A Summary of Calculus

Karl Heinz Dovermann
Professor of Mathematics
University of Hawaii
July 28, 2003
c Copyright 2003 by the author. All rights reserved. No part of this
publication may be reproduced, stored in a retrieval system, or transmit-
ted, in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of the author.
Printed in the United States of America.
This publication was typeset using A
M
S-T
E
X, the American Mathemat-
ical Society’s T
E
X macro system, and L
A
T
E
X2
ε
. The graphics were produced
with the help of Mathematica
1
.
This is an incomplete draft which will undergo further changes.
1
Mathematica Version 2.2, Wolfram Research, Inc., Champaign, Illinois (1993).
Contents
Preface v
1 Basic Concepts 1
1.1 Real Numbers and Functions . . . . . . . . . . . . . . . . . . 1
1.2 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Two important estimates . . . . . . . . . . . . . . . . 4
1.3 More Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Tangent Lines and the Derivative . . . . . . . . . . . . . . . . 10
1.6.1 Derivatives without Limits . . . . . . . . . . . . . . . 12
1.7 Secant Lines and the Derivative . . . . . . . . . . . . . . . . . 13
1.8 Differentiability implies Continuity . . . . . . . . . . . . . . . 14
1.9 Basic Examples of Derivatives . . . . . . . . . . . . . . . . . . 15
1.10 The Exponential and Logarithm Functions . . . . . . . . . . . 18
1.11 Differentiability on Closed Intervals . . . . . . . . . . . . . . . 20
1.12 Other Notations for the Derivative . . . . . . . . . . . . . . . 21
1.13 Rules of Differentiation . . . . . . . . . . . . . . . . . . . . . 22
1.13.1 Linearity of the Derivative . . . . . . . . . . . . . . . . 22
1.13.2 Product and Quotient Rules . . . . . . . . . . . . . . . 23
1.13.3 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . 26
1.13.4 Hyperbolic Functions . . . . . . . . . . . . . . . . . . 29
1.13.5 Derivatives of Inverse Functions . . . . . . . . . . . . . 30
1.13.6 Implicit Differentiation . . . . . . . . . . . . . . . . . . 34
1.14 Related Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.15 Exponential Growth and Decay . . . . . . . . . . . . . . . . . 41
1.16 More Exponential Growth and Decay . . . . . . . . . . . . . 43
1.17 The Second and Higher Derivatives . . . . . . . . . . . . . . . 48
1.18 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . . 48
i
1.18.1 Approximation by Differentials . . . . . . . . . . . . . 48
1.18.2 Newton’s Method . . . . . . . . . . . . . . . . . . . . . 51
1.18.3 Euler’s Method . . . . . . . . . . . . . . . . . . . . . . 53
1.19 Table of Important Derivatives . . . . . . . . . . . . . . . . . 63
2 Global Theory 65
2.1 Cauchy’s Mean Value Theorem . . . . . . . . . . . . . . . . . 65
2.2 Unique Solutions of Differential Equations . . . . . . . . . . . 67
2.3 The First Derivative and Monotonicity . . . . . . . . . . . . . 69
2.3.1 Monotonicity on Intervals . . . . . . . . . . . . . . . . 69
2.3.2 Monotonicity at a Point . . . . . . . . . . . . . . . . . 75
2.4 The Second Derivative and Concavity . . . . . . . . . . . . . 76
2.4.1 Concavity on Intervals . . . . . . . . . . . . . . . . . . 77
2.4.2 Concavity at a Point . . . . . . . . . . . . . . . . . . . 80
2.5 Local Extrema and Inflection Points . . . . . . . . . . . . . . 81
2.6 Detection of Local Extrema . . . . . . . . . . . . . . . . . . . 83
2.7 Detection of Inflection Points . . . . . . . . . . . . . . . . . . 88
2.8 Absolute Extrema of Functions . . . . . . . . . . . . . . . . . 90
2.9 Optimization Story Problems . . . . . . . . . . . . . . . . . . 92
2.10 Sketching Graphs . . . . . . . . . . . . . . . . . . . . . . . . . 98
3 Integration 105
3.1 Properties of Areas . . . . . . . . . . . . . . . . . . . . . . . . 105
3.2 Partitions and Sums . . . . . . . . . . . . . . . . . . . . . . . 107
3.2.1 Upper and Lower Sums . . . . . . . . . . . . . . . . . 108
3.2.2 Riemann Sums . . . . . . . . . . . . . . . . . . . . . . 111
3.3 Limits and Integrability . . . . . . . . . . . . . . . . . . . . . 112
3.3.1 The Darboux Integral and Areas . . . . . . . . . . . . 112
3.3.2 The Riemann Integral . . . . . . . . . . . . . . . . . . 115
3.4 Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . 116
3.5 Some elementary observations . . . . . . . . . . . . . . . . . . 118
3.6 Areas and Integrals . . . . . . . . . . . . . . . . . . . . . . . . 121
3.7 Anti-derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 122
3.8 The Fundamental Theorem of Calculus . . . . . . . . . . . . . 124
3.8.1 Some Proofs . . . . . . . . . . . . . . . . . . . . . . . 126
3.9 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
3.9.1 Substitution and Definite Integrals . . . . . . . . . . . 131
3.10 Areas between Graphs . . . . . . . . . . . . . . . . . . . . . . 133
3.11 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . 136
3.12 Applications of the Integral . . . . . . . . . . . . . . . . . . . 142
ii
3.13 The Exponential and Logarithm Functions . . . . . . . . . . . 144
3.13.1 Other Bases . . . . . . . . . . . . . . . . . . . . . . . . 148
4 Trigonometric Functions 151
iii
iv
Preface
In these notes we like to summarize calculus.
v
vi PREFACE
Chapter 1
Basic Concepts
Introduction
In this chapter we introduce limits and derivatives. These are basic concepts
of calculus. We provide some rules for their computations.
1.1 Real Numbers and Functions
We assume that the reader is familiar with the real numbers (denoted by R)
and the operations of addition and multiplication. A real number is either
positive, negative, or zero. This allows us to order the real numbers. If
x and y are real numbers, then x is larger than y (i.e., x > y) if x − y is
positive.
Until further notice, we will work with real valued functions in one real
variable. Their domains, the sets on which these functions are defined, are
subsets of the real numbers, and they take values in R. The range of a
function is a set in which the function takes values. The image of a function
f consists of all those points y in the range for which there exists an x in
the domain of f, such that f(x) = y.
We will make frequent use of the absolute value function.
|x| =
_
¸
_
¸
_
x if x > 0
−x if x < 0
0 if x = 0
The distance between two points a and b on the real line is |a − b|, and
{x ∈ R | |x − a| < } is the set of all real numbers whose distance from
1
2 CHAPTER 1. BASIC CONCEPTS
a is less than . Expressed as an interval this set is (a − , a + ). For
computations with absolute values it is worth noting that, for any two real
numbers x and x
|x · y| = |x| · |y|, |x +y| ≤ |x| +|y|, and ||x| −|y|| ≤ |x −y|. (1.1)
The first inequality is referred to as triangle inequality, and the last one is
a variation of it.
Every now and then we will allude to the completeness of the real line,
which means that every bounded subset of the real line has a least upper
bound. This property is crucial for calculus, but arguments using it are too
difficult for an introductory course on the subject.
1.2 Limits
Limits are a central tool in calculus and other areas of mathematics. We
discuss them in this section.
Definition 1.1. Let f be a function and L a real number. We say that
L = lim
x→a
f(x) (1.2)
if for all > 0 there exists a δ > 0, such that |f(x) − L| < whenever x is
in the domain of f and 0 < |x −a| < δ.
The equation in (1.2) reads as L is the limit of f(x) as x approaches
a. We also say that f(x) approaches or converges to L as x approaches a.
An intuitive interpretation is that the expected value of f(x) at x = a is L,
based on the values of f(x) for x near a.
In all but a few degenerate cases, limits are unique if they exist.
Proposition 1.2. Suppose that f(x) has a limit at x = a, then this limit is
unique, provided that the domain of the function f contains points arbitrarily
close to a.
1 2
The latter assumption in the proposition is satisfied if the domain of f
contains an interval, and either a belongs to this interval or a is an end point
1
Expressed in mathematical language this means, for all δ > 0 there is a point b in the
domain of f, such that 0 < |b −a| < δ.
2
Some authors do not apply the concept of a limit at isolated points of the domain of
a function, points for which there are no other arbitrarily close points in the domain of
the function.
1.2. LIMITS 3
of it. To avoid intricate language, we make this kind of an assumption for
the remainder of this section. Taking limits is compatible with the basic
algebraic operations in the following sense.
Proposition 1.3. Assume that the domains of the functions f(x) and g(x)
both contain an interval of the form (d, a) or (a, e) where d < a < e. Suppose
that
lim
x→a
f(x) = L and lim
x→a
g(x) = M.
and that c is a constant. Then
lim
x→a
(f +g)(x) = M +L
lim
x→a
cf(x) = cM
lim
x→a
(f · g)(x) = M · L
lim
x→a
(f/g)(x) = M/L provided that L = 0.
As a special case we obtain the following useful observation:
lim
x→c
f(x) = L if and only if lim
x→c
(f(x) −L) = 0. (1.3)
Proposition 1.4 (Pinching Theorem). Assume that the domains of the
functions f(x), g(x), and h(x) all contain an interval of the form (d, a) or
(a, e) where d < a < e and that f(x) ≤ h(x) ≤ g(x). If
lim
x→a
f(x) = L = lim
x→a
g(x),
then the limit of h(x) exists as x approaches a, and it is equal to L.
For many functions the computation of limits is no challenge.
Proposition 1.5. If f(x) is a polynomial, a rational function, or a trigono-
metric function and f(a) is defined, then
lim
x→a
f(x) = f(a).
The following limits are important in the calculations of some derivatives.
lim
x→0
cos x −1
x
= 0, lim
x→0
sin x
x
= 1. and lim
x→a
x
n
−a
n
x −a
= na
n−1
. (1.4)
Hints: The first two limits follow easily from the estimates in Theo-
rem 1.7, discussed in the following subsection. The last assertion can be
proved using synthetic division, at least if n is an integer.
4 CHAPTER 1. BASIC CONCEPTS
1.2.1 Two important estimates
In preparation of the proof of Theorem 1.7 we show
Theorem 1.6. If h ∈ [−π/4, π/4], then
| sin h| ≤ |h| ≤ | tan h|. (1.5)
Proof. In Figure 1.1 you see part of the unit circle. For h ∈ [−π/4, π/4] we
set C = (cos h, sin h). Given two points X and Y in the plane, the distance
between them is denoted by XY . We denote by
¯
BC the length of the arc
(part of the unit circle) between B and C.
O A B
C
D
E
Figure 1.1: The unit circle
We find that | sin h| = AC ≤ |h| =
¯
BC because going from C straight
down to the x-axis is shorter than following the circle from C to the x-axis.
Secondly, to show that |h| =
¯
BC ≤ | tan h| = BD, imagine that you roll
the circle along the vertical line through B until the point C touches it in the
point E. We use the process of rolling the circle along the line to measure
|h|. In particular, |h| = BE. It appears to be clear
3
that BE ≤ BD. This
3
Here our argument relies on intuition. A rigorous argument requires work. One can
show that the area of a disk with radius one is π. From this is follows by elementary
geometry that the area of the slice of the disk with vertices O, B and C has area |h|/2.
This slice is contained in the triangle with vertices O, B and D, and the area of the slice
is (tan |h|)/2. It follows that |h| ≤ tan |h|.
1.2. LIMITS 5
verifies that |h| ≤ tan |h|, the second inequality which we claimed in the
theorem.
Theorem 1.7. If h ∈ [−π/4, π/4], then
4
|1 −cos h| ≤
h
2
2
and |h −sin h| ≤
h
2
2
. (1.6)
Proof of Theorem 1.7. In Figure 1.2 you see half of a circle of radius 1
centered at the origin, and a triangle with vertices A, B, and C. Let
h ∈ [−π/4, π/4] be the number for which we want to show the inequal-
ity and C = (cos h, sin h). Denote by XY the length of the straight line
segment between the points X and Y . Let
¯
BC be the length of the arc
(part of the unit circle) between B and C.
A B
C
D
Figure 1.2: The unit circle
From the picture we read off that
AB = 2, DB = (1 −cos h),
¯
BC = |h|, and BC ≤
¯
BC.
Using similar triangles we see AB/BC = BC/DB and (BC)
2
= AB×DB.
In other words
2(1 −cos h) = AB ×DB = (BC)
2
≤ (
¯
BC)
2
= h
2
.
4
The inequalities hold without the restriction on h, but we only need them on an
interval around zero. Restricting ourselves to this interval simplifies the proofs somewhat.
6 CHAPTER 1. BASIC CONCEPTS
The first estimate in (1.6) is an immediate consequence.
If h = 0, then both sides of the second inequality in (1.6) are zero,
verifying the assertion in this case. If 0 = h ∈ [−π/4, π/4], then Theorem 1.6
tells us that
| sin h| ≤ |h| ≤ | tan h| =
| sin h|
cos h
hence 0 ≤ cos h ≤
sin h
h
≤ 1.
Subtracting the terms in this inequality from 1 we find
0 ≤ 1 −
sin h
h
≤ 1 −cos h ≤ 1.
Using our previous estimate for |1 − cos h| and our assumption that |h| ≤
π/4 < 1, we conclude that
¸
¸
¸
¸
h −sin h
h
¸
¸
¸
¸
≤ |1 −cos h| ≤
h
2
2

h
2
.
The second estimate claimed in the theorem is an immediate consequence.
1.3 More Limits
The material in the previous, first section about limits suffices for a while.
In some situations one would like to modify the definition in Section 1.2,
and we do so in this section. The first two limits express how the function
behaves as we approach a point a from the right or left. They are called the
right and left hand limits. The next two limits express what happens as the
variable tends to plus or minus infinity. We call them limits at infinity. The
last two limits allow us to express that the values of a function tend to plus
or minus infinity. We call them infinite limits.
Definition 1.8. Let f be a function and L a real number. We say that
L = lim
x→a
+
f(x)
if for all > 0 there exists a δ > 0, such that |f(x) − L| < whenever x is
in the domain of f and a < x < a +δ.
Definition 1.9. Let f be a function and L a real number. We say that
L = lim
x→a

f(x)
if for all > 0 there exists a δ > 0, such that |f(x) − L| < whenever x is
in the domain of f and a −δ < x < a.
1.3. MORE LIMITS 7
For example, if f(x) = sign(x) = x/|x|, then
lim
x→a
+
f(x) = 1 and lim
x→a

f(x) = −1.
We can consider what happens to the values of a function f(x) as x
approaches ∞ or −∞.
Definition 1.10. Let f be a function and L a real number. We say that
L = lim
x→∞
f(x)
if for all > 0 there exists a number M, such that |f(x) −L| < whenever
x is in the domain of f and x > M.
Definition 1.11. Let f be a function and L a real number. We say that
L = lim
x→−∞
f(x)
if for all > 0 there exists a number M, such that |f(x) −L| < whenever
x is in the domain of f and x < M.
For example
lim
x→∞
1
x
= 0 and lim
x→−∞
1
1 +x
2
= 1.
Definition 1.12. Let f be a function and a a real number. We say that
lim
x→a
f(x) = ∞
if for all M there exists a δ > 0 such that f(x) > M whenever x is in the
domain of f and 0 < |a −x| < δ.
In other words, we can make sure that the value of f(x) is larger than
any given number M, no matter how large, by taking x close to a.
Definition 1.13. Let f be a function and a a real number. We say that
lim
x→a
f(x) = −∞
if for all M there exists a δ > 0 such that f(x) < M whenever x is in the
domain of f and 0 < |a −x| < δ.
In the last two definitions a may be replaced by a
±
, so that we approach
a from the left or right, and a can be replaced by ±∞.
For example
lim
x→0
+
1
x
= ∞ and lim
x→∞

x = ∞.
8 CHAPTER 1. BASIC CONCEPTS
1.4 Continuous Functions
We define continuous functions and discuss a few of their basic properties.
The class of continuous functions will play a central role later.
Definition 1.14. Let f be a function and c a point in its domain. The
function is said to be continuous at c if for all > 0 there exists a δ > 0,
such that |f(c) − f(x)| < whenever x belongs to the domain of f and
|x−c| < δ. A function f is continuous if it is continuous at all points in its
domain.
In most cases the condition in Definition 1.14 says that
lim
x→c
f(x) = f(c). (1.7)
In fact, this equation holds whenever there are points in the domain of f
arbitrarily close to c. See the footnote to Proposition 1.2. If c is an isolated
point in the domain of f, i.e., there are no other points in the domain of f
arbitrarily close to c, then the function is always continuous at c.
Polynomials, rational functions, and trigonometric functions are contin-
uous. One can produce many more continuous functions through standard
operations on functions.
Proposition 1.15. Let f and g be continuous functions. Then f +g, f · g,
f/g and f ◦ g are continuous, wherever these functions are defined.
The clarify the remark about the domain in the proposition, we note that
the function (f +g)(x) = f(x) +g(x) is defined for those x for which both f
and g are defined. The same statement holds for (f · g)(x) = f(x) · g(x). To
determine the domain of f/g one needs to exclude those points where g is
zero. For the composition (f ◦ g)(x) = f(g(x)) on needs that g takes values
in the domain of f.
One may also reverse the order of applying a continuous function and
calculating a limit:
lim
x→c
f(g(x)) = f
_
lim
x→c
g(x)
_
, (1.8)
provided the natural technical assumption hold, i.e., g is defined at points
arbitrarily close to c, f is defined for all g(x) where x is in the domain of g
and close to c, and f is continuous at lim
x→c
g(x).
Theorem 1.16 (Intermediate Value Theorem). Suppose that f is de-
fined and continuous on the closed interval [a, b]. If C is in between f(a)
and f(b), then there exists a c ∈ [a, b], such that f(c) = 0.
1.5. LINES 9
E.g., suppose that p(x) = x
3
−x
2
+2x −1. The polynomial is certainly
a continuous function, p(0) = −1 and p(1) = 1. According to the theorem
there exists some c ∈ (0, 1), such that p(c) = 0.
Theorem 1.17 (Extreme Value Theorem). Let f be defined and con-
tinuous on the closed interval [a, b]. Then there exist points c and d in [a, b],
such that f(c) ≤ f(x) ≤ f(d) for all x ∈ [a, b].
Expressed in words, the theorem says that a continuous function on a
closed interval assumes a smallest and largest value.
The Intermediate Value and Extreme Value theorem are typically proved
in an introductory analysis course. They are equivalent to the completeness
of the real line. We mentioned this property of the real numbers in Sec-
tion 1.1.
1.5 Lines
In general, a line consists of the points (x, y) in the plane which satisfy the
equation
ax +by = c (1.9)
for some given real numbers a, b and c, where it is assumed that a and b
are not both zero. The line is vertical if and only if b = 0. If b = 0 we may
rewrite the equation as
y = −
a
b
x +
c
b
= mx +B. (1.10)
The number m is called the slope of the line, and B is the point in which the
line intersects the y-axis, also called the y-intercept. Given any two points
(x
1
, y
1
) and (x
2
, y
2
) in the plane, the line through them has slope
m =
y
2
−y
1
x
2
−x
1
.
For our purposes, the most useful version of the equation of a line is its
point-slope formula. The equation of a line with slope m through the point
(x
1
, y
1
) is
y = m(x −x
1
) +y
1
. (1.11)
10 CHAPTER 1. BASIC CONCEPTS
1.6 Tangent Lines and the Derivative
We like to introduce the concept of tangent lines. To be able to express
ourselves concisely, let us say
Definition 1.18. A point c is an interior point of a subset B of R if there
is an open interval I, such that c ∈ I ⊆ B.
We give a first definition for a tangent line.
Definition 1.19. Suppose f(x) is a function and c is an interior point of
its domain. We call a line t(x) the tangent line to the graph of f(x) at x = c
if t(x) is the best linear approximation of f(x) on some open interval around
c, i.e., the line t(x) is closer to the graph of f(x) than any other line for all
x in some open interval around c.
For a given function and an interior point c in its domain there may or
may not be a tangent line, but it there is a tangent line, then it is unique.
Although the term ‘best linear approximation near c’ gives an excellent
intuitive picture what a tangent line is, this definition is hard to work with.
It is easier to work with a more concrete definition.
Definition 1.20. Suppose f(x) is a function and c is an interior point of
its domain. We call a line t(x) the tangent line to the graph of f(x) at x = c
if
lim
x→c
f(x) −t(x)
x −c
= 0. (1.12)
The equation in (1.12) expresses in a precise form in which sense the
tangent line is close to the graph of f(x) near c. Not only does f(x) −t(x)
converge to zero as x approaches c, it does so even when divided by x −c.
We use tangent lines to define the concept of differentiability and the
derivative.
Definition 1.21. Suppose f(x) is a function and c is an interior point of
its domain, and assume that there is a tangent line to the graph of f(x) at
x = c. Then we say that f(x) is differentiable at c. We call the slope of the
tangent line the derivative of f(x) at c, and we denote it by f

(c).
Utilizing the notation in the previous definition we can write down the
equation of the tangent line to the graph of f(x) at x = c in point-slope
form:
t(x) = f

(c)(x −c) +f(c). (1.13)
1.6. TANGENT LINES AND THE DERIVATIVE 11
To differentiate a function means to find its derivative.
By definition, an open set is a set, such that each of its points is an
interior point.
Definition 1.22. Suppose the domain of the function f(x) is an open set.
Then say that f(x) is differentiable if it is differentiable at each point of
its domain. We consider f

(x) as a function, whose domain consists of all
those points where f(x) is differentiable.
Example 1.23. Let p(x) = 2x
4
−3x
2
+5. Find the tangent line t(x) to the
graph of p(x) at x = −2 and p

(−2).
Solution: As a first step we expand p in powers of u = (x + 2). To do
so, we substitute u−2 for x and expand p in powers of u. You are expected
to fill in some of the arithmetic steps.
p = 2(u −2)
4
−3(u −2)
2
+ 5
= 2(u
4
−8u
3
+ 24u
2
−32u + 16) −3(u
2
−4u + 4) + 5
= 2u
4
−16u
3
+ 45u
2
−52u + 25
Reversing the substitution, replacing u by (x + 2), we find:
p(x) = 2(x + 2)
4
−16(x + 2)
3
+ 45(x + 2)
2
−52(x + 2) + 25.
We assert that t(x) = −52(x + 2) + 25 and p

(−2) = −52.
For t(x) as proposed, we see that
¸
¸
¸
¸
p(x) −t(x)
x −c
¸
¸
¸
¸
=
¸
¸
¸
¸
p(x) −t(x)
x + 2
¸
¸
¸
¸
= |2(x + 2)
3
−16(x + 2)
2
+ 45(x + 2)|
≤ 65|x + 2| (provided |x + 2| ≤ 1)
This estimate shows that (p(x) − t(x))/(x − c) converges to zero as x ap-
proaches c = −2. By definition, this means that t(x) is the desired tangent
line. Its slope is p

(−2) = −52. ♦
The example is generic. We can use any polynomial p(x) and point x = c
and write p(x) in powers of (x −c). Say, the result is
p(x) = A
n
(x −c)
n
+· · · +A
1
(x −c) +A
0
.
The technique used in the example, suitably generalized, shows that
t(x) = A
1
(x −c) +A
0
12 CHAPTER 1. BASIC CONCEPTS
is the tangent line to the graph of p(x) at x = c, and p

(c) = A
1
. Eventually
we will find a more efficient method for differentiating polynomials, but we
have shown that
Proposition 1.24. Polynomials are differentiable.
1.6.1 Derivatives without Limits
Without a doubt, the definition of a limit is the most difficult one in a
first semester of calculus, and it is interesting to explore ways to develop
calculus, rigorously, without the limit concept. One can do this by replacing
the condition in (1.12) by a slightly stronger one.
Definition 1.25. Suppose f(x) is a function and c is an interior point of
its domain. We call a line t(x) the tangent line to the graph of f(x) at x = c
if there exists and open interval I around c and a number A, such that
|f(x) −t(x)| ≤ A(x −c)
2
(1.14)
for all x ∈ I.
With this definition fewer functions will be differentiable than with the
one given in Definition 1.20, but this is not crucial.
The inequality in (1.14) can be rewritten as
q(x) = t(x) −A(x −c)
2
≤ f(x) ≤ t(x) +A(x −c)
2
= p(x),
where the parabolas q(x) and p(x) are defined by the expressions they are
adjacent to. All four function f(x), q(x), p(x), and t(x) have the same value
at x = c. In an example, this situation is shown in Figure 1.3. There you
see the function f(x) = sin x, the parabola p(x) (dotted and open upwards),
the parabola q(x) (dotted and open downwards), and the tangent line t(x)
(dashed). The parabolas p(x) and q(x) touch each other without crossing,
and the picture shows how they ‘hug’ each other. There is very little space
left between p(x) and q(x), and f(x) and t(x) are squeezed in between them.
In this sense, the graphs of f(x) and t(x) have to be close to each other near
x = c.
A pedagogical advantage of the approach is that one does not have to
understand limits before one can understand the definition of the derivative.
There is also a geometric picture which illustrates the concept of closeness,
tangent line, and derivative. The condition in (1.14) is also more accessible
to computer assisted algebra than the limit definition. In terms of algebraic
geometry (1.14) at least alludes to a divisibility condition.
1.7. SECANT LINES AND THE DERIVATIVE 13
0.5 1 1.5 2
0.25
0.5
0.75
1
1.25
1.5
Figure 1.3: Sine Function and Tangent Line between two Parabolas
1.7 Secant Lines and the Derivative
Often a different approach is taken to motivate and introduce the derivative.
Theorem 1.26. Suppose f is a function and c is an interior point of its
domain. If f is differentiable at c, then
f

(c) = lim
x→c
f(c) −f(x)
c −x
.
Proof. This is obvious once one uses the expression for the tangent line in
(1.13) and substitutes it in the expression in (1.12) inside the limit.
f(x) −t(x)
x −c
=
f(x) −f(c)
x −c
−f

(c). (1.15)
Apply limits to both sides of the equation and the assertion follows.
Let us explain the situation geometrically. Suppose a and b are distinct
points in the domain of the function f. The line through (a, f(a)) and
(b, f(b)) is called a secant line, and its slope (f(a) − f(b))/(a − b) is called
14 CHAPTER 1. BASIC CONCEPTS
the average rate of change of f over the interval [a, b]. In (1.15) we are
considering the slopes of secant lines through (c, (f(c)) and (x, f(x)), and
then we take the limit as x approaches c. The theorem asserts that for a
differentiable function this limit of the slopes of secant lines is the slope of
the tangent line. For the obvious reason f

(c) is called the rate of change or
instantaneous rate of change of f at c.
Many authors introduce the derivative as the limit of the slopes of secant
lines, call t(x) = f

(x − c) + f(c) the tangent line, and possibly illustrate
that the tangent line is close to the graph in the sense of Definition 1.20.
-0.1 -0.05 0.05 0.1
-0.0075
-0.005
-0.0025
0.0025
0.005
0.0075
Figure 1.4: f(x) = x
2
sin(1/x)
-0.01 -0.005 0.005 0.01
-0.0001
-0.00005
0.00005
0.0001
Figure 1.5: f(x) = x
2
sin(1/x)
It is misleading to say that the graph of f(x) looks like, or resembles, a
line near c. Eventually you will be able to show that the function
f(x) =
_
x
2
sin(1/x) if x = 0
0 if x = 0
is differentiable everywhere on the real line. You see part of its graph over
two different intervals in Figure 1.4 and 1.5. By no stretch of imagination
will you say that the graph of the function looks like a line.
1.8 Differentiability implies Continuity
It is worth pointing out that
Theorem 1.27. If a function is differentiable at a point, then it is contin-
uous at this point.
1.9. BASIC EXAMPLES OF DERIVATIVES 15
Proof. Denote the function by f(x) and the point of differentiability by c.
By assumption we have the derivative f

(c) and
lim
x→c
_
f(x) −f(c)
x −c
−f

(c)
_
= 0.
Then certainly
lim
x→c
[(f(x) −f(c)) −f

(c)(x −c)] = 0.
Because f

(c)(x−c) converges to zero as x approaches c, so does (f(x)−f(c)).
This implies that lim
x→c
f(x) = f(c) and that f(x) is continuous at c.
-2 -1 1 2
0.5
1
1.5
2
Figure 1.6: The absolute value function
The converse of the theorem is false. There are continuous functions
which are not differentiable. E.g., the function f(x) = |x| is continuous,
but it is not differentiable at x = 0. It is apparent from the graph (see
Figure 1.6) that there is not line close to the graph of this function near
x = 0.
We can also give an analytic argument. According to the definition of
differentiability, we have to study the difference quotients (|x|−|0|)/(x−0) =
|x|/x. They are 1 if x > 0 and −1 if x < 0. There is no number these
difference quotients converge to, and f(x) = |x| is not differentiable at x = 0.
1.9 Basic Examples of Derivatives
Let us use the definitions and work out a few derivatives.
16 CHAPTER 1. BASIC CONCEPTS
Example 1.28. If f(x) = x
n
and n is a non-negative integer, i.e., n = 0,
1, 2, . . . , then f

(x) = nx
n−1
.
Proof. Suppose that n ≥ 2. Then
lim
x→c
x
n
−c
n
x −c
= lim
x→c
(x
n−1
+x
n−2
c +· · · xc
n−2
+c
n−1
) = nc
n−1
The cases n = 0 and n = 1 are even easier and left to the reader.
Example 1.29. If f(x) = 1/x, then f

(x) = −1/x
2
.
Proof. Suppose c = 0.
lim
x→c
1
x

1
c
x −c
= lim
x→c
c −x
xc(x −c)
= −
1
c
2
.
Example 1.30. If f(x) =

x and x > 0, then f

(x) = 1/(2

x).
Proof.
lim
x→c

x −

c
x −c
= lim
x→c
x −c
(x −c)(

x +

c)
=
1
2

c
.
Remark 1. Eventually we will see that if f(x) = x
a
for any real number
a, then f

(x) = ax
a−1
, generalizing all of the examples above.
Exercise 1. Suppose that f(x) =

ax +b and ax +b > 0. Show that
f

(x) =
a
2

ax +b
.
The tangent line to the graph of f(x) at x = c is then
t(x) =
a
2

ac +b
(x −c) +

ac +b.
Verify that
|f(x) −t(x)| ≤
a
2
2(

ac +b)
3
(x −c)
2
. (1.16)
1.9. BASIC EXAMPLES OF DERIVATIVES 17
The estimate in (1.16) shows differentiability in the sense of Defini-
tion 1.25, and provides an explicit error estimate, a bound on the difference
between the function and its tangent line.
Example 1.31. Show that sin

x = cos x. For this equation to hold, the
angle x needs to be measured in radians.
Proof. Below we will set x = c +h and x −c = h.
lim
x→c
_
sin x −sin c
x −c
_
= lim
h→0
_
sin(c +h) −sin c
h
_
= lim
h→0
_
sinc cos h + cos c sin h −sin c
h
_
= lim
h→0
_
sinc(cos h −1) + cos c sin h
h
_
= sin c · lim
h→0
cos h −1
h
+ cos c · lim
h→0
sin h
h
= cos c.
For computation of the limits in the second to last line see (1.4).
The tangent line to the graph of the sine function at x = c is
t(x) = cos c(x −c) + sin c.
It is left as an exercise for the reader to show that
| sin x −t(x)| ≤ (x −c)
2
(1.17)
The steps are essentially the same as in the proof above. The estimate in
(1.17) does not only show differentiability in the sense of Definition 1.25,
but it provides an explicit error estimate, a bound on the difference between
the function and its tangent line.
Exercise 2. If f(x) = cos x, then f

(x) = −sin x. The details are similar
to the ones in Example 1.31. Furthermore, if
t(x) = sin c(x −c) + cos x
is the tangent line to the graph of f(x) at x = c, then
|f(x) −t(x)| ≤ (x −c)
2
.
18 CHAPTER 1. BASIC CONCEPTS
1.10 The Exponential and Logarithm Functions
The exponential and logarithm are of great importance and we do not want
to delay their introduction any further. Still, technically we are not quite
prepared for it and at a later point we have to revisit the introduction to fill
in details.
Suppose a is a positive real number and a = 1. For any rational number
r = p/q (p and q are integers) one can define a
r
=
q

a
p
. First we take a p-th
power and then a q-root. In this sense we have a function h(r) = a
r
, whose
domain consists of all rational numbers. This function is monotonic. More
precisely, h(r) is increasing if a > 1 and decreasing when 0 < a < 1.
Theorem-Definition 1.32. Let a be a positive number, a = 1. There
exists exactly one monotonic function, called the exponential function with
base a and denoted by exp
a
(x), which is defined for all real numbers x such
that exp
a
(x) = a
x
whenever x is a rational number. Furthermore, a
x
> 0 for
all x, so that the domain of the exponential function is (−∞, ∞). For every
number y > 0 there exists exactly one number x, such that exp
a
(x) = y, so
we use (0, ∞) as the range of the exponential function exp
a
(x).
It is common, and we will follow this convention, to use the notation
a
x
for exp
a
(x) also if x is not rational. The arithmetic properties of the
exponential function, also called the exponential laws, are collected in our
next theorem. The theorem just says that the exponential laws, which you
previously learned for rational exponents, also hold in the generality of our
current discussion.
Theorem 1.33 (Exponential Laws). For any positive real number a and
all real numbers x and y
a
x
a
y
= a
x+y
a
x
/a
y
= a
x−y
(a
x
)
y
= a
xy
If x is the unique solution of the equation a
x
= y, then we set
log
a
(y) = x. (1.18)
We just defined a function log
a
(y). It is called the logarithm function with
base a, and by construction it is the inverse of the exponential function
exp
a
(x). More explicitly,
a
log
a
y
= y and log
a
(a
x
) = x
1.10. THE EXPONENTIAL AND LOGARITHM FUNCTIONS 19
for all x ∈ R and all y > 0. The domain of the logarithm function is
(0, ∞) and its range is (−∞, ∞). It is increasing if a > 1 and decreasing if
0 < a < 1.
Corresponding to the exponential laws in Theorem 1.33 we have the laws
of logarithms. One set of laws implies the other one, and vice versa.
Theorem 1.34 (Laws of Logarithms). For any positive real number a =
1, for all positive real numbers x and y, and any real number z
log
a
(xy) = log
a
(x) + log
a
(y)
log
a
(x/y) = log
a
(x) −log
a
(y)
log
a
(x
z
) = z log
a
(x)
In Figures 1.7 and 1.8 you see parts of the graphs of the exponential and
logarithm functions with base 2.
-1 -0.5 0.5 1 1.5
0.5
1
1.5
2
2.5
Figure 1.7: exp
2
(x)
0.5 1 1.5 2 2.5 3
-1
-0.5
0.5
1
1.5
Figure 1.8: log
2
(x)
The Euler number e as base
There is one number which is preferrable as base over the others. This
irrational number is called the Euler number (named after Leonard Euler)
and denoted by e, and e ≈ 2.718281828. We will define it precisely later.
Definition 1.35. The exponential function is the exponential function for
the base e. It is denoted by exp(x) or e
x
. Its inverse is the natural logarithm
function. It is denoted by ln(x). So exp(x) = exp
e
(x) and ln(x) = log
e
(x).
20 CHAPTER 1. BASIC CONCEPTS
Eventually we will see
exp

(x) = exp(x) and ln

(x) =
1
x
. (1.19)
The derivative of the exponential function is the exponential function, and
the derivative of the natural logarithm function is 1/x.
Other Bases
Finally, let us relate the exponential and logarithm functions for different
bases to those with base e. For any positive number a (a = 1),
Theorem 1.36.
a
x
= e
x lna
and log
a
x =
ln x
ln a
.
These identities follow from the exponential laws and the laws of loga-
rithms.
1.11 Differentiability on Closed Intervals
In Definition 1.22 we defined what it means that a function is differentiable
on an open set. There are situations in which one would like to apply the
notion of differentiability to functions with other kinds of domains. Let us
formalize the idea of extending functions.
Definition 1.37. Suppose that I and J are subsets of the real line R and
I ⊆ J, that I is the domain of a function f, and that J is the domain of
a function F. We call F an extention of f if it agrees with f on I, i.e.,
F(x) = f(x) for all x ∈ I.
Definition 1.38. A function f is said to be differentiable on a subset I
of R if it extends to a differentiable function F on an open set. We set
f

(x) = F

(x) for all x ∈ I.
Without some restrictions on I, a function may be differentiable without
the derivative being well defined. The least technical and for our purposes
sufficient solution is captured in
Proposition 1.39. Suppose the function f is defined on an interval I, the
interval is neither empty nor a single point, and f extends to a differentiable
function F on an open interval containing I, then f

(x) = F

(x) is unique
for all x ∈ I.
1.12. OTHER NOTATIONS FOR THE DERIVATIVE 21
We are mostly concerned with defining differentiability for functions
whose domain is a closed interval [a, b], where a < b. Some authors use
one-sided limits and one-sided derivatives to contemplate derivatives at the
end points of the interval. Our discussion is less painful, and it lends itself
more to generalizations in higher dimensions.
Let us discuss two examples. The function f(x) = x
2
with domain [0, 1]
is differentiable. It extends to the differentiable function F(x) = x
2
with
the open set (−∞, ∞) as its domain. In contrast, the function g(x) =

x
is not differentiable on the interval [0, ∞). The only sensible candidate for
the tangent line to the graph of g(x) at the point (0, 0) is a vertical line.
The slope of this line is not a real number and we do not have a derivative.
(The function g(x) is differentiable if we use (0, ∞) as domain.)
1.12 Other Notations for the Derivative
There are different notations for the derivative of a function. Physicists will
indicate a derivative with respect to time by a dot. E.g., if x is a function
of time, then they will write ˙ x(t) instead of x

(t). Leibnitz’ notation for the
derivative of a function f of a variable x is
df
dx
. We will use it frequently. Ex-
pressing the derivatives of the exponential and natural logarithm functions
this way (see (1.19)) we have:
If y(x) = e
x
, then
dy
dx
= y = e
x
, and if y(x) = ln x, then
dy
dx
=
1
x
.
This notation is not always specific enough. The expression dy/dx stands
for the derivative of y with respect to x, and that is a function. The ex-
pression does not tell where dy/dx is evaluated. To be specific about this
aspect, it makes sense to write (compare Example 1.31):
If y(x) = sinx, then
dy
dx
(x) = cos x.
In this notation x plays two roles. It is the name of the variable of y as
well as the name of the variable of the derivative of y. This in acceptable
because it won’t lead to confusion. Instead of
df
dx
(x) we also write
d
dx
f(x).
This is particularly convenient if f stands for a larger expression as in
d
dx
sin x = cos x or
d
dx
e
x
= e
x
.
22 CHAPTER 1. BASIC CONCEPTS
1.13 Rules of Differentiation
We discuss formulas for calculating the derivative of a composite function
from the derivatives of its constituents. These formulas, together with the
knowledge of the derivatives of some basic functions, turn the process of
differentiation for many functions into an algorithm, a rather mechanical
process. You can do it even on the computer, which means that no “under-
standing” is required. You are expected to learn the basic rules, be able to
apply the accurately, and practice many examples. In the last section of this
chapter we summarize the computational results of this section. We collect
the rules established in this section and tabulate the derivatives of many of
the important functions which we considered.
1.13.1 Linearity of the Derivative
Differentiation is compatible with addition of functions and multiplication
with a constant. In a more mathematical language one says that differen-
tiation is linear. Let f and g be functions, and assume that both of them
are differentiable at x. Let c be a real number. Then f + g and cf are
differentiable at x and their derivatives are given by
(f +g)

(x) = f

(x) +g

(x) and (cf)

(x) = cf

(x). (1.20)
In Leibnitz’ notation this reads
d
dx
(f +g)(x) =
df
dx
(x) +
dg
dx
(x) and
d
dx
(cf)(x) = c
df
dx
(x). (1.21)
In words, the derivative of a sum of functions is the sum of the derivatives,
and the derivative of a multiple of a function is the multiple of the derivative.
Example 1.40. Differentiate
h(x) = x
2
+ 3e
x
.
Solution: Set f(x) = x
2
, g(x) = e
x
and c = 3. Then h(x) = f(x) + 3g(x).
Previously we found that f

(x) = 2x and that g

(x) = e
x
, see (1.19). We
conclude that
h

(x) =
dh
dx
(x) = 2x + 3e
x
. ♦
1.13. RULES OF DIFFERENTIATION 23
Example 1.41. Differentiate log
a
x, the logarithm functions for an arbi-
trary positive base a, a = 1.
Solution: Recall that log
a
x =
ln x
ln a
, see Theorem 1.36. In this sense
log
a
x = cf(x) where c = 1/ ln a and f(x) = ln x. We stated previously that
ln

x = 1/x, see (1.19). Using the linearity of the derivative, we find
log

a
x =
d
dx
_
ln x
ln a
_
=
1
ln a
ln

x =
1
ln a
×
1
x
=
1
xln a
. ♦
Suppose f and g are defined and differentiable on an set. Thinking of f
and g more as functions, and not so much as functions evaluated at a point,
we may omit (x) from the notation. Then the differentiation rules are
(f +g)

= f

+g

or
d
dx
(f +g) =
df
dx
+
dg
dx
(1.22)
and
(cf)

= cf

or
d
dx
(cf) = c
df
dx
. (1.23)
Example 1.42. Find the derivative of an arbitrary polynomial.
Solution: A polynomial is a finite sum of multiples of non-negative
powers of the variable, i.e., a function of the form
f(x) = a
n
x
n
+a
n−1
x
n−1
+· · · +a
1
x +a
0
,
where the a
i
are constants. Using Example 1.28 and the linearity of the
derivative we see right away that
f

(x) = na
n
x
n−1
+ (n −1)a
n−1
x
n−2
+· · · +a
1
.
Here is a specific example, a special case of the formula which we just
derived.
If f(x) = 4x
5
−3x
2
+ 4x + 5, then f

(x) = 20x
4
−6x + 4. ♦
1.13.2 Product and Quotient Rules
Next we state the product and the quotient rule. They allow us to calculate
the derivatives of products and quotients of functions. Again, let f and g
be functions, and assume that both of them are differentiable at x. For the
quotient rule assume in addition that g(x) = 0. Then the product fg and
the quotient f/g are differentiable at x and their derivatives are given by
24 CHAPTER 1. BASIC CONCEPTS
(fg)

(x) = f

(x)g(x) +f(x)g

(x) (1.24)
_
f
g
_

(x) =
f

(x)g(x) −f(x)g

(x)
[g(x)]
2
. (1.25)
In Leibnitz’ notation these formulas become
d
dx
(fg)(x) =
df
dx
(x)g(x) +f(x)
dg
dx
(x) (1.26)
d
dx
_
f
g
_
(x) =
df
dx
(x)g(x) −f(x)
dg
dx
(x)
[g(x)]
2
. (1.27)
Example 1.43. Differentiate the function h(x) = x
2
ln x.
Solution: Write h(x) = f(x)g(x) with f(x) = x
2
and g(x) = ln x.
Then f

(x) = 2x and g

(x) = 1/x, see (1.19). Putting this into the product
formula yields
h

(x) = f

(x)g(x) +f(x)g

(x) = 2xln x +x
2
1
x
= x(2 ln x + 1). ♦
Example 1.44. Find the derivative of the rational function.
r(x) =
x
2
−5
x
3
+ 1
.
Solution: We set p(x) = x
2
− 5 and q(x) = x
3
+ 2. Then p

(x) = 2x
and q

(x) = 3x
2
. According to the quotient rule
r

(x) =
2x(x
3
+ 1) −(x
2
−5)3x
2
(x
3
+ 1)
2
=
−x
4
+ 15x
2
+ 2x
(x
3
+ 1)
2
. ♦
Example 1.45. The formula
d
dx
x
n
= nx
n−1
for all integer powers n. If n ≤ −1, then we domain of the function is R\{0},
the real line with the origin removed.
Solution: We verified this formula for n ≥ 0 in Example 1.28. Let n be
a negative integer and m = −n. Then
d
dx
x
n
=
d
dx
_
1
x
m
_
=
0 · x
m
−1 · mx
m−1
x
2m
=
−m
x
m+1
= nx
n−1
.
1.13. RULES OF DIFFERENTIATION 25
Example 1.46. Find the derivative of
f(x) = tan x.
Solution: We express f(x) as a quotient of two functions, f(x) =
sin x/ cos x, and apply the quotient rule. Use that sin

x = cos x (see Exam-
ple 1.31) and cos

x = −sinx (see Exercise 2 on page 17). We find
tan

x =
sin

xcos x −sin xcos

x
cos
2
x
=
cos
2
x + sin
2
x
cos
2
x
=
1
cos
2
x
= sec
2
x.
(1.28)
Some books and computer programs will give this result in a different form.
Based on the relevant trigonometric identity, they write
tan

x = 1 + tan
2
x. (1.29)
That draws our attention to the fact that the function f(x) = tan x satisfies
the differential equation
f

(x) = 1 +f
2
(x). ♦
Example 1.47. Differentiate the function
f(x) = sec x.
Solution: We write the function as a quotient: f(x) = 1/ cos x. The
function is defined for all x for which cos x = 0, i.e., for x not of the form
nπ + 1/2, where n is an integer. We apply the quotient rule, using that
cos

x = −sin x (see Exercise 2 on page 17), and that the derivative of a
constant vanishes. We find
sec

x =
sin x
cos
2
x
=
sin x
cos x
·
1
cos x
= tan xsec x. ♦ (1.30)
Suppose f and g are defined and differentiable on an open set. Thinking
of f and g again more as functions, and not so much as functions evaluated
at a point, we may once more omit (x) from the notation. Then the product
rule and quotient rule become
(fg)

= f

g +fg

or
d
dx
(fg) =
df
dx
g +f
dg
dx
(1.31)
and, wherever g(x) = 0,
_
f
g
_

=
f

g −fg

g
2
or
d
dx
_
f
g
_
=
df
dx
g −f
dg
dx
g
2
. (1.32)
Here g
2
is the square of the function g, given by g
2
(x) = [g(x)]
2
.
26 CHAPTER 1. BASIC CONCEPTS
1.13.3 Chain Rule
Let f and g be functions, and suppose that the domain of f contains the
range of g, so that the composition (f ◦ g)(x) = f(g(x)) is defined for all x
in the domain of g. Set h = f ◦ g, so that h(x) = f(g(x)). The chain rule
says that whenever g is differentiable at x and f is differentiable at g(x),
then h(x) is differentiable at x and
h

(x) = (f ◦ g)

(x) = f

(g(x))g

(x). (1.33)
In Leibnitz’ notation the chain rule says that
dh
dx
(x) =
d
dx
f(g(x)) =
df
du
(g(x))
dg
dx
(x). (1.34)
Example 1.48. Differentiate the function
h(x) = e
x
2
+1
.
Solution: We write h = f ◦ g as a composition of two functions, with
g(x) = x
2
+ 1 and f(u) = e
u
. Remember that f

(u) = f(u) = e
u
and
g

(x) = 2x. In particular, f

(g(x)) = e
x
2
+1
. The chain rule tells us that
h

(x) = f

(g(x))g

(x) = 2xe
x
2
+1
.
In the last expression we reversed the order of the factors to make the
expression more readable. ♦
Example 1.49. Let u(x) be a differentiable function.
If f(x) = e
u(x)
then f

(x) = u

(x)e
u(x)
.
Here are some specific examples:
d
dx
e
2x+5
= 2 e
2x+5
d
dx
e
sin x
= cos x e
sin x
d
dx
e
tan x
= sec
2
x e
tan x
. ♦
Example 1.50. Combining Example 1.45 with the chain rule we find
d
dx
u
n
(x) = nu

(x)u
n−1
(x)
1.13. RULES OF DIFFERENTIATION 27
for all integers n, assuming only that u is differentiable at x and u(x) = 0 if
n ≤ −1.
Solution: Set g(x) = u(x) and f(u) = u
n
. Then
h(x) = f(g(x)) = u
n
(x).
According to Example 1.45, f

(u) = nu
n−1
. The chain rule tells us now that
h

(x) = f

(g(x))g

(x) = n(g(x))
n−1
g

(x) = nu

(x)u
n−1
(x).
We reordered the expressions so that the expression is more readable.
To be specific, here are concrete examples:
d
dx
(3x + 5)
8
= 8(3x + 5)
8−1
· 3 = 24(3x + 5)
7
d
dx
(x
2
+ 1)
25
= 25(x
2
+ 1)
24
· 2x = 50x(x
2
+ 1)
24
d
dx
tan
3
x = 3 sec
2
xtan
2
x
d
dx
cos
2
x = 2 cos x(−sin x) = −2 cos xsin x
d
dx
sec
5
x = 5 sec
4
xsec xtan x = 5 sec
5
xtan x. ♦
Example 1.51. Differentiate the function ln |u| for u = 0.
Solution: We asserted that ln

u = 1/u for positive values of u, see
(1.19). So, suppose that u < 0. Then u = −|u| and ln |u| = ln(−u). The
chain rule tells us that, for u < 0,
d
du
ln |u| =
1
|u|
d
du
(−u) = (−1)
1
−u
=
1
u
.
This means that for all non-zero u
d
du
ln |u| =
1
u
. ♦ (1.35)
More generally, envoking the chain rule
d
dx
ln|u(x)| =
u

(x)
u(x)
, (1.36)
assuming that u is differentiable and nowhere zero on its domain. E.g.,
d
dx
ln |x
2
−4| =
2x
x
2
−4
28 CHAPTER 1. BASIC CONCEPTS
for all x = ±2.
We push matters a bit further. We use the formulae for differentiating
the exponential and natural logarithm functions. Eventually we will verify
them independently.
Consider a function u which is differentiable and nowhere zero on its
domain and q any real number. Then
If f(x) = |u(x)|
q
then f

(x) = q
u

(x)
u(x)
|u(x)|
q
. (1.37)
The assertion follows from (1.36), Example 1.49 and the exponential laws.
f

(x) =
d
dx
e
ln f(x)
=
d
dx
e
ln(|u(x)|
q
)
=
d
dx
e
q ln|u(x)|
=
_
d
dx
(q ln |u(x)|)
_
e
q ln |u(x)|
= q
u

(x)
u(x)
|u(x)|
q
.
Here is a concrete example:
d
dx
¸
¸
¸
¸
1
2
−sin x
¸
¸
¸
¸
5
= 5
−cos x
1
2
−sin x
¸
¸
¸
¸
1
2
−sin x
¸
¸
¸
¸
5
whenever sin x = 1/2. Specifically, we have to exclude all x of the form
π
6
+ 2nπ and

6
+ 2nπ, where n is an arbitrary integer.
For differentiable functions which are everywhere positive on their do-
main and any real number q the differentiation formula in (1.37) specializes
to
d
dx
u
q
(x) = qu

(x)u
q−1
(x). (1.38)
For example:
d
dx
(sin x)
1/2
=
cos x
2

sin x
for x ∈ (0, π) and
d
dx
(sec
2
x + 5)
π
= 2π sec
2
xtan x(sec
2
x + 5)
π−1
for x ∈ (−π/2, π/2).
1.13. RULES OF DIFFERENTIATION 29
Using the tricks from above, we get the following derivatives:
d
dx
a
x
= a
x
ln a (Assume a > 0. Hint: a
x
= e
x lna
)
d
dx
x
x
= (1 + ln x)x
x
(Assume x > 0, x = 1. Hint: x
x
= e
x lnx
)
d
dx
x
sinx
=
_
sin x
x
+ cos xln x
_
x
sinx
(Assume x ∈ (0, π/4)).
To differentiate a composition of more than two differentiable functions
we apply the chain rule repeatedly. E.g.,
d
dx
f(g(h(x))) = f

(g(h(x))
d
dx
g(h(x)) = f

(g(h(x)) · g

(h(x)) · h

(x).
For example
d
dx
e

x
2
+1
= e

x
2
+1
·
1
2

x
2
+ 1
· 2x =
xe

x
2
+1

x
2
+ 1
d
dx
tan
3
(5x
2
−x + 5) = 3 tan
2
(5x
2
−x + 5) sec
2
(5x
2
−x + 5) · (10x −1)
1.13.4 Hyperbolic Functions
The exponential function may be used to define the hyperbolic sine and
cosine.
sinhx =
1
2
_
e
x
−e
−x
¸
& cosh x =
1
2
_
e
x
+e
−x
¸
(1.39)
You are invited to verify that
cosh
2
x −sinh
2
x = 1.
Conversely, one can show that any point (u, v) on the hyperbola
u
2
−v
2
= 1
can be expressed as (±cosh x, sinh x) for some x ∈ (−∞, ∞). These obser-
vations motive the attribute ‘hyperbolic’.
It is elementary to compute the derivatives of the hyperbolic functions:
sinh

x = cosh x and cosh

x = sinhx.
30 CHAPTER 1. BASIC CONCEPTS
One may also define other hyperbolic functions
tanh x =
sinhx
cosh x
, coth x =
cosh x
sinhx
, sech x =
1
cosh x
, and csch x =
1
sinh x
.
As a routine application of the rules of differentiation, you may calculate
the derivatives of these functions. There are identities for these hyperbolic
functions, comparable to the identities for the trigonometric functions. You
can find them in any table of mathematical formulas, or you can work them
out yourself.
1.13.5 Derivatives of Inverse Functions
Let us recall. Two functions f and g are said to be inverses of each other
(or each function is the inverse of the other one) if the domain of f is equal
to the range of g, the domain of g is equal to the range of f, and
g(f(x)) = x and f(g(y)) = y (1.40)
for all x in the domain of f and all y in the domain of g. A few essential
properties of inverse functions are listed in
Proposition 1.52. Suppose f and g are inverses of each other.
1. The graph of g is obtained from the graph of f by reflection at the
diagonal.
2. If f is increasing, then so is g. If f is decreasing, then so is g.
3. If f is continuous, then it is monotonic (increasing or decreasing) on
any interval in its domain.
4. If f is continuous and I is an interval in the domain of f, then J =
f(I), the image of I under the map f, is an interval. If I is an open
interval, then J is an open interval.
Some parts of this proposition are elementary, others are consequences
of the intermediate value theorem.
For example, the function f(x) = cos x maps the interval [0, π] to the
inteval [−1, 1]. The function f(x) = e
x
maps the interval (−∞, ∞) to the
interval (0, ∞). The function tan x maps the interval (−π/2, π/2) to the
interval (−∞, ∞). It is customary to define its inverse arctan x as a func-
tion from (−∞, ∞) to (−π/2, π/2). The function sin x maps the interval
[−π/2, π/2] to the interval [−1, 1]. Its inverse arcsin x is typically used with
1.13. RULES OF DIFFERENTIATION 31
-1.5 -1 -0.5 0.5 1 1.5
-1
-0.5
0.5
1
Figure 1.9: sin x on [−π/2, π/2]
-1 -0.5 0.5 1
-1.5
-1
-0.5
0.5
1
1.5
Figure 1.10: arcsin y on [−1, 1]
domain [−1, 1], and its range is [−π/2, π/2]. You see the graph of these two
functions in Figures 1.9 and 1.10.
The relation between the derivative of a function and its inverse is spelled
out in our next theorem.
Theorem 1.53. Let f be a differentiable and invertible function which is
defined on an open interval (a, b), and denote the image of f by (A, B).
Denote the inverse of f by g. Then g is differentiable at all points y ∈ (A, B)
for which f

(g(y)) = 0. For these values of y and for x such that f(x) = y
the derivative is given by:
g

(y) =
1
f

(g(y))
or g

(f(x)) =
1
f

(x)
.
Proof. We will not give a formally complete proof of the differentiability
assertion. Still, if the line t(x) is close to the graph of the function f(x) at
the point (x, f(x)) and y = f(x), then its reflection T(x) at the diagonal
is close to the graph of the function g(x) at the point (f(x), x) = (y, g(y)).
We need that T(x) is not vertical, and this is assured by the assumption
that t(x) is not horizontal. With the role of x and y being interchanged,
the slope of t(x) is the reciprocal of the slope of T(x). This provides the
formula for the derivative. Actually, this is also easy to calculate.
By definition we have f(g(y)) = y for all y ∈ (A, B). Differentiate both
sides of the equation. We find
f

(g(y))g

(y) = 1 and g

(y) =
1
f

(g(y))
,
32 CHAPTER 1. BASIC CONCEPTS
as claimed. If y = f(x), then g(y) = g(f(x)) = x, and we obtain the second
version of the formula for the derivative of the inverse of the function:
g

(f(x)) =
1
f

(x)
.
We apply the theorem to find some important derivatives.
Example 1.54. Assume that the natural logarithm function is differen-
tiable and that ln

x = 1/x, as asserted in (1.19). Show that the exponential
function is differentiable and that
d
dy
e
y
= e
y
.
Solution: By definition, the exponential function is the inverse of the
natural logarithm function ln. Set f(x) = ln x and g(y) = e
y
in Theo-
rem 1.53. We note that ln

(x) = 0 for all x in (0, ∞), the domain of the
natural logarithm. The theorem says that the exponential function is dif-
ferentiable and provides the formula for the derivative:
d
dy
e
y
=
1
ln

(e
y
)
=
1
1/e
y
= e
y
,
as claimed. ♦
Example 1.55. Show that the function g(y) = arctan y (the inverse of
f(x) = tan x) is differentiable, and that
d
dy
arctan y =
1
1 +y
2
.
According to standard conventions we use (−∞, ∞) as the domain and
(−π/2, π/2) as the range for arctan.
Solution: The function f(x) = tan x is differentiable on its entire do-
main, and f

(x) = sec
2
x is nowhere zero. Theorem 1.53 tells us that
g(y) = arctan y is differentiable on its entire domain (−∞, ∞). The the-
orem also provides us with the formula for the derivative:
arctan

(y) =
1
tan

(arctan y)
=
1
sec
2
(arctan y)
= cos
2
(arctan y).
All we need to do now is to figure out what cos
2
(arctan y) is. To do this
we draw a triangle in which we identify the available data. We refer to the
notation in Figure 1.11.
1.13. RULES OF DIFFERENTIATION 33
y
1
u
A B
Figure 1.11: An informative triangle
There you see a rectangular triangle, the right angle is at the vertex
B. The angle at the vertex A is called u. The adjacent side to this angle is
chosen to be of length 1, and the opposing side of length y. So, by definition,
tan u = y and arctan y = u.
By the theorem of Pythagoras, the length of the hypotenuse is
_
1 +y
2
.
Then
cos u =
1
_
1 +y
2
and cos
2
(arctan y) =
1
1 +y
2
.
The conclusion is that
arctan

(y) =
1
1 +y
2
. (1.41)
This is exactly what we claimed. ♦
Combined with the chain rule, and assuming the differentiability of u(x),
we find a slightly more general formula:
d
dx
arctan(u(x)) =
u

(x)
1 +u
2
(x)
. (1.42)
34 CHAPTER 1. BASIC CONCEPTS
For example:
d
dx
arctan(x
2
+ 5) =
2x
1 + (x
2
+ 5)
2
d
dx
arctan(sin x) =
cos x
1 + sin
2
x
.
The reader is invited to verify the formulas for the other inverse trigono-
metric functions arcsin x, arccos x, arccot x, and arcsec x as they are given
in Table 1.3 on page 63. For example
Exercise 3. It is customary to think of arcsin x as a function from [−1, 1]
to [−π/2, π/2]. Show that arcsin x is differentiable on (−1, 1), and that its
derivative is
d
dx
arcsin x =
1

1 −x
2
.
We may once more improve on this formula. Let u(x) be a differentiable
function which is defined on an open interval, and suppose that |u(x)| < 1.
Then, using the chain rule, we find that
d
dx
arcsin(u(x)) =
u

(x)
_
1 −u
2
(x)
. (1.43)
For example:
d
dx
arcsin(3x) =
3

1 −9x
2
if x ∈ (−1/3, 1/3)
d
dx
arcsin(x
2
) =
2x

1 −x
4
if x ∈ (−1, 1)
1.13.6 Implicit Differentiation
Until now we considered functions which were given explicitly. I.e., we were
given an equation y = f(x), where f(x) is some instruction which assigns a
value to x. The points on the graph of f are the points which satisfy the
equation. Consider the equation
(x
2
+y
2
)
2
= x
2
−y
2
. (1.44)
The solutions of this equation form a curve
5
in the plane called a lemniscate,
see Figure 1.12. Parts of this curve look like the graph of a function, such
1.13. RULES OF DIFFERENTIATION 35
-1 -0.5 0.5 1
-0.3
-0.2
-0.1
0.1
0.2
0.3
Figure 1.12: Lemniscate
as the points for which y ≥ 0. Without solving the equation for y, we still
like to calculate the slope of curve at one of its points. This process is called
implicit differentiation.
Let us start out with an example which we have studied before.
Example 1.56. The unit circle consists of all points which satisfy the equa-
tion x
2
+y
2
= 1. Find the slope of the tangent line to the unit circle at the
point (1/2,

3/2).
Solution: We write y = y(x) to emphasize that y as a function of x.
Differentiating both sides of the equation of the circle we get
2x + 2y
dy
dx
= 0 or
dy
dx
=
−x
y
.
Plugging in the coordinates of the specified point, we find that
dy
dx
¸
¸
¸
¸
(1/2,

3/2)
=
−1

3
.
We used a different way to indicate at which point we evaluate the derivative
because we had to specify the x and the y coordinate of the point. ♦
5
We will rely on the readers intuitive idea of a curve in the plane.
36 CHAPTER 1. BASIC CONCEPTS
Example 1.57. Find the slope of the tangent line to the lemniscate
(x
2
+y
2
)
2
= x
2
−y
2
,
and find the coordinates of the points where the tangent line is horizontal.
Solution: You see a picture of the lemniscate in Figure 1.12. As in
Example 1.56, we consider y as a function of x and differentiate both sides
of the equation. We find
2(x
2
+y
2
)(2x + 2y
dy
dx
) = 2x −2y
dy
dx
.
Bring all terms with a factor dy/dx to the left hand side of the equation and
those without to the right hand side.
(2y(x
2
+y
2
) +y)
dy
dx
= x(1 −2(x
2
+y
2
)).
Finally we get an explicit expression for
dy
dx
in terms of x and y:
dy
dx
=
x(1 −2(x
2
+y
2
))
2y(x
2
+y
2
) +y
=
x(1 −2(x
2
+y
2
))
y(2(x
2
+y
2
) + 1)
.
Given any point (x, y) with y = 0 on the lemniscate, we can plug it into the
expression for
dy
dx
and we get the slope of the curve at this point.
E.g, the point (x, y) = (
1
2
,
1
2
_
−3 + 2

3) is a point on the lemniscate,
and at this point the slope of the tangent line is
dy
dx
=
2 −

2

3
_
−3 + 2

3
.
This specific calculation takes a bit of arithmetic skill and effort to carry
out.
The tangent line is horizontal whenever
dy
dx
= 0. A quick look at Fig-
ure 1.12 tells us that we may ignore points where x = 0 or y = 0. That
means that
dy
dx
= 0 whenever
1 −2(x
2
+y
2
) = 0 or x
2
+y
2
=
1
2
.
Substitute x
2
+ y
2
=
1
2
, and y
2
=
1
2
− x
2
into the equation of the curve.
Then we get an equation in one variable:
1
4
= x
2

_
1
2
−x
2
_
or x
2
=
3
8
and y
2
=
1
8
.
1.13. RULES OF DIFFERENTIATION 37
The points at which the tangent line to the lemniscate is horizontal are
(x, y) = (±

6
4
, ±

2
4
) ≈ (±.6124, ±.3536). ♦
Example 1.58. Suppose you drop a circle of radius 1 into a parabola with
the equation y = 2x
2
. At which points will the circle touch the parabola?
6
-1 -0.5 0.5 1
0.5
1
1.5
2
2.5
3
Figure 1.13: Ball in a Cup.
Solution: You see a picture of the problem in Figure 1.13. The crucial
observation in this example is, that the tangent line to the parabola and the
circle will be the same at the point of contact.
Suppose the coordinates of the center of the circle are (0, a), then its
equation is x
2
+ (y − a)
2
= 1. Differentiating the equation of the parabola
with respect to x, we find that
dy
dx
= 4x. Differentiating the equation of the
circle with respect to x, we get
2x + 2(y −a)
dy
dx
= 0.
Assuming that
dy
dx
is the same for both curves at the point of contact, we
substitute
dy
dx
= 4x into the second equation. After some implifications we
6
More sensibly, drop a ball of radius 1 into a cup whose vertical cross section is the
parabola y = 2x
2
.
38 CHAPTER 1. BASIC CONCEPTS
find:
x(1 + 4(y −a)) = 0.
The ball it too large to fit into the parabola and touch at (0, 0). So we may
assume that x = 0. Solving the equation 1 + 4(y − a) = 0 for y, we find
that the y coordinate of the point of contact is y = a −
1
4
. We substitute
this expression into the equation of the circle and find that the x coordinate
of the point of contact is x = ±

15
4
. Substituting this into the equation of
the parabola, we find that y =
15
8
at the point of contact. In summary, the
circle touches the parabola in the points
(x, y) =
_
±

15
4
,
15
8
_
. ♦
Exercise 4. Consider the curve given by the equation
x
3
+y
3
= 1 + 3xy
2
.
Find the slope of the curve at the point (x, y) = (2, −1).
Exercise 5. Consider the curve given by the equation x
2
= sin y. Find the
slope of the curve at the point with coordinates x = 1/
4

2 and y = π/4.
Exercise 6. Repeat Example 1.57 with the curve given by the equation
y
2
−x
2
(1−x
2
) = 0. You find a picture of this Lissajous figure in Figure 1.14.
1.14 Related Rates
Many times you encounter situations in which you have two related variables,
you know at which rate one of them changes, and you like to know at which
rate the other one changes. In this section we treat such problems.
Example 1.59. Suppose the radius of a ball changes at a rate of 2 cm/min.
At which rate does its volume change when r = 20 cm?
Solution: Denote the volume of the ball by V and its radius by r. We
use t to denote the time variable. We consider V as a function of r as well
as t. The formula for the volume of a ball is V (r) =

3
r
3
. According the
the chain rule:
dV
dt
=
dV
dr
dr
dt
= 4πr
2
dr
dt
.
With r = 20 and
dr
dt
= 2 we get
dV
dt
= 3200π cm
3
/min. This is the rate at
which the volume of the ball changes with respect to time. ♦
1.14. RELATED RATES 39
-1 -0.5 0.5 1
-0.4
-0.2
0.2
0.4
Figure 1.14: y
2
−x
2
(1 −x
2
) = 0
Example 1.60. Suppose a particle moves on a circle of radius 10 cm and
centered at the origin (0, 0) in the Cartesian plane. At some time the particle
is at the point (5, 5

3) and moves downwards at a rate of 3 cm/min. At
which rate does it move in the horizontal direction?
Solution: The equation of the circle is x
2
+y
2
= 100. We consider both
variables, x and y, as functions of the time variable t. Implicit differentiation
of the equation of the circle gives us the equation
2x
dx
dt
+ 2y
dy
dt
= 0.
In the given situation x = 5, y = 5

3, and
dy
dt
= −3. We find that
dx
dt
= 3

3,
so that the particle is moving to the right at a rate of 3

3 cm/min. ♦
Example 1.61. Pressure (P) and volume (V ) of air at room temperature
are related by the equation
7
PV
1.4
= C.
7
Boyle-Mariotte described the relation between the pressure and volume of a gas. They
derived the equation PV
γ
= C. It is called the adiabatic law. The constant γ depends on
the molecular structure of the gas and the temperature. For the purpose of this problem,
we suppose that γ = 1.4 for air at room temperature.
40 CHAPTER 1. BASIC CONCEPTS
Here C is a constant. At some instant t
0
the pressure of the gas is 25 kg/cm
2
and the volume is 200 cm
3
. Find the rate of change of P if the volume
increases at a rate of 10 cm
3
/min.
Solution: We consider P as a function of V . Differentiation of the
equation yields
dP
dV
V
1.4
+ 1.4PV
.4
= 0 or
dP
dV
= −
1.4P
V
.
According to the chain rule
dP
dt
=
dP
dV
dV
dt
= −
1.4P
V
dV
dt
.
Substituting the given information we find that the pressure decreases at a
rate of 1.75 kg/cm
2
sec. ♦
Example 1.62. The mass M of a particle at velocity v, as perceived by an
observer in resting position, is
M(v) =
m
_
1 −v
2
/c
2
,
where m is that mass at rest and c is the speed of light. This formula is from
Einstein’s special theory of relativity. At which rate is the mass changing
when the particle’s velocity is 90% of the speed of light, and increasing at
.001c per second?
Solution: According to our rules of differentiation
dM
dv
=
mvc
(c
2
−v
2
)
3/2
.
Applying the chain rule and substituting the values, we find
dM
dt
=
dM
dv
dv
dt
=
mvc
2
1000(c
2
−v
2
)
3/2
=
9

19m
3610
≈ .010867m.
The perceived mass increases at a rate of approximately 1% of its mass at
rest. ♦
Exercise 7. A ladder, 7 m long, is leaning against a wall. Right now the
foot of the latter is 1 m away from the wall. You are pulling the foot of the
ladder further away from the wall at a rate of .1 m/sec. At which rate is
the top of the ladder sliding down the wall?
1.15. EXPONENTIAL GROWTH AND DECAY 41
1.15 Exponential Growth and Decay
An idealistic, but very useful model for population growth is the Malthusian
Law
A

(t) = aA(t). (1.45)
It says that the rate of change of a population is proportional to its size. We
denoted the proportionality factor by a. We saw that the functions A(t) =
Ce
at
are solutions of this equation, and it can be shown that on an interval
any solution is of this form. We also say that A(t) grows exponentially and
a is the relative growth rate.
The equation in (1.45) is an example of a differential equation, an equa-
tion which involves a function and its derivatives, and the unknown is a
function.
We may specify the value of A at some time t
0
, say A
0
= A(t
0
). Then
we have an initial value problem
A

(t) = aA(t) and A
0
= A(t
0
). (1.46)
Theorem 1.63. On an interval which contains t
0
the function
A(t) = A
0
e
a(t−t
0
)
is the unique solution
8
of the initial value problem in (1.46).
The essential aspects of dealing with (1.46) are addressed in
Example 1.64. Suppose the size of a population of bacteria in a laboratory
experiment is C
1
= 5, 000 at time t
1
= 2 and C
2
= 7, 000 at time t
2
= 5.
Here time is measured in hours since the beginning of the experiment.
1. Find the relative growth rate a of the population.
2. Find the formula for the size of the population at any time t ≥ 0.
3. Predict the size of the population at time t = 10.
4. Find the time at which the population reaches 8, 000.
5. Find the time within which the population doubles
9
.
8
That the function satisfies the differential equation follows from (1.19), which we still
need to prove. The uniqueness assertion follows from Proposition 2.9 on page 68.
9
Note that the doubling time depends only on the relative growth rate a.
42 CHAPTER 1. BASIC CONCEPTS
Solution: We denote the size of the population at time t by A(t). The
theorem tells us that A(t) = A
0
e
a(t−t
0
)
, where A
0
= A(t
0
).
1. To calculate the relative growth rate a observe that
C
2
C
1
=
A
0
e
a(t
2
−t
0
)
A
0
e
a(t
1
−t
0
)
= e
a(t
2
−t
1
)
and ln
_
C
2
C
1
_
= a(t
2
−t
1
).
We find that
a =
ln C
2
−ln C
1
t
2
−t
1
=
ln 1.4
3
≈ .11.
The population grows at a rate of about 11% per hour.
2. and 3. The size of the population at any time t ≥ 0 is
A(t) = 5000e
a(t−2)
,
where a is as above. Substituting t = 10 we find that A(10) ≈ 12, 264.
4. Suppose the size of the population reaches 8, 000 at time t
1
, then
8000 = 5000e
a(t
1
−2)
or ln(8/5) = a(t
1
−2) or t
1
=
ln(1.6)
a
+ 2 ≈ 6.2
The size of the population reaches 8, 000 about 6.2 hours into the experiment.
5. Suppose at some time t
0
the size of the population is A
0
= A(t
0
) and
T hours later the size of the population is 2A
0
= A(t
0
+T). Then
A(t
0
+T) = A
0
e
aT
= 2A
0
or e
aT
= 2 and aT = ln 2.
Thus the doubling time is T =
ln 2
a
≈ 6.18 hours.
Consider a radioactive substance. Experiments have shown that the
rate at which radioactive decays occur is proportional to the amount of
radioactive material present. This rate is proportional to the rate at which
the amount of the material decreases. Suppose t denotes time and A(t) the
amount of radioactive substance at time t. The experience which we just
described can be expressed as a differential equation
A

(t) = −kA(t). (1.47)
The minus sign in the equation is included so that k will be positive. The
half-life T of a radioactive substance is the time within which half of it
decays. As in the computation of the doubling time in the previous example,
one finds
T =
ln 2
k
. (1.48)
1.16. MORE EXPONENTIAL GROWTH AND DECAY 43
In the late 1940ies Willard Libby invented (or discovered) the method of
carbon-14 dating. He was awarded the Nobel price for it. In brief, the idea is
as follows. Carbon-14 occurs naturally in the atmosphere, and the amount
is believed to have been essentially constant for a long time (until recent
nuclear testing). All living organisms absorb it. Within a living organism
there is an equilibrium. The amount which is absorbed equals the amount
which decays. The level of the equilibrium is characteristic for the organism,
or a part thereof (e.g. wood from an oak or a human bone). After death
no more carbon-14 is absorbed, and the carbon-14 which was present at the
time of death decays. The half-life of carbon-14 has been determined to be
about 5568 years. For many organisms one also knows how many carbon-14
decays to expect at the time of death. Measuring the number of decays in
a dead organism allows us to determine the time of death, approximately.
We explain the process in a numerical example.
Example 1.65. Suppose we measure 6.68 carbon-14 decays per minute and
gram in a certain kind of wood at the time of death of the tree. Suppose
dead wood of the same kind shows 1.8 decays per minute and gram. How
long ago did the tree die?
Solution: Let t
0
= 0 be the time of death of the tree, and t
1
the present
time, measured in years. The number of decays to be expected t years after
death is
A(t) = 6.68e

ln 2
5568
t
.
We have that A(t
1
) = 1.8. From this we calculate:
ln
1.8
6.68
= −
ln 2
5568
t
1
or t
1
= −
5568
ln 2
ln
_
1.8
6.68
_
≈ 10, 534. (1.49)
The tree died approximately 10, 500 years ago.
1.16 More Exponential Growth and Decay
More generally than in (1.45), consider the differential equation
f

(t) = af(t) +b, (1.50)
where a and b are constants, and a = 0. A time independent solution (steady
state solution) of this equation is f(t) = −b/a.
44 CHAPTER 1. BASIC CONCEPTS
Theorem 1.66. Functions of the form
f(t) = ce
at

b
a
are solutions of the differential equation in (1.50). Here c denotes an arbi-
trary constant. On an interval every solution of (1.50) is of this form.
We obtain a unique solution if we add an initial condition to the differential
equation in (1.50).
Theorem 1.67. On an interval which contains t
o
, the function
f(t) =
_
y
0
+
b
a
_
e
a(t−t
0
)

b
a
is the unique solution of the initial value problem
f

(t) = af(t) +b and f(t
0
) = y
0
.
Remark 2. It is not hard to verify that the given functions are solutions of
the respective problems. The uniqueness assertion is a minor modification
of Proposition 2.9 on page 68.
Let us apply these ideas to solve some problems. The important aspects
are to translate the given information into a mathematical equation. The
rest will be routine calculation.
Example 1.68. On graduation day the balance of your student loan is
$15,000. Interest is added at a rate of .5% per month, and you are repaying
the loan at a rate of $ 200.00 per month. Analyze the future of the loan.
Solution: As variable we use time, denoted by t and measured in
months. We set t = 0 at the time of graduation. This is the time at
which you start to repay the loan. Denote the balance of your loan at time
t by B(t). The balance increases at a rate of .005B(t) due to interest being
added and decreases at a rate of $200.00 per month due to payments which
you make. In summary, we have the initial value problem
B

(t) = .005B(t) −200 and B(0) = 15, 000.
According to Theorem 1.67 the solution of the initial value problem is
B(t) =
_
15, 000 +
−200
.005
_
e
.005t

−200
.005
= −25, 000e
.005t
+ 40, 000.
1.16. MORE EXPONENTIAL GROWTH AND DECAY 45
For example, B(T) = 0 if
T =
1
.005
ln
_
40
25
_
≈ 94.
After approximately 94 months (7 years and 10 months) you repaid the
loan. Your total payments were $18,800, so that you paid the principal plus
$3,800 in interest. ♦
Example 1.69. You are absorbing a medication at a rate of 3 mg per hour.
(You can keep this rate constant with a skin patch.) The liver metabolizes
the medication at a rate of 4% per hour. Analyze the amount of medication
in your body at any time.
Solution: We use time as independent variable, denote it by t and
measure it in hours. We denote by t = 0 the time when we start taking
the medication. Let A(t) denote the amount of medication in your body,
measured in milligrams. Then A(t) increases at a rate of 3 mg per hour
because you are taking in medication and at the same time A(t) decreases
at a rate of .04A(t) due to your liver metabolizing the medication. We have
the initial value problem
A

(t) = −.04A(t) + 3 and A(0) = 0.
The solution of this problem is
A(t) = −75e
−.04t
+ 75.
For example, after 12 hours there will be about 28.6 milligram of medication
in your body. It will take slightly more than 40 hours before the amount of
medication in your body reaches 60 milligrams. The steady state solution
of the problem is A(t) = 75. The amount of medication will stabilize at this
amount with time. ♦
Example 1.70 (Newton’s Law of Cooling). Suppose you have an ob-
ject whose temperature is different from the temperature of its surround-
ings. With time, the temperature of the object will approach the one of
its surroundings. We discuss how this happens, at least under idealized
circumstances.
Think of a cup of coffee. You stir the coffee gently so that the tem-
perature in the cup remains homogeneous and almost no energy is added
46 CHAPTER 1. BASIC CONCEPTS
through the process of stirring.
10
Denote the temperature of the coffee by
T. It is a function of time, so that we write T(t). Newton’s law of cooling
says that the rate at which the heat is transferred, and with this the rate
of change of temperature of the coffee, is proportional to the temperature
difference. If K is the temperature of the surroundings, then
T

(t) = a(T(t) −K) = aT(t) −aK. (1.51)
Setting b = −aK, this is the differential equation in (1.50).
Let us work out a numerical example. At time t = 0, just after you
poured the coffee into your cup, its temperature is 95 degrees Celsius. Five
minutes later the temperature has dropped to 80 degree, while you stir it
slightly and patiently. The room temperature is 25 degrees.
1. Determine the function T(t).
2. Find t
1
, such that T(t
1
) = 70 degrees Celsius.
Solution: To apply Theorem 1.67, we set t
0
= 0, y
0
= 95, and K = 25.
Note that −b/a = K. Putting all of this into the formula for the solution of
the initial value problem, we get that
T(t) = (95 −25)e
at
+ 25 = 70e
at
+ 25.
To determine a we use that
T(5) = 80 = 70e
5a
+ 25,
and we conclude that a =
1
5
ln
_
55
70
_
≈ −.0482. Using these data, Equa-
tion (1.51) says that the temperature of the coffee drops at a rate of about
.048 degrees per minute for each degree of difference between the tempera-
ture of the coffee and the room temperature. Having a numerical value for
a gives us an explicit expression for the temperature T as a function of t:
T(t) = 70e
−.0482t
+ 25.
10
The physics of heat transfer changes substantially if you take a solid object, such
as a turkey in the oven. The temperature in the solid object will not be homogenous,
the outside warms up much faster than the inside. In addition, the specific heat (the
amount of energy needed to increase the temperature of one unit of the material by one
degree) varies. It is different for fat, protein, and bone. Furthermore, the specific heat
is highly temperature dependent for substances like protein. That means, a in (1.51)
depends on the temperature T. All of this leads to a significantly different development
of the temperature inside a turkey as you roast it for your Thanksgiving dinner.
1.16. MORE EXPONENTIAL GROWTH AND DECAY 47
We like to find out the time t
1
for which
T(t
1
) = 70e
−.0482t
1
+ 25 = 70.
Solving the equation for t
1
, we find that t
1
≈ 9.17. That means that the
temperature drops to 70 degrees approximately 9.17 minutes after pouring
it. ♦
Exercise 8. A chemical factory is located on the banks of a river. Down
stream from the factory is a lake, and the river is the only contributor to
the lake. Assume that the amount of water carried by the river is the same
all year around, and the amount of water in the lake is 10 times the amount
of water carried by the river per year. In negotiations which the EPA, the
owner has agreed to an acceptable level of 2.5 mg per m
3
of a pollutant in
the lake. After a major accident the level has risen to 15 mg per m
3
. As a
remedy, the factory owner proposes to reduce the emission of pollution so
that the level of pollutant in the river is only 1.5 mg per m
3
. It is assumed
that the pollutant is distributed uniformly in the lake at any time.
1. Let P(t) denote the amount of pollutant (measured in mg per m
3
) in
the lake at time t. Let t
0
= 0 be the time just after the accident and
at which the clean-up strategy is implemented. State the initial value
problem for P(t).
2. Find the function P(t).
3. At which time will the level of pollution be back to 2.5 mg per m
3
?
Exercise 9. The population of an endangered species of birds on Kauai
decreases at a relative rate of 25% per year. Currently, at time t
0
= 0, the
population is estimated to be 700 birds. A government agency raises the
species in captivity and releases birds into the wild at a rate of 80 birds per
year. Denote the size of the population at time t by P(t), where t denotes
time and is measured in years.
1. State the initial value problem for P(t).
2. Find the function P(t).
3. At which time will the population drop to 500 birds?
4. What is the long term estimate for the population of this species in
the wild?
48 CHAPTER 1. BASIC CONCEPTS
1.17 The Second and Higher Derivatives
Let f(x) be a function which is defined on an open set. If the function is
differentiable at each point of its domain, then f

(x) is again a function with
the same domain as f(x). We may ask whether the function f

(x) is dif-
ferentiable. Its derivative, wherever it exists, is called the second derivative
of f. It is denoted by f

(x). This process can be iterated. The deriva-
tive of the second derivative is called the third derivative, and denoted by
f

(x), etc. We will make use of the second derivative. Leibnitz’s notation
for the second derivative of a function f(x) is d
2
f/dx
2
. Here is a sample
computation in which you are invited to fill in the details:
d
2
dx
2
e
sin x
=
d
dx
cos xe
sin x
= (−sin x + cos
2
x)e
sin x
.
Exercise 10. Find the second derivatives of the following functions:
(1) f(x) = 3x
3
+ 5x
2
(2) g(x) = sin 5x
(3) h(x) =
_
x
2
+ 2
(4) i(x) = e
5x
(5) j(x) = tan x
(6) k(x) = cos(x
2
)
(7) l(x) = ln 2x
(8) m(x) = ln(x
2
+ 3)
(9) n(x) = arctan 3x
(10) o(x) = sec(x
3
)
(11) p(x) = ln
2
(x + 4)
(12) q(x) = e
cos x
(13) r(x) = ln(tan x)
(14) s(x) = e
x
2
−1
(15) t(x) = sin
3
x.
1.18 Numerical Methods
In this section we introduce some methods for numerical computations.
Their common feature is, that for a differentiable function we do not make
a large error when we use the tangent line to the graph instead of the graph
itself. This rather casual statement will become clearer when you look at
the individual methods.
1.18.1 Approximation by Differentials
Suppose x
0
is an interior point of the domain of a function f(x) and f(x)
is differentiable at x
0
. Assume also that f(x
0
) and f

(x
0
) are known. The
method of approximation by differentials provides an approximate values
f(x
1
) if x
1
is near x
0
. We use the symbol ‘≈’ to stand for ‘is approximately’.
One uses the formula
f(x
1
) ≈ f(x
0
) +f

(x
0
)(x
1
−x
0
). (1.52)
1.18. NUMERICAL METHODS 49
On the right hand side in (1.52) we have l(x
1
), the tangent line to the graph
of f(x) at (x
0
, f(x
0
)) evaluated at x
1
. In the sense of the definition of the
tangent line in Section 1.6, f(x
1
) is close to l(x
1
) for x
1
near x
0
.
Example 1.71. Find an approximate value for
3

9.
Solution: We set f(x) =
3

x, so we are supposed to find f(9). Note
that
f

(x) =
1
3
x
−2/3
, f(8) = 2, and f

(8) =
1
12
.
Formula (1.52), applied with x
1
= 9 and x
0
= 8, says that
3

9 = f(9) ≈ 2 +
1
12
(9 −8) =
25
12
≈ 2.0833.
Your calculator will give you
3

9 ≈ 2.0801. ♦
Example 1.72. Find an approximate value for tan 46

.
Solution: We carry out the calculation in radial measure. Note that
46

= 45

+ 1

, and this corresponds to π/4 + π/180. Use the function
f(x) = tan x. Then f

(x) = sec
2
x, f(π/4) = 1, and f

(π/4) = 2. Formula
(1.52), applied with x
1
= (π/4 +π/180) and x
0
= π/4 says
tan 46

= tan
_
π
4
+
π
180
_
≈ tan
_
π
4
_
+ sec
2
_
π
4
__
π
180
_
= 1 +
π
90
≈ 1.0349.
Your calculator will give you tan 46

≈ 1.0355. ♦
Exercise 11. Use approximation by differentials to find approximate values
for
(1)
5

34 (2) tan 31

(3) ln 1.2 (4) arctan 1.1.
In each case, compare your answer with one found on your calculator.
We have been causal in (1.52) insofar as we have not estimated (provided
an upper bound for) the error which we make using the right hand side of
(1.52) instead of of the actual value of the function on the left hand side.
The inequality in Definition 1.25 on page 12 provides us with an estimate.
According to this slightly more demanding definition, differentiability of the
function f(x) means that there exist numbers A and d > 0, such that
|f(x
1
) −[f(x
0
) +f

(x
0
)(x
1
−x
0
)| ≤ A(x
1
−x
0
)
2
whenever |x
1
−x
0
| < d. Thus, if we know A and d, then we can approximate
the error as long as |x
1
−x
0
| < d.
50 CHAPTER 1. BASIC CONCEPTS
Example 1.73. Find an approximate value for sin31

and estimate the
error.
Solution: Set f(x) = sinx. Then f

(x) = cos x, f(π/6) = 1/2, and
f

(π/6) =

3/2. Measuring angles in radians we set x
0
= π/6 and x
1
=
π/6 +π/180. Applying the formula in (1.52), we find
sin 31

≈ sin
π
6
+
π
180
cos
π
6
=
1
2
_
1 +

3
π
180
_
≈ .515115.
The calculator will tell that sin 31

≈ .515038.
From the computation in Example 1.31 on page 17 we also know that we
may use A = 1 and d = π/4 in the differentiability estimate. The estimate
assures us that the error is at most
(x
1
−x
0
)
2
=
_
π
180
_
2
≤ .000305.
Comparison of the actual and approximate value confirm this. ♦
Example 1.74. Use approximation by differentials to find an approximate
value of

10 and give an upper bound for the error.
Solution: We use f(x) =

x and x
0
= 9. The f

(x) = 1/(2

x),
f(x
0
) = 3, and f

(x
0
) = 1/6. The formula in (1.52) tells us that

10 = f(10) ≈ f(9) +f

(9)(10 −9) = 3 +
1
6
≈ 3.16666.
The calculator will give you

10 ≈ 3.16228.
For the error estimate we may use
A =
1
2(

x
0
)
3
and any d > 0, see (1.16). The estimate assures us that the error is at most
1
2(

x
0
)
3
(x
1
−x
0
)
2
=
1
54
.
The actual error is again substantially less than this. ♦
Exercise 12. Use approximation by differentials to find approximate values
for
(1) cos 28

(2)

26 (3) sin 47

.
In each case, estimate also the maximal error which you may have made by
using the method of approximation by differentials.
1.18. NUMERICAL METHODS 51
1.18.2 Newton’s Method
Newton’s method is designed to find the zeros of a function. You have learned
how to solve linear and quadratic equations, i.e., finding the zeros of func-
tions of degree 1 and 2. More sophisticated methods allow you to find the
exact solutions of polynomial equations of degree three and four. For poly-
nomials of degree greater or equal to 5 and most other functions there are
no general methods for finding their roots.
Newton’s method works as follows. Suppose we want to find a zero of a
differentiable function f(x), i.e., we want to find some x, such that f(x) = 0.
Suppose that by some means we know that such an x exists, and that x
0
is
not far from x. Then we set
x
1
= x
0

f(x
0
)
f

(x
0
)
, x
2
= x
1

f(x
1
)
f

(x
1
)
, x
3
= x
2

f(x
2
)
f

(x
2
)
, etc. (1.53)
and in general
x
n+1
= x
n

f(x
n
)
f

(x
n
)
. (1.54)
Geometry of Newton’s Method: Let us give a geometric explanation
for the formlas. Given any x
0
at which f is defined and differentiable, we
obtain the tangent line l(x) to the graph of f at this point. Then x
1
, as
given in (1.53), is the point at which l(x) intersects the x-axis. Specifically,
l(x) = f

(x
0
)(x −x
0
) +f(x
0
), and l(x
1
) = 0 if x
1
= x
0

f(x
0
)
f

(x
0
)
.
This means that we accept that the tangent line is close to the graph of
the function, and instead of finding the zero of the function itself, we find
the zero of the tangent line. The process is then iterated.
Let us calculate

A, i.e., the positive root of the function f(x) = x
2
−A.
Then f

(x) = 2x, and
x
n+1
= x
n

f(x
n
)
f

(x
n
)
= x
n

x
2
n
−A
2x
n
=
x
2
n
+A
2x
n
=
1
2
_
x
n
+
A
x
n
_
(1.55)
If we use A = 3 and x
0
= 2, then we find
x
1
=
1
2
_
2 +
3
2
_
=
7
4
, x
2
=
1
2
_
7
4
+
12
7
_
=
97
56
and x
3
=
18817
10864
.
We summarize the computation in Table 1.1. In the first column you find
the subscript n. In the following two columns you find the values of x
n
, once
52 CHAPTER 1. BASIC CONCEPTS
expressed as a fraction of integers, once in decimal form. In the last column
you see the square of x
n
. At least x
2
3
is rather close to 3. Your calculator
will give you 1.73205080757 as an approximate value of

3. You see that
our value for x
3
is rather precise. In fact, if you carry the calculation one
step further and find x
4
, then the accuracy of this approximation of

3 will
exceed the accuracy of most calculators. The numbers in the last column
show that we are making rapid progress in finding a good approximation of

3.
n x
n
x
n
x
2
n
0 2 2.0000000000 4.0000000000
1 7/4 1.7500000000 3.0625000000
2 97/56 1.7321428571 3.0003188775
3 18817/10864 1.7320508100 3.0000000085
Table 1.1: The Babylonian Method
More than 4000 years ago the Babylonians used the outermost expres-
sions in (1.55)
x
n+1
=
1
2
_
x
n
+
A
x
n
_
to find good approximations of square roots, expressed as rational numbers.
We refer to the described procedure as the Babylonian method.
Let us consider one more example to illustrate Newton’s method. Find
a solution of the equation
xsin x = cos x.
Equivalently, we may say, find a root of the function f(x) = xsinx −cos x.
Step 1: Let us make sure that there is a root of the function to be found.
Observe that f(0) = −1 < 0 and f(π/2) = π/2 > 0. The intermediate value
theorem tells us that f(x), as a continuous funtion, has a root in the interval
(0, π/2). Let us call this root x.
Step 2: Let us come up with a first guess for a root. Considering the
values of f at the end points of the interval, we guess that x
0
= 1 is not
1.18. NUMERICAL METHODS 53
too far away from the root, which we know to exist by Step 1. Actually
f(1) ≈ .3.
Step 3: Let us improve the guess: Set
x
1
= x
0

f(x
0
)
f

(x
0
)
≈ .8645.
Your calculator will tell you that f(x
1
) ≈ .00874. You see that f(x
1
) is
much closer to zero than f(x
0
), and in this sense we expect that x
1
is much
closer to the root x of f(x) than x
0
. We made progress finding x.
Step 4: Repeat Step 3 and calculate x
2
, x
3
, . . . . The distance between
x and x
n
will decrease rapidly as n increases.
We explained Newton’s method because we want to illustrate the power
of the concept ‘tangent line.’ A full discussion of Newton’s method requires
mathematical tools which are not available to us at this time. In general,
many interesting phenomena can occur. Still, the principle problem is as
follows. Suppose that f(x) is a differentiable function and f(x) = 0 and x
0
is given. Suppose that x
n
for n ≥ 1 are computed according to (1.54). Do
the x
n
tend (converge) to x, and how fast?
For completeness sake, we give an answer. Consider an interval I =
[x−a, x +a], and suppose that |f

(x)| ≥ m and |f

(x)| ≤ M on I. Suppose
x
n
∈ I and |x −x
n
| <
2m
M
. A theorem from advanced calculus asserts that
|x −x
n+1
| ≤
M
2m
(x −x
n
)
2
. (1.56)
We illustrate the theorem by applying it to the previous example. Ob-
serve that f(.8) < 0 and f(.9) > 0. This tells us that x ∈ [.8, .9]. Let us
set a = .2, so that I ⊂ J = [.6, 1.1]. On J, and with this also on I, we
have that |f

(x)| ≥ m = 1.5 and |f

(x)| ≤ M = 2.5. You are invited to
verify these estimates using technology. In (1.56) we use that
M
2m
< 1. As
first guess we used x
0
= 1, so that we know that |x − x
0
| < .2. The quoted
theorem asserts that |x − x
1
| < .04. If we repeat the process, then we see
that |x −x
2
| < .0016 and |x −x
3
| < .00000256. This illustrates that the x
n
approach x rapidly.
There is one feature of Newton’s method which helps. You may say
that with each iteration you make a fresh start, and in this sense previous
round-off errors don’t carry over.
1.18.3 Euler’s Method
Euler’s method is designed to find, by numerical means, an approximate
solution of the following kind of problem:
54 CHAPTER 1. BASIC CONCEPTS
Problem 1. Find a function y(t) which satisfies
y

=
dy
dt
= F(t, y) and y(t
0
) = y
0
. (1.57)
Here F(t, y) denotes a given function in two variables, and t
0
and y
0
are
given numbers.
The first condition on y in (1.57) is a first order differential equation. It is
an equation which involves a function and its derivative, and the unknown is
the function. The second condition is called an initial condition. It specifies
the value of the function at one point. For short, the problem in (1.57) is
called an initial value problem.
Approach in one step: Suppose you want to find y(T) for some T = t
0
.
Then you might try the formula
y(T) ≈ y(t
0
) +y

(t
0
)(T −t
0
) = y
0
+F(t
0
, y
0
)(T −t
0
). (1.58)
The tangent line to the graph of y at (t
0
, y
0
) is
l(t) = y(t
0
) +y

(t
0
)(t −t
0
),
so that the middle term in (1.58) is just l(T). The first, approximate equality
in (1.58) expresses the philosophy that the graph of a differentiable function
is close to its tangent line, at least as long as T is close to t
0
. To get the
second equality in (1.58) we use the differential equation and initial condition
in (1.57), which tell us that
y

(t
0
) = F(t
0
, y(t
0
)) = F(t
0
, y
0
).
The Logistic Law
The differential equation in our next example is known as the logistic law
of population growth. In the equation, t denotes time and y(t) the size of
a population, which depends on t. The constants a and b are called the
vital coefficients of the population. The equation was first used in popula-
tion studies by the Dutch mathematician-biologist Verhulst in 1837. The
equation refines the Malthusian law for population growth (see (1.45)).
In the differential equation, the term ay expresses that population growth
is proportional to the size of the population. In addition, the members of
the population meet and compete for food and living space. The probability
of this happening is proportional to y
2
, so that it is assumed that population
growth is reduced by a term which is proportional to y
2
.
1.18. NUMERICAL METHODS 55
Example 1.75. Consider the initial value problem:
dy
dt
= ay −by
2
and y(t
0
) = y
0
, (1.59)
where a and b are given constants. Find an approximate value for y(T).
Remark 3. An exact solution of the initial value problem in (1.59) is given
by the equation
y(t) =
ay
0
by
0
+ (a −by
0
)e
−a(t−t
0
)
(1.60)
This is not the time to derive this exact solution, though you are invited to
verify that it satisfies (1.59). We are providing the exact solution, so that
we can see how well our approximate values match it.
Solution: Setting F(t, y) = ay −by
2
, you see that the differential equa-
tion in this example is a special case of the one in (1.57). According to the
formula in (1.58) we find
y(T) ≈ y
0
+ (ay
0
−by
2
0
)(T −t
0
). (1.61)
We expect a close approximation only for T close to t
0
. ♦
Let us be even more specific and give a numerical example.
Example 1.76. Consider the initial value problem.
dy
dt
=
1
10
y −
1
10000
y
2
and y(0) = 300. (1.62)
Find approximate values for y(1) and y(10).
Solution: Substituting a = 1/10, b = 1/10000, t
0
= 0, and y
0
= 300
into the solution in (1.61), we find that
y(1) ≈ 300 +
_
300
10

300
2
10000
_
(1 −0) = 321.
According to the exact solution in (1.60), we find that
y(t) =
3000
3 + 7e
−t/10
.
Substituting t = 1, we find the exact value y(1) = 321.4; this number is
rounded off. So, our approximate value is close.
For T = 10 the formula suggests that y(10) ≈ 510. According to the
exact solution for this initial value problem, y(10) = 538.1. For this larger
value of T, the formula in (1.61) gives us a less satisfactory result. ♦
56 CHAPTER 1. BASIC CONCEPTS
Multi-step approach: We like to find a remedy for the problem which
we discovered in Example 1.76 for T further away from t
0
. Consider again
Problem 1 on page 54. We want to get an approximate value for y(T). For
notational convenience we assume that T > t
0
. Pick several t
i
between t
0
and T:
t
0
< t
1
< t
2
< · · · < t
n
= T.
Starting out with t
0
and y(t
0
), we use the one step method from above to get
an approximate value for y(t
1
). Then we pretend that y(t
1
) is exact, and we
repeat the process. We use t
1
and y(t
1
) to calculate an approximate value for
y(t
2
). Again we pretend that y(t
2
) is exact and use t
2
and y(t
2
) to calculate
y(t
3
). Iteratively, we calculate [t
i+1
, y(t
i+1
)] from [t
i
, y(t
i
)] according to the
formula in (1.58):
[t
i+1
, y(t
i+1
)] = [t
i+1
, y(t
i
) +F(t
i
, y(t
i
))(t
i+1
−t
i
)] (1.63)
We continue this process until we reach T.
For reasonably nice
11
expressions F(t, y) the accuracy of the value which
we get for y(T) will increase with n, the number of steps we make (at least if
all steps are of the same length). On the other hand, in an actual numerical
computation we also make round-off errors in each step, and the more steps
we make the worse the result might get. Experience will guide you in the
choice of the step length.
Example 1.77. Consider the initial value problem
dy
dt
=
1
10
y −
1
10000
y
2
and y(0) = 10. (1.64)
1. Apply the multi-step method to find approximate values for y(t) at
t
1
= 5, t
2
= 10, t
3
= 15, . . . , t
20
= 100. Arrange them in a table.
2. Graph the points found in the previous step together with the actual
solution of the initial value problem.
Solution: As points in the multi-step process we use
t
0
= 0, t
1
= 5, t
2
= 10, t
3
= 15, t
4
= 20, . . . , t
20
= 100.
11
We do not want to make this term precise, but the F(t, y) in Example 1.75 is of this
kind.
1.18. NUMERICAL METHODS 57
t y(t) & t y(t) & t y(t)
0 10.00 35 153.96 70 857.73
5 14.95 40 219.09 75 918.74
10 22.31 45 304.62 80 956.07
15 33.22 50 410.55 85 977.07
20 49.28 55 531.55 90 988.27
25 72.70 60 656.05 95 994.07
30 106.41 65 768.87 100 997.02
Table 1.2: Solution of Problem 1.77
For each t
i
(0 ≤ i ≤ 19) we use the formula
y(t
i+1
) = y(t
i
) + 5
_
y(t
i
)
10

y
2
i
(t
i
)
10000
_
and calculate y(t
1
), y(t
2
), y(t
3
), . . . , y(t
20
) consecutively. We summarize
the calculation in Table 1.2.
In Figure 1.15 you see the graph of the exact solution of the initial value
problem. You also see the points from Table 1.2. The points suggest a graph
which does follow the actual one reasonably closely. But you see that we are
definitely making errors, and they get worse as t increases
12
. You may try
a shorter step length. The points will follow the curve much more closely if
you use t
1
= 1, t
2
= 2, t
3
= 3, . . . , t
100
= 100 in your calculation. ♦
Steady States: Let us consider some very specific solutions of our initial
value problem in (1.57):
y

=
dy
dt
= F(t, y) and y(t
0
) = y
0
.
Suppose F(y
0
, t) = 0 for all t. Then the constant function y(t) = y
0
is a
solution of the problem. Such a solution is called a steady state solution.
12
It is incidental that the points eventually get closer to the graph again. This is due
to the specific problem, and will not occur in general.
58 CHAPTER 1. BASIC CONCEPTS
20 40 60 80 100
200
400
600
800
1000
Figure 1.15: Illustration of Euler’s Method
Example 1.78. Find the steady states of the differential equation (see
(1.50) in Section 1.16)
f

(t) = af(t) +b. (1.65)
Solution: Apparently, f

(t) = 0 if and only if f(t) = −b/a. So the
constant function f(t) = −b/a is the only steady state of this differential
equation. ♦
In review of Example 1.68 in Section 1.16, you see that the steady state
in that example is B(t) = 40, 000. I.e., if your loan balance is $40,000.00, the
bank charges you interest at a rate of .5% per month, and you are repaying
the loan at a rate of $ 200.00 per month, then the principal balance of your
account will stay unchanged. Your payments cover exactly the occuring
interest charges.
Example 1.79. For the logistic law (see Equation (1.59))
dy
dt
= F(y, t) = ay −by
2
= y(a −by)
1.18. NUMERICAL METHODS 59
we find that F(y, t) = 0 if and only if y = 0 or y = a/b. There are two
steady state solutions: y
u
(t) = 0 and y
s
(t) = a/b.
Let us interpret these steady state solutions for the specific numerical
values of a = 1/10 and b = 1/10, 000 in Example 1.77. If the initial value
y
0
of the population is positive, then the population size will tend to and
stabilize
13
at y(t) = a/b = 1, 000. In this sense, y
s
(t) = a/b = 1, 000 is a
stable steady state solution. It is also referred to as the carrying capacity.
It tells you which size population of the given kind the specific habitat will
support.
If the initial value y
0
is negative, then y(t) will tend to −∞ as time
increases. If y
0
= 0, then y(t) will not tend to the steady state y(t) = 0. In
this sense, y(t) = 0 is an unstable steady state. ♦
Exercise 13. Consider the initial value problem
y

(t) = −50 +
1
2
y(t) −
1
2000
y
2
(t) and y
0
= y(0) = 200. (1.66)
To make the problem explicit, you should think of a population of deer in a
protected wildlife preserve. There are no predators. The deer are hunted at
a rate of 50 animals per year. The population has a growth rate of 50% per
year. Reproduction takes place at a constant rate all year round. Finally,
the last term in the differential equation accounts for the competition for
space and food.
1. Use Euler’s method to find the population size over the next 30 years.
Proceed in 1 year steps. Tabulate and plot your results.
2. Guess at which level the population stabilizes.
3. Repeat the first two steps of the problem if hunting is stopped.
4. Repeat the first two steps of the problem if the initial population is
100 animals.
5. Find the steady states of the original equation in which hunting takes
place. I.e., find for which values of y you have that y

= 0? You will
find two values. Call the smaller one of them Y
u
and the larger one
Y
s
. Experiment with different initial values to see which of the steady
states is stable, and which one is unstable.
13
The common language meaning of these expressions suffices for the purpose of our
discussion, and the mathematical definition of ‘tends to’ and ‘stabilizes at’ only make these
terms precise.
60 CHAPTER 1. BASIC CONCEPTS
Orthogonal Trajectories
Let us explore a different kind of application. Suppose we are given a family
F(x, y, a) = 0 of curves. In Figure 1.16 you see a family of ellipses
C
a
: F(x, y, a) = x
2
+ 3y
2
−a = 0. (1.67)
There is one ellipse for each a > 0. We like to find curves D
b
which in-
tersect the curves C
a
perpendicularly. (We say that D
b
and C
a
intersect
perpendicularly in a point (x
1
, y
1
), if the tangent lines to the curves at this
point intersect perpendicularly.) We call such a curve D
b
an orthogonal tra-
jectory to the family of the C
a
’s. You also see one orthogonal trajectory in
Figure 1.16.
-6 -4 -2 0 2 4 6
-6
-4
-2
0
2
4
6
Figure 1.16: Orthogonal Trajectory to Level Curves
Let us explain where this type of situation occurs. Suppose the curves
C
a
are the level curves in a crater. Here a represents the elevation, so that
the elevation is constant along each curve C
a
. The orthogonal trajectory
gives a path of steepest descent. A new lava flow which originates at some
point in the crater will follow this path.
Suppose that each ellipse represents an equipotential line of an electro-
magnetic field. The orthogonal trajectory provides you with a path which
is always in the direction of the most rapid change of the field. A charged
particle will move along an orthogonal trajectory.
1.18. NUMERICAL METHODS 61
Suppose a stands for temperature, so that along each ellipse the tem-
perature is constant. In this case the curves are called isothermal lines
14
.
A heat seeking bug will, at any time, move in the direction in which the
temperature increases most rapidly, i.e., along an orthogonal trajectory to
the isothermal lines.
Suppose a stands for the concentration of a nutrient in a solution. It is
constant along each curve C
a
. On their search for food, bacteria will follow
a path in the direction in which the concentration increases most rapidly.
They will move along an orthogonal trajectory.
Example 1.80. Find orthogonal trajectories for the family of ellipses
C
a
: F(x, y, a) = x
2
+ 3y
2
−a = 0. (1.68)
Solution: Differentiating the equation for the ellipses, we get
2x + 6y
dy
dx
= 0 or
dy
dx
=
−x
3y
.
The slope of the tangent line to a curve C
a
at a point (x
1
, y
1
) is
−x
1
3y
1
. If
a curve D
b
intersects C
a
in (x
1
, y
1
) perpendicularly, then we need that the
slope of the tangent line to D
b
at this point is
3y
1
x
1
. Thus, to find an orthog-
onal trajectory to the family of the C
a
’s we need to find functions which
satisfy this differential equation. If we also require that the orthogonal tra-
jectory goes through a specific point (x
0
, y
0
), then we end up with the initial
value problem
dy
dx
=
3y
x
and y(x
0
) = y
0
.
This is exactly the kind of problem which we solved with Euler’s method. In
this particular example it is not difficult to find solutions for the differential
equation. They are functions of the form y(x) = bx
3
. The orthogonal
trajectory shown in Figure 1.16 has the equation y = x
3
/25. There is one
orthogonal trajectory which does not have this form, and this is the curve
x = 0.
Let us apply Euler’s method to solve the problem. Let us find approxi-
mate values for the initial value problem
dy
dx
=
3y
x
and y(1) =
1
25
.
14
The idea of isothermal lines, and with this the method in all of these applications,
was pioneered by Alexander von Humbold (1769–1859).
62 CHAPTER 1. BASIC CONCEPTS
Use x
0
= 1, x
1
= 1.2, x
2
= 1.4, . . . , x
20
= 5.
We set (x
0
, y
0
) = (1, 1/25) and calculate (x
n
, y
n
) according to the for-
mula
y
n
= y
n−1
+.2
3y
n−1
x
n−1
for n = 1, 2, . . . , 20.
Without recording the results of this calculation, we graphed the points in
Figure 1.16. ♦
Exercise 14. Consider the family of hyperbolas:
C
a
: x
2
−5y
2
+a = 0.
There is one hyperbola for each value of a, only for a = 0 the hyperbola
degenerates into two intersecting lines.
1. Graph several of the curves C
a
.
2. Find the differential equation for an orthogonal trajectory.
3. Use Euler’s method to find points on the orthogonal trajectory through
the point (3, 4). Use the points x
0
= 3, x
1
= 3.2, x
2
= 3.4, . . . ,
x
20
= 7. Plot the points (x
n
, y
n
) in your figure.
4. Check that the graph of y(x) = bx
−5
is an orthogonal trajectory to the
family of hyperbolas for every b. Determine b, so that the orthogonal
trajectory passes through the point (3, 4), and add this graph to your
figure.
1.19. TABLE OF IMPORTANT DERIVATIVES 63
1.19 Table of Important Derivatives
f(x) f

(x) Assumptions
x
q
qx
q−1
q a natural number, or x > 0
e
x
e
x
x ∈ (−∞, ∞)
ln |x| 1/x x ∈ (−∞, ∞), x = 0
sin x cos x x ∈ (−∞, ∞)
cos x −sin x x ∈ (−∞, ∞)
tan x sec
2
x all x for which tan x is defined
cot x −csc
2
x all x for which cot x is defined
sec x sec xtan x all x for which sec x is defined
csc x −csc xcot x all x for which csc x is defined
arctan x
1
1+x
2
x ∈ (−∞, ∞)
arcsin x
1

1−x
2
x ∈ (−1, 1), arcsin x ∈ (−π/2, π/2)
arccos x
−1

1−x
2
x ∈ (−1, 1), arccos x ∈ (0, π)
arccot x
−1
1+x
2
x ∈ (−∞, ∞), arccot x ∈ (0, π)
arcsec x
1
|x|

x
2
−1
x < −1 or x > 1, arcsec x ∈ (0, π/2) ∪ (π/2, π)
arccsc x
−1
|x|

x
2
−1
x < −1 or x > 1, arcsec x ∈ (−π/2, 0) ∪ (0, π/2)
Table 1.3: Some Derivatives
64 CHAPTER 1. BASIC CONCEPTS
Chapter 2
Global Theory
So far we studied the local behaviour of a function. All concepts related to
the behaviour of a function near a point. In this chapter we will use local in-
formation about a function to draw global conclusions. We will discuss some
uniqueness properties of solutions of differential equations. Then we discuss
geometric properties of graphs, their monotonicity and concavity. We apply
these ideas to the study of extrema of functions. With this information it is
possible to sketch graphs capturing their essential features.
The fundamental result which allows us to do this is referred to as
Cauchy’s mean value theorem. Augustin-Louis Cauchy (1789–1857) was
one of the great mathematicians of the 19-th century. He made major con-
tributions to make calculus a rigorous mathematical theory.
2.1 Cauchy’s Mean Value Theorem
It is useful to make the following
Definition 2.1. Let f(x) be a function which is defined on the interval
[a, b]. Then we call
f(b) −f(a)
b −a
the average rate of change of f over the interval [a, b].
For example, the average rate of change of f(x) = x
2
over the interval
[0, 2] is 2. The average rate of change of f(x) = sin x over [0, π/2] is 2/π
and over [0, π] it is 0.
65
66 CHAPTER 2. GLOBAL THEORY
Theorem 2.2 (Chauchy’s Mean Value Theorem). Let f be a real val-
ued function which is defined and continuous on the interval [a, b] and dif-
ferentiable on (a, b), where a < b. Then there exists a number c ∈ (a, b) such
that
f

(c) =
f(b) −f(a)
b −a
.
In words, the theorem asserts that the average rate of change over an
interval is equal to the rate of change at some point in the interval. For
example, the average rate of change of f(x) = x
2
over the interval [−2, 1] is
−1, and f

(−1/2) = −1.
The following special case of the theorem, called Rolle’s theorem (named
after Michel Rolle (1652–1719)), is of particular interest.
Theorem 2.3 (Rolle’s Theorem). Let f be a real valued function which
is defined and continuous on the interval [a, b] and differentiable on (a, b),
where a < b. If f(a) = f(b), then there exists a number c between a and b
(i.e., a < c < b) such that
f

(c) = 0.
We are not going to say anything about the proof of these two theorems,
except that Cauchy’s theorem and Rolle’s theorem are equivalent (each is
an easy consequence of the other one), and that the proof of both of them
depends heavily on the completeness
1
of the real numbers. We are also not
interested in finding the points c, as they occur in the two theorems. We
are interested in more general consequences.
Corollary 2.4. Let f be a real valued function which is defined and contin-
uous on an interval I. If f

(x) = 0 for all interior points x of I, then f is
constant on this interval. In other words, there exists a number d such that
f(x) = d for all x ∈ I.
Proof. A different formulation of the claim is that f(a) = f(b) for all a,
b ∈ I. We prove this statement using Cauchy’s theorem. If f(a) = f(b),
then a = b and there exists some c ∈ (a, b), such that
f

(c) =
f(b) −f(a)
b −a
= 0.
But this contradicts the assumption that f

(c) = 0 for all c ∈ I, and the
corollary is proved.
1
We discussed this property of the real numbers in Section 1.1.
2.2. UNIQUE SOLUTIONS OF DIFFERENTIAL EQUATIONS 67
We are going to use the following corollary frequently.
Corollary 2.5. Let h and g be functions which are defined and continuous
on an interval I. If h

(x) = g

(x) for all x ∈ I, then h and g differ by a
constant, i.e., there exists a number d such that
h(x) = g(x) +d
for all x ∈ I.
Proof. Apply the previous corollary to f(x) = h(x) −g(x).
Definition 2.6. Suppose the function f(x) is defined on the interval I. We
call a function F(x) with domain I an antiderivative of f if F

(x) = f(x)
for all x ∈ I.
Using this notion, we can reformulate Corollary 2.5.
Corollary 2.7. Suppose h and g are antiderivatives of a function f, defined
on an interval. Then h and g differ by a constant.
2.2 Unique Solutions of Differential Equations
Corollary 2.4 implies
Proposition 2.8. If the function F(x) is defined on an interval I and
F

(x) = 0 for all x ∈ I, then F(x) is constant on I.
In other words, on intervals the only solutions of the differential equation
F

(x) = 0 are the constant functions.
More generally, if you like to find all antiderivatives F(x) of a function
f(x) on an interval, then it suffices to find one antiderivative H(x). Any
antiderivative F(x) is of the form H(x) + c where c is a constant. The
constant c is referred to as integration constant. For the time being you
depend on being able to guess such a function H(x). By differentiating
H(x) you can check whether you guessed right.
For example, any antiderivative F(x) of the function f(x) = 2x on the
real line (−∞, ∞) is of the form F(x) = x
2
+ c where c is a constant. Any
antiderivative F(x) of the function f(x) = sec
2
x on the interval (−π/2, π/2)
is of the form F(x) = tan x +c.
Typically, the integration constant is determined by an initial condition.
Suppose we like to solve the initial value problem
f

(x) = cos x and f(0) = 1.
68 CHAPTER 2. GLOBAL THEORY
Our first conclusion is that f(x) = sin x + c. This follows from the above
because (sin x + c)

= cos x. Next we substitute x = 0 in the equation.
Then we see that f(0) = c = 1. The solution of the initial value problem is
f(x) = sin x + 1.
Of particular importance to our discussion of expenential growth and
decay is
Proposition 2.9. Every solution f(x) of the differential equation
f

(x) = af(x)
on an interval is of the form f(x) = ce
ax
for some constant c.
Proof. We asserted in (1.19), and will eventually prove, that all functions of
the form f(x) = ce
ax
satisfy the differential equation. We want to see that
these are the solutions.
Let f(x) be any function which satisfies the differential equation on some
interval. Consider the function
h(x) = f(x)e
−ax
.
As a product of differentiable functions, h is differentiable. Its derivative is
h

(x) = f

(x)e
−ax
−af(x)e
−ax
= af(x)e
−ax
−af(x)e
−ax
= 0.
Corollary 2.4 tells us that h(x) is a constant function. Calling the constant
c we find that
f(x) = ce
ax
.
This means that all solutions of the differential equation f

(x) = af(x) are
of the form f(x) = ce
ax
, where c is a constant.
Proposition 2.10. The initial value problem
f

(x) = af(x) and f(x
0
) = C
has a unique solution on an interval containing x
0
. In fact
f(x) = Ce
a(x−x
0
)
.
Proof. By the previous proposition we know that the solution is of the form
f(x) = ce
ax
for some c. Substituting the initial condition we obtain
C = f(x
0
) = ce
ax
0
.
Thus c = Ce
−ax
0
and f(x) = ce
ax
= Ce
−ax
0
e
ax
= Ce
a(x−x
0
)
.
2.3. THE FIRST DERIVATIVE AND MONOTONICITY 69
Remark 4. The uniqueness of the solution of an initial value problem as in
the previous proposition is not only of theoretical importance. Imagine that
you study the growth rate of a strain of bacteria, as we did in Example 1.64
on page 41. Before you can publish your result, it must be certain that your
experiment can be reproduced at a different time in a different location.
That is a requirement which any experiment in science must satisfy. If there
is more than one mathematical solution to your problem, then you have to
expect that the experiment can go either way, and this would invalidate your
experiment.
2.3 The First Derivative and Monotonicity
One of the interesting properties of a function is whether it is increasing or
decreasing. We might want to find out whether the part of a population
which is infected with a disease is increasing or decreasing. We might want
to know how the level of pollution in a body of water is changing. The first
derivative of a function gives us information of this kind.
2.3.1 Monotonicity on Intervals
Recall that a function f is called increasing if f(b) > f(a) whenever b > a.
It is called decreasing if f(b) < f(a) whenever b > a. A function is called
monotonic if it is either increasing or decreasing.
Theorem 2.11. Suppose that the function f is defined and continuous on
the interval I.
1. If f

(x) > 0 for all x ∈ I, then f is increasing on I.
2. If f

(x) < 0 for all x ∈ I, then f is decreasing on I.
3. More generally, the conclusions in (1) and (2) still hold if in each
finite interval J ⊂ I there are only finitely many points at which the
assumption on f

(x) is not satisfied.
2
Proof. We show (1). Let a and b be points in I, and suppose that a < b.
Cauchy’s theorem says that there exists a point c, a < c < b, such that
f

(c) =
f(b) −f(a)
b −a
.
2
It is permissable that f is not differentiable at a few points in J, or that f

(x) = 0.
It is not possible that f

(x) < 0 at some point in the interval, and f(x) is increasing on
the interval.
70 CHAPTER 2. GLOBAL THEORY
We have that f

(c) > 0 and b − a > 0, and it follows that f(b) − f(a) > 0.
This means that f(b) > f(a). The proof of the second claim is similar. We
leave it and the generalization of both statements to the reader.
For example, log

2
x =
_
ln x
ln 2
_

=
1
x ln2
> 0 for all x ∈ (0, ∞). In the
computation we used (1.19), Theorem 1.36, and that ln2 > 0. It follows
from Theorem 2.11 that log
2
x is increasing on x ∈ (0, ∞). You see part of
the graph of the function in Figure 1.8.
The exponential function exp
a
x = a
x
is increasing on (−∞, ∞) if a > 1
and decreasing if 0 < a < 1. To see this, observe that a
x
= e
x ln a
and
d
dx
a
x
= (ln a)a
x
. Furthermore, a
x
> 0 and lna > 0 if a > 1 and ln a < 0
if 0 < a < 1. Now Theorem 2.11 implies our assertion. You may also
want to have a look at the graph of the exponential function with base 2 in
Figure 1.7.
The function f(x) = 1/x is defined and differentiable on the set of all
nonzero real numbers, and its derivative is f

(x) = −1/x
2
. In particular
f

(x) < 0 for all nonzero real numbers. According to Theorem 2.11, f(x)
is decreasing on the interval (−∞, 0), and that f(x) is decreasing on the
interval (0, ∞). The function is not decreasing on the union of the two
intervals. The example illustrates that it is crucial in Theorem 2.11 that we
deal with functions which are defined and differentiable on an interval.
The function f(x) = tan x, defined on (−π/2, π/2), has as its deriva-
tive f

(x) = sec
2
x, and the derivative is positive. Consequently, f(x) is
increasing on (−π/2, π/2). Its inverse g(x) = arctan x, defined on (−∞, ∞),
has as its derivative g

(x) =
1
1+x
2
, which is positive on (−∞, ∞), so that
g(x) = arctan x is increasing on (−∞, ∞). As a general priciple, one may
show that the inverse of an increasing function is increasing.
Example 2.12. For a three dimensional solid we set E = A/V , where
A denotes the surface area and V the volume. For example, for a ball
E(r) = (4πr
2
)/(
4
3
πr
3
) = 3/r, where r denotes the radius. Then E

(r) < 0.
The same principle holds for other shapes, E decreases as we enlarge the
solid without changing its shape. What does this have to do with the size
of animals?
Warm blooded animals living in cold climates need to preserve their body
temperature. The total amount of heat stored in the body is proportional to
the volume, while the heat loss is proportional to the surface area. The ratio
of volume to surface area increases as the animal gets larger, so that for warm
blooded animals it is of advantage to be large if they live in cold climates.
In hot climates they need to give off heat, so that it is of advantage to be
2.3. THE FIRST DERIVATIVE AND MONOTONICITY 71
small. Natural selection (Darwinism) should favor the larger specimens of
a warm blooded species in a cold climate and smaller ones in a hot climate.
You can observe this phenomenon in real life.
For cold blooded animals the converse holds. They absorbe heat so
that they body reaches a temperature at which they can be active. In cold
climates it helps to be small, because then the surface area is relatively large,
compared to the volume. In hot climates cold blooded animals can afford to
be large, as it is easy to reach and maintain the temperature at which they
can be active. The argument is again consistent with real life.
Needless to say, there are other mechanisms to increase the surface area
of a body than decreasing its size, and the maintenance of the body temper-
ature is only one factor which influences the size of specimens of a species.
Larger animals need more food, are stronger, cannot hide so well, and are of-
ten less agile. All of these factors need to be taken into account to determine
the optimal size of an animal. ♦
So far we have only discussed examples where we used (1) and (2) of
Theorem 2.11. Let us show how to use the conclusion in (3). To apply it
we need to determine intervals on which a function does not change signs.
We recall a procedure which works well for continuous functions.
Definition 2.13. Suppose f(x) is a function. We call a point x
0
on the
real line exceptional if either f(x
0
) = 0 or f(x
0
) is not defined.
The following result is an immediate consequence of the Intermediate
Value Theorem, see Theorem 1.16 on page 8. Expressed casually it says
that a continuous function can change signs only at exceptional points.
Proposition 2.14. Suppose f(x) is continuous and f(x) has no exceptional
points in the interval (x
0
, x
1
). Then f(x) > 0 for all points in the interval
(x
0
, x
1
), or f(x) < 0 for all points in the interval (x
0
, x
1
). In particular, if
f(x) is positive at one point in the interval, then it is positive at all points
in the interval. If f(x) is negative at one point in the interval, then it is
negative at all points in the interval.
Example 2.15. For example, consider the function
f(x) =
x
2
(x
2
−4)
x
2
+ 2x −15
=
x
2
(x −2)(x + 2)
(x −3)(x + 5)
.
The zeros of the numerator, and with this the zeros of f(x), are x = 0,
x = 2, and x = −2. The zeros of the denominator, i.e., the points where
f(x) is not defined, are x = 3 and x = −5.
72 CHAPTER 2. GLOBAL THEORY
According to the proposition, the sign of f(x) remains unchanged on
each of the intervals (−∞, −5), (−5, −2), (−2, 0), (0, 2), (2, 3) and (3, ∞).
Counting signs of the factors in the expression for f(x), we see f(x) is
positive on the interval (−∞, −5), negative on (−5, −2), positive on (−2, 0)
and on (0, 2), negative on (2, 3), and positive on (3, ∞). You see that the
sign changes at some, but not all, exceptional numbers. ♦
Exercise 15. Find intervals on which the following functions do not change
signs. Decide whether the functions are positive or negative on these inter-
vals.
(1) f(x) = x
3
−x
2
−5x −3 (2) g(x) =
x
x
3
+ 5x
2
−4x −20
.
We are ready to discuss the monotonicity of functions whose derivative
vanishes at some points.
Example 2.16. Find intervals of monotonicity for the function
f(x) = 3x
2
+ 5x −4.
-3 -2 -1 1 2 3 4
-5
5
10
15
20
25
30
Figure 2.1: A quadratic polyno-
mial, f(x) = 3x
2
+ 5x −4
-2 2 4
-20
-15
-10
-5
5
Figure 2.2: A cubic polynomial,
p(x) = x
3
−3x
2
−9x + 3
Solution: We graphed the function in Figure 2.1. Its derivative is
f

(x) = 6x + 5. In particular, f

(x) > 0 if x ∈ (−5/6, ∞). So f

(x) > 0
for all points x ∈ [−5/6, ∞), except at x = −5/6. Theorem 2.11 (3) says
that f is increasing on the interval [−5/6, ∞). By a similar argument, f is
decreasing on the interval (−∞, −5/6]. ♦
2.3. THE FIRST DERIVATIVE AND MONOTONICITY 73
Example 2.17. Find intervals of monotonicity for the degree three poly-
nomial (for a graph see Figure 2.2)
p(x) = x
3
−3x
2
−9x + 3
Solution: The function is defined and differentiable on the real line. Its
derivative is
p

(x) = 3x
2
−6x −9 = 3(x
2
−2x −3) = 3(x −3)(x + 1).
Counting the signs of the factors we see that p

(x) is positive on (−∞, −1)
and on (3, ∞). We conclude that p(x) is increasing on the interval [3, ∞)
and that it is increasing on the interval (−∞, −1]. The derivative is negative
on the interval (−1, 3). The theorem implies that p(x) is decreasing on the
interval [−1, 3]. ♦
Example 2.18. Find intervals of monotonicity for the rational function
f(x) =
x
2
+ 3x
x −1
.
Solution: The simplified expression for the derivative of f is
f

(x) =
(x + 1)(x −3)
(x −1)
2
.
We see that the exceptional points for f

(x) are x = 1, x = −1 and x = 3. We
conclude that f

(x) does not change signs on the intervals (−∞, −1), (−1, 1),
(1, 3), and (3, ∞). Counting the signs of the factors of f

(x), we conclude
that f

(x) > 0 on the intervals (−∞, −1) and (3, ∞), and f

(x) < 0 on the
intervals (−1, 1) and (1, 3). Observe that f(x) is defined and differentiable
on the entire real line with the only exception of x = 1. We conclude that
f(x) is increasing on the (−∞, −1] and [3, ∞). The function is decreasing
on the intervals [−1, 1) and (1, 3]. ♦
Example 2.19. Find intervals on which the function
f(x) = sin 2x + 2 sin x
is monotonic. Restrict your discussion to the interval [0, 2π].
Solution: We differentiate the function and rewrite the expression for
the derivative so that it is easier to find its exceptional points.
f

(x) = 2 cos 2x + 2 cos x
= 2[2 cos
2
x + cos x −1]
= 4(cos x + 1)
_
cos x −
1
2
_
.
74 CHAPTER 2. GLOBAL THEORY
To see the second equality we used that cos 2x = 2 cos
2
x − 1. Then we
solved the quadratic equation in terms of cos x. We find exceptional points
where cos x = −1 (i.e., x = π) and where cos x =
1
2
(i.e., x =
π
3
and x =

3
).
1 2 3 4 5 6
-2
-1
1
2
3
4
Figure 2.3: A function and its derivative.
Observe that f is differentiable on [0, 2π], and that f

(x) = 0 at the end
points of this interval. This provides us with the intervals [0, π/3), (π/3, π),
(π, 5π/3) and (5π/3, 2π] on which f

does not change sign. Checking the
sign of f

(at one point) in each of the intervals, we find that f

(x) > 0 for
x ∈ [0, π/3) and x ∈ (5π/3, 2π], and f

(x) < 0 for x ∈ (π/3, π) and (π, 5π/3).
We conclude that f is increasing on the interval [0, π/3] and [5π/3, 2π]. The
function is decreasing on the interval [π/3, 5π/3], and in this interval there
are three points at which f

(x) is not positive.
You may confirm the calculation by having a look at Figure 2.3. There
you see the graph of the function (solid line) and the graph of its derivative
(dashed line). As you see, wherever f

(x) is positive, there f(x) is increasing.
Wherever f

(x) is negative, there f(x) is decreasing. ♦
Exercise 16. Find intervals on which the function f increases and intervals
on which f decreases. In the last two problems, (g) and (h), restrict yourself
2.3. THE FIRST DERIVATIVE AND MONOTONICITY 75
to the interval [0, 2π].
(a) f(x) = 3x
2
+ 5x + 7
(b) f(x) = x
3
−3x
2
+ 6
(c) f(x) = (x + 3)/(x −7)
(d) f(x) = x + 1/x
(e) f(x) = x
3
(1 +x)
(f) f(x) = x/(1 +x
2
)
(g) f(x) = cos 2x + 2 cos x
(h) f(x) = sin
2
x −

3 sin x
2.3.2 Monotonicity at a Point
It is quite natural to ask what it means that a function is increasing at a
point, and how this concept is related to the one of being increasing on an
interval. We address both questions in this subsection.
Definition 2.20. Suppose f is a function and c is an interior point of its
domain. We say that f is increasing at c if, for some d > 0,
f(x) < f(c) for all x ∈ (c −d, c) and f(x) > f(c) for all x ∈ (c, c +d).
We say that f is decreasing at c if this statement holds with the inequalities
reversed.
Expressed informally, to the left of c the function is smaller and to the
right of c it is larger than at c, at least for a while.
Being increasing or decreasing at a point c is a local property. We are
making a statement about the behavior of the function on some open interval
which contains c. Being increasing on an interval is a global property. For
the global property the interval is given to us. For the local property we may
chose the, possibly rather small, interval. The global property has to hold
for any two points in the given interval. For the local property we compare
f(x) to f(c) where c is fixed and x is any point in an open interval around
c which we may chose.
Theorem 2.21. Suppose f is a function which is defined on an open inter-
val I. Then f is increasing (decreasing) on I if and only it it is increasing
(decreasing) at each point in I.
This theorem establishes the relation between the local and the global
property. The ‘only if’ part is not difficult to show, but the ‘if’ part uses
some deeper facts about finite closed intervals. Our second result gives us a
valuable tool to detect monotonicity of functions at a point.
76 CHAPTER 2. GLOBAL THEORY
Proposition 2.22. Let f be a function and c an interior point of its do-
main. If f is differentiable at c and f

(c) > 0, then f is increasing at c. If
f

(c) < 0, then f is decreasing at c.
Remark 5. A function does not have to be differentiable to be increasing.
Graph the function f(x) = 2x+|x| to convince yourself of this fact. A func-
tion can be differentiable and increasing at a point x, even if the assumptions
of Proposition 2.22 do not hold, i.e., f(x) = x
3
is increasing at x = 0, but if
f

(0) = 0. A function can also be increasing at a point x, but there is not
open interval which contains x such that the function is increasing on this
interval.
Remark 6. The ideas of of a function being increasing or decreasing at
a point may be generalized to cover domains of functions which are half-
closed or closed intervals, and where we like to make a statement about the
behavior of a function at an endpoint. We have no specific needs for such
statements, but the motivated reader is encouraged to explore them.
2.4 The Second Derivative and Concavity
We like to capture the property of a graph being bent upwards or downwards.
Secant lines will either be required to lie above or below the graph, and the
rates of change will be either increasing of decreasing. These properties can
be described globally over intervals and locally at points. You may use the
graphs in Figures 2.4 and 2.5 as illustrations of the discussion.
-2 -1 1 2
2
4
6
8
10
Figure 2.4: Concave Up
-2 -1 1 2 3 4
-8
-6
-4
-2
2
4
Figure 2.5: Concave Down
2.4. THE SECOND DERIVATIVE AND CONCAVITY 77
2.4.1 Concavity on Intervals
Let f(x) be a function and let (a, f(a)) and (b, f(b)) be two distinct points
on its graph. The line through these two points is
l(x) = f(a) +
f(b) −f(a)
b −a
(x −a).
If we restrict l(x) to x ∈ [a, b], then we get the secant line through the two
point, i.e., the line segment joining the two points.
Definition 2.23. Let f be a function which is defined on an interval I.
We say that f is concave up on I if f(c) < l(c) for all a, b in I and
c ∈ (a, b). Here l(x) is the secant line through (a, f(a)) and (b, f(b)). The
inequality expresses that between the points a and b the secant line lies above
the graph. We say that f is concave down on I if f(c) > l(c) for all a, b
in I and c ∈ (a, b). The inequality expresses that between the points a and b
the secant line lies below the graph.
We state a theorem which provides you with assumptions under which a
function is concave up or down. We will not provide a proof of the theorem.
Theorem 2.24. Let f be a function which is defined on an interval I.
1. Suppose that f(x) is differentiable on I. If f

(x) is increasing on I,
then f(x) is concave up on I. If f

(x) is decreasing on I, then f(x) is
concave down on I.
2. Suppose that f(x) is twice differentiable
3
on I. If f

(x) > 0 for all x
in I, then f(x) is concave up on I. If f

(x) < 0 for all x in I, then
f(x) is concave down on I.
3. More generally, the conclusions in (2) still hold if in each finite interval
J ⊂ I there are only finitely many points at which the assumption
f

(x) > 0, resp. f

(x) < 0, is not satisfied.
For example, the function shown in Figure 2.4 is q(x) = x
2
−2x +3. Its
second derivative is q

(x) = 2 > 0. Theorem 2.24 (2) says that q is concave
3
Strictly speaking, so far we can consider being ‘twice differentiable’ only for functions
which are defined on open intervals. More generally, we proceed as in Section 1.11. We say
that f(x) is twice differentiable on I, if f(x) extends to a function F(x) which is defined
on an open interval J which contains I, and F(x) is twice differentiable on J. The second
derivative will be unique at all points in I if I is not empty and does not consist of exactly
one point.
78 CHAPTER 2. GLOBAL THEORY
up on (−∞, ∞). The function shown in Figure 2.5 is g(x) = −x
2
+ 5x −1,
and its second derivative is g

(x) = −2 < 0. Theorem 2.24 (2) says that q
is concave down on (−∞, ∞).
The function ln x is concave down on the interval (0, ∞). To see this, you
may use that ln

(x) = −1/x
2
< 0 on (0, ∞) and apply Theorem 2.24 (2).
Alternatively, you may note that the derivative ln

x = 1/x is decreasing on
(0, ∞) and apply Theorem 2.24 (1). The exponential function exp(x) = e
x
is
concave up on (−∞, ∞). To see this, you may note that exp

(x) = exp(x) >
0 and apply Theorem 2.24 (2). You may also use that exp

(x) is increasing
on the real line, and then quote Theorem 2.24 (1) to derive the desrired
conclusion. Finally, you may observe that a function is concave up if its
inverse is convave down
4
. So, ln x being concave down implies that exp(x)
is concave up.
Let us look at examples where we apply condition Theorem 2.24 (3).
Example 2.25. Study the concavity properties of the function
p(x) = x
3
−3x
2
−9x + 3.
Solution: You find the graph of this function in Figure 2.2. Its second
derivative is p

(x) = 6x−6 = 6(x−1). We see that p

(x) > 0 for x ∈ (1, ∞),
and p

(x) < 0 for x ∈ (−∞, 1). This means that p

(x) > 0 for all x ∈ [1, ∞)
with only one exception, x = 1. Theorem 2.24 (3) tells us that p(x) is
concave up on the interval [1, ∞). Similarly, p

(x) < 0 for x ∈ [−∞, 1) with
only one exception, x = 1. One deduces that f(x) is concave down on the
interval (−∞, 1]. ♦
Consider the function tan x. You may verify that tan

x = 2 sec
2
xtan x.
In particular, tan

x < 0 for x ∈ (−π/2, 0) and tan

x > 0 for x ∈ (0, π/2).
Theorem 2.24 (3) implies that tan x is concave down on (−π/2, 0] and con-
cave up on [0, π/2). You may confirm these statements visually by inspecting
a graph of the tangent function. You are invited to study the concavity of
the other trigonometric and hyperbolic functions.
Remark 7. You may consider the spread of a desease. Denote the number
of infected people by I(t). It may be scary if I

(t) > 0, i.e., I(t) increases.
It is worse, and often true in the early stages of an epedemic, if I

(t) > 0.
4
If f and g are inverses of each other, then the graph of one of the functions is obtained
from the one of the other one by reflection at the diagonal x = y. In this process, secant
lines which are above the graph turn into secant lines below the graph. Thus, if f is
concave up, then g is concave down, and vice versa.
2.4. THE SECOND DERIVATIVE AND CONCAVITY 79
This means that I

(t) increases, and the desease spreads at an increasing
rate. Medical professional will not necessarily wait for the time when I(t),
the number of infected people, starts decreasing. When I

(t) turns negative,
then I

(t) decreases. The spread or the desease slows. One may hope that
eventually I

(t) becomes negative, so that the actual number of sick people
decreases. The point at which I

(t) changes signs from being positive to
being negative may be considered the turning point in the spread of the
desease. One of the recent presidents was confused by a subtle argument of
this kind
5
.
Let us look at this phenomena in a concrete example. Earlier we con-
sidered the logistic equation
y

= ay −by
2
.
See Example 1.75 and the graph of a solution of this diferential equation in
Figure 1.15. Use implicit differentiation to find the second derivative:
y

= ay

−2byy

= (a −2by)y

.
We see that y

= 0 if y

= 0 or y = a/(2b). The first case occurs if y = 0
or y = a/b. We called y = a/b the carrying capacity of the system, and it
was the stable equilibrium point. The inflection occurs when y is half the
carrying capacity. As long as y is less than a/(2b), the population grows at
an increasing rate. If a/(2b) < y < a/b, then growth slows. You see the
turning point in the graph in Figure 1.15. For a while the population seems
to explode, but after a while it levels off so that it does not exceed a the
carrying capacity.
Exercise 17. Find intervals on which the following functions are concave
up, resp., concave down.
1. f(x) = x
3
−4x
2
+ 8x −7
2. g(x) = x
4
+ 2x
3
−3x
2
+ 5x −2
3. h(x) = x + 1/x
4. i(x) = 2x
4
−x
2
5. j(x) = x/(x
2
−1)
6. k(x) = 2 cos
2
x −x
2
for x ∈ [0, 2π].
5
During a televised presidential debate, one of the candidates said (see the New York
Times from October 8th, 1984, page B6): “Some of these facts and figures just don’t add
up. Yes, there has been an increase in poverty but it is a lower rate of increase than it was
in the preceding years before we got here. It has begun to decline, but it is still going up.”
80 CHAPTER 2. GLOBAL THEORY
2.4.2 Concavity at a Point
The notion of being concave up or down was defined for functions which are
defined on intervals. Still, we got a picture how the function has to look
like near a point, and this is the behavior which we like to capture in a
definition.
Definition 2.26. Let f be a function and c an interior point
6
of its do-
main. We say that f is concave up, resp., concave down, at c if there exists
an open interval I and a line l, called a support line, such that l(c) = f(c)
and
f(x) > l(x), resp., f(x) < l(x),
for all x ∈ I with x = c.
0.5 1 1.5 2
1
2
3
4
Figure 2.6: Concave up at •
1 2 3 4 5
-0.5
0.5
1
1.5
2
Figure 2.7: Concave down at •
In other words, we are asking for a line l(x), such that the graph lies
on one side of the graph, at least near c. If the graph is above the line,
then the function is concave up, if it is below, then the function is concave
down. We assume that the graph and the line agree at c. You see this
situation illustrated in two generic pictures in Figures 2.6 and 2.7. One
shows a function which is concave up at the indicated point, one shows a
function which is concave down.
Our next theorem tells us how to detect concavity, and it tells us how
to find the support line if the function is differentiable.
6
The idea of an interior point was defined in Definition 1.18 on page 10.
2.5. LOCAL EXTREMA AND INFLECTION POINTS 81
Theorem 2.27. Let f be a function and c an interior point of its domain.
1. If f

is increasing at c or if f

(c) > 0, then f is concave up at c.
2. If f

is decreasing at c or if f

(c) < 0, then f is concave down at c.
3. If f is differentiable and concave up or down at c, then there is only
one support line, and this line is the tangent line to the graph of f at
c.
The sign of the second derivative of a functions tells us whether a function
is concave up or down at a point. If the second derivative is zero, then the
test is inconclusive. The function can be concave up, down, or neither.
In general, there can be many support lines at any given point, but if the
function is differentiable at c, then the support line is unique. It is the
tangent line. So, for a differentiable function which is concave up or down
at a point, we can draw the tangent line easily. We just hold the ruler
against the graph.
For example, the function f(x) = x
5
−7x
4
+2x
3
+2x
2
−5x+4 is concave
down at x = 2 because f

(2) = −148 < 0.
To relate concavity properties on an interval to those at each point in
the interval we state, without proof, the following theorem.
Theorem 2.28. Let f be a function which is defined on an open interval
(a, b). Then f is concave up (resp., down) on (a, b) if and only if f is concave
up (resp., down) at each point in (a, b).
2.5 Local Extrema and Inflection Points
We are going to discuss two types of points which are particularly important
in the discussion of (graphs of) functions. As we like to apply local properties
of the function, we focus on interior points is the domain of the function.
Definition 2.29 (Local Extrema). Let f be a function and c an interior
point in its domain
7
. We say that f has a local maximum, resp. minimum,
at c if
f(c) ≥ f(x), resp. f(c) ≤ f(x),
for all x in some open interval I around c. In this case we call f(c) a local
maximum, resp. minimum, of f. A local extremum is a local maximum or
minimum.
82 CHAPTER 2. GLOBAL THEORY
-2 -1 1 2
-2
-1
1
2
3
Figure 2.8: A local minimum
-1 -0.5 0.5 1
-1.5
-1
-0.5
0.5
1
1.5
Figure 2.9: An Inflection Point
In other words, f has a local maximum of f(c) at c, if f(c) is the largest
value compared the values at points near c. The function shown in Figure 2.8
has a local minimum at x = −1. We will study tests which allow us find local
extrema soon. We do not need any test to see that f(x) = |x| has a local
minimum at x = 0, and f(x) = −(x − 1)
2
has a local maximum at x = 1.
The vertex of a parabola is always a local extremum, a local minimum if the
coefficient of x
2
is positive, and a local maximum if the coefficient of x
2
is
negative.
Definition 2.30 (Inflection Points). Let f be a function and c an inte-
rior point of its domain. We call c an inflection point of f if the concavity
of f changes at c. I.e., for some numbers a and b with a < c < b, we have
that f is concave up on the interval (a, c] and concave down on [c, b), or vice
versa.
Soon we will develop tests which detect inflections points. No test is
required to see that f(x) = tan x has an inflection point at x = 0. The
function is concave down on the interval (−π/2, 0] and concave up on the
interval [0, π/2). So the concavity changes at x = 0 and that means that
there is an inflection point at x = 0. You see the graph of this function in
Figure 2.9.
7
According to Definition 1.18 on page 10 this means that f(x) is defined for all x in
some open interval around c.
2.6. DETECTION OF LOCAL EXTREMA 83
2.6 Detection of Local Extrema
We will discuss how to detect local extrema. The first result excludes many
points. Typically, there are very few points where local extrema can occur.
Theorem 2.31. Let f be a function and c an interior point of its domain. If
f is differentiable at c and f

(c) = 0, then f does not have a local extremum
at c. In other words, if f has a local extremum at c, then f is either not
differentiable at c or f

(c) = 0.
To have an abbreviation for the points which are recognized as important
in this theorem, it is customary to say:
Definition 2.32 (Critical Points). Let f be a function and c an interior
point of its domain. We say that c is a critical point of f if f is differentiable
at c and f

(c) = 0, or if f is not differentiable at c.
Theorem 2.31 provides us with a necessary condition. If a function has
a local extremum at c, then c is a critical point of the function. No local
extrema can occur at points which are not critical. The test does not give
a sufficient condition for a local extremum. If c is a critical point of the
function, then the function need not have a local extremum at c. It makes
sense to introduce one more word.
Definition 2.33 (Saddle Points). Let f be a function and c an interior
point of its domain. We say that c is a saddle point of f if f is differentiable
at c and f

(c) = 0, but f does not have a local extremum at c.
Proof of Theorem 2.31. Suppose that f is differentiable at c and f

(c) > 0.
Proposition 2.22 on page 76 tells us that there exists some positive number
d, such that f(x) < f(c) for all x ∈ (c − d, c), and f(x) > f(c) for all
x ∈ (c, c +d). So, there are points x to the left of and arbitrarily close to c
such that f(x) < f(c), and there are points x to the right of and arbitrarily
close to c such that f(x) > f(c). This means, by definition, that f does not
have a local extremum at c. If f

(x) < 0, then the same argument applies
with inequalities reversed. If f

(c) = 0, then either f

(c) > 0 or f

(c) < 0,
and in neither case we have an extremum at c.
Neither the exponential function nor the logarithm function have local
extrema. To see this, observe that these functions are differentiable on their
domain, and their derivatives exp

x = exp x and ln

x = 1/x are every-
where nonzero. These functions have no critical points, and according to
Theorem 2.31 they have no local extrema.
84 CHAPTER 2. GLOBAL THEORY
Example 2.34. Find the local extrema of the function
q(x) = x
2
−2x + 3.
Solution: The function is differentiable for all real numbers x, and
q

(x) = 2x −2 = 2(x −1).
So q

(x) = 0 if x = 1. The only point at which we can have a local extremum,
i.e., the only critical point, is x = 1. If we write the function in the form
q(x) = (x −1)
2
+ 2,
then we see that q does indeed that a local minimum at x = 1. You should
confirm this result by having a look at Figure 2.10, where this function is
graphed. ♦
-1 1 2 3
2
4
6
8
Figure 2.10: A local minimum
-2 -1 1 2
-8
-6
-4
-2
2
4
6
8
Figure 2.11: A saddle point
Example 2.35. Show that the function g(x) = x
3
has a saddle point at
x = 0.
Solution: The function g(x) is everywhere differentiable, and its only
critical point is at x = 0, which is the only zero of g

(x) = 3x
2
. Obviously,
g(x) > 0 for all x ∈ (0, ∞) and g(x) < 0 for all x ∈ (−∞, 0). This means
that there is no local extremum at x = 0. As g

(0) = 0 and there is no
local extremum at x = 0, the function has a saddle point at this point. This
saddle point is shown in Figure 2.11. ♦
2.6. DETECTION OF LOCAL EXTREMA 85
Let us formulate a criterion which confirms that a function has a local
extremum at a point c. It gives us a sufficient condition for a local extremum
at c.
Theorem 2.36. Suppose c is an interior point of the domain of a function
f, and suppose that for some d > 0 the function is increasing on (c−d, c] and
decreasing on [c, c + d). Then f has a local maximum at c. If the function
is decreasing on (c − d, c] and increasing on [c, c + d), then f has a local
minimum at c.
Taking advantage of the information provided by the first derivative, we
obtain the following test.
Theorem 2.37 (First Derivative Test). Suppose f is a function which
is defined and differentiable on (c − d, c + d) for some d > 0, and c is a
critical point.
1. If f

(x) > 0 for all x ∈ (c − d, c) and f

(x) < 0 for all x ∈ (c, c + d),
then f has a local maximum at c.
2. If f

(x) < 0 for all x ∈ (c − d, c) and f

(x) > 0 for all x ∈ (c, c + d),
then f has a local minimum at c.
3. If f

(x) > 0 for all x ∈ (c−d, c)∪(c, c+d), then f has a saddle point at
c. This conclusion also holds if f

(x) < 0 for all x ∈ (c−d, c)∪(c, c+d).
Let us illustrate the use of the theorem with an example.
Example 2.38. Find the local extrema of the function
f(x) = x
3
−3x
2
+ 2x + 2.
Solution: We differentiate f(x) and express f

(x) as a product of linear
factors:
f

(x) = 3x
2
−6x + 2 = 3
_
x −
_
1 +

3
3
___
x −
_
1 −

3
3
__
It is easy to determine where the factors are zero, positive and negative. We
conclude that f

(x) = 0 if x = 1 ±

3/3, f

(x) is positive on the intervals
(−∞, 1 −

3/3) and (1 +

3/3, ∞), and f

(x) is negative on the interval
(1 −

3/3, 1 +

3/3). You can see graphs of f and f

in Figures 2.12 and
2.13
86 CHAPTER 2. GLOBAL THEORY
0.5 1 1.5 2
1.2
1.4
1.6
1.8
2.2
2.4
Figure 2.12: f(x) = x
3
− 3x
2
+
2x + 2
0.5 1 1.5 2
-1
1
2
3
4
Figure 2.13: f

(x) = 3x
2
−6x +2
The only only critical points of f are at x = 1±

3/3, and these are the
only points where a local extremum can occur. Based on the sign of f

(x) on
intervals to the left and right of these two critical points we see that f has
a local maximum at x = 1 −

3/3 and a local minimum at x = 1 +

3/3.

Exercise 18. Find the local extrema of the following function:
(1) f(x) =
x
2
+ 3x
x −1
(2) g(x) = sin2x + 2 sin x for x ∈ [0, 2π].
Hint: We discussed the monotonicity properties of these functions in Exam-
ples 2.18 and 2.19.
Exercise 19. Find the local extrema of the following functions. In the last
two problems, (g) and (h), restrict yourself to the interval [0, 2π].
(a) f(x) = 3x
2
+ 5x + 7
(b) f(x) = x
3
−3x
2
+ 6
(c) f(x) = (x + 3)/(x −7)
(d) f(x) = x + 1/x
(e) f(x) = x
3
(1 +x)
(f) f(x) = x/(1 +x
2
)
(g) f(x) = cos 2x + 2 cos x
(h) f(x) = sin
2
x −

3 sin x
Hint: You discussed the intervals of monotonicity for these functions in
Exercise 16.
2.6. DETECTION OF LOCAL EXTREMA 87
We may use the second derivative to detect the change of sign of the first
derivative, as it is called for in the assumptions in Theorem 2.37.
Theorem 2.39 (Second Derivative Test). Let f be a function and c an
interior point in its domain. Assume also that f

(c) and f

(c) exist and that
f

(c) = 0. If f

(c) > 0, then f has a local minimum at c. If f

(c) < 0, then
f has a local maximum at c.
To apply the theorem to the detection of the local extrema of a differ-
entiable function f(x), we differentiate f and find the critical points, the
zeros of f

(x). Then we differentiate f

(x). The sign of f

at the critical
points tells us whether we found a local minimum or a local maximum. If
f

(c) = f

(c) = 0, then the test is inconclusive. There may or may not be a
local extremum at c. Furthermore, the function f can have a local extremum
at c, and the assumptions of the test are not satisfied. In this sense, the test
provides us with a sufficient condition for the existence of a local extremum
at a point. It does not provide us with a necessary condition.
Example 2.40. Find the local extrema of the function (for a graph, see
Figure 2.2 on page 72)
p(x) = x
3
−3x
2
−9x + 3.
Solution: We calculated the first derivative,
p

(x) = 3x
2
−6x −9 = 3(x + 1)(x −3).
The critical points of the function are x = −1 and x = 3. Furthermore,
p

(x) = 6x −6 = 6(x −1).
In particular, p

(−1) = −12 and p

(3) = 12. The second derivative test
tells us that we have a local maximum at x = −1, because this is a critical
point and p

(−1) < 0. We also have a local minimum at x = 3 because at
this critical point the second derivative of the function is positive. ♦
Proof of the Second Derivative Test. First, let us assume that f

(c) = 0 and
f

(c) > 0. We will show that f has a local minimum at c. The assumption
that f

(c) = 0 means that the tangent line to the graph of f at (c, f(c))
is horizontal. Its equation is l(x) = f(c). The assumption that f

(c) > 0
means that f is concave up at c (see Theorem 2.27 (1)). Spelled out explicitly
this means that
f(x) > l(x) = f(c)
88 CHAPTER 2. GLOBAL THEORY
for some positive number d and for all x ∈ (c − d, c) ∪ (c, c + d). In other
words, f has a local minimum at c.
The proof that f has a local maximum at c if f

(c) = 0 and f

(c) < 0 is
similar. We leave it to the reader.
Exercise 20. Find the critical points and the local extrema.
(a) f(x) = 4x
2
−7x + 13
(b) f(x) = x
3
−3x
2
+ 6
(c) f(x) = x + 3/x
(d) f(x) = x
2
(1 −x)
(e) f(x) = |x
2
−16|
(f) f(x) = x
2
/(1 +x
2
).
2.7 Detection of Inflection Points
We defined an inflection point to be a point at which the concavity of a
function changes. If we know where the function is concave up and down,
then we can just answer this question. We want to detect inflection points
more efficiently. A theorem provides a necessary and a sufficient condition
for the existence of an inflection point. Let us start out with an example.
Example 2.41. Find the the inflection points of the function
g(x) = x
3
−4x
2
+ 3x −5.
-0.5 0.5 1 1.5 2 2.5
-0.5
0.5
1
1.5
2
2.5
Figure 2.14: The graph of g.
-0.5 0.5 1 1.5 2 2.5
-2
2
4
6
8
Figure 2.15: The graph of g

.
2.7. DETECTION OF INFLECTION POINTS 89
You see the graph of g in Figure 2.14 and the one of g

in Figure 2.15.
We calculate the first and second derivative of g:
g

(x) = 3x
2
−8x + 3 and g

(x) = 6x −8.
From the formula for the second derivative we conclude that
g

(x) < 0 if x ∈ (−∞, 4/3) and that g

(x) > 0 if x ∈ (4/3, ∞).
This means that g is concave down on the interval (−∞, 4/3] and concave
up on [4/3, ∞). By definition, we have an inflection point at x = 4/3. You
see the inflection point indicated as a dot in Figure 2.14. You also see that
g

(x) has a local extremum at the same point. ♦
Theorem 2.42. Let f be a function and c an interior point of its domain.
Suppose that the first and second derivatives of f exist at c.
1. If f has an inflection point at c, then f

(c) = 0.
2. If f

(c) = 0, f

(c) exists and f

(c) = 0, then f has an inflection
point at c.
Example 2.43. Find the inflection points of
f(t) = 2t
4
−6t
3
+ 5t
2
−7t + 4.
Solution: We calculate the second derivative of the function and find
f

(t) = 24t
2
−36t + 10.
According to the theorem, we have to find the zeros of f

(x) to determine
where an inflection point can be. The roots are
t =
3
4
±
1
12

21 =
9 ±

21
12
.
Now, let us check whether there are inflection points at either of these values
for t. We calculate the third derivative of f:
f

(t) = 48t −36.
We could plug t = (9 ±

21)/12 into the expression for f

, but this is a
bit cumbersome. We see right away that f

(t) = 0 exactly if t = 3/4, and
this means that f

(9 ±

21)/12) = 0. The theorem says that the inflection
points of f(t) are at t = (9 ±

21)/12. ♦
90 CHAPTER 2. GLOBAL THEORY
Apparently, our ability it find inflection points of a function is limited by
our ability to find the zeros of its second derivative. If we are given graphical
information, then this quite easy.
Example 2.44. Find the inflection points of the function
f(x) =
_
1.2 +x
2
−3(sin x)
3
.
-3 -2 -1 0 1 2 3
0.5
1
1.5
2
2.5
3
Figure 2.16: The graph of f.
-3 -2 -1 1 2 3
-2
2
4
6
8
Figure 2.17: The graph of f

.
Apparently, it will take an effort to calculate the second derivative of
this function, and it will be nearly impossible to find the zeros of f

. Any
reasonable software has no problem with this. We asked the computer to
graph f and f

for x ∈ [−3, 3]. You see the graphs in Figures 2.16 and 2.17.
A look at the graph of f barely reveals some of the inflection points, but
the graph of f

shows them clearly. Zooming in on parts of the graph f will
not improve this. At least in this example, the graph of f

tells us much
more about the concavity of the function f than its own graph. ♦
Exercise 21. Discuss the relation between the inflection points of a function
f and the local extrema of its derivative f

.
2.8 Absolute Extrema of Functions
We said that a function f has a local maximum at c if its value at c is largest
in comparison to the values at point near c. In many cases we like to find the
maximal value of a function, and where it occurs, anywhere in the domain
of the function. This concept is captured in
2.8. ABSOLUTE EXTREMA OF FUNCTIONS 91
Definition 2.45. Let f be a function, and c a point in its domain. We say
that f has an absolute maximum at c if f(x) ≤ f(c) for all x in the domain
of f. Then we call f(c) the absolute maximum of f. If f(x) ≥ f(c) for all
x in the domain of f, then we say that f has an absolute minimum at c,
and we call f(c) the absolute minimum of f.
A different expression is to say that the function assumes its absolute
extremum at c.
Theorem 2.46. A continuous function on a closed interval [a, b] assumes
its absolute maximum and minimum either at a critical point or at an end-
point of the interval.
Proof. In Theorem 1.17 we asserted that a continuous function assumes
its absolute maximum at some point in the interval. If the function does
not assume its absolute maximum at an endpoint, then it does so at some
interior point c, and the function has a local maximum at c. If f is not
differentiable at c, the c is critical. If f is differentiable at c, then f

(c) = 0
by Theorem 2.31, and c is critical as well. The argument for the absolute
minimum is left to the reader.
Example 2.47. Find the absolute extrema of the function
f(x) = x
3
−5x
2
+ 6x + 1
for x ∈ [0, 4].
1 2 3 4
2
4
6
8
Figure 2.18: x
3
−5x
2
+ 6x + 1.
1 2 3 4
-2.5
2.5
5
7.5
10
12.5
Figure 2.19: 3x
2
−10x + 6
Solution: According to Theorem 2.46, the absolute extrema of the func-
tion occur either at one of the end points x = 0, x = 4, or at a critical
92 CHAPTER 2. GLOBAL THEORY
point. In fact, f(0) = 1 and f(4) = 9. The critical points, i.e., the zeros of
f

(x) = 3x
2
− 10x + 6, are x = (5 ±

7)/3. Approximate values of these
roots are 2.5486 and .7848. You may also check that f

(x) = 6x −10, and
f

((5 +

7)/3) > 0 and f

((5 −

7)/3) < 0.
The second derivative test tells us that the function has a local minimum at
x = (5 +

7)/3 and a local maximum at x = (5 −

7)/3. The approximate
values of the function at these points are
f((5 +

7)/3) = 3.1126 and f((5 −

7)/3) = .3689.
Comparing the values of f(x) at these four points, we conclude that the
function assumes its absolute maximum of 9 at x = 4, and its absolute
minimum of approximately .3689 at x = (5 −

7)/3.
You may compare our calculation with the graphs of f in Figure 2.18
and the one of f

in Figure 2.19. ♦
Exercise 22. Find the absolute extrema of the functions on the indicated
intervals.
(a) f(x) = x
2
−5x + 2 for x ∈ [0, 5]
(b) f(x) = x
3
+ 3x
2
−5x + 2 for x ∈ [−3, 2.5]
(c) f(x) =

2 +x/

1 +x for x ∈ [0, 5]
(d) f(x) = cos 2x + 2 cos x for 0 ≤ x ≤ 2π
(e) f(x) = sinx + cos x for 0 ≤ x ≤ 2π
2.9 Optimization Story Problems
Many real-life problems are formulated as optimization problems. Calculus
helps us to solve these optimization problems. To avoid lenghty introduc-
tions to real-life problems, we content ourselves with problems of an algebraic
or geometric nature. We consider a few examples and give some problems
for practice.
Example 2.48. Cut a string of length 50 centimeters into two pieces. Use
one piece as the perimeter of an equilateral triangle and the other one as the
perimeter of a disk. How long should each piece be, so that the combined
2.9. OPTIMIZATION STORY PROBLEMS 93
area of the triangle and the circle is minimal? How long should each piece
be, so that the combined area of the triangle and the circle is maximal?
In our solution we will go through several steps.
Introduction of notation: There are many ways to set up the notation
to solve this problem. Among them we say that the side length of the triangle
is a and the radius of the circle is r.
Express information as equations: The perimeter of the triangle
will be 3a and the perimeter of the circle will be 2πr. This means that
3a + 2πr = 50 and a =
50 −2πr
3
.
The height of the triangle is h =
a

3
2
, and its area is
a
2

3
4
. The area of the
disk is πr
2
. The combined area of the triangle and disk is
A =
a
2

3
4
+πr
2
=

3
4
_
50 −2πr
3
_
2
+πr
2
.
For this to make sense, we need that 0 ≤ r ≤ 25/π.
Formulate the problem mathematically: Find the absolute mini-
mum (maximum) of the function
A(r) =

3
4
_
50 −2πr
3
_
2
+πr
2
.
for r ∈ [0, 25/π].
Solve the mathematical problem: The derivative of A(r) is
A

(r) = −

3
·

3
4
_
50 −2πr
3
_
+ 2πr =
−π

3
_
50 −2πr
3
_
+ 2πr,
and A

(r) = 0 if and only if r = 50/(2π + 6

3). We note that A(r) is a
parabola which is open upwards. The critical point, which we just found, is
where the local minimum occurs. It is also the absolute minimum of A(r) on
any interval which contains the critical point. For the end points we have:
A(0) ≈ 120.28 amd A(25/π) ≈ 198.94.
Answer the original question: The combined area of the disk and
the triangle will be minimal if r = 50/(2π + 6

3), and it will be maximal
of r = 25/π. In the latter case, all string is used for the circle. ♦
Exercise 23. Repeat the previous example with
1. a disk and a square.
94 CHAPTER 2. GLOBAL THEORY
2. an equilateral triangle and a square.
3. a regular hexagon and a square.
4. a disk and half an equilateral triangle (the angles at 30, 60 and 90
degrees).
5. two geometric shapes of your own choice.
Example 2.49. Construct an open box from a rectangular piece of card
board of length L and width W. What are the dimensions of the box with
the largest possible volume?
In our solution we will go through several steps.
Clarification and introduction of notation: We construct the box
by making an incision at a 45 degree angle at each corner. Then we fold
up a strip of width x along each side
8
. For yourself, draw a picture of
this production process, and convince yourself that any box obtained by a
different process will have smaller volume. To simplify matters, we call the
longer side of the rectangle L and the shorter one W.
Express information as equations: As we folded up a strip of width
x, the box will have width W −2x, length L −2x, height x, and volume
V (x) = (W −2x)(L −2x)x = WLx −2(L +W)x
2
+ 4x
3
.
By construction, x ≥ 0, x ≤ W/2, and x ≤ L/2, in fact x ≤ W/2.
Formulate the problem mathematically: Find the absolute maxi-
mum of the function
V (x) = WLx −2(L +W)x
2
+ 4x
3
for x ∈ [0, W/2].
Solve the mathematical problem: At the end points of the interval
V vanishes, i.e., V (0) = V (W/2) = 0. On the interior of the interval the
function is positive. The derivative of V is
V

(x) = WL −4(W +L)x + 12x
2
.
The zeros of V

are at
x =
1
6
_
(L +W) ±
_
L
2
+W
2
−LW
_
.
8
You could have cut out a square of size x ×x at each corner.
2.9. OPTIMIZATION STORY PROBLEMS 95
The function has an inflection point at (W +L)/6, to the right of which V

is positive and V is concave up, and to the left of which V is concave down.
We conclude that V has a local maximum at
x =
1
6
_
(L +W) −
_
L
2
+W
2
−LW
_
.
As the function V (x) has only one local maximum in the interval, the local
maximum is the same as the absolute maximum.
Answer the original question: The box with the largest volume will
have a height of
x =
1
6
_
(L +W) −
_
L
2
+W
2
−LW
_
.
Its width will be W −2x and its length L −2x. ♦
Exercise 24. Repeat the previous example with specific numbers for the
width and length of the piece of card board.
Exercise 25. Start out with an equilateral piece of card board with side
length a. Make incisions at the corners, and fold up strips along the edges.
You will get an open box whose base is an equilateral triangle. How broad
should the folded up strips be, so that the volume of the box is maximal?
Exercise 26. Modify the problem from above, constructing a box with a
round base from a circular piece of card board.
Exercise 27. What is the largest possible volume for a right circular cone
of slant height a?
Example 2.50. Determine the rectangle of maximal area which can be
placed between the x-axis and the graph of the function f(x) = sin x.
Solution: Draw a graph of sin x so that you can follow the discussion.
Convince yourself that the vertices of the rectangle should be (x, 0), (π −
x, 0), (x, sin x) and (π − x, sin x) for some x ∈ [0, π/2]. The width of the
rectangle is π −2x and its height is sin x, so that its area is
A(x) = (π −2x) sin x.
We need to find the absolute maximum for this function for x ∈ [0, π/2].
The first derivative of this function is A

(x) = −2 sin x + (π − 2x) cos x.
After a simple algebraic simplification, you find that
A

(x) = 0 if and only if tan x =
π −2x
2
.
96 CHAPTER 2. GLOBAL THEORY
Find an approximate solution of the equation using Newton’s method or your
calculator. A fairly good approximation of the zero of A

(x) is x
0
= .710462.
Convince yourself
9
that this is the only zero of A

(x) for x ∈ [0, π/2]. We
conclude that x
0
is the only critical point of A(x).
You may calculate A

(x). Substituting x
0
you will see that A

(x
0
) < 0.
It follows from the second derivative test that A(x) has a local maximum
at x
0
. Apparently A(x) = 0 at the end points x = 0 and x = π/2 of the
interval. This tells us that A(x) assumes its absolute maximum at x
0
.
With this, the final answer to our problem is: The rectangle of maximal
area which can be placed between the x-axis and the graph of the sine
function will have a width of approximately π −2x
0
= 1.72066 and a height
of sinx
0
= .652183. Its area will be about 1.12218. ♦
To find the absolute extrema of a continuous function on an interval of
the form [a, b] we could inspect the values of the function at the critical
points and at a and b. It allows us to decide whether a local extremum is
also an abolute one. Our next result allows us to do the same even if the
interval is not closed and bounded. The assumptions of this theorem are
satisfied in many applied problems.
Theorem 2.51. Suppose f is defined on an interval I.
(a) If f is concave up on I and has a local minimum at x
0
, then f assumes
its absolute minimum at x
0
.
(b) If f is concave down on I and has a local maximum at x
0
, then f
assumes its absolute maximum at x
0
.
Example 2.52. Find the absolute minimum of the function
f(x) = x +
1
x
for x ∈ (0, ∞).
Solution: We calculate the first and second derivative of f(x):
f

(x) = 1 −
1
x
2
and f

(x) =
2
x
3
.
We find that f

(x) = 0 if x = 1, and that f

(x) > 0 for all x in (0, ∞). So f
has a local minimum at x = 1, and f is concave up on (0, ∞). Theorem 2.51
tells us that the absolute minimum of the function is f(1) = 2. ♦
9
One possible argument is that tan x is increasing on the interval [0, π/2), and that
π−2x
2
is decreasing. So these functions can intersect in only one point.
2.9. OPTIMIZATION STORY PROBLEMS 97
Exercise 28. Find the largest possible area for a rectangle with base on
the x-axis and upper vertices on the curve y = 4 −x
2
.
Exercise 29. A rectangular warehouse will have 5000 m
2
of floor space and
will be separated into two rectangular rooms by an interior wall. The cost of
the exterior walls is $ 1,000.00 per linear meter and the cost of the interior
wall is $ 600.00 per linear meter. Find the dimensions of the warehouse that
minimizes the construction cost.
Exercise 30. One side of a rectangular meadow is bounded by a cliff, the
other three sides by straight fences. The total length of the fence is 600
meters. Determine the dimensions of the meadow so that its area is maximal.
Exercise 31. Draw a rectangle with one vertex at the origin (0, 0) in the
plane, one vertex on the positive x-axis, one vertex on the positive y-axis,
and one vertex on the line 3x + 5y = 15. What are the dimensions of a
rectangle of this kind with maximal area?
Exercise 32. Two hallways, one 8 feet wide and one 6 feet wide, meet at a
right angle. Determine the length of the longest ladder that can be carried
horizontally from one hallway into the other one.
Exercise 33. Inscribe a right circular cylinder into a right circular cone of
height 25 cm and radius 6 cm. Find the dimensions of the cylinder if its
volume is the be a maximum.
Exercise 34. A right circular cone is inscribed in a sphere of radius R.
Find the dimensions of the cone if its volume is to be maximal.
Exercise 35. Find the dimensions of a right circular cone of minimal vol-
ume, so that a ball of radius 10 centimeters can be inscribed.
Exercise 36. Consider a triangle in the plane with vertices (0, 0), (a, 0),
and (0, b). Suppose that a and b are positive, and that (2, 5) lies on the line
through the points (a, 0), and (0, b). What should the slope of the line be,
so that the area of the triangle is minimal?
Exercise 37. Minimize the cost of the material needed to make a round
drum with a volume of 200 liter (i.e., .2 m
3
) if
(a) the drum has a bottom and a top, and the same material is used for
the top, bottom and sides.
(b) the drum has no top (but a bottom) and the same material is used for
the bottom and sides.
98 CHAPTER 2. GLOBAL THEORY
(c) the drum has a bottom and a top, the same material is used for the
top and bottom, and the material for the top and bottom is twice as
expensive as the material for the sides.
(d) the situation is as in the previous case, but the top and the bottom
are cut out of squares, and the left over material is recycled for half
its value.
Exercise 38. Consider a box with a round base and no lid whose interior
is subdivided into six wedge shaped sectors. Which shape should it have,
so that its volume is maximal, assuming you are allowed a fixed amount of
material? More specifically determine the ratio of radius and height which
will maximize the volume.
Exercise 39. Design a roman window with a perimeter of 4 m which admits
the largest amount of light. (A roman window has the shape of a rectangle
capped by a semicircle.)
Exercise 40. A rectangular banner has a red border and a white center.
The width of the border at top and bottom is 15 cm, and along the sides
10 cm. The total area is 1 m
2
. What should be the dimensions of the banner
if the area of the white area is to be maximized?
Exercise 41. A power line is needed to connect a power station on the
shore line to an island 2 km off shore. The point on the coast line closest to
the island is 6 km from the power station, and, for all practical purposes, you
may suppose that the shore line is straight. To lay the cable costs $40,000
per kilometer under ground and $70,000 under water. Find the minimal
cost for laying the cable.
Exercise 42. Consider the distance D(x) between a point P(x) = (x, f(x))
on the graph of a differentiable function f(x) and a point Q = (x
0
, y
0
)
not on this graph. Suppose D(x) has a local minimum at x
1
. Then the
tangent line to the graph of f at x
1
intersects the line joining P(x
1
) and Q
perpendicularly.
2.10 Sketching Graphs
The techiques which we developed so far provide us with some valuable tools
for graphing functions. Let us make a list of data which we may determine,
so that we can sketch a graph rather precisely. Going through the following
2.10. SKETCHING GRAPHS 99
program is also a good review of the material which we developed in this
chapter.
Useful information for graphing a function: We call the function
f(x).
(a) Plot some points on the graph, such as the y-intercept. If the function
is given on a closed interval, plot the values at its endpoints.
(b) Plot the zeros of the function. If you cannot find the zeros by analytical
means, try it numerically (Newton’s method).
(c) If possible, decide on which intervals the function is positive, resp.,
negative.
(d) Find the first derivative f

(x) of f(x).
(e) Repeat (b) and (c) with f

(x) in place of f(x). Intervals on which
f

(x) is positive give you intervals on which f(x) is increasing, and
intervals on which f

(x) is negative give you intervals on which f(x)
is decreasing. The zeros of f

(x) provide you with the critical points
of f(x). Plot the critical points (x and y value), and keep track of the
intervals on which the function is increasing, resp., decreasing.
(f) Find the second derivative f

(x) of f(x).
(g) Repeat (b) and (c) with f

(x) in place of f(x). Intervals on which
f

(x) is positive give you intervals on which f(x) is concave up, and
intervals on which f

(x) is negative give you intervals on which f(x)
is concave down. Find the inflection points of the function, i.e., the
points where the concavity changes. Plot the inflection points (x and y
value), and keep track of the intervals on which the function is concave
up, resp., concave down.
(h) Decide at which critical points of f(x) the function has a saddle point
or local extremum, and whether it is a minimum or a maximum.
If you now draw a graph which exhibits all of the properties which you
gathered in the course of the suggested program, then your graph will look
very much like the graph of f(x). More importantly, the graph will have all
of the essential features of the graph of f(x). Let us go through the program
in an example.
100 CHAPTER 2. GLOBAL THEORY
Example 2.53. Discuss the graph of the function
f(x) = x
4
−2x
3
−3x
2
+ 8x −4 for x ∈ [−3, 3].
Solution: To make the discussion a little easier, we note that
f(x) = (x −1)
2
(x
2
−4) = (x −1)
2
(x −2)(x + 2). (2.1)
You should verify this by multiplying out the expression for f(x) in (2.1).
(a): Plot the y intercept of the function and its values at the end points
of the given interval: f(−3) = 80, f(0) = −4 and f(3) = 20.
(b): As a polynomial, the function f(x) is differentiable on the given
interval. The only exceptional points are its zeros. Having written f(x) as
in (2.1), we see right away that f(x) = 0 if and only if x = −2, x = 1, or
x = 2. Plot these x-intercepts.
(c): Counting the signs of the factors of f(x), we see that f(x) is positive
on the intervals [−3, −2) and (2, 3], and negative on (−2, 1) and (1, 2).
(d): We calculate the derivative of f(x):
f

(x) = 2(x −1)(x
2
−4) + (x −1)
2
2x = 2(x −1)(2x
2
−x −4).
We based the calculation on the description of f(x) in (2.1). In the first
step we applied the product rule, and then we used elementary algebra.
(e): We use the quadratic formula to find the zeros of the factor 2x
2
−x−4
in the expression for f

(x). They are (1 ±

33)/4. This allows us to factor
the expression for f

(x), and we find:
f

(x) = 4(x −1)
_
x −
1
4
[1 +

33]
__
x −
1
4
[1 −

33]
_
.
We conclude that:
• f

(x) is negative on the interval [−3, (1−

33)/4) and f(x) is decreas-
ing on [−3, (1 −

33)/4].
• f

(x) is positive on the interval ((1−

33)/4, 1) and f(x) is increasing
on [(1 −

33)/4, 1].
• f

(x) is negative on the interval (1, (1+

33)/4) and f(x) is decreasing
on [1, (1 −

33)/4].
• f

(x) is positive on the interval ((1+

33)/4, 3] and f(x) is increasing
on [(1 +

33)/4, 3].
2.10. SKETCHING GRAPHS 101
• f(x) has a critical point and local minimum at (1 −

33)/4 ≈ −1.19,
a critical point and local maximum at x = 1, and a critical point and
local minimum at (1 +

33)/4 ≈ 1.69.
The values of the function at its three critical points are approximately:
f(
1 −

33
4
) ≈ −12.39 & f(1) = 0 & f(
1 +

33
4
) ≈ −.54.
Plot these points.
(f): We rewrite the first derivative as f

(x) = 4x
3
− 3x
2
− 3x + 4, and
find
f

(x) = 12x
2
−12x −6.
(g): We use the quadratic formula to find the zeros on f

(x) and factor
it:
f

(x) = 12
_
x −
1
2
[1 +

3]
__
x −
1
2
[1 −

3]
_
.
We conclude that:
• f

(x) is positive on the interval [−3, (1 −

3)/2) and f(x) is concave
up on [−3, (1 −

3)/2]
• f

(x) is negative on the interval ((1 −

3)/2, (1 +

3)/2) and f(x) is
concave down on [(1 −

3)/2, (1 +

3)/2]
• f

(x) is positive on the interval ((1+

3)/2, 3] and f(x) is concave up
on [(1 +

3)/2, 3]
• f(x) has inflection points at x = (1 −

3)/2 ≈ −.37 and at x =
(1 +

3)/2 ≈ 1.37.
The values of the function at its inflection points is approximately:
f(
1 −

3
2
) ≈ −7.21 & f(
1 −

3
2
) ≈ −.29.
Plot these points.
(h): At this point we could use the second derivative test to find at which
critical points the function has local extrema, but we decided this already
based on first derivative behaviour in (e).
102 CHAPTER 2. GLOBAL THEORY
Let us gather and organize our information. We consider the interval:
I
1
= [−3, −2]
I
2
=
_
−2,
1 −

33
4
_
I
3
=
_
1 −

33
4
,
1 −

3
2
_
I
4
=
_
1 −

3
2
, 1
_
I
5
=
_
1,
1 +

3
2
_
I
6
=
_
1 +

3
2
,
1 +

33
4
_
I
7
=
_
1 +

33
4
, 2
_
I
8
= [2, 3] .
We tabulate the which properties hold on which interval. It should be
understood, that at some end points of intervals the function is zero.
Property I
1
I
2
I
3
I
4
I
5
I
6
I
7
I
8
Sign pos neg neg neg neg neg neg pos
Monotonicity dec dec inc inc dec dec inc inc
Concavity up up up down down up up up
Table 2.1: Properties of the Graph
In Figure 2.20 you see the graph of the function. We have shown it on
a slightly smaller interval, as the values at the endpoint a comparetively
large. Showing all of the graph would show less clearly what happens near
the intercept, extrema, and inflection points. The dots indicate the points
which we suggests to plot.
In Figure 2.21 you see the graph of f on an even smaller interval, and
parts of the graphs of f

and f

. You can use them to see that f is decreasing
where f

is negative, f is concave down where f

is negative, etc. ♦
Exercise 43. In analogy with the previous example, discuss the function
f(x) = (x −1)(x −2)(x + 2) = x
3
−x
2
−4x + 4
on the interval [−3, 2.5]. In addition, find the absolute extrema of this
function.
2.10. SKETCHING GRAPHS 103
-3 -2 -1 1 2 3
-10
-5
5
10
Figure 2.20: The Graph
-2 -1 1 2
-10
-5
5
10
Figure 2.21: f, f

, f

Exercise 44. In analogy with the previous example, discuss the function
f(x) = x
3
−3x + 2
on the interval [−2, 2]. In addition, find the absolute extrema of this func-
tion.
Exercise 45. In analogy with the previous example, discuss the function
f(x) = 2 sin x + cos 3x
on the interval [0, 2π]. In addition, find the absolute extrema of this function.
You may have to apply Newton’s method to find zeros of f, f

, and f

.
104 CHAPTER 2. GLOBAL THEORY
Chapter 3
Integration
We will introduce the ideas of the definite and the indefinite integral. Sup-
pose that f is a function which is defined and bounded on the interval [a, b].
If it exists, then the definite integral of f over the interval [a, b] is a real
number. It is denoted by
_
b
a
f(x) dx.
The definition is set up, so that for a non-negative function it makes sense
to think of the integral as the area of the region bounded by the graph of
the function, the x-axis, and the lines x = a and x = b.
The indefinite integral of a function f is the family (set) of all antideriva-
tives of f, i.e. all functions whose derivative is f. For important classes of
functions one may utilize definite integrals to construct antiderivatives. The
Fundamental Theorem of Calculus relates definite integrals and antideriva-
tives.
To be concrete, consider the function f(x) = x
2
e
−x
, shown in Figure 3.1,
and find the area of the region Ω bounded by the graph of f(x), the lines
x = 1 and x = 5, and the x-axis.
3.1 Properties of Areas
So far, we only know the area of some simple regions, like rectangles. We
will denote the area of a region Ω by Area(Ω). Whatever concept of area we
have in mind, it should have the following properties:
• The area of a rectangle is the product of the lengths of its sides.
105
106 CHAPTER 3. INTEGRATION
1 2 3 4 5 6
0.1
0.2
0.3
0.4
0.5
0.6
Figure 3.1: f(x) = x
2
e
−x
• Suppose that Ω
1
and Ω
2
are regions in the plane, and that the area of
each of them is defined.
If Ω
1
⊆ Ω
2
, then Area(Ω
1
) ≤ Area(Ω
2
).
• Suppose that Ω
1
and Ω
2
are regions in the plane, and that the area
of each of them is defined. If the regions Ω
1
and Ω
2
do not intersect,
then the area of the union Ω
1
∪ Ω
2
of Ω
1
and Ω
2
is defined, and
Area(Ω
1
∪ Ω
2
) = Area(Ω
1
) + Area(Ω
2
).
Suppose for a moment, that the region under the graph shown in Fig-
ure 3.1 has an area. In Figure 3.2 you see a rectangle R
l
with area .6, which
is contained in Ω. In Figure 3.3 you see a rectangle R
u
with area 2.24 which
contains Ω. The first two principles tell us that
Area(R
l
) = .6 ≤ Area(Ω) ≤ Area(R
u
) = 2.24.
From above principles one may derive another one, which occurs fre-
quently in our upcoming constructions:
3.2. PARTITIONS AND SUMS 107
1 2 3 4 5 6
0.1
0.2
0.3
0.4
0.5
0.6
Figure 3.2: A rectangle R
l
con-
tained in Ω
1 2 3 4 5 6
0.1
0.2
0.3
0.4
0.5
0.6
Figure 3.3: A rectangle R
u
con-
taining Ω
• Suppose the region R in the plane is the union of a finite number of
rectangles R
1
, . . . , R
n
and any two of them intersect at most in an
edge. Then Area(R) is defined, and it is equal to the sum of the areas
of the regions R
1
, . . . , R
n
:
Area(R) = Area(R
1
) +· · · + Area(R
n
).
3.2 Partitions and Sums
We like to refine the approach to calculating areas of regions which we
started in the previous section. We do so by partitioning the interval before
applying the ideas from above, and then we add up what we get over the
individual intervals.
A partition of an interval [a, b] is of a collection is points {x
j
| 0 ≤ j ≤ n},
such that
a = x
0
≤ x
1
≤ · · · ≤ x
n−1
≤ x
n
= b.
The interval [a, b] is partitioned into n intervals [x
j−1
, x
j
] with 1 ≤ j ≤ n.
108 CHAPTER 3. INTEGRATION
3.2.1 Upper and Lower Sums
As before, f denotes a function which is defined and bounded on [a, b]. On
each interval we pick numbers m
j
and M
j
, such that
m
j
≤ f(x) ≤ M
j
for all x ∈ [x
j−1
, x
j
].
We define the lower sum to be
S
l
= m
1
(x
1
−x
0
) +m
2
(x
2
−x
1
) +· · · +m
n
(x
n
−x
n−1
). (3.1)
and the upper sum to be
S
u
= M
1
(x
1
−x
0
) +M
2
(x
2
−x
1
) +· · · +M
n
(x
n
−x
n−1
). (3.2)
These sums depend on the choice of partition and the choices for the m
j
and M
j
.
1 2 3 4 5 6
0.1
0.2
0.3
0.4
0.5
0.6
Figure 3.4: A union of rectangles
contained in Ω
1 2 3 4 5 6
0.1
0.2
0.3
0.4
0.5
0.6
Figure 3.5: A union of rectangles
containing Ω
Let us return to the example of the function f(x) = x
2
e
−x
on the interval
[1, 4]. In the computation of the lower sum we use the partition
x
0
= 1 < x
1
= 2 < x
2
= 3 < x
3
= 4 < x
4
= 5
of the interval. We also pick m
1
= .35, m
2
= .43, m
3
= .28 and m
4
= .16.
This leads to a lower sum S
l
= 1.22. In the computation of the upper sum
we use the partition
x
0
= 1 < x
1
= 3 < x
2
= 4 < x
3
= 5
3.2. PARTITIONS AND SUMS 109
of the interval. We also pick M
1
= .55, M
2
= .45 and M
3
= .3. This leads
to an upper sum S
u
= 1.85. The m
j
and M
j
represent the heights of the
rectangles in Figures 3.4 and 3.5, and we trust these figures to show that
m
j
≤ f(x) and f(x) ≤ M
j
on the respective interval.
As before, let Ω denote the region under the graph. Then the union of
the rectangles shown in Figure 3.4 is contained in Ω, and the union of the
rectangles shown in Figure 3.5 contains Ω. Thus, if Ω has an area, the our
principles tell us that
S
l
= 1.22 ≤ Area(Ω) ≤ S
u
= 1.85.
In fact the only number greater or equal to all lower sums and smaller
or equal to all upper sums is
5
e

37
e
5
, and this will be the area of the region
Ω. Here e is the Euler number.
Example 3.1. Let us find upper and lower sums for the function
f(x) = x
3
−7x
2
+ 14x −8
for x ∈ [.5, 4.5]. In contrast to the function in the previous example, this
function is not non-negative.
1 2 3 4
-3
-2
-1
1
2
3
4
Figure 3.6: Rectangles for calcu-
lating an upper sum.
1 2 3 4
-3
-2
-1
1
2
3
4
Figure 3.7: Rectangles for calcu-
lating a lower sum.
Solution: For the purpose of calculating an upper sum, we partitioned
the interval [.5, 4.5] using the intermediate points x
0
= .5, x
1
= 1.1, x
2
= 2.4,
x
3
= 3.8, and x
4
= 4.5. As numbers M
i
(so that M
i
≥ f(x) for x ∈ [x
i−1
, x
i
])
110 CHAPTER 3. INTEGRATION
we chose M
1
= .3, M
2
= .7, M
3
= −.9, and M
4
= 4.4. These data are shown
in Figure 3.6. With these choices, the upper sum is
S
u
= .3(1.1 −.5) +.7(2.4 −1.1) + (−.9)(3.8 −2.4) + 4.4(4.5 −3.8)
= 2.91.
In Figure 3.6 you see four rectangles. Their areas are combined to calculate
the upper sum. The areas of the ones above the x-axis are added, the ones
below the axis are subtracted, in accordance with the sign of the M
i
.
In the calculation of the lower sum we partitioned [.5, 4.5] using x
0
= .5,
x
1
= .8, x
2
= 2.3, x
3
= 4.2, and x
4
= 4.5. As numbers m
i
(so that
m
i
≤ f(x) for x ∈ [x
i−1
, x
i
]) we chose m
1
= −2.7, m
2
= −.8, m
3
= −2.2,
and m
4
= 1.3. These data are shown in Figure 3.7. With these choices we
calculate a lower sum of
S
l
= −2.7(.8 −.5) + (−.8)(2.3 −.8) + (−2.2)(4.2 −2.3) + 1.3(4.5 −4.2)
= −5.8.
In Figure 3.7 you see four rectangles. Their areas are combined to calculate
the lower sum. The areas of the ones above the x-axis are added, the ones
below the axis are subtracted, in accordance with the sign of the m
i
.
In summary, you see that we still combine areas of rectangles in the
calculation of the upper and lower sum, only that, depending on the sign
of the M
i
or m
i
, these rectangles are either above or below the x-axis, and
depending on this, their areas are either added or subtracted. ♦
Let us make a simple albeit important observation:
Theorem 3.2. Let f be a function which is defined and bounded on a closed
interval [a, b]. Let S
l
be any lower sum of f and S
u
any upper sum. Then
S
l
≤ S
u
.
Let us repeat the statement of the theorem to emphasize its meaning.
Whichever partition of the interval [a, b] and whichever m
i
we use in the
calculation of the lower sum S
l
and whichever partition of the interval and
whichever M
i
we use in the calculation of the upper sum S
u
, the lower sum
is always smaller or equal to the upper sum. To see this, one refines the
partitions for the upper and lower sum computation so that they become
the same. Then one notes that m
i
≤ M
i
for all i.
3.2. PARTITIONS AND SUMS 111
3.2.2 Riemann Sums
Suppose once again that f(x) is a function which is defined on the interval
[a, b]. Pick once more a partition
a = x
0
≤ x
1
≤ · · · ≤ x
n−1
≤ x
n
= b
of the interval. In each subinterval, pick a point x
j
∈ [x
j−1
, x
j
]. Then we
define the Riemann Sum
S
R
= f(x
1
)(x
1
−x
0
) +f(x
2
)(x
2
−x
1
) +· · · +f(x
n
)(x
n
−x
n−1
). (3.3)
We leave it to the reader to contemplate
Proposition 3.3. Let f be a function which is defined and bounded on a
closed interval [a, b]. Let S
l
be any lower sum of f, S
u
any upper sum, and
S
R
any Riemann sum. Then
S
l
≤ S
R
≤ S
u
.
1 2 3 4 5 6
0.1
0.2
0.3
0.4
0.5
0.6
Figure 3.8: Representing a Riemann Sum
To be more concrete, let us return to the example f(x) = x
2
e
−x
on the
interval [1, 5]. Let us use the partition
x
0
= 1 < x
1
=

3 < x
2
= 5.
In the two interval of this subdivision we pick the points x
1
=

2 ∈ [1,

3]
and x
2
= π ∈ [

3, 5]. As Riemann sum we obtain
S
R
= f(x
1
)(x
1
−x
0
) +f(x
2
)(x
2
−x
1
) ≈ 1.749741.
In Figure 3.8 you see the picture illustrating the computation. There are
two rectangles, their bases are the intervals in the subdivision, and their
heights are f(x
1
) and f(x
2
). The sum of the areas of these rectangles is the
Riemann sum.
112 CHAPTER 3. INTEGRATION
3.3 Limits and Integrability
The idea is to refine the partitions in our previous construction, so that in
the limit our sums can be justifiably called the area of the region under the
graph, if the function is non-negative. The specifics depend on which sums
we are working with.
3.3.1 The Darboux Integral and Areas
As we discussed earlier, whatever choices we make in the calculation of
lower and upper sums S
l
and S
u
, we always have that S
l
≤ S
u
. A crucial
additional fact is stated in the next result.
Theorem 3.4. Let f be a function which is defined and bounded on a closed
interval [a, b]. There exists a real number Y , such that
S
l
≤ Y ≤ S
u
for all lower sums S
l
and upper sums S
u
of f.
Idea of Proof. To deduce the theorem from the completeness of the real
numbers, one observes that the set of all lower sums of f has a least upper
bound. Call it Y
l
. The set of all upper sums of f has a greatest lower
bound. Call it Y
u
. Apparently, Y
l
≤ Y
u
. Then Y is any number such that
Y
l
≤ Y ≤ Y
u
.
We are now prepared to define the concept of integrability of a function.
Definition 3.5. Let f be a function which is defined and bounded on a
closed interval [a, b]. If there is exactly one number Y , such that
S
l
≤ Y ≤ S
u
for all lower sums S
l
and all upper sums S
u
of f, then we say that f is
integrable over the interval [a, b]. In this case, the number Y is called the
integral
1
of f for x between a and b. It is also denoted by
_
b
a
f(x) dx.
1
To distinguish it from the result of a different, but typically equivalent, construction
we should Y the Darboux integral.
3.3. LIMITS AND INTEGRABILITY 113
Remark 8. For completeness sake and later use, let us explain what hap-
pens when a function is not integrable. In this case there are at least two
different numbers, and with this an entire interval, between all upper and
lower sums. So, a function over a closed interval [a, b] is not integrable if
and only if the exists a positive number D such that S
u
− S
l
≥ D for any
lower sum S
u
and any upper sum S
u
.
On the other hand, a function is integrable if for every positive number
D there is an upper sum S
u
and a lower sum S
l
such that S
u
−S
l
< D.
Example 3.6. Explore upper sums, lower sums, and integrability for the
function f(x) = x
2
on the interval [0, 1].
Solution: Fix a natural number n and set
x
0
= 0 < x
1
=
1
n
< x
2
=
2
n
< · · · < x
n−1
=
n −1
n
< x
n
=
n
n
= 1.
This is an equidistant partition of the interval [0, 1], all subintervals have
the same length 1/n.
For the upper sums we pick
M
1
= f(x
1
) =
_
1
n
_
2
, M
2
= f(x
2
) =
_
2
n
_
2
, M
3
= f(x
3
) =
_
3
n
_
2
, . . .
and M
j
= f(x
j
) =
_
j
n
_
2
in general. Apparently, M
j
≥ f(x) for all x ∈
[x
j−1
, x
j
] because f(x) is increasing on [0, 1]. Without proof, we use that
1
2
+ 2
2
+ 3
2
+· · · +n
2
=
n(n + 1)(2n + 1)
6
.
We calculate the upper sum
S
u
= M
1
(x
1
−x
0
) +M
2
(x
2
−x
1
) +· · · +M
n
(x
n
−x
n−1
)
=
_
1
n
_
2
×
1
n
+
_
2
n
_
2
×
1
n
+· · · +
_
n
n
_
2
×
1
n
=
1
n
3
_
1
2
+ 2
2
+· · · n
2
¸
=
n(n + 1)(2n + 1)
6n
3
=
1
3
+
1
2n
+
1
6n
2
For the lower sums we pick
m
1
= f(x
0
) = 0, m
2
= f(x
1
) =
_
1
n
_
2
, m
3
= f(x
2
) =
_
2
n
_
2
, . . .
114 CHAPTER 3. INTEGRATION
-0.2 0.2 0.4 0.6 0.8 1 1.2
0.25
0.5
0.75
1
1.25
Figure 3.9: Rectangles for calcu-
lating a lower sum.
-0.2 0.2 0.4 0.6 0.8 1 1.2
0.25
0.5
0.75
1
1.25
Figure 3.10: Rectangles for calcu-
lating an upper sum.
and m
j
= f(x
j−1
) =
_
j−1
n
_
2
in general. The resulting lower sum is
S
l
=
1
3

1
2n
+
1
6n
2
For n = 5 you see the rectangles whose areas are the summands in the
lower and upper sums in Figures 3.9 and 3.10.
Using the expressions for S
u
and S
l
you see that S
u
− S
l
= 1/n. We
do not only see that S
l

1
3
≤ S
u
, but also that Y = 1/3 is the only real
number, so that S
l
≤ Y ≤ S
u
for all natural numbers n. According to the
definition this means, that f(x) = x
2
is integrable over the interval [0, 1]
and that
_
1
0
x
2
dx =
1
3
. ♦
We motivated our introduction of upper and lower sums by our quest to
define the concept of area. Our answer is formulated as a
Definition 3.7. Let f be a function which is defined, bounded, and non-
negative on a closed interval [a, b]. Let Ω be the region bounded by the graph
of f, the x-axis, and the lines x = a and x = b. If f is integrable over this
interval, then we say that the region Ω has an area and
Area(Ω) =
_
b
a
f(x) dx.
3.3. LIMITS AND INTEGRABILITY 115
The upper and lower sum were constructed such that if there is any
justification to assigning an area to Ω then
S
l
≤ Area(Ω) ≤ S
u
.
For an integrable function there is exactly one real number between the
lower and upper sums, so this is the only number which we can call the area
of Ω.
For example, the area of the region Ω bounded by the graph of the
function f(x) = x
2
, the x-axis, and the lines x = 0 and x = 1 is
Area(Ω) =
_
1
0
x
2
dx =
1
3
. ♦
3.3.2 The Riemann Integral
Earlier we introduced the idea of a Riemann sum. Consider an interval [a, b]
and a function f(x) defined on it. We picked a partition
P : a = x
0
≤ x
1
≤ x
2
≤ · · · x
n−1
≤ x
n
= b,
which broke [a, b] up into smaller interval [x
j−1
, x
j
]. In each of the subin-
tervals we picked a point x
j
∈ [x
j−1
, x
j
], and set
S
R
= f(x
1
)(x
1
−x
0
) +f(x
2
)(x
2
−x
1
) +· · · +f(x
n
)(x
n
−x
n−1
).
We want to consider a limit Riemann sums. This is trickier than for
functions, because there are a lot of choices which we make to define such a
sum. We define the norm of the partition P to be
|P| = max{x
j
−x
j−1
| 1 ≤ j ≤ n},
in other words, the norm of P is the length of the longest of the intervals
[x
j−1
, x
j
].
Definition 3.8 (Limit for Riemann Sums). Suppose the function f(x)
is defined on [a, b]. We say that
L = lim
|P|→0
S
R
if for all > 0 there exists a δ > 0, such that |L−S
R
| < whenever |P| < δ.
If the limit of the S
R
exists, then we say that f is Riemann integrable over
[a, b], call L the Riemann integral of f, and write
L = lim
|P|→0
S
R
=
_
b
a
f(x) dx.
116 CHAPTER 3. INTEGRATION
Thus L = limS
R
if we can force S
R
to be close to L, as close as we like,
by making the partition fine, by making each subinterval no longer that
some number.
It is worth pointing out and not very difficult to show the following
proposition.
Proposition 3.9. Suppose the function f is defined on the interval [a, b].
Then f is Riemann integrable if and only if it is Darboux integrable. If
defined, the Riemann and the Darboux integral are the same.
3.4 Integrable Functions
We like to provide a supply of integrable functions. Our first result is typi-
cally proved in an analysis course.
Theorem 3.10. Suppose f is defined and continuous on [a, b]. Then f is
integrable over [a, b].
According to this theorem, polynomials are integrable over any interval
of the form [a, b]. Rational functions (i.e., functions of the form p(x)/q(x)
where p(x) and q(x) are polynomials) are integrable over intervals of the
form [a, b] as long as q does not vanish anywhere on the interval. The
trigonometric functions (sin, cos, tan, cot, sec, and csc) are integrable on
intervals where the functions are defined. Arbitrary powers of a variable,
f(x) = x
α
, are integrable. One just needs to make sure that the function is
defined on the interval [a, b]. For any real number α it suffices to assume that
a > 0. For any real α ≥ 0, it suffices to assume a ≥ 0. For rational numbers
α = p/q, where p and q are integers and q is odd, it suffices to assume
0 ∈ [a, b]. For non-negative integers α no assumption needs to be made on a
and b. Just making sure that the resulting functions are defined everywhere
on [a, b], the functions just mentioned may be added, subtracted, multiplied,
divided, and composed, and one still ends up with integrable functions.
Let us introduce another class of functions for which we can prove that
they are integrable.
Definition 3.11. Suppose f(x) is a function. We say that f(x) is non-
decreasing if f(x
1
) ≤ f(x
2
) whenever x
1
and x
2
are in the domain of f(x)
and x
1
≤ x
2
. We say that f(x) is non-increasing if f(x
1
) ≥ f(x
2
) whenever
x
1
≤ x
2
.
Proposition 3.12. Let [a, b] be a closed interval and let f be defined and
non-increasing or non-decreasing on [a, b]. Then f is integrable on [a, b]. In
particular, monotonic (increasing or decreasing) functions are integrable.
3.4. INTEGRABLE FUNCTIONS 117
Proof. We will use Darboux integrability. Let us assume that the function
f is non-decreasing on the interval. The non-increasing case is left as an
exercise. Take any partition of the interval:
a = x
0
< x
1
< · · · < x
n
= b.
The reader may justify why we can use the same partition in the computation
of the upper and lower sum. For i = 1, . . . , n we set
m
i
= f(x
i−1
) & M
i
= f(x
i
).
Then, because f is non-decreasing,
m
i
≤ f(x) ≤ M
i
for all x ∈ [x
i−1
, x
i
].
We use the m
i
and M
i
to compute upper and lower sums. Let ∆ be the
largest value of the x
i
−x
i−1
. Then
S
u
−S
l
= [M
1
(x
1
−x
0
) +· · · +M
n
(x
n
−x
n−1
)]
−[m
1
(x
1
−x
0
) +· · · +m
n
(x
n
−x
n−1
)]
= (M
1
−m
1
)(x
1
−x
0
) +· · · + (M
n
−m
n
)(x
n
−x
n−1
)
≤ [(M
1
−m
1
) + (M
2
−m
2
) +· · · + (M
n
−m
n
)] ∆
= (M
n
−m
1
)∆
= [f(b) −f(a)]∆
The inequality in the computation follows from the choice of ∆. The
second to last equality follows because M
i−1
= m
i
for all i = 2, . . . , n.
Many terms in the computation cancel. Given any positive number D, we
can make the partition fine enough so that [f(b) −f(a)]∆ < D. According
to our Remark 8 this means that f is integrable over the interval, as we
claimed.
We illustrate the steps in the proof in a concrete example. In Figure 3.11
you see the upper and lower sum. The lower sum is the sum of the areas
of the darkly shaded rectangles. The upper sum is the sum of the areas of
the lightly and darkly shaded rectangles. The difference between the upper
and the lower sum is the sum of the lightly shaded rectangles shown in
Figure 3.12. We can combine these areas by sliding the rectangles sideways
so that they form one column. Its height will be f(b) −f(a). Its width may
vary, but in the widest place it is no wider than ∆, the width of the largest
interval in the partition of [a, b]. That means, the difference between the
upper and the lower sum is at most [f(b) −f(a)]∆. As above, we conclude
that the function is integrable.
118 CHAPTER 3. INTEGRATION
0.2 0.4 0.6 0.8 1
-0.2
0.2
0.4
0.6
0.8
1
1.2
Figure 3.11: Rectangles for calcu-
lating a lower and an upper sum.
0.2 0.4 0.6 0.8 1
-0.2
0.2
0.4
0.6
0.8
1
1.2
Figure 3.12: Rectangles for calcu-
lating the difference between an
upper and a lower sum.
Remark 9. There are functions which are not integrable over any interval
of the form [a, b] with a < b.
Remark 10. Here we only discuss integrability of function over closed finite
intervals, i.e., intervals of the form [a, b]. The discussion of integrability of
functions over intervals which are not of this form, e.g., half-open intervals
like [a, b) or unbounded closed intervals like [a, ∞), requires additional ideas
and techniques which we are not ready to discuss yet.
3.5 Some elementary observations
In spite of our success calculating some integrals using upper and lower
sums and the definition, this is certainly not the way to go in general. To
integrate “well behaved” functions we want a theory which allows us to
calculate integrals more easily. We have to develop a few basic tools. These
are fairly straight forward consequences of the definition of the integral.
Proposition 3.13. If the function f is defined at a, then
_
a
a
f(x) dx = 0 (3.4)
Proof. The reader should contemplate the proposition.
3.5. SOME ELEMENTARY OBSERVATIONS 119
Proposition 3.14. Let [a, b] be a closed interval, c a point between a and
b, and f a function which is defined on the interval. Then
_
c
a
f(x) dx +
_
b
c
f(x) dx =
_
b
a
f(x) dx. (3.5)
Implicitly in the formulation of the proposition is the statement that f
is integrable over [a, b] if and only if it is integrable over the intervals [a, c]
and [c, b]. If one of the sides of Equation (3.5) exists, then so does the other
one.
Idea of Proof. Use c as one of the points in the partition. The remaining
details are left to the reader.
As an immediate consequence of Propositions 3.12 and 3.14 we find
Corollary 3.15. Let f be defined on the interval [a, b]. Suppose that we
can partition the interval into a finite number of intervals such that f is
non-increasing or non-decreasing on each of them. Then f is integrable on
[a, b].
We can also extend Theorem 3.10.
Definition 3.16. Suppose that f is defined on an interval [a, b]. We call f
piecewise continuous if there is a partition
a = x
0
< x
1
< · · · < x
n−1
< x
n
= b
such that f is continuous on the open intervals (x
j−1
, x
j
) for all 1 ≤ j ≤ n,
and the one-sided limits (see Section 1.3)
lim
x→x
+
j−1
f(x) and lim
x→x

j
f(x).
exist and are finite.
Corollary 3.17. If f is a piecewise contiuous function on [a, b], then f is
integrable on [a, b].
Idea of Proof. According to Proposition 3.14 we may break the problem up,
and consider it over each of the intervals [x
j−1
, x
j
] separately. On each of
these smaller intervals, we can change the defintion of the function at a point
or two and make it continuous. This changes neither the integrability nor
the value of the integral. So the assertion follows from Theorem 3.10.
120 CHAPTER 3. INTEGRATION
Definition 3.18. Let f be defined and integrable on the interval [a, b]. Then
_
b
a
f(x) dx = −
_
a
b
f(x) dx.
This definition is convenient and consistent with what we have said so
far about the integral. The approach to integrals via lower and upper sums
could also be generalized to include integrals
_
b
a
where b < a, leading to
exactly this formula.
Using the definition of the integral it is not difficult to show:
Proposition 3.19. Let [a, b] be a closed interval and c a scalar. Suppose
that f and g are integrable over the interval. Then f +g and cf are integrable
over [a, b] and
_
b
a
(f(x) +g(x)) dx =
_
b
a
f(x) dx +
_
b
a
g(x) dx
and
_
b
a
cf(x) dx = c
_
b
a
f(x) dx.
We mention a few useful estimates for integrals.
Proposition 3.20. If f is integrable over [a, b], and f(x) ≥ 0 for all x ∈
[a, b], then
_
b
a
f(x) dx ≥ 0.
Proof. The proof is left to the reader.
Corollary 3.21. If h and g are integrable over [a, b], and g(x) ≥ h(x) for
all x ∈ [a, b], then
_
b
a
g(x) dx ≥
_
b
a
h(x) dx.
Proof. Use that f(x) = g(x) −h(x) ≥ 0 for all x ∈ [a, b].
Proposition 3.22. Let [a, b] be a closed interval and f integrable over [a, b].
Then the absolute value of f is integrable over [a, b], and
¸
¸
¸
¸
_
b
a
f(x) dx
¸
¸
¸
¸

_
b
a
|f(x)| dx. (3.6)
The proof of this proposition is elementary, though a bit tricky.
3.6. AREAS AND INTEGRALS 121
3.6 Areas and Integrals
Let us return to the relation between areas and integrals. Suppose f(x) is
a non-negative integrable function over an interval [a, b]. If Ω is the area
bounded by the graph of f(x), the x-axis, and the lines x = a and x = b,
then
Area(Ω) =
_
b
a
f(x) dx.
The question is, what happens if f(x) is not non-negative?
Let f be a function which is defined and bounded on a closed interval
[a, b] and Ω the set of points which lie between the graph of f(x) and the
x-axis for a ≤ x ≤ b. We decompose Ω into the union of two sets, Ω
+
and Ω

. Specifically, Ω
+
consist of those points (x, y) in the plane for which
a ≤ x ≤ b and 0 ≤ y ≤ f(x), and Ω

of those points for which a ≤ x ≤ b and
f(x) ≤ y ≤ 0. Then Ω is the union of the sets Ω
+
and Ω

. We decompose
the region between the x-axis and the graph into the part Ω
+
above the
x-axis and the part Ω

below it. Making use of this notation, we have:
Proposition 3.23. If f is integrable, then the areas of the regions Ω
+
and


are defined
2
and
_
b
a
f(x) dx = Area(Ω
+
) −Area(Ω

). (3.7)
Idea of Proof. We define two functions:
f
+
(x) =
_
f(x) if f(x) ≥ 0
0 if f(x) ≤ 0
and f

(x) =
_
f(x) if f(x) ≤ 0
0 if f(x) ≥ 0
It is elementary, though a bit tricky, to show that the integrability of f(x)
implies the integrability of f
+
(x) and f

(x). Apparently, f = f
+
+ f

, so
that the additivity of the integral implies that
_
b
a
f(x) dx =
_
b
a
f
+
(x) dx +
_
b
a
f

(x) dx. (3.8)
According to Definition 3.7 we have
Area(Ω
+
) =
_
b
a
f
+
(x) dx. (3.9)
2
If you want to be formal, then you have to flip the region Ω

to lie above the x-axis.
Only then have we addressed the question of it having an area.
122 CHAPTER 3. INTEGRATION
Let −Ω

be the area obtained by flipping Ω

up, i.e., we take its mirror
image along the x-axis. This process does not change areas, so Area(Ω

) =
Area(−Ω

). The function −f

(x) is non-negative, and −Ω

is bounded by
the graph of −f

(x), the x-axis, and the lines x = a and x = b. According
to Definition 3.7 and our elementary properties of the integral we have
Area(Ω

) = Area(−Ω

) =
_
b
a
−f

(x) dx = −
_
b
a
f

(x) dx. (3.10)
Our claim follows now by substituting the results in (3.9) and (3.10) into
(3.8).
For example,
_
π/2
−π/2
sin x dx = 0 because the graph bounds congruent
regions above and below the x-axis.
3.7 Anti-derivatives
Consider a function f(x) with domain I. In Definition 2.6 we called a func-
tion F(x) with domain I an antiderivative of f(x) if F

(x) = f(x). Having
an anti-derivative of a function will (typically) make it easy to integrate it
over a closed interval.
Remember that any antiderivatives F
1
and F
2
of a function f on an
interval I differ only by a constant (see Corollary 2.5). In other words,
there exists a constant c, such that
F
1
(x) = F
2
(x) +c for all x ∈ I.
Definition 3.24. Let f be a function which is defined on an interval I, and
suppose that f has an antiderivative. The set of all antiderivatives of f is
called the indefinite integral of f. It is denoted by
_
f(x) dx.
Given a function f and an antiderivative F of it, we typically write
_
f(x) dx = F(x) +c. (3.11)
In this expression c stands for an arbitrary constant. Different values for c
result in different functions. Allowing all real numbers as possible values for
c, we understand the the right hand side of (3.11) as a set of functions. The
constant c in the expression is referred to as integration constant.
3.7. ANTI-DERIVATIVES 123
Example 3.25. Given a function f(x) we might know or guess a function
F(x), such that F

(x) = f(x). Then we can write down the indefinite
integral of f in the form F(x) + c. You can check the correctness of your
guess by differentiation. You may want to consult Table 1.3 on page 63 to
come up with ideas for antiderivatives. Here are some examples.
_
1 dx = x +c
_

x dx =
2
3
x
3/2
+c
_
sin x dx = −cos x +c
_
cos x dx = sin x +c
_
dx
1 +x
2
= arctan x +c
_
x dx =
1
2
x
2
+c
_
x
n
dx =
1
n + 1
x
n+1
(n = −1)
_
sec
2
x dx = tan x +c
_
sec xtan x dx = sec x +c
_
dx

1 −x
2
= arcsin x +c
Using the linearity of the differentiation (see the differentiation rules
in (1.20)), it is easy to produce more examples. E.g.
_
5x
2
−2 cos x dx =
5
3
x
3
−2 sin x +c.
Occasionally, an additional idea is required before we can see the anti-
derivative. E.g., using the trigonometric identity cos
2
x = (1 + cos 2x)/2,
we find that
_
cos
2
x dx =
1
2
_
(1 + cos(2x)) dx =
1
2
_
x +
1
2
sin(2x)
_
+c.
Using a different trigonometric identity we find
_
(1 + cot
2
x) dx =
_
csc
2
x dx = −cot x +c. ♦
We shall explore additional ideas for finding antiderivatives at a later.
The reader may practice finding some antiderivatives for the functions in
the next exercise. As you go through them you are expected to learn, or
pick up some new ideas as you go along.
124 CHAPTER 3. INTEGRATION
Exercise 46. Find the following indefinite integrals:
(a)
_
3 dx
(b)
_
(x + 4) dx
(c)
_
(x
2
−5) dx
(d)
_
cos 2x dx
(e)
_
(3 +x)
3
dx
(f)
_
(3 + 2x)
5
dx
(g)
_
1
x
3
dx
(h)
_
csc
2
x dx
(i)
_
(1 + tan
2
x) dx
(j)
_
csc xcot x dx
(k)
_
sin
2
x dx
(l)
_
sec
2
(3x) dx
(m)
_
e
x/3
dx
(n)
_
2x
x
2
+ 1
dx
(o)
_
(4 −3x)
5
dx
(p)
_
cos(4 −3x) dx
(q)
_
2x
(x
2
+ 3)
2
dx
(r)
_
xsec
2
(x
2
+ 5) dx
3.8 The Fundamental Theorem of Calculus
Our first result provides us with a large class of functions which have an-
tiderivatives.
Theorem 3.26. Continuous functions, defined over intervals, have anti-
derivatives. More specifically, suppose that a function f is defined and con-
tinuous over the interval I. Let a ∈ I. Then
f(x) =
d
dx
_
x
a
f(t) dt
for all x ∈ I.
The major tool for calculating integrals, and the grand conclusion of our
discussion of antiderivatives is the Fundamental Theorem of Calculus.
Theorem 3.27 (Fundamental Theorem of Calculus). Suppose that f
is a continuous function over a closed interval [a, b] and that F is an an-
tiderivative of f. Then
_
b
a
f(x) dx = F(b) −F(a).
For example, F(x) = −cos x is an antiderivative of f(x) = sinx, so that
the Fundamental Theorem of Calculus tells us that
_
π
0
sinx dx = −cos(π) −(−cos(0)) = −(−1) −(−1) = 2.
3.8. THE FUNDAMENTAL THEOREM OF CALCULUS 125
As another example, note that F(x) = tan x is an anti-derivative of f(x) =
sec
2
x, so that the Fundamental Theorem of Calculus tells us that
_
π/4
0
sec
2
x dx = tan(π/4) −tan(0) = 1.
Remark 11 (Notational Convention). One commonly uses the nota-
tion
F(x)
¸
¸
¸
b
a
= F(b) −F(a).
This is quite convenient. E.g., we write
sin x
¸
¸
¸
π
0
= sinπ −sin0.
If there are ambiguities due to the length of the expression to which this
construction is applied, we also use the notation shown in the following
example:
_
x
3
−5x
2
+ 2x −8
_
5
3
= p(5) −p(3)
where p(x) = x
3
−5x
2
+ 2x −8.
Using this notation, we calculate that
_
3
−2
(x
2
−2x + 5) dx =
_
x
3
3
−x
2
+ 5x
_
3
−2
=
95
3
.
Other examples are
_
π/4
0
sec xtan x dx = sec x
¸
¸
¸
π/4
0
=

2 −1
and
_
π/3
π/4
csc xcot x dx = −csc x
¸
¸
¸
π/3
π/4
=
_
−2

3
3
_
−(−

2) =

2 −
2

3
3
.
The reader is invited to practice a few examples.
126 CHAPTER 3. INTEGRATION
Exercise 47. Evaluate the following definite integrals:
(a)
_
1
0
(3x + 2) dx
(b)
_
2
1
6 −t
t
3
dt
(c)
_
5
2
2

x −1 dx
(d)
_
0
1
(t
3
−t
2
) dt
(e)
_
π/4
π/6
csc xcot x dx
(f)
_
−1
−1
7x
6
dx
(g)
_
π
0
1
2
cos x dx
(h)
_
π
0
cos(x/2) dx
(i)
_
2
−2
|x
2
−1| dx
(j)
_
π/2
0
cos
2
x dx
(k)
_
π/2
0
sin
2
(2x) dx
(l)
_
π/4
0
sec
2
x dx
3.8.1 Some Proofs
Because of their importance, we like to prove Theorem 3.26 and the Funda-
mental Theorem of Calculus.
Proof of the Fundamental Theorem of Calculus. Essentially, the desired re-
sult is an easy consequence of Theorem 3.26. Let F(x) be any anti-derivative
of f(x) on I, and H(x) =
_
x
a
f(t) dt the one provided by Theorem 3.26. In
particular, F

(x) = H

(x) = f(x). Cauchy’s Theorem (see its application in
Corollary 2.5) tells us that F and H differ by a constant. For some constant
c and all x ∈ I:
H(x) =
_
x
a
f(t) dt = F(x) +c (3.12)
We can find out the value for c by substituting x = a in this equation. In
particular, we find that
_
a
a
f(t) dt = 0 = F(a) +c or c = −F(a).
Using this calculation of c and substituting x = b in (3.12), we obtain
_
b
a
f(t) dt = F(b) −F(a),
as claimed.
3.8. THE FUNDAMENTAL THEOREM OF CALCULUS 127
Proof of Theorem 3.26. Because we assumed continuity of f on the interval
I, it follows from Theorem 3.10 that
F(x) =
_
x
a
f(t) dt
exists. So it is our task to show that F is differentiable at x, and that
F

(x) = f(x). Using Theorem 1.26, after adjusting the notation to fit the
current setting, the task becomes to show that
f(x) = lim
h→0
F(x +h) −F(x)
h
= lim
h→0
1
h
__
x+h
a
f(t) dt −
_
x
a
f(t) dt
_
= lim
h→0
1
h
_
x+h
x
f(t) dt.
Here we assume that x is not an endpoint of I, so that x and x+h are both
in I. We omit (leave to the reader) the modifications of the proof which are
required in the case where x is an endpoint of I.
According to the Extreme Value Theorem (see Theorem 1.17) there are
points c and d between x and x +h, such that
f(c) = m(h) ≤ f(x) ≤ f(d) = M(h) (3.13)
for all t between x and x + h. The points c and d may not be uniquely
determined by h, but m and M are. It follows from (3.13) and Corollary 3.21
that
m(h) · h =
_
x+h
x
m(h) dt ≤
_
x+h
x
f(t) dt ≤
_
x+h
x
M(h) dt = M(h) · h,
and with this that
m(h) ≤
1
h
_
x+h
x
f(t) dt ≤ M(h).
Continuity of f(x) implies that
lim
h→0
m(h) = f(x) = lim
h→0
M(h).
It follows from a pinching argument (see Proposition 1.4) that
lim
h→0
_
x+h
x
f(t) dt = f(x),
and this is exactly what we needed to show.
128 CHAPTER 3. INTEGRATION
3.9 Substitution
In some cases it is not that easy to ‘see’ an antiderivative of the function one
likes to integrate. Substitution is a method which, when applied correctly,
will simplify the expression for the function you like to integrate. You hope
that you can find an antiderivative for the simplified expression. The method
is based on the chain rule for differentiation. Sometimes this method is
helpful, other times it is not. Your success with this method depends greatly
on experience, i.e., practice.
We explain the method. Let F and g be functions which are defined and
differentiable on an interval I. Set F

= f. Then, according to the chain
rule,
d
dx
F(g(x)) = f(g(x))g

(x).
Assume that f and g

are continuous on I. Then f(g(x))g

(x) is continuous
as well. We may take antiderivatives of both sides of our previous equation,
and conclude that
_
f(g(x))g

(x) dx = F(g(x)) +c. (3.14)
The variable for the functions f and F is often called u, and this means in
context that u = g(x).
Let us give a few examples to illustrate how this method can be put
to use. There are no general rules what substitution must be used, rather
success justifies the means. Working through the examples will teach you
how to apply this method in some typical situations. It will give you at least
some experience which you may then rely on in similar examples.
For example,
_
(2x −3)
3
dx =
1
2
_
(2x −3)
3
· 2dx =
1
8
(2x −3)
4
+c.
Here we used g(x) = 2x −3, g

(x) = 2, f(u) = u
3
, and F(u) =
u
4
4
.
There is a pattern, a way to use the notation, which can be applied to
write down the steps in an integration using substitution efficiently. Setting
u = g(x) we write
du = g

(x)dx,
3.9. SUBSTITUTION 129
instead of g

(x) = du/dx
3
. Suppose also that F is an anti-derivative of f,
so F

= f. Then the pattern for calculating an integral via substitution is
_
f(g(x))g

(x) dx =
_
f(u) du = F(u) +c = F(g(x)) +c. (3.15)
In the first step of this calculation we carry out the substitution, in the
second one we find the anti-derivative, and in the third one we reverse the
substitution. We make use of this notation in our next example.
For example, we calculate that
_
x
_
x
2
+ 2 dx =
1
2
_
_
x
2
+ 2 · 2xdx
=
1
2
_

u du
=
1
3
u
3/2
+c
=
1
3
(x
2
+ 2)
3/2
+c.
We used the substitution u = x
2
+ 2. Then
du
dx
= 2x, or du = 2xdx.
We calculate that
_
t
2
(t + 1)
7
dt =
_
(u −1)
2
u
7
du
=
_
(u
2
−2u + 1)u
7
du
=
_
(u
9
−2u
8
+u
7
) du
=
1
10
u
10

2
9
u
9
+
1
8
u
8
+c
=
1
10
(t + 1)
10

2
9
(t + 1)
9
+
1
8
(t + 1)
8
+c.
Here we used the substitution u = t + 1. Then du = dx and t = u −1.
We may have to use a substitution and a trigonometric identity to solve
3
We do not attach any particular meaning to the symbols dx and du in their own right.
The equation du = g

(x)dx helps us to write down what happens when we perform the
substitution as in the first equality in (3.15). Thought of as infinitesimals or differentials,
these symbols have a meaning, but this is beyond the scope of these notes.
130 CHAPTER 3. INTEGRATION
an integration problem:
_
2xsin
2
(x
2
+ 5) dx =
_
sin
2
u du
=
1
2
_
[1 −cos 2u] du
=
1
2
_
u −
1
2
sin2u
_
+c
=
1
2
_
(x
2
+ 5) −
1
2
sin[2(x
2
+ 5)]
_
+c.
We used the substitution u = x
2
+ 5, so that du = 2xdx and the identity
sin
2
α = [1 −cos 2α]/2.
Find the substitution which we used in the following computation, and
check the details:
_
sec
2
xtan x dx =
_
sec x · sec xtan x dx
=
_
u du
=
1
2
u
2
+c
=
1
2
sec
2
x +c.
Sometimes we have to apply the method of substitution twice, or more
often, to work out an integral. Here is an example.
_
(x
2
+ 1) sin
3
(x
3
+ 3x −2) cos(x
3
+ 3x −2) dx =
1
3
_
sin
3
ucos u du
=
1
3
_
v
3
dv
=
1
12
v
4
+c
=
1
12
sin
4
u +c
=
sin
4
(x
3
+ 3x −2)
12
+c
In the computation we used the substitution u = x
3
+ 3x − 2. Then du =
3(x
2
+1) dx. In a second substitution we set v = sin u. Then dv = cos u du.
3.9. SUBSTITUTION 131
Here are two examples, which are important in the context of integrating
rational functions. In the first example we assume that a = 0, and we use
the subsitution x = au. The dx = adu.
_
dx
x
2
+a
2
dx =
_
adu
a
2
u
2
+a
2
=
1
a
_
du
u
2
+ 1
=
1
a
arctan(u) +c
=
1
a
arctan
_
x
a
_
+c
Adding another idea, we calculate
_
dx
x
2
+ 2x + 5
=
_
dx
(x + 1)
2
+ 4
=
_
du
u
2
+ 4
=
1
2
arctan
_
x + 1
2
_
+c.
We used the substitution u = x+1, and then we proceeded as in the previous
example.
3.9.1 Substitution and Definite Integrals
Let us now explore how substitution is used to calculate definite integrals.
Assuming as before that f and g

are continuous on the interval [a, b], we
have
_
b
a
f(g(x))g

(x) dx =
_
g(b)
g(a)
f(u) du. (3.16)
To see this, observe that f has an anti-derivative, which we again denote
by F. Then
_
b
a
f(g(x))g

(x) dx = F(g(x))
¸
¸
¸
b
a
= F(u)
¸
¸
¸
g(b)
g(a)
=
_
g(b)
g(a)
f(u) du.
The first identity is obtained as a combination of the Fundamental Theorem
of Calculus and (3.14). The second one is obvious, and the third one is
another application of the Fundamental Theorem of Calculus.
Let us apply this formula in a few examples.
_
1
0
(x
2
−1)(x
3
−3x + 5)
3
dx =
1
3
_
3
5
u
3
du =
1
12
u
4
¸
¸
¸
3
5
= −
136
3
.
132 CHAPTER 3. INTEGRATION
We used the substitution u = x
3
− 3x + 5. Then du = (3x
2
− 3) dx, and
1
3
du = (x
2
−1) dx. To obtain the limits for the integral we calculate u(0) = 5
and u(1) = 3.
Another example is
_
π/4
0
cos
2
xsin x dx = −
_

2/2
1
u
2
du = −
1
3
u
3
¸
¸
¸

2/2
1
= −
1
3
_
1 −

2
4
_
.
We use the substitution u = cos x. Then −du = sinx dx. If x = 0, then
u = 1, and if x = π/4, then u =

2/2.
Incorporating one of our previous techniques, we calculate
_
2
0
x(x + 1)
6
dx =
_
3
1
(u −1)u
6
du =
_
3
1
u
7
−u
6
du =
3554
7
.
We use the substitution u = x + 1. Then du = dx and x = u − 1. If x = 0,
then u = 1, and if x = 2, the u = 3.
Similarly,
_

8
0
x
3
_
x
2
+ 1 dx =
1
2
_
9
1
(u −1)

u du =
1
2
_
9
1
_
u
3/2
−u
1/2
_
du =
596
15
.
We use the substitution u = x
2
+ 1. Then
1
2
du = x dx and x
2
= u −1. For
the limits we calculate, if x = 0, then u = 1, and if x =

8, then u = 9.
Finally,
_
1
0
_
1 −x
2
dx =
_
π/2
0
_
1 −sin
2
ucos u du =
_
π/2
0
cos
2
u du =
π
4
.
We use the substitution x = sin u. Then dx = cos u du. If x = 0, then
u = 0, and if x = 1, then u = π/2. For our given values of x, there are other
possible values for u, but they will lead to the same results.
Remark 12. The graph of f(x) =

1 −x
2
is the northern part of a circle.
Using x ∈ [0, 1] means that we calculated the area under this graph in the
first quadrant, i.e., the area of one forth of the disk of radius 1. You were
told long time ago in school, that the area of this unit disk is π, so that the
result of the calculation is hardly surprising.
There is a more serious matter. Is the example genuine, or did we assume
the answer previously? By definition, π is the ratio of the circumference of
a circle by its diameter. In our calculation of the derivative of the sine and
cosine functions we used the estimate that | sin h − h| ≤ h
2
/2. When we
3.10. AREAS BETWEEN GRAPHS 133
showed this, we used that |h| ≤ | tan h| for h ∈ [−π/4, π/4]. A typical proof
of the latter inequality starts out by first showing that the area of the unit
disk is π. This means, we assumed the result in the example, we did not
derive it.
Exercise 48. Find the following integrals:
(a)
_
dx

2x + 1
(b)
_
t
(4t
2
+ 9)
2
dt
(c)
_
t(1 +t
2
)
3
dt
(d)
_
2s
3

6 −5s
2
ds
(e)
_
b
3
x
3

1 −a
4
x
4
dx
(f)
_
π
0
xcos x
2
dx
(g)
_
x
2

x + 1 dx
(h)
_
x + 3

x + 1
dx
(i)
_
sin
2
(3x) dx
(j)
_
π/2
0
cos
2
x dx
(k)
_
π/4
π/6
sec(2x) tan(2x) dx
(l)
_
1/2
0
dx
4 +x
2
(m)
_
sec
2
x

1 + tan x
dx
(n)
_

1 + sinxcos x dx
(o)
_
r
0
_
r
2
−x
2
dx
3.10 Areas between Graphs
Previously we related the integral to areas of a region under a graph. This
idea can be generalized to the discussion of areas of regions between two
graphs. Let us look at an example.
Example 3.28. Calculate the area of the region between the graphs of the
functions f(x) = x
2
and g(x) =

1 −x
2
.
Solution To get a better understanding, we draw the two graphs, see
Figure 3.13. Now you see the region between the two graphs whose area we
want to calculate. We call the region Ω.
The graphs intersect in two points. To find their x-coordinates, we solve
the equation
f(x) = x
2
= g(x) =
_
1 −x
2
.
After squaring the equation and solving it for x
2
, we find x
2
=
−1±

5
2
.
Only the + sign occurs as x
2
≥ 0. Taking the square root, we find the
x-coordinates of the points where the curves intersect:
A = −
¸
−1 +

5
2
and B =
¸
−1 +

5
2
.
134 CHAPTER 3. INTEGRATION
-1 -0.5 0.5 1
0.2
0.4
0.6
0.8
1
Figure 3.13: Region between two
graphs
0.5 1 1.5 2 2.5 3
-1
-0.5
0.5
1
Figure 3.14: Region between two
graphs
To get the area of the region under the graph of f(x) and g(x) over the
interval [A, B] we can calculate the appropriate integrals. To get the area of
the region Ω between the graphs, we take the area of the region under the
graph of g(x) and subtract the area of the region under the graph of f(x).
Concretely:
Area(Ω) =
_
B
A
g(x) dx −
_
B
A
f(x) dx =
_
B
A
(g(x) −f(x)) dx ≈ 1.06651.
The numerical value was obtained by computer. You are invited to work
out the integral with the help of the Fundamental Theorem of Calculus to
verify the result. ♦
Some problems are a bit more subtle.
Example 3.29. Find the area of the region between the graphs of the func-
tions f(x) = cos x and g(x) = sin x for x between 0 and π.
Solution: The region Ω between the graphs is shown in Figure 3.14.
The region breaks up into two pieces, the region Ω
1
over the interval [0, π/4]
on which f(x) ≥ g(x), and the region Ω
2
over the interval [π/4, π] where
g(x) ≥ f(x). We calculate the areas of the regions Ω
1
and Ω
2
separately.
3.10. AREAS BETWEEN GRAPHS 135
In each case, we proceed as in the previous example:
Area(Ω
1
) =
_
π/4
0
(cos x −sin x) dx = (sin x + cos x)
¸
¸
¸
π/4
0
=

2 −1
Area(Ω
2
) =
_
π
π/4
(sin x −cos x) dx = −(sin x + cos x)
¸
¸
¸
π
π/4
= 1 +

2.
In summary we find:
Area(Ω) = Area(Ω
1
) + Area(Ω
2
) = 2

2.
An additional remark may be in place. When we compared integrals and
areas, we had to take into account where the function is non-negative, resp.,
non-positive. Here we did not. We took care of this aspect by breaking up
the interval into the part where f(x) ≥ g(x) and the part where g(x) ≥ f(x).

Our general definition for the area between two graphs is as follows.
Definition 3.30. Suppose f(x) and g(x) are integrable functions over an
interval [a, b]. Let Ω be the region between the graphs of f(x) and g(x) for
x between a and b. The area of Ω is
Area(Ω) =
_
b
a
|f(x) −g(x)| dx.
This definition generalizes Definition 3.7 on page 114. The definition is
also consistent with the intuitive idea of the area of a region, and it incor-
porates and generalizes Proposition 3.23 on page 121. Taking the absolute
value of the difference of f(x) and g(x) allows us avoid the question where
f(x) ≥ g(x) and where g(x) ≥ f(x). Typically this problem gets addressed
when the integral is calculated. In some problems a and b are explicitly
given, in others you have to determine them from context. In all cases it is
good to graph the functions before calculating the area of the region between
them. Having the correct picture in mind helps you to avoid mistakes.
Exercise 49. Sketch and find the area of the region bounded by the curves:
(a) y = x
2
and y = x
3
.
(b) y = 8 −x
2
and y = x
2
(c) y = x
2
and y = 3x + 5.
(d) y = sin x and y = πx −x
2
.
(e) y = sin x and y = 2 sin xcos x for x between 0 and π.
136 CHAPTER 3. INTEGRATION
3.11 Numerical Integration
The Fundamental Theorem of Calculus provided us with a highly efficient
method for calculating definite integrals. Still, for some functions we have
no good expression for its anti-derivative. In such cases we may have to
rely on numerical methods for integrating. Let us take such a function, and
show some methods for finding an approximate value for the integral.
We describe different ways to find, by numerical means, approximate
values for the integral of a function f(x) over the interval [a, b]:
_
b
a
f(x) dx.
In all of the different approaches we partition the interval into smaller ones:
a = x
0
< x
1
< · · · < x
n−1
< x
n
= b.
Left and Right Endpoint Method: In the left endpoint method we
find the value of the function at each left endpoint of the intervals of the
partition. We multiply it with the length of the associated interval, and
then add up the terms. Explicitly, we calculate
I
L
= f(x
0
)(x
1
−x
0
) +f(x
1
)(x
2
−x
1
) +· · · +f(x
n−1
)(x
n
−x
n−1
). (3.17)
In the right endpoint method we proceed as we did on the left endpoint
method, only we use the value of the function at the right endpoint instead
of the left endpoint:
I
R
= f(x
1
)(x
1
−x
0
) +f(x
2
)(x
2
−x
1
) +· · · +f(x
n
)(x
n
−x
n−1
). (3.18)
Both expressions provide us with specific examples of Riemann sums.
Example 3.31. Use the left and right endpoint method to find approximate
values for
_
2
0
e
−x
2
dx.
Solution: Set f(x) = e
−x
2
and choose the partition:
x
0
= 0 < x
1
=
1
2
< x
2
= 1 < x
3
=
3
2
< x
4
= 2.
3.11. NUMERICAL INTEGRATION 137
Then x
k
−x
k−1
= 1/2 for k = 1, 2, 3, and 4. Formula (3.17) for I
L
specializes
to
I
L
=
f(0) +f(1/2) +f(1) +f(3/2)
2
≈ 1.126039724.
Formula (3.18) for I
R
specializes to
I
R
=
f(1/2) +f(1) +f(3/2) +f(2)
2
≈ .6351975438.
0.5 1 1.5 2
0.2
0.4
0.6
0.8
1
Figure 3.15: Use left end points
0.5 1 1.5 2
0.2
0.4
0.6
0.8
1
Figure 3.16: Use right end points
Apparently, I
L
and I
R
are calculated by combining the areas of certain
rectangles. In our case the values of f(x) are all positive and all of the
rectangles are above the x axis, so the areas of the rectangles are all added.
Note also, that our specific function f(x) is decreasing on the interval [0, 2],
so that I
L
is an upper sum for the function f(x) over the interval [0, 2], and
I
R
is a lower sum. In this sense, we have
I
R
= .6351975438 ≤
_
2
0
e
−x
2
dx ≤ I
L
= 1.126039724.
The function and the rectangles whose areas are added to give us I
L
and I
R
are shown in Figure 3.15 and Figure 3.16. ♦
Midpoint and Trapezoid Method: We may try and improve on the
endpoint methods. In the midpoint methods, we use the value of the function
at the midpoints of the intervals of the partition. That should be less bias.
138 CHAPTER 3. INTEGRATION
We use the same partition and notation as above. Then the formula for the
midpoint method is:
I
M
= f
_
x
0
+x
1
2
_
(x
1
−x
0
) +· · · +f
_
x
n
+x
n−1
2
_
(x
n
−x
n−1
). (3.19)
In the trapezoid method we do not take the function at the average (i.e.
midpoint) of the end points of the intervals in the partition, but we average
the values of the function at the end points. Specifically, the formula is
I
T
=
f(x
0
) +f(x
1
)
2
(x
1
−x
0
) +· · · +
f(x
n−1
) +f(x
n
)
2
(x
n
−x
n−1
). (3.20)
It is quite easy to see that
I
T
=
I
L
+I
R
2
. (3.21)
Let us explain the reference to the word trapezoid. For simplicity, sup-
pose that f(x) is non-negative on the interval [a, b]. Consider the trapezoid
of width (x
1
− x
0
) which has height f(x
0
) at its left and f(x
1
) at its right
edge. The area of this trapezoid is
f(x
0
)+f(x
1
)
2
(x
1
− x
0
). This is the first
summand in the formula for I
T
, see (3.20). We have such a trapezoid over
each of the intervals in the partition, and their areas are added to give I
T
.
Expressed differently, we can draw a secant line through the points
(x
0
, f(x
0
)) and (x
1
, f(x
1
)). This gives us the graph of a function T(x)
over the interval [x
0
, x
1
]. Over the interval [x
1
, x
2
] the graph of T(x) is the
secant line through the points (x
1
, f(x
1
)) and (x
2
, f(x
2
)). Proceeding in
the fashion, we use appropriate secant lines above all of the intervals in the
partition to define the function T(x) over the entire interval [a, b]. Then
I
T
=
_
b
a
T(x) dx.
This integral is easily computed by the formula in (3.20).
Example 3.32. Use the midpoint and trapezoid method to find approxi-
mate values for
_
2
0
e
−x
2
dx.
Solution: We use the same partition of [0, 2] as in Example 3.31. The
formula for I
M
(see (3.19)) specializes to
I
M
=
f(.25) +f(.75) +f(1.25) +f(1.75)
2
≈ .8827889485.
3.11. NUMERICAL INTEGRATION 139
As for the endpoint methods, I
M
is the combined area of certain rectangles.
Their heights are the values f(x
i
) at the midpoints of the intervals of the
partition. Their width are the lengths of the intervals of the partition. You
see the rectangles for this calculation in Figure 3.17.
0.5 1 1.5 2
0.2
0.4
0.6
0.8
1
Figure 3.17: Use midpoints
0.5 1 1.5 2
0.2
0.4
0.6
0.8
1
Figure 3.18: Trapezoid Method
Based on our previous calculations and Formula (3.21) we find
I
T
=
I
L
+I
R
2
≈ .8806186341.
We illustrated this calculation in Figure 3.18. There you see the function
f(x) = e
−x
2
and five dots on the graph. The dots are connected by straight
line segments. These line segments form the graph of a function T(x), and
I
T
is the area of the region under this graph. So
I
T
=
_
2
0
T(x) dx. ♦
Simpson’s Method: In Simpson’s method we combine the endpoint
and midpoint methods in a weighted fashion. Again, we use the same no-
tation for the function and the partition as above. The specific formula for
an approximate value of the integral of f(x) over [a, b] is
I
S
=
1
6
_
f(x
0
) + 4f
_
x
0
+x
1
2
_
+f(x
1
)
_
(x
1
−x
0
) +· · ·
+
1
6
_
f(x
n−1
) + 4f
_
x
n−1
+x
n
2
_
+f(x
n
)
_
(x
n
−x
n−1
)
(3.22)
140 CHAPTER 3. INTEGRATION
It is quite easy to see that
I
S
=
I
L
+ 4I
M
+I
R
6
=
I
T
+ 2I
M
3
.
Let us explain the background to Simpson’s method. We define a func-
tion P(x) over the interval [a, b] by defining a degree 2 polynomial on each
of the intervals of the partition. The polynomial over the interval [x
k−1
, x
k
]
is chosen so that it agrees with f(x) at the end points and at the midpoint
of this interval. Simpson’s method is a refinement of the Trapezoid method.
In one method we use two points on the graph and connect them by a
straight line segment. In the other one we use three points on the graph and
construct a parabola through them. With some work one can show that
I
S
=
_
b
a
P(x) dx.
0.5 1 1.5 2
0.2
0.4
0.6
0.8
1
Figure 3.19: Simpson’s Method
Example 3.33. Use Simpson’s method to find an approximate value for
_
2
0
e
−x
2
dx.
3.11. NUMERICAL INTEGRATION 141
Solution: We use the same partition of [0, 2] as in Example 3.31. The
formula for I
S
(see the special case of (3.22)) specializes to
I
S
=
I
L
+ 4I
M
+I
R
6
≈ .88206555104,
where I
L
, I
M
and I
R
are as above.
You see the method illustrated in Figure 3.19. There you see the graphs
of two functions, the function f(x) = e
−x
2
and the function P(x) from the
discussion of Simpson’s method. Only the thickness of the line suggests that
there are two graphs of almost identical functions. ♦
Example 3.34. Compare the accuracy of the various approximate values
of
_
2
0
e
−x
2
dx.
Solution: We compare the approximate values for the integral obtained
by the different formulas. We partition the interval [0, 2] into n intervals
of the same length, and vary n. We tabulate the results. They should be
compared with an approximate value for the integral of
0.882081390762421.
n = 1 n = 10 n = 100 n = 1000
I
L
2.0000000 0.9800072469 0.891895792451 0.883063050702697
I
R
0.0366313 0.7836703747 0.872262105229 0.881095681980474
I
M
0.7357589 0.8822020700 0.882082611663 0.882081402972833
I
T
1.0183156 0.8818388108 0.882078948840 0.882081366341586
I
S
0.8299445 0.8820809836 0.882081390722 0.882081390762417
Table 3.1: Approximate Values of the Integral
Simpson’s method is more accurate than the other ones. E.g., Simpson’s
method with n = 4 gives a result which is better than the left and right
endpoint method with n = 1000. Even if you use the midpoint and trapezoid
method with n = 1000, then the result is far less accurate that Simpson’s
method with n = 100. ♦
142 CHAPTER 3. INTEGRATION
Remark 13. It is important that we keep the number n of intervals into
which we partition [a, b] small. It does not only keep the number of overall
computations small. In each computational step we expect to make a round-
off error, and these may add up. The fewer computations we make, the
smaller the cummulative round-off error will be.
Exercise 50. Proceed as in Example 3.34 and compare the different meth-
ods applied to the calculation of
_
π/2
0
sin x dx = 1.
3.12 Applications of the Integral
In Definition 3.7 and Proposition 3.23 we related definite integrals to areas.
Based on the context, this can have a more concrete meaning. Consider a
function f(t) on an interval [a, b] and the integral
I =
_
b
a
f(t) dt.
If f(t) stands for the rate at which a drug is absorbed, then I is the total
amount of the drug which has been absorbed in the time interval [a, b]. If
f(t) stands for the speed with which you travel, then I stands for the total
distance which you traveled during the time interval [a, b]. You are invited
to come up with more interpretations. In addition, the following definition
expresses the common notion of the average value of a function.
Definition 3.35. Suppose that f(t) is an integrable function over the in-
terval [a, b]. Then the quantity
f
av
:=
1
b −a
_
b
a
f(t) dt
is called the average value of f(t) over the interval [a, b].
For example, the average value of the sine function f(x) = sinx over the
interval [0, π] is 2/π.
Let us explore the different aspects of integration in an example.
Example 3.36. The river Little Brook flows into a reservoir, referred to
as Beaver Pond by the locals. The amount of water carried by the river
depends on the season. As a function of time, it is
g(t) = 2 + sin
_
πt
180
_
.
3.12. APPLICATIONS OF THE INTEGRAL 143
We measure time in days, and t = 0 corresponds to New Year. The units
of g(t) are millions of liter of water per day. Water is released from Beaver
Pond at a constant rate of 2 million liters per day. At the beginning of the
year, there are 200 million liters of water in the reservoir.
(a) How many liter of water are in Beaver Pond by the end of April?
(b) Suppose F(t) tells how much water there is in the reservoir on day t
of the year. Find F(t).
(c) At which rate does the amount of water in the reservoir change at the
beginning of September?
(d) On which days will there be 250 million liters of water in Beaver Pond?
(e) At which amount of water will the reservoir crest?
(f) On the average, by how much has the amount of water in Beaver Pond
increased per day during the first three months of the year?
Solution: Water enters and leaves the pond. The net rate entering is
f(t) = g(t) −2 = sin
_
πt
180
_
millions of liters per day.
We obtain the total change of the amount of water in the reservoir by inte-
grating f(t). Set
A(T) =
_
T
0
f(t) dt.
On the T-th day of the year, the total amount of water in Beaver Pond is
F(T) = 200 +
_
T
0
f(t) dt = 200 +
180
π
_
1 −cos
_
πT
180
__
millions of liters.
This answers (b). By the end of April, after 120 days, there are
F(120) = 200 +
180
π
_
1 −cos

3
_
≈ 238.2
millions of liters of water in the pond. This answers (a).
The rate at which the amount of water in the pond changes is F

(t) =
f(t). At the beginning of September, after 240 days, the rate of change is
144 CHAPTER 3. INTEGRATION
f(240) ≈ −.866. The pond is losing water at a rate of 866, 000 liters per
day.
To answer (d), we like to know for which T we have F(T) = 250. We
solve the equation for T:
250 = 200 +
180
π
_
1 −cos
_
πT
180
__
or cos
_
πT
180
_
= 1 −

18
.
We apply the function arccos to both sides of the last equation and find
T =
180
π
arccos
_
1 −

18
_
≈ 88, or 272.
On the 88-th and 272-nd day of the year there will be 250 millions of liters
of water in the reservoir.
To find at which amount the reservoir crests, we have to find the max-
imum value of F(t). This occurs apparently when cos(πt/180) = −1 or
t = 180. The pond crests at mid-year, and then the amount of water in it
is about 314.6 millions of liters of water. This answers (e).
After three months or 90 days there are about 257.3 millions of liters
of water in Beaver Pond. Within this time, the amount of water has in-
creased by 57.3 millions of liters. On the average, the amount of water in
the reservoir increased by about 640,000 liters per day. ♦
Exercise 51. A pain reliever has been formulated such that it is absorbed
at a rate of 600 sin(πt) (mg/hr) by the body. Here t measures time in hours,
t = 0 at the time you take the medication, and the absorption process is
complete at time t = 1.
(a) What is the total amount of the drug which is absorbed?
(b) Find a function F(t), such that F(t) tells how much medication has
been absorbed at time t.
(c) A total of 150 mg of the medication has to be absorbed before the
drug is effective. How long does it take until this threshold is reached?
3.13 The Exponential and Logarithm Functions
In Section 1.10 we introduced the exponential function exp(x) = e
x
and the
natural logarithm function ln x. At the time we only stated that they exist
because we did not have the tools to properly define them. We will now fill
in the details. Many of the routine calculations are formulated as exercises.
3.13. THE EXPONENTIAL AND LOGARITHM FUNCTIONS 145
Definition 3.37. Let x ∈ (0, ∞). The natural logarithm of x is defined as
ln x =
_
x
1
dt
t
. (3.23)
Theorem 3.38. The natural logarithm function is differentiable on its en-
tire domain (0, ∞), its derivative is
ln

x =
1
x
,
and ln x is increasing on (0, ∞).
Proof. The function 1/x is defined and continuous on (0, ∞). According
to Theorem 3.10 this means that lnx is defined for all x in (0, ∞). Theo-
rem 3.26 tells us that ln

x = 1/x. According to Theorem 2.11, the function
is increasing because its derivative ln

x > 0 for all x > 0.
Let us also verify one of the central equations for calculating with loga-
rithms, the third rule in Theorem 1.34.
Proposition 3.39. For any x, y > 0,
ln(xy) = ln x + ln y. (3.24)
Proof. We need a short calculation. Here x and y are fixed positive numbers.
We use the substitution u =
t
x
, so that du =
1
x
dt. For the adjustment of
the limits of integration, observe that t/x = u = 1 when t = x, and that
t/x = u = y when t = xy. Then
_
xy
x
dt
t
=
_
xy
x
1
(t/x)
1
x
dt =
_
y
1
du
u
= ln y.
Using this calculation we deduce that
ln(xy) =
_
xy
1
dt
t
=
_
x
1
dt
t
+
_
xy
x
dt
t
= ln x + ln y.
This is exactly our claim.
Exercise 52. Show:
(1) ln 1 = 0.
(2) ln(1/y) = −ln y for all y > 0.
146 CHAPTER 3. INTEGRATION
(3) ln(x/y) = ln x −ln y for all x, y > 0.
Exercise 53. Show that ln 4 > 1. Hint: Using the partition
1 = x
0
< 2 = x
1
< 3 = x
2
< 4 = x
3
,
find a lower sum S
l
for the function 1/t over the interval [1, 4] so that S
l
> 1.
We can now define the Euler number:
Definition 3.40. The number Euler number e is the unique number such
that
ln e = 1 or, equivalently,
_
e
1
dt
t
= 1.
For this definition to make sense, we have to show that there is a number
e which has the property used in the definition. To see this, observe that
ln 1 = 0 < 1 < ln 4. Because ln x is differentiable, it follows from the
Intermediate Value Theorem (see Theorem 1.16) that there is a number e
for which ln e = 1. It also follows that 1 < e < 4.
Proposition 3.41. For every real number x there exists exactly one positive
number y, such that
ln y = x (3.25)
Proof. Observe that ln(e
n
) = n and ln(1/e
n
) = −n for all natural num-
bers n. So all integers (whole numbers) are values of the natural logarithm
function. Every real number x lies between two integers. According to the
Intermediate Value Theorem, every real number is a value of the function
ln y. We saw that ln y is an increasing function. This means that, for any
given x, the equation ln y = x has at most one solution. Taken together it
means that it has a unique solution.
Exercise 54. Show that
ln(a
r
) = r ln a
for all positive numbers a and all rational numbers r, i.e., numbers of the
form r = p/q where p and q are integers and q = 0.
In summary, we have seen that
Corollary 3.42. The natural logarithm function ln x is a differentiable, in-
creasing function with domain (0, ∞) and range (−∞, ∞), and ln

x = 1/x.
3.13. THE EXPONENTIAL AND LOGARITHM FUNCTIONS 147
We are now ready to define the exponential function.
Definition 3.43. Given any real number x, we define exp(x) to be the
unique number for which
ln(exp(x)) = x, (3.26)
i.e., y = exp(x) is the unique solution of the equation ln(y) = x. This
assignment (mapping x to exp(x)) defines a function, called the exponential
function, with domain (−∞, ∞) and range (0, ∞).
Exercise 55. Show that the exponential function exp and the natural log-
arithm function ln are inverses of each other. In addition to the equation in
(3.26), you need to show that
exp(ln(y)) = y (3.27)
for all y ∈ (0, ∞).
Summarizing this discussion, and adding some observations which we
have made elsewhere, we have:
Proposition 3.44. The exponential function exp(x) is a differentiable, in-
creasing function with domain (−∞, ∞) and range (0, ∞), and the exponen-
tial function is its own derivative, i.e., exp

(x) = exp(x).
Exercise 56. Show for all real numbers x and y that:
(1) exp(0) = 1
(2) exp(1) = e
(3) exp(x) exp(y) = exp(x +y)
(4) 1/ exp(y) = exp(−y)
(5) exp(x)/ exp(y) = exp(x −y).
Hint: Use the results of Exercise 52, the definition of e in Definition 3.40,
and that the exponential and logarithm functions are inverses of each other.
Exercise 57. Show that exp(r) = e
r
for all rational numbers r. Hint: Use
Exercise 54 and that the exponential and logarithm functions are inverses
of each other.
148 CHAPTER 3. INTEGRATION
The expression e
r
makes sense only if r is a rational number. If r = p/q
then we raise e to the r-th power and take the q-th root of the result. For
an arbitrary real number we set
e
x
= exp(x). (3.28)
This is consistent with the meaning of the expression for rational exponents
due to Exercise 57, and it defines what we mean by raising e to any real
power.
3.13.1 Other Bases
So far we discussed the natural logarithm function and the exponential func-
tion with base e. We now expand the discussion to other bases.
Definition 3.45. Let a be a positive number, a = 1. Set
log
a
x =
ln x
ln a
and exp
a
(x) = exp(xln a). (3.29)
We call log
a
(x) the logarithm function with base a and exp
a
(x) the expo-
nential function with base a. For the function log
a
we use the domain (0, ∞)
and range (−∞, ∞). For the exponential function exp
a
we use the domain
(−∞, ∞) and range (0, ∞).
Exercise 58. Show
(1) ln a > 0 if a > 1 and lna < 0 if 0 < a < 1.
(2) log
a
(x) and exp
a
(x) are differentiable functions.
(3) log
a
(x) and exp
a
(x) are increasing functions if a > 1.
(4) log
a
(x) and exp
a
(x) are decreasing functions if 0 < a < 1.
Exercise 59. Suppose a > 0 and a = 1. Show that
(a) exp
a
(log
a
(y)) = y for all y > 0.
(b) log
a
(exp
a
(x)) = x for all real numbers x.
Taken together, the specifications for the domains and ranges for the
functions exp
a
and log
a
and the results from Exercise 59 tell us that
Corollary 3.46. Suppose a > 0 and a = 1. The functions exp
a
and log
a
are inverses of each other.
3.13. THE EXPONENTIAL AND LOGARITHM FUNCTIONS 149
Exercise 60. Suppose a > 0 and a = 1. Show the laws of logarithms:
(a) log
a
1 = 0 and log
a
a = 1.
(b) log
a
(xy) = log
a
x + log
a
y for all x, y > 0.
(c) log
a
(1/y) = −log
a
y for all y > 0.
(d) log
a
(x/y) = log
a
x −log
a
y for all x, y > 0.
Exercise 61. Suppose a > 0 and a = 1. Show the exponential laws:
(1) exp
a
(0) = 1 and exp
a
(1) = a
(2) exp
a
(x) exp
a
(y) = exp
a
(x +y)
(3) 1/ exp
a
(y) = exp
a
(−y)
(4) exp
a
(x)/ exp
a
(y) = exp
a
(x −y).
Exercise 62. Suppose a > 0, a = 1, and r is a rational number. Show
log
a
(a
r
) = r and exp
a
(r) = a
r
.
We rephrase a convention which we made previously for e. Suppose
a > 0 and a = 1. The expression a
r
makes sense if r is a rational number.
If r = p/q then we raise a to the r-th power and take the q-th root of the
result. For an arbitrary real number we set
a
x
= exp
a
(x). (3.30)
This is consistent with the meaning of the expression for rational exponents
due to Exercise 62, and it defines what we mean by raising a to any real
power. Equation 3.30 specializes to the one in Equation 3.28 if we set a = e.
It is also a standard convention to set
1
x
= 1 and 0
x
= 0
for any real number x. Typically 0
0
is set 1.
We can now state an equation which is typically considered to be one of
the laws of logarithms:
Exercise 63. Suppose a > 0, a = 1, x > 0, and z is any real number. Then
log
a
(x
z
) = z log
a
(x).
150 CHAPTER 3. INTEGRATION
We are now ready to fill in the details for one of the major statements
which we made in Section 1.10. We are ready to prove
Theorem 3.47. Let a be a positive number, a = 1. There exists exactly one
monotonic function, called the exponential function with base a and denoted
by exp
a
(x), which is defined for all real numbers x such that exp
a
(x) = a
x
whenever x is a rational number.
Proof. In this section we constructed the function exp
a
(x), and this function
has all of the properties called for in the theorem. That settles the existence
statement. We have to show the uniqueness statement, i.e., there in only
one such function.
Suppose f(x) is any monotonic function and f(r) = a
r
= exp
a
(r) for
all rational numbers r. We have to show that f(x) = exp
a
(x) for all real
numbers x. We leave the verification of this assertion to the reader. Here one
uses that f(x) and exp
a
(x) are monotonic, and that exp
a
(x) is continuous.
Chapter 4
Trigonometric Functions
In this section we discuss the radian measure of angles and introduce the
trigonometric functions. These are the functions sine, cosine, tangent, et. al.
We collect some formulas relating these functions.
Arc Length and Radian Measure of Angles: Consider the unit
circle (a circle with radius 1) centered at the origin in the Cartesian plane.
It is shown in Figure 4.1. We take a practical approach to measuring the
length of an arc on this circle. We imagine that we can straighten it out,
and measure how long it is. It requires some work to introduce the idea of
the length of a curve in a mathematically rigorous fashion.
Definition 4.1. The number π is the ratio between the circumference of a
circle and its diameter.
This definition goes back to the Greeks. Stated differently it says, that
the circumference of a circle of radius r is 2πr. Observe that the ratio
referred to in the definition does not depend on the radius of the circle.
Consider an angle α between the positive x-axis and a ray which origi-
nates at the origin of the coordinate system and intersects the unit circle in
the point p. We like to find the radian measure of the angle α. Consider an
arc on the unit circle which starts out at the point (1, 0) and ends at p, and
suppose its length is s. Then
α = ±s (radians). (4.1)
The + sign is used if the arc goes counter clockwise around the circle. The
− sign is used if it proceeds clockwise. We may also consider arcs which
wrap around the circle several times before they end at p. In this sense, the
radian measure of the angle α is not unique, but any two radian measures
of the angle differ by an integer multiple of 2π.
151
152 CHAPTER 4. TRIGONOMETRIC FUNCTIONS
-1 -0.5 0.5 1
-1
-0.5
0.5
1
(cos t, sin t)
Figure 4.1: The unit circle
Conversely, let t be any real number. We construct the angle with radian
measure t. Starting at the point (1, 0) we travel the distance |t| along the
unit circle (here |t| denotes the absolute value of t). By convention, we travel
counter clockwise if t is positive and clockwise if t is negative. In this way
we reach a point p on the circle. Let α be the angle between the positive
x-axis and the ray which starts at the origin and intersects the unit circle
in p. This angle has radian measure t.
Comparison of Angles in Degrees and Radians: We suppose that
you are familiar with measuring angles in degrees. The measure of half a
revolution (a straight angle) comprises π radians and 180 degrees. So, one
degree corresponds to π/180 ≈ 0.017453293 radians, and one radian corre-
sponds to 180/π ≈ 57.29577951 degrees. We have the conversion formula
x degrees =
π
180
x radians. (4.2)
Trigonometric Functions: Let t be once more a real number. Starting
at the point (1, 0) we travel the distance |t| along the unit circle, counter
clockwise if t is positive and clockwise if t is negative. In this way we reach
a point p = (x(t), y(t)) on the circle, and we set
x(t) = cos t and y(t) = sin t. (4.3)
153
This defines the functions sint and cos t. You see the construction im-
plemented in Figure 4.1. You can find the graphs of the sine and cosine
functions on the interval [0, 2π] in Figures 4.2 and 4.3.
1 2 3 4 5 6
-1
-0.5
0.5
1
Figure 4.2: f(x) = sin x
1 2 3 4 5 6
-1
-0.5
0.5
1
Figure 4.3: f(x) = cos x
The other trigonometric functions, tangent (tan), cotangent (cot), secant
(sec), and cosecant (csc) are defined as follows:
tan x =
sin x
cos x
cot x =
cos x
sin x
sec x =
1
cos x
csc x =
1
sin x
(4.4)
To make sure you have some idea about the behavior of the tangent
and cotangent function we provided two graphs for each of them. They
are drawn over different parts of the domain to show different aspects. See
Figure 4.4 to Figure 4.7. You can see the graphs of the secant and cosecant
functions in Figure 4.8 and 4.9.
A small table with angles given in degrees and radians, as well as the
associated values for the trigonometric functions is given in Table 4.1. If the
functions are not defined at some point, then this is indicated by ‘n/a’. Older
calculus books may still contain tables with the values of the trigonometric
functions, and there are books which were published for the specific purpose
of providing these tables. This is really not necessary anymore because any
scientific calculator gives those values to you with rather good accuracy.
Trigonometric Functions defined at a right triangle: Occasionally
it is more convenient to use a right triangle to define the trigonometric
functions. To do this we return to Figure 4.1. You see a right triangle with
vertices (0, 0), (x, 0) and (x, y). We may use a circle of any radius r. The
154 CHAPTER 4. TRIGONOMETRIC FUNCTIONS
-3 -2 -1 1 2 3
-40
-20
20
40
Figure 4.4: tan x on [−π, π]
-1 -0.5 0.5 1
-2
-1
1
2
Figure 4.5: tan x on [−1.1, 1.1]
-3 -2 -1 1 2 3
-40
-20
20
40
Figure 4.6: cot x on [−π, π]
1.5 2 2.5
-1.5
-1
-0.5
0.5
1
1.5
Figure 4.7: cot x on [
π
2
−1,
π
2
+1]
-3 -2 -1 1 2 3
-10
10
20
Figure 4.8: sec x on [−π, π]
-3 -2 -1 1 2 3
-20
-10
10
20
Figure 4.9: csc x on [−π, π]
155
degrees radians sin x cos x tan x cot x sec x csc x
0 0 0 1 0 n/a 1 n/a
30 π/6
1
2

3
2

3
3

3
2

3
3
2
45 π/4

2
2

2
2
1 1

2

2
60 π/3

3
2
1
2

3

3
3
2
2

3
3
90 π/2 1 0 n/a 0 1 n/a
120 2π/3

3
2

1
2


3 −

3
3
−2
2

3
3
135 3π/4

2
2


2
2
−1 −1 −

2

2
150 5π/6
1
2


3
2


3
3


3 −
2

3
3
2
180 π 0 −1 0 n/a −1 n/a
Table 4.1: Values of Trigonometric Functions
right angle is at the vertex (x, 0) and the hypotenuse has length r. Let α
be the angle at the vertex (0, 0). In the following the words adjacent and
opposing are in relation to α. Then
sin α =
opposing side
hypothenuse
cos α =
adjacent side
hypothenuse
tan α =
opposing side
adjacent side
cot α =
adjacent side
opposing side
sec α =
hypothenuse
adjacent side
csc α =
hypothenuse
opposing side
Trigonometric Identities: There are several important identities for
the trigonometric functions. Some of them you should know, others you
should be aware of, so that you can look them up whenever needed. From
the theorem of Pythagoras and the definitions you obtain
sin
2
x + cos
2
x = 1, sec
2
x = 1 + tan
2
x, csc
2
x = 1 + cot
2
x. (4.5)
The following identities are obtained from elementary geometric observa-
156 CHAPTER 4. TRIGONOMETRIC FUNCTIONS
tions using the unit circle.
sin x = sin(x + 2π) = sin(π −x) = −sin(−x)
cos x = cos(x + 2π) = −cos(π −x) = cos(−x)
cos x = sin(x +
π
2
) = −cos(x +π) = −sin(x +

2
)
sin x = −cos(x +
π
2
) = −sin(x +π) = cos(x +

2
)
You should have seen, or even derived, the following addition formulas in
precalculus.
sin(α +β) = sin αcos β + cos αsin β (4.6)
sin(α −β) = sin αcos β −cos αsin β (4.7)
cos(α +β) = cos αcos β −sinαsin β (4.8)
cos(α −β) = cos αcos β + sinαsin β (4.9)
tan(α +β) =
tan α + tan β
1 −tan αtan β
(4.10)
tan(α −β) =
tan α −tan β
1 + tan αtan β
(4.11)
These formulas specialize to the double angle formulas
sin2α = 2 sin αcos α and cos 2α = cos
2
α −sin
2
α (4.12)
From the addition formulas we can also obtain
sinαsin β =
1
2
[cos(α −β) −cos(α +β)] (4.13)
sinαcos β =
1
2
[sin(α −β) + sin(α +β)] (4.14)
cos αcos β =
1
2
[cos(α −β) + cos(α +β)] (4.15)
which specialize to the the half-angle formulas
sin
2
α =
1
2
[1 −cos 2α] and cos
2
α =
1
2
[1 + cos 2α] (4.16)

c Copyright 2003 by the author. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the author. Printed in the United States of America. This publication was typeset using AMS-TEX, the American MathematA ical Society’s TEX macro system, and L TEX 2ε . The graphics were produced 1. with the help of Mathematica This is an incomplete draft which will undergo further changes.

1

Mathematica Version 2.2, Wolfram Research, Inc., Champaign, Illinois (1993).

Contents
Preface 1 Basic Concepts 1.1 Real Numbers and Functions . . . . . . . . 1.2 Limits . . . . . . . . . . . . . . . . . . . . . 1.2.1 Two important estimates . . . . . . 1.3 More Limits . . . . . . . . . . . . . . . . . . 1.4 Continuous Functions . . . . . . . . . . . . 1.5 Lines . . . . . . . . . . . . . . . . . . . . . . 1.6 Tangent Lines and the Derivative . . . . . . 1.6.1 Derivatives without Limits . . . . . 1.7 Secant Lines and the Derivative . . . . . . . 1.8 Differentiability implies Continuity . . . . . 1.9 Basic Examples of Derivatives . . . . . . . . 1.10 The Exponential and Logarithm Functions . 1.11 Differentiability on Closed Intervals . . . . . 1.12 Other Notations for the Derivative . . . . . 1.13 Rules of Differentiation . . . . . . . . . . . 1.13.1 Linearity of the Derivative . . . . . . 1.13.2 Product and Quotient Rules . . . . . 1.13.3 Chain Rule . . . . . . . . . . . . . . 1.13.4 Hyperbolic Functions . . . . . . . . 1.13.5 Derivatives of Inverse Functions . . . 1.13.6 Implicit Differentiation . . . . . . . . 1.14 Related Rates . . . . . . . . . . . . . . . . . 1.15 Exponential Growth and Decay . . . . . . . 1.16 More Exponential Growth and Decay . . . 1.17 The Second and Higher Derivatives . . . . . 1.18 Numerical Methods . . . . . . . . . . . . . . i v 1 1 2 4 6 8 9 10 12 13 14 15 18 20 21 22 22 23 26 29 30 34 38 41 43 48 48

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .5 Some elementary observations . . . . . . 3. . . . . . . . . . . . . . 2. . . . .2 Partitions and Sums . . . .3. . . . 2. . 3. . . . . . . . . . . . .18. 3. . . 3 Integration 3.4 Integrable Functions . . . .1 Cauchy’s Mean Value Theorem . . . . . . 3. . . . . . 2. .2 Unique Solutions of Differential Equations 2. 3. . . . . . . . .1 Concavity on Intervals . . . . ii . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . .8 The Fundamental Theorem of Calculus . . . . . . . . . . . . . 3. . . . . . . . . . . . .6 Detection of Local Extrema . . . 3.8 Absolute Extrema of Functions . . . . .2 The Riemann Integral . . . . . . . . .3. . . . . . . . . . . . . . . . . . . .2. . 2. . 2. . . . . . .18. . . . . . . . . . . 3. . 2. . . . . . . . . . . . . . . . .12 Applications of the Integral . . . . . . . . . . . . .6 Areas and Integrals . . .9. . . . . . . . . . . 3. .1 The Darboux Integral and Areas .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 The Second Derivative and Concavity . . . . . . . .4.9 Substitution . . . . . . . . .8. . .1 Properties of Areas . . . . . . . .1 Some Proofs . . . . . . . . . . .2 Riemann Sums . . . . . . . . . . . 2. . . . .1 Upper and Lower Sums . . . . . . . . . . . . . . . . . . . . . . . .7 Anti-derivatives .3 Limits and Integrability . . . . . . . . . . . . . 3. 1. . . . . . . . . . . . .3. . . . . . . 48 51 53 63 65 65 67 69 69 75 76 77 80 81 83 88 90 92 98 105 105 107 108 111 112 112 115 116 118 121 122 124 126 128 131 133 136 142 2 Global Theory 2. . . . . . . . . . . . . . . . . . . . . . . 3. . . .10 Sketching Graphs .19 Table of Important Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.3 The First Derivative and Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. . .2 Monotonicity at a Point . . . . 2. . . . .1. . . .1 Monotonicity on Intervals . . . . . . .11 Numerical Integration . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . .5 Local Extrema and Inflection Points . . . . 3. . .9 Optimization Story Problems . . . . . . . . . . . . .2 Concavity at a Point . 3. . .1 Approximation by Differentials 1. 3. . . . . . . .10 Areas between Graphs . . . . . . . .18. . . . .2 Newton’s Method . . . . . . . . . . . . . 3.1 Substitution and Definite Integrals 3. . . . . . . . . . . . . . . . . . . . . . . . .3 Euler’s Method . . . . . 2. . . . . . . .7 Detection of Inflection Points . . . . . . . . 1. . . . . . . . . . . . . . . . 2. . . . . . . . . . . . .

. . . . 148 4 Trigonometric Functions 151 iii . . .1 Other Bases . . . . . . . . . . . .3. . . . . . . . 144 3. . . . . .13. .13 The Exponential and Logarithm Functions . . . .

iv

Preface
In these notes we like to summarize calculus.

v

vi

PREFACE

1. We provide some rules for their computations. and they take values in R. The range of a function is a set in which the function takes values.1 Real Numbers and Functions We assume that the reader is familiar with the real numbers (denoted by R) and the operations of addition and multiplication. This allows us to order the real numbers. These are basic concepts of calculus. We will make frequent use of the absolute value function. x > y) if x − y is positive. then x is larger than y (i.Chapter 1 Basic Concepts Introduction In this chapter we introduce limits and derivatives. the sets on which these functions are defined. A real number is either positive.. negative. and {x ∈ R | |x − a| < } is the set of all real numbers whose distance from 1 . Their domains.e. are subsets of the real numbers. If x and y are real numbers. we will work with real valued functions in one real variable. or zero. Until further notice. such that f (x) = y.  x if x > 0  |x| = −x if x < 0   0 if x = 0 The distance between two points a and b on the real line is |a − b|. The image of a function f consists of all those points y in the range for which there exists an x in the domain of f .

We also say that f (x) approaches or converges to L as x approaches a. Every now and then we will allude to the completeness of the real line. Suppose that f (x) has a limit at x = a.2 CHAPTER 1. but arguments using it are too difficult for an introductory course on the subject. provided that the domain of the function f contains points arbitrarily close to a. Let f be a function and L a real number. and either a belongs to this interval or a is an end point Expressed in mathematical language this means.2. limits are unique if they exist. and ||x| − |y|| ≤ |x − y|. |x + y| ≤ |x| + |y|.2) reads as L is the limit of f (x) as x approaches a. For computations with absolute values it is worth noting that. whenever x is The equation in (1.1) |x · y| = |x| · |y|. 1. based on the values of f (x) for x near a. 1 . such that |f (x) − L| < in the domain of f and 0 < |x − a| < δ.2) L = lim f (x) x→a if for all > 0 there exists a δ > 0. such that 0 < |b − a| < δ. 2 Some authors do not apply the concept of a limit at isolated points of the domain of a function. An intuitive interpretation is that the expected value of f (x) at x = a is L. Expressed as an interval this set is (a − . BASIC CONCEPTS a is less than . Definition 1. In all but a few degenerate cases. This property is crucial for calculus. points for which there are no other arbitrarily close points in the domain of the function. for all δ > 0 there is a point b in the domain of f .1. which means that every bounded subset of the real line has a least upper bound. a + ). Proposition 1.2 Limits Limits are a central tool in calculus and other areas of mathematics. The first inequality is referred to as triangle inequality.1 2 The latter assumption in the proposition is satisfied if the domain of f contains an interval. and the last one is a variation of it. then this limit is unique. We discuss them in this section. We say that (1. for any two real numbers x and x (1.

5. x→0 x lim sin x = 1. then x→a lim f (x) = f (a). we make this kind of an assumption for the remainder of this section. Suppose that x→a lim f (x) = L and x→a lim g(x) = M. Assume that the domains of the functions f (x) and g(x) both contain an interval of the form (d. To avoid intricate language.1. If f (x) is a polynomial. and that c is a constant. and h(x) all contain an interval of the form (d. at least if n is an integer. and it is equal to L. or a trigonometric function and f (a) is defined. Taking limits is compatible with the basic algebraic operations in the following sense. a rational function.3. a) or (a. As a special case we obtain the following useful observation: (1. (1. If x→a lim f (x) = L = lim g(x).4 (Pinching Theorem). x→a x − a Hints: The first two limits follow easily from the estimates in Theorem 1. Proposition 1. x→0 x lim and lim xn − an = nan−1 . e) where d < a < e. The last assertion can be proved using synthetic division. Proposition 1. Assume that the domains of the functions f (x).2. LIMITS 3 of it. . Proposition 1.3) x→c lim f (x) = L if and only if x→c lim (f (x) − L) = 0. For many functions the computation of limits is no challenge. e) where d < a < e and that f (x) ≤ h(x) ≤ g(x). The following limits are important in the calculations of some derivatives. a) or (a.4) cos x − 1 = 0. discussed in the following subsection. g(x).7. Then x→a lim (f + g)(x) = M + L x→a x→a lim cf (x) = cM lim (f · g)(x) = M · L x→a lim (f /g)(x) = M/L provided that L = 0. x→a then the limit of h(x) exists as x approaches a.

For h ∈ [−π/4.4 CHAPTER 1.7 we show Theorem 1.5) | sin h| ≤ |h| ≤ | tan h|. then (1. One can show that the area of a disk with radius one is π. A rigorous argument requires work. Proof. If h ∈ [−π/4. In particular. We use the process of rolling the circle along the line to measure |h|. sin h). to show that |h| = BC ≤ | tan h| = BD. D E C O A B Figure 1.1: The unit circle We find that | sin h| = AC ≤ |h| = BC because going from C straight down to the x-axis is shorter than following the circle from C to the x-axis. |h| = BE. BASIC CONCEPTS 1. and the area of the slice is (tan |h|)/2.2. This Here our argument relies on intuition. From this is follows by elementary geometry that the area of the slice of the disk with vertices O. B and C has area |h|/2. imagine that you roll the circle along the vertical line through B until the point C touches it in the point E. It appears to be clear3 that BE ≤ BD. 3 . π/4]. Given two points X and Y in the plane. the distance between them is denoted by XY . Secondly.1 Two important estimates In preparation of the proof of Theorem 1.6. We denote by BC the length of the arc (part of the unit circle) between B and C. π/4] we set C = (cos h.1 you see part of the unit circle. This slice is contained in the triangle with vertices O. In Figure 1. B and D. It follows that |h| ≤ tan |h|.

π/4] be the number for which we want to show the inequality and C = (cos h. then4 h2 h2 and |h − sin h| ≤ . BC = |h|. Let BC be the length of the arc (part of the unit circle) between B and C. DB = (1 − cos h). and BC ≤ BC. Restricting ourselves to this interval simplifies the proofs somewhat. Using similar triangles we see AB/BC = BC/DB and (BC)2 = AB × DB. In Figure 1.2. π/4]. 2 2 Proof of Theorem 1.7. (1. B.6) |1 − cos h| ≤ C A D B Figure 1. Theorem 1. the second inequality which we claimed in the theorem. but we only need them on an interval around zero. LIMITS 5 verifies that |h| ≤ tan |h|. Denote by XY the length of the straight line segment between the points X and Y . . Let h ∈ [−π/4.2: The unit circle From the picture we read off that AB = 2.2 you see half of a circle of radius 1 centered at the origin.7. If h ∈ [−π/4. and a triangle with vertices A. sin h). 4 The inequalities hold without the restriction on h.1. and C. In other words 2(1 − cos h) = AB × DB = (BC)2 ≤ (BC)2 = h2 .

If 0 = h ∈ [−π/4. such that |f (x) − L| < in the domain of f and a − δ < x < a. We say that L = lim f (x) x→a+ if for all > 0 there exists a δ > 0. The next two limits express what happens as the variable tends to plus or minus infinity. They are called the right and left hand limits. such that |f (x) − L| < in the domain of f and a < x < a + δ. We say that L = lim f (x) x→a− if for all > 0 there exists a δ > 0. Let f be a function and L a real number.2. whenever x is Definition 1. If h = 0. Let f be a function and L a real number. h 2 2 The second estimate claimed in the theorem is an immediate consequence. then both sides of the second inequality in (1.3 More Limits The material in the previous.8. and we do so in this section.9.6) is an immediate consequence. first section about limits suffices for a while. h Using our previous estimate for |1 − cos h| and our assumption that |h| ≤ π/4 < 1. π/4]. then Theorem 1. The first two limits express how the function behaves as we approach a point a from the right or left.6 CHAPTER 1. We call them infinite limits. cos h h Subtracting the terms in this inequality from 1 we find 0 ≤1− sin h ≤ 1 − cos h ≤ 1. We call them limits at infinity. The last two limits allow us to express that the values of a function tend to plus or minus infinity.6 tells us that | sin h| sin h | sin h| ≤ |h| ≤ | tan h| = hence 0 ≤ cos h ≤ ≤ 1. verifying the assertion in this case. 1. Definition 1. we conclude that h − sin h h2 h ≤ |1 − cos h| ≤ ≤ .6) are zero. In some situations one would like to modify the definition in Section 1. whenever x is . BASIC CONCEPTS The first estimate in (1.

MORE LIMITS For example. We say that L = lim f (x) x→∞ if for all > 0 there exists a number M . We say that x→a lim f (x) = ∞ if for all M there exists a δ > 0 such that f (x) > M whenever x is in the domain of f and 0 < |a − x| < δ.10. no matter how large. such that |f (x) − L| < x is in the domain of f and x > M . Let f be a function and L a real number.1. Definition 1. In other words.12. Let f be a function and a a real number. whenever Definition 1. We say that x→a lim f (x) = −∞ if for all M there exists a δ > 0 such that f (x) < M whenever x is in the domain of f and 0 < |a − x| < δ. and a can be replaced by ±∞. we can make sure that the value of f (x) is larger than any given number M . then x→a+ 7 lim f (x) = 1 and x→a− lim f (x) = −1. We can consider what happens to the values of a function f (x) as x approaches ∞ or −∞. Let f be a function and L a real number.13. by taking x close to a. if f (x) = sign(x) = x/|x|. For example √ 1 lim = ∞ and lim x = ∞. In the last two definitions a may be replaced by a± . We say that L = lim f (x) x→−∞ if for all > 0 there exists a number M . For example x→∞ whenever lim 1 = 0 and x x→−∞ lim 1 = 1. + x x→∞ x→0 . Let f be a function and a a real number. so that we approach a from the left or right. Definition 1. such that |f (x) − L| < x is in the domain of f and x < M .11.3. 1 + x2 Definition 1.

Definition 1. b].16 (Intermediate Value Theorem). such that |f (c) − f (x)| < whenever x belongs to the domain of f and |x − c| < δ. we note that the function (f + g)(x) = f (x) + g(x) is defined for those x for which both f and g are defined. wherever these functions are defined.4 Continuous Functions We define continuous functions and discuss a few of their basic properties.2. Theorem 1. The clarify the remark about the domain in the proposition. Let f and g be continuous functions. One may also reverse the order of applying a continuous function and calculating a limit: (1. Let f be a function and c a point in its domain. such that f (c) = 0. and trigonometric functions are continuous. If c is an isolated point in the domain of f . If C is in between f (a) and f (b). Suppose that f is defined and continuous on the closed interval [a. For the composition (f ◦ g)(x) = f (g(x)) on needs that g takes values in the domain of f .e. BASIC CONCEPTS 1. f · g. f /g and f ◦ g are continuous. g is defined at points arbitrarily close to c. there are no other points in the domain of f arbitrarily close to c. i. The class of continuous functions will play a central role later. rational functions.15.e. Polynomials. See the footnote to Proposition 1.. then the function is always continuous at c.8) x→c lim f (g(x)) = f lim g(x) . b]. To determine the domain of f /g one needs to exclude those points where g is zero. One can produce many more continuous functions through standard operations on functions. . f is defined for all g(x) where x is in the domain of g and close to c. In most cases the condition in Definition 1. A function f is continuous if it is continuous at all points in its domain.8 CHAPTER 1. Proposition 1. x→c provided the natural technical assumption hold. Then f + g. i. In fact.14.14 says that (1.. The function is said to be continuous at c if for all > 0 there exists a δ > 0. this equation holds whenever there are points in the domain of f arbitrarily close to c. then there exists a c ∈ [a. The same statement holds for (f · g)(x) = f (x) · g(x).7) x→c lim f (x) = f (c). and f is continuous at limx→c g(x).

. The equation of a line with slope m through the point (x1 . a line consists of the points (x..11) y = m(x − x1 ) + y1 .5. The polynomial is certainly a continuous function. y1 ) and (x2 . the line through them has slope m= y2 − y1 . and B is the point in which the line intersects the y-axis. where it is assumed that a and b are not both zero.1. Expressed in words. Then there exist points c and d in [a. Theorem 1. b]. x2 − x 1 For our purposes. Let f be defined and continuous on the closed interval [a. We mentioned this property of the real numbers in Section 1. such that f (c) ≤ f (x) ≤ f (d) for all x ∈ [a. b b The number m is called the slope of the line.9) ax + by = c for some given real numbers a. The line is vertical if and only if b = 0. y2 ) in the plane. the theorem says that a continuous function on a closed interval assumes a smallest and largest value.17 (Extreme Value Theorem). suppose that p(x) = x3 − x2 + 2x − 1. 1). The Intermediate Value and Extreme Value theorem are typically proved in an introductory analysis course. 1. b]. LINES 9 E.1. If b = 0 we may rewrite the equation as (1. such that p(c) = 0. also called the y-intercept. b and c. They are equivalent to the completeness of the real line. y1 ) is (1. Given any two points (x1 . According to the theorem there exists some c ∈ (0. p(0) = −1 and p(1) = 1.5 Lines In general. y) in the plane which satisfy the equation (1. the most useful version of the equation of a line is its point-slope formula.10) a c y = − x + = mx + B. b].g.

Definition 1. BASIC CONCEPTS 1.. Definition 1. x−c The equation in (1. We give a first definition for a tangent line. the line t(x) is closer to the graph of f (x) than any other line for all x in some open interval around c. For a given function and an interior point c in its domain there may or may not be a tangent line.10 CHAPTER 1.12) expresses in a precise form in which sense the tangent line is close to the graph of f (x) near c.6 Tangent Lines and the Derivative We like to introduce the concept of tangent lines. A point c is an interior point of a subset B of R if there is an open interval I. i. Definition 1. this definition is hard to work with.e. We call the slope of the tangent line the derivative of f (x) at c.12) x→c lim f (x) − t(x) = 0. Suppose f (x) is a function and c is an interior point of its domain. then it is unique. It is easier to work with a more concrete definition. and assume that there is a tangent line to the graph of f (x) at x = c.21. Although the term ‘best linear approximation near c’ gives an excellent intuitive picture what a tangent line is. . and we denote it by f (c). Then we say that f (x) is differentiable at c. Utilizing the notation in the previous definition we can write down the equation of the tangent line to the graph of f (x) at x = c in point-slope form: (1. To be able to express ourselves concisely. We call a line t(x) the tangent line to the graph of f (x) at x = c if (1.13) t(x) = f (c)(x − c) + f (c). We call a line t(x) the tangent line to the graph of f (x) at x = c if t(x) is the best linear approximation of f (x) on some open interval around c. it does so even when divided by x − c. but it there is a tangent line.18. Suppose f (x) is a function and c is an interior point of its domain. Not only does f (x) − t(x) converge to zero as x approaches c. We use tangent lines to define the concept of differentiability and the derivative. such that c ∈ I ⊆ B. let us say Definition 1.19.20. Suppose f (x) is a function and c is an interior point of its domain.

The technique used in the example.1. Then say that f (x) is differentiable if it is differentiable at each point of its domain. We consider f (x) as a function. Its slope is p (−2) = −52. For t(x) as proposed. Definition 1. TANGENT LINES AND THE DERIVATIVE 11 To differentiate a function means to find its derivative. To do so. Say. ♦ The example is generic. By definition. Example 1.6. whose domain consists of all those points where f (x) is differentiable.22. shows that t(x) = A1 (x − c) + A0 . Let p(x) = 2x4 − 3x2 + 5. such that each of its points is an interior point.23. the result is p(x) = An (x − c)n + · · · + A1 (x − c) + A0 . Find the tangent line t(x) to the graph of p(x) at x = −2 and p (−2). we substitute u − 2 for x and expand p in powers of u. suitably generalized. We can use any polynomial p(x) and point x = c and write p(x) in powers of (x − c). we see that p(x) − t(x) x−c = p(x) − t(x) x+2 (provided |x + 2| ≤ 1) = |2(x + 2)3 − 16(x + 2)2 + 45(x + 2)| ≤ 65|x + 2| This estimate shows that (p(x) − t(x))/(x − c) converges to zero as x approaches c = −2. replacing u by (x + 2). we find: p(x) = 2(x + 2)4 − 16(x + 2)3 + 45(x + 2)2 − 52(x + 2) + 25. By definition. Suppose the domain of the function f (x) is an open set. We assert that t(x) = −52(x + 2) + 25 and p (−2) = −52. You are expected to fill in some of the arithmetic steps. Solution: As a first step we expand p in powers of u = (x + 2). an open set is a set. p = 2(u − 2)4 − 3(u − 2)2 + 5 = 2(u4 − 8u3 + 24u2 − 32u + 16) − 3(u2 − 4u + 4) + 5 = 2u4 − 16u3 + 45u2 − 52u + 25 Reversing the substitution. this means that t(x) is the desired tangent line.

6.14) can be rewritten as q(x) = t(x) − A(x − c)2 ≤ f (x) ≤ t(x) + A(x − c)2 = p(x). Eventually we will find a more efficient method for differentiating polynomials. There you see the function f (x) = sin x. and the picture shows how they ‘hug’ each other.12 CHAPTER 1. and derivative.14) for all x ∈ I. With this definition fewer functions will be differentiable than with the one given in Definition 1.24. In an example. There is also a geometric picture which illustrates the concept of closeness. The inequality in (1. In this sense. and p (c) = A1 . but this is not crucial. The parabolas p(x) and q(x) touch each other without crossing. We call a line t(x) the tangent line to the graph of f (x) at x = c if there exists and open interval I around c and a number A.1 Derivatives without Limits Without a doubt.3.14) at least alludes to a divisibility condition. In terms of algebraic geometry (1. Polynomials are differentiable. Suppose f (x) is a function and c is an interior point of its domain. such that (1. but we have shown that Proposition 1. and it is interesting to explore ways to develop calculus. tangent line. The condition in (1. Definition 1. the graphs of f (x) and t(x) have to be close to each other near x = c. and the tangent line t(x) (dashed).20. All four function f (x). the definition of a limit is the most difficult one in a first semester of calculus. One can do this by replacing the condition in (1. without the limit concept.14) is also more accessible to computer assisted algebra than the limit definition.25. |f (x) − t(x)| ≤ A(x − c)2 . the parabola q(x) (dotted and open downwards). rigorously. p(x). this situation is shown in Figure 1.12) by a slightly stronger one. q(x). A pedagogical advantage of the approach is that one does not have to understand limits before one can understand the definition of the derivative. BASIC CONCEPTS is the tangent line to the graph of p(x) at x = c. and f (x) and t(x) are squeezed in between them. where the parabolas q(x) and p(x) are defined by the expressions they are adjacent to. There is very little space left between p(x) and q(x). 1. the parabola p(x) (dotted and open upwards). and t(x) have the same value at x = c.

Theorem 1. SECANT LINES AND THE DERIVATIVE 13 1.3: Sine Function and Tangent Line between two Parabolas 1. Suppose f is a function and c is an interior point of its domain. (1.5 1 1.26. This is obvious once one uses the expression for the tangent line in (1. and its slope (f (a) − f (b))/(a − b) is called .1. then f (c) = lim f (c) − f (x) .75 0.13) and substitutes it in the expression in (1. Let us explain the situation geometrically. c−x x→c Proof. x−c x−c Apply limits to both sides of the equation and the assertion follows.5 0.7. If f is differentiable at c.25 1 0.7 Secant Lines and the Derivative Often a different approach is taken to motivate and introduce the derivative. f (b)) is called a secant line.12) inside the limit. The line through (a.25 0. f (a)) and (b.15) f (x) − t(x) f (x) − f (c) = − f (c). Suppose a and b are distinct points in the domain of the function f .5 2 Figure 1.5 1.

0025 -0.01 -0.1 -0. . By no stretch of imagination will you say that the graph of the function looks like a line.0075 0. You see part of its graph over two different intervals in Figure 1. 0. (f (c)) and (x. In (1.0001 0. and then we take the limit as x approaches c.5: f (x) = x2 sin(1/x) It is misleading to say that the graph of f (x) looks like.0025 -0. If a function is differentiable at a point.14 CHAPTER 1. call t(x) = f (x − c) + f (c) the tangent line.4: f (x) = x2 sin(1/x) Figure 1. then it is continuous at this point.01 0. For the obvious reason f (c) is called the rate of change or instantaneous rate of change of f at c. a line near c.8 Differentiability implies Continuity It is worth pointing out that Theorem 1.005 0.005 -0.0001 -0. BASIC CONCEPTS the average rate of change of f over the interval [a. b].00005 0.27. 1.1 -0.05 -0. f (x)). and possibly illustrate that the tangent line is close to the graph in the sense of Definition 1.15) we are considering the slopes of secant lines through (c.00005 Figure 1.005 0.0075 -0.20. or resembles.5.005 0.05 0. The theorem asserts that for a differentiable function this limit of the slopes of secant lines is the slope of the tangent line.4 and 1. Eventually you will be able to show that the function f (x) = x2 sin(1/x) 0 if x = 0 if x = 0 is differentiable everywhere on the real line. Many authors introduce the derivative as the limit of the slopes of secant lines.

We can also give an analytic argument. 2 1. BASIC EXAMPLES OF DERIVATIVES 15 Proof.5 1 0.9 Basic Examples of Derivatives Let us use the definitions and work out a few derivatives. They are 1 if x > 0 and −1 if x < 0.6) that there is not line close to the graph of this function near x = 0. It is apparent from the graph (see Figure 1. Because f (c)(x−c) converges to zero as x approaches c. Denote the function by f (x) and the point of differentiability by c. 1.1. the function f (x) = |x| is continuous. so does (f (x)−f (c)). According to the definition of differentiability.5 -2 -1 1 2 Figure 1. we have to study the difference quotients (|x|−|0|)/(x−0) = |x|/x. There is no number these difference quotients converge to.g.6: The absolute value function The converse of the theorem is false. By assumption we have the derivative f (c) and x→c lim f (x) − f (c) − f (c) = 0. . and f (x) = |x| is not differentiable at x = 0. E. This implies that limx→c f (x) = f (c) and that f (x) is continuous at c. but it is not differentiable at x = 0. x−c Then certainly x→c lim [(f (x) − f (c)) − f (c)(x − c)] = 0. There are continuous functions which are not differentiable.9..

Example 1. √ √ x− c x−c 1 √ √ = √ .28. n = 0. Suppose c = 0. . . generalizing all of the examples above. 2 ac + b Verify that (1.16 CHAPTER 1. . Eventually we will see that if f (x) = xa for any real number a.e. If f (x) = xn and n is a non-negative integer. then f (x) = −1/x2 . i. √ Exercise 1.. If f (x) = Proof. 1. BASIC CONCEPTS Example 1. Example 1. 3 2( ac + b) . . If f (x) = 1/x. then f (x) = 1/(2 x).29. Proof. Suppose that n ≥ 2. = lim x→c x − c x→c (x − c)( x + c) 2 c lim Remark 1. then f (x) = nxn−1 . lim 1 x x→c −1 c−x 1 c = lim = − 2. 2 ax + b The tangent line to the graph of f (x) at x = c is then √ a t(x) = √ (x − c) + ac + b. Suppose that f (x) = ax + b and ax + b > 0. Show that a f (x) = √ . Then xn − cn = lim (xn−1 + xn−2 c + · · · xcn−2 + cn−1 ) = ncn−1 x→c x − c x→c lim The cases n = 0 and n = 1 are even easier and left to the reader. x→c xc(x − c) x−c c √ √ x and x > 0. 2. then f (x) = axa−1 . Proof.30.16) |f (x) − t(x)| ≤ a2 √ (x − c)2 .

the angle x needs to be measured in radians. Example 1. then f (x) = − sin x.9. and provides an explicit error estimate. lim x→c For computation of the limits in the second to last line see (1.16) shows differentiability in the sense of Definition 1. but it provides an explicit error estimate. Exercise 2. a bound on the difference between the function and its tangent line.31. It is left as an exercise for the reader to show that (1. If f (x) = cos x.17) | sin x − t(x)| ≤ (x − c)2 The steps are essentially the same as in the proof above. The estimate in (1.1.31.17) does not only show differentiability in the sense of Definition 1. then |f (x) − t(x)| ≤ (x − c)2 . The tangent line to the graph of the sine function at x = c is t(x) = cos c(x − c) + sin c. The details are similar to the ones in Example 1. Show that sin x = cos x. lim sin x − sin c x−c = = = = = sin(c + h) − sin c h→0 h sin c cos h + cos c sin h − sin c lim h→0 h sin c(cos h − 1) + cos c sin h lim h→0 h cos h − 1 sin h sin c · lim + cos c · lim h→0 h→0 h h cos c. Below we will set x = c + h and x − c = h. BASIC EXAMPLES OF DERIVATIVES 17 The estimate in (1. a bound on the difference between the function and its tangent line. For this equation to hold.4). Furthermore. .25.25. if t(x) = sin c(x − c) + cos x is the tangent line to the graph of f (x) at x = c. Proof.

It is called the logarithm function with base a. ∞) as the range of the exponential function expa (x). Theorem 1. More explicitly. and by construction it is the inverse of the exponential function expa (x). so we use (0. are collected in our next theorem.33 (Exponential Laws). called the exponential function with base a and denoted by expa (x). √ any rational number For r = p/q (p and q are integers) one can define ar = q ap . so that the domain of the exponential function is (−∞.18 CHAPTER 1. which you previously learned for rational exponents. also hold in the generality of our current discussion. aloga y = y and loga (ax ) = x . More precisely. In this sense we have a function h(r) = ar . ∞).32. Furthermore. For any positive real number a and all real numbers x and y ax ay = ax+y ax /ay = ax−y (ax )y = axy If x is the unique solution of the equation ax = y. For every number y > 0 there exists exactly one number x. which is defined for all real numbers x such that expa (x) = ax whenever x is a rational number. such that expa (x) = y. The arithmetic properties of the exponential function. BASIC CONCEPTS 1.18) loga (y) = x. Suppose a is a positive real number and a = 1. whose domain consists of all rational numbers. It is common. This function is monotonic. ax > 0 for all x. a = 1. technically we are not quite prepared for it and at a later point we have to revisit the introduction to fill in details. also called the exponential laws. h(r) is increasing if a > 1 and decreasing when 0 < a < 1.10 The Exponential and Logarithm Functions The exponential and logarithm are of great importance and we do not want to delay their introduction any further. Theorem-Definition 1. Still. Let a be a positive number. There exists exactly one monotonic function. to use the notation ax for expa (x) also if x is not rational. We just defined a function loga (y). The theorem just says that the exponential laws. and we will follow this convention. then we set (1. First we take a p-th power and then a q-root.

5 2 1. Its inverse is the natural logarithm function. and e ≈ 2.10.8 you see parts of the graphs of the exponential and logarithm functions with base 2.5 1 1.35. For any positive real number a = 1. 2. So exp(x) = expe (x) and ln(x) = loge (x).33 we have the laws of logarithms. Theorem 1.5 1 -0. THE EXPONENTIAL AND LOGARITHM FUNCTIONS 19 for all x ∈ R and all y > 0. ∞) and its range is (−∞. . We will define it precisely later. This irrational number is called the Euler number (named after Leonard Euler) and denoted by e.718281828. Definition 1. The exponential function is the exponential function for the base e. ∞).7: exp2 (x) Figure 1. and vice versa.5 0. for all positive real numbers x and y. The domain of the logarithm function is (0.1. It is denoted by exp(x) or ex . It is denoted by ln(x).5 0.7 and 1.5 1.5 1 1.5 -1 -1 -0.8: log2 (x) The Euler number e as base There is one number which is preferrable as base over the others.5 3 Figure 1. One set of laws implies the other one. It is increasing if a > 1 and decreasing if 0 < a < 1.5 1 0.5 0. and any real number z loga (xy) = loga (x) + loga (y) loga (x/y) = loga (x) − loga (y) loga (xz ) = z loga (x) In Figures 1. Corresponding to the exponential laws in Theorem 1.5 2 2.34 (Laws of Logarithms).

a function may be differentiable without the derivative being well defined. The least technical and for our purposes sufficient solution is captured in Proposition 1. and that J is the domain of a function F . Theorem 1.19) exp (x) = exp(x) CHAPTER 1.e. Definition 1.20 Eventually we will see (1..22 we defined what it means that a function is differentiable on an open set. We call F an extention of f if it agrees with f on I. ln a These identities follow from the exponential laws and the laws of logarithms. Suppose that I and J are subsets of the real line R and I ⊆ J. Let us formalize the idea of extending functions. i. and f extends to a differentiable function F on an open interval containing I.37.38. ax = ex ln a and loga x = ln x . We set f (x) = F (x) for all x ∈ I. Without some restrictions on I. that I is the domain of a function f . There are situations in which one would like to apply the notion of differentiability to functions with other kinds of domains. Other Bases Finally. For any positive number a (a = 1). F (x) = f (x) for all x ∈ I. and the derivative of the natural logarithm function is 1/x. A function f is said to be differentiable on a subset I of R if it extends to a differentiable function F on an open set. 1. the interval is neither empty nor a single point. x The derivative of the exponential function is the exponential function. . Definition 1. let us relate the exponential and logarithm functions for different bases to those with base e.11 Differentiability on Closed Intervals In Definition 1. BASIC CONCEPTS and ln (x) = 1 .39. then f (x) = F (x) is unique for all x ∈ I. Suppose the function f is defined on an interval I.36.

where a < b. then dy dy 1 = y = ex . then dy (x) = cos x.12. and if y(x) = ln x.12 Other Notations for the Derivative There are different notations for the derivative of a function. E. ∞) as its domain. and that is a function. (The function g(x) is differentiable if we use (0. ∞) as domain.1. It is the name of the variable of y as well as the name of the variable of the derivative of y. The only sensible candidate for the tangent line to the graph of g(x) at the point (0.) 1. We will use it frequently. ∞).31): If y(x) = sin x. then = . b]. the function g(x) = x is not differentiable on the interval [0. OTHER NOTATIONS FOR THE DERIVATIVE 21 We are mostly concerned with defining differentiability for functions whose domain is a closed interval [a. This is particularly convenient if f stands for a larger expression as in d sin x = cos x dx or d x e = ex . if x is a function of time. dx In this notation x plays two roles. Let us discuss two examples.g. In contrast. Leibnitz’ notation for the ˙ df derivative of a function f of a variable x is dx . Some authors use one-sided limits and one-sided derivatives to contemplate derivatives at the end points of the interval. Physicists will indicate a derivative with respect to time by a dot. To be specific about this aspect. It extends to the differentiable function F (x) = x2 with √ the open set (−∞. Expressing the derivatives of the exponential and natural logarithm functions this way (see (1. dx . 1] is differentiable.. it makes sense to write (compare Example 1. The function f (x) = x2 with domain [0. The slope of this line is not a real number and we do not have a derivative. 0) is a vertical line. This in acceptable df d because it won’t lead to confusion. The expression does not tell where dy/dx is evaluated.19)) we have: If y(x) = ex . then they will write x(t) instead of x (t). and it lends itself more to generalizations in higher dimensions. Instead of dx (x) we also write dx f (x). dx dx x This notation is not always specific enough. The expression dy/dx stands for the derivative of y with respect to x. Our discussion is less painful.

We collect the rules established in this section and tabulate the derivatives of many of the important functions which we considered. Let c be a real number.22 CHAPTER 1.40. and assume that both of them are differentiable at x.21) and In words. In a more mathematical language one says that differentiation is linear. see (1. Example 1. Then f + g and cf are differentiable at x and their derivatives are given by (1. a rather mechanical process. You can do it even on the computer. together with the knowledge of the derivatives of some basic functions. dx dx (1. We conclude that h (x) = dh (x) = 2x + 3ex . be able to apply the accurately. In the last section of this chapter we summarize the computational results of this section. Let f and g be functions. turn the process of differentiation for many functions into an algorithm. Then h(x) = f (x) + 3g(x). In Leibnitz’ notation this reads d df dg (f + g)(x) = (x) + (x) dx dx dx d df (cf )(x) = c (x). which means that no “understanding” is required. the derivative of a sum of functions is the sum of the derivatives. These formulas. BASIC CONCEPTS 1. g(x) = ex and c = 3. Solution: Set f (x) = x2 .1 Linearity of the Derivative Differentiation is compatible with addition of functions and multiplication with a constant.20) (f + g) (x) = f (x) + g (x) and (cf ) (x) = cf (x). and practice many examples. Differentiate h(x) = x2 + 3ex . You are expected to learn the basic rules.13.13 Rules of Differentiation We discuss formulas for calculating the derivative of a composite function from the derivatives of its constituents. Previously we found that f (x) = 2x and that g (x) = ex . ♦ dx . 1. and the derivative of a multiple of a function is the multiple of the derivative.19).

23) (cf ) = cf or d df (cf ) = c . Then the differentiation rules are (1. we may omit (x) from the notation. Then the product f g and the quotient f /g are differentiable at x and their derivatives are given by . Solution: A polynomial is a finite sum of multiples of non-negative powers of the variable.22) and (1. We stated previously that ln x = 1/x.13. let f and g be functions.41. Here is a specific example. They allow us to calculate the derivatives of products and quotients of functions. and not so much as functions evaluated at a point. we find loga x = d dx ln x ln a = 1 1 1 1 ln x = × = . Using the linearity of the derivative.42. Thinking of f and g more as functions. For the quotient rule assume in addition that g(x) = 0.e.1. Solution: Recall that loga x = ln x . where the ai are constants.36. Find the derivative of an arbitrary polynomial. a function of the form f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 . i.28 and the linearity of the derivative we see right away that f (x) = nan xn−1 + (n − 1)an−1 xn−2 + · · · + a1 . and assume that both of them are differentiable at x.13. dx dx (f + g) = f + g or d df dg (f + g) = + dx dx dx Example 1. If f (x) = 4x5 − 3x2 + 4x + 5. In this sense ln a loga x = cf (x) where c = 1/ ln a and f (x) = ln x.2 Product and Quotient Rules Next we state the product and the quotient rule. Again. see Theorem 1. RULES OF DIFFERENTIATION 23 Example 1. the logarithm functions for an arbitrary positive base a. a = 1. ♦ 1..19). Using Example 1. then f (x) = 20x4 − 6x + 4. see (1. a special case of the formula which we just derived. ♦ ln a ln a x x ln a Suppose f and g are defined and differentiable on an set. Differentiate log a x.

44. Then p (x) = 2x and q (x) = 3x2 . According to the quotient rule r (x) = 2x(x3 + 1) − (x2 − 5)3x2 −x4 + 15x2 + 2x = . x = m 2m dx dx x x x . Then f (x) = 2x and g (x) = 1/x. Find the derivative of the rational function. BASIC CONCEPTS (1. [g(x)]2 (1.24) (f g) (x) = f (x)g(x) + f (x)g (x) f g f (x)g(x) − f (x)g (x) .43.25) (x) = In Leibnitz’ notation these formulas become (1. Solution: We verified this formula for n ≥ 0 in Example 1. Putting this into the product formula yields 1 = x(2 ln x + 1).19). If n ≤ −1. x3 + 1 ♦ Solution: We set p(x) = x2 − 5 and q(x) = x3 + 2. see (1. ♦ (x3 + 1)2 (x3 + 1)2 Example 1. [g(x)]2 (1. x Example 1.24 CHAPTER 1. Solution: Write h(x) = f (x)g(x) with f (x) = x2 and g(x) = ln x. The formula d n x = nxn−1 dx for all integer powers n. Then d n 0 · xm − 1 · mxm−1 d −m 1 = = m+1 = nxn−1 .27) Example 1. Differentiate the function h(x) = x2 ln x.45.28. then we domain of the function is R\{0}.26) d (f g)(x) = dx d dx f g (x) = df dg (x)g(x) + f (x) (x) dx dx df dx (x)g(x) dg − f (x) dx (x) . Let n be a negative integer and m = −n. h (x) = f (x)g(x) + f (x)g (x) = 2x ln x + x2 r(x) = x2 − 5 . the real line with the origin removed.

31) and cos x = − sin x (see Exercise 2 on page 17). given by g2 (x) = [g(x)]2 . The function is defined for all x for which cos x = 0. i.28) sin x cos x − sin x cos x cos2 x + sin2 x 1 = = = sec2 x. Solution: We write the function as a quotient: f (x) = 1/ cos x. 25 Solution: We express f (x) as a quotient of two functions. and that the derivative of a constant vanishes.. f (x) = sin x/ cos x. Use that sin x = cos x (see Example 1. Example 1.30) = · = tan x sec x. We find sin x sin x 1 sec x = (1.29) tan x = 1 + tan2 x. they write tan x = (1. . where n is an integer. We find (1. and not so much as functions evaluated at a point. g2 ♦ and.31) (f g) = f g + f g or d df dg (f g) = g+f dx dx dx d dx f g df dx g dg − f dx . That draws our attention to the fact that the function f (x) = tan x satisfies the differential equation f (x) = 1 + f 2 (x).47. and apply the quotient rule. for x not of the form nπ + 1/2. RULES OF DIFFERENTIATION Example 1.13.1. (1. Thinking of f and g again more as functions. wherever g(x) = 0. ♦ 2x cos cos x cos x Suppose f and g are defined and differentiable on an open set. we may once more omit (x) from the notation. using that cos x = − sin x (see Exercise 2 on page 17). We apply the quotient rule. Differentiate the function f (x) = sec x.46. Then the product rule and quotient rule become (1. Based on the relevant trigonometric identity.32) f g f g − fg = g2 or = Here g2 is the square of the function g. Find the derivative of f (x) = tan x.e. 2x 2x cos cos cos2 x Some books and computer programs will give this result in a different form.

so that h(x) = f (g(x)). ♦ Example 1. In the last expression we reversed the order of the factors to make the expression more readable. The chain rule says that whenever g is differentiable at x and f is differentiable at g(x). The chain rule tells us that h (x) = f (g(x))g (x) = 2xex 2 +1 . Combining Example 1.33) h (x) = (f ◦ g) (x) = f (g(x))g (x).13. In particular.48. f (g(x)) = ex +1 .45 with the chain rule we find d n u (x) = nu (x)un−1 (x) dx . and suppose that the domain of f contains the range of g. Differentiate the function h(x) = ex 2 +1 . so that the composition (f ◦ g)(x) = f (g(x)) is defined for all x in the domain of g. In Leibnitz’ notation the chain rule says that (1.34) dh d df dg (x) = f (g(x)) = (g(x)) (x).49. then h(x) is differentiable at x and (1. If f (x) = eu(x) then f (x) = u (x)eu(x) . Set h = f ◦ g. Remember that f (u) = f (u) = eu and 2 g (x) = 2x. Here are some specific examples: d 2x+5 = 2 e2x+5 e dx d sin x = cos x esin x e dx d tan x = sec2 x etan x . Let u(x) be a differentiable function. with g(x) = x2 + 1 and f (u) = eu .3 Chain Rule Let f and g be functions. dx dx du dx Example 1.26 CHAPTER 1. BASIC CONCEPTS 1. ♦ e dx Example 1. Solution: We write h = f ◦ g as a composition of two functions.50.

To be specific.1. RULES OF DIFFERENTIATION 27 for all integers n. ♦ du u More generally. E. Differentiate the function ln |u| for u = 0. see (1.45. d 2x ln |x2 − 4| = 2 dx x −4 . for u < 0. So.19).35) d 1 ln |u| = . suppose that u < 0. dx u(x) assuming that u is differentiable and nowhere zero on its domain. The chain rule tells us now that h (x) = f (g(x))g (x) = n(g(x))n−1 g (x) = nu (x)un−1 (x). Solution: Set g(x) = u(x) and f (u) = un .51. assuming only that u is differentiable at x and u(x) = 0 if n ≤ −1. Then u = −|u| and ln |u| = ln(−u). d 1 d 1 1 ln |u| = (−u) = (−1) = . here are concrete examples: d (3x + 5)8 dx d 2 (x + 1)25 dx d tan3 x dx d cos2 x dx d sec5 x dx = 8(3x + 5)8−1 · 3 = 24(3x + 5)7 = 25(x2 + 1)24 · 2x = 50x(x2 + 1)24 = 3 sec2 x tan2 x = 2 cos x(− sin x) = −2 cos x sin x = 5 sec4 x sec x tan x = 5 sec5 x tan x. ♦ Example 1. The chain rule tells us that. f (u) = nun−1 . du |u| du −u u This means that for all non-zero u (1. We reordered the expressions so that the expression is more readable. envoking the chain rule (1. Solution: We asserted that ln u = 1/u for positive values of u.g. Then h(x) = f (g(x)) = un (x).13.. According to Example 1.36) d u (x) ln |u(x)| = .

36). Then (1. We use the formulae for differentiating the exponential and natural logarithm functions.28 CHAPTER 1.49 and the exponential laws. π/2). BASIC CONCEPTS for all x = ±2. For differentiable functions which are everywhere positive on their domain and any real number q the differentiation formula in (1.37) If f (x) = |u(x)|q then f (x) = q u (x) |u(x)|q . u(x) The assertion follows from (1. π) and for x ∈ (−π/2.38) For example: d (sin x)1/2 = dx cos x √ 2 sin x for x ∈ (0. We push matters a bit further. we have to exclude all x of the form π 5π 6 + 2nπ and 6 + 2nπ. Example 1. Specifically. dx d (sec2 x + 5)π = 2π sec2 x tan x(sec2 x + 5)π−1 dx . Consider a function u which is differentiable and nowhere zero on its domain and q any real number. where n is an arbitrary integer. u(x) − cos x 1 =51 − sin x 2 2 − sin x 5 whenever sin x = 1/2.37) specializes to (1. f (x) = = = = = Here is a concrete example: d 1 − sin x dx 2 5 d ln f (x) e dx d ln(|u(x)|q ) e dx d q ln |u(x)| e dx d (q ln |u(x)|) eq ln |u(x)| dx u (x) q |u(x)|q . d q u (x) = qu (x)uq−1 (x). Eventually we will verify them independently.

1. d d f (g(h(x))) = f (g(h(x)) g(h(x)) = f (g(h(x)) · g (h(x)) · h (x).39) sinh x = 1 x e − e−x 2 & cosh x = 1 x e + e−x 2 You are invited to verify that cosh2 x − sinh2 x = 1. These observations motive the attribute ‘hyperbolic’.13.13. Hint: ax = ex ln a ) dx d x x = (1 + ln x)xx (Assume x > 0. x = 1.4 Hyperbolic Functions The exponential function may be used to define the hyperbolic sine and cosine. ∞). one can show that any point (u. Conversely. . v) on the hyperbola u2 − v 2 = 1 can be expressed as (± cosh x. sinh x) for some x ∈ (−∞. It is elementary to compute the derivatives of the hyperbolic functions: sinh x = cosh x and cosh x = sinh x. dx x 29 To differentiate a composition of more than two differentiable functions we apply the chain rule repeatedly. E. we get the following derivatives: d x a = ax ln a (Assume a > 0. Hint: xx = ex ln x ) dx d sin x sin x = x + cos x ln x xsin x (Assume x ∈ (0. π/4)).. dx dx For example √ d √x2 +1 1 xe x +1 2 = e x +1 · √ e · 2x = √ dx 2 x2 + 1 x2 + 1 √ 2 d tan3 (5x2 − x + 5) = 3 tan2 (5x2 − x + 5) sec2 (5x2 − x + 5) · (10x − 1) dx 1. (1. RULES OF DIFFERENTIATION Using the tricks from above.g.

The function tan x maps the interval (−π/2. then J = f (I). π] to the inteval [−1. cosh x sinh x cosh x sinh x As a routine application of the rules of differentiation. the image of I under the map f . then so is g. For example. The function f (x) = ex maps the interval (−∞. ∞).13.5 Derivatives of Inverse Functions Let us recall. 1. BASIC CONCEPTS One may also define other hyperbolic functions tanh x = sinh x cosh x 1 1 . comparable to the identities for the trigonometric functions. It is customary to define its inverse arctan x as a function from (−∞. 1. and (1. ∞) to the interval (0. There are identities for these hyperbolic functions. Some parts of this proposition are elementary. If f is continuous. 1].40) g(f (x)) = x and f (g(y)) = y for all x in the domain of f and all y in the domain of g. A few essential properties of inverse functions are listed in Proposition 1. Suppose f and g are inverses of each other. then it is monotonic (increasing or decreasing) on any interval in its domain. The graph of g is obtained from the graph of f by reflection at the diagonal. the function f (x) = cos x maps the interval [0. π/2] to the interval [−1. If f is continuous and I is an interval in the domain of f . you may calculate the derivatives of these functions. 3. You can find them in any table of mathematical formulas. 1]. Its inverse arcsin x is typically used with . then J is an open interval.52. the domain of g is equal to the range of f . If f is increasing. then so is g. If I is an open interval. 4. 2. or you can work them out yourself.30 CHAPTER 1. and csch x = . sech x = . Two functions f and g are said to be inverses of each other (or each function is the inverse of the other one) if the domain of f is equal to the range of g. π/2). coth x = . ∞). The function sin x maps the interval [−π/2. π/2) to the interval (−∞. is an interval. others are consequences of the intermediate value theorem. If f is decreasing. ∞) to (−π/2.

9 and 1. Actually. You see the graph of these two functions in Figures 1.10. x) = (y. Theorem 1. We need that T (x) is not vertical. b).5 1 1. B) for which f (g(y)) = 0. By definition we have f (g(y)) = y for all y ∈ (A. Denote the inverse of f by g. B). RULES OF DIFFERENTIATION 1 1. f (x)) and y = f (x). This provides the formula for the derivative. g(y)). Still.5 -1. 1]. With the role of x and y being interchanged. f (x) Proof.13. and its range is [−π/2. and denote the image of f by (A. π/2] Figure 1.5 Figure 1. We find f (g(y))g (y) = 1 and g (y) = 1 .9: sin x on [−π/2.5 -1 -0. then its reflection T (x) at the diagonal is close to the graph of the function g(x) at the point (f (x). and this is assured by the assumption that t(x) is not horizontal. We will not give a formally complete proof of the differentiability assertion.5 -1 -0. Then g is differentiable at all points y ∈ (A.5 0.10: arcsin y on [−1. 1] domain [−1.5 1 -0. Let f be a differentiable and invertible function which is defined on an open interval (a. f (g(y)) . this is also easy to calculate. Differentiate both sides of the equation. π/2]. B). The relation between the derivative of a function and its inverse is spelled out in our next theorem.5 -1 -1 -1.1. For these values of y and for x such that f (x) = y the derivative is given by: g (y) = 1 f (g(y)) or g (f (x)) = 1 . the slope of t(x) is the reciprocal of the slope of T (x).5 31 1 0. if the line t(x) is close to the graph of the function f (x) at the point (x.53.5 -0.5 0.5 0.

Set f (x) = ln x and g(y) = ey in rem 1. The theorem says that the exponential function ferentiable and provides the formula for the derivative: d y 1 1 = ey . ∞). BASIC CONCEPTS as claimed. Show that the exponential function is differentiable and that d y e = ey . Show that the function g(y) = arctan y (the inverse of f (x) = tan x) is differentiable.32 CHAPTER 1. The theorem also provides us with the formula for the derivative: arctan (y) = 1 1 = = cos2 (arctan y). If y = f (x). Example 1. Assume that the natural logarithm function is differentiable and that ln x = 1/x. .54. π/2) as the range for arctan. Theorem 1. We refer to the notation in Figure 1. as asserted in (1.11. ∞). f (x) We apply the theorem to find some important derivatives. the domain natural logarithm. ∞) as the domain and (−π/2.19). and we obtain the second version of the formula for the derivative of the inverse of the function: 1 g (f (x)) = . e = = y) dy 1/ey ln (e as claimed. To do this we draw a triangle in which we identify the available data. We note that ln (x) = 0 for all x in (0. Solution: The function f (x) = tan x is differentiable on its entire domain. and that d 1 .53 tells us that g(y) = arctan y is differentiable on its entire domain (−∞. dy Solution: By definition. and f (x) = sec2 x is nowhere zero.55. the exponential function is the inverse natural logarithm function ln. arctan y = dy 1 + y2 According to standard conventions we use (−∞. then g(y) = g(f (x)) = x. ♦ of the Theoof the is dif- Example 1. tan (arctan y) sec2 (arctan y) All we need to do now is to figure out what cos2 (arctan y) is.53.

RULES OF DIFFERENTIATION 33 y u A 1 B Figure 1. The angle at the vertex A is called u. the length of the hypotenuse is Then cos u = The conclusion is that (1.1. dx 1 + u2 (x) .13. 1 + y2 This is exactly what we claimed. So. the right angle is at the vertex B. 1 + y2 1 1+ y2 and cos2 (arctan y) = 1 .42) d u (x) arctan(u(x)) = . The adjacent side to this angle is chosen to be of length 1. we find a slightly more general formula: (1. tan u = y and arctan y = u. and assuming the differentiability of u(x). by definition.11: An informative triangle There you see a rectangular triangle. and the opposing side of length y.41) arctan (y) = ♦ 1 . Combined with the chain rule. 1 + y2. By the theorem of Pythagoras.

Parts of this curve look like the graph of a function. 1). I. and that its derivative is d 1 . 1 + sin2 x The reader is invited to verify the formulas for the other inverse trigonometric functions arcsin x. π/2].13. Show that arcsin x is differentiable on (−1. BASIC CONCEPTS d arctan(x2 + 5) = dx d arctan(sin x) = dx 2x 1 + (x2 + 5)2 cos x . where f (x) is some instruction which assigns a value to x.e. we find that (1. Then.. arccos x.43) For example: d arcsin(3x) = dx d arcsin(x2 ) = dx 3 if x ∈ (−1/3. and arcsec x as they are given in Table 1. 1.3 on page 63. see Figure 1. arcsin x = √ dx 1 − x2 We may once more improve on this formula. Let u(x) be a differentiable function which is defined on an open interval. The solutions of this equation form a curve5 in the plane called a lemniscate. The points on the graph of f are the points which satisfy the equation.6 Implicit Differentiation Until now we considered functions which were given explicitly. using the chain rule. such . we were given an equation y = f (x).44) (x2 + y 2 )2 = x2 − y 2 . 1) 1 − x4 √ d arcsin(u(x)) = dx u (x) 1 − u2 (x) . For example Exercise 3.34 For example: CHAPTER 1. Consider the equation (1. 1] to [−π/2. It is customary to think of arcsin x as a function from [−1.12. and suppose that |u(x)| < 1. 1/3) 1 − 9x2 2x √ if x ∈ (−1. arccot x.

1 -1 -0.5 -0.13. Find the slope of the tangent line to the unit circle at the √ point (1/2. 3/2) −1 =√ . ♦ 5 We will rely on the readers intuitive idea of a curve in the plane. we find that dy dx √ (1/2. we still like to calculate the slope of curve at one of its points. Without solving the equation for y.1. 3/2).1 -0.2 -0. Example 1.2 0. Differentiating both sides of the equation of the circle we get 2x + 2y dy = 0 or dx dy −x = .56. Let us start out with an example which we have studied before. . 3 We used a different way to indicate at which point we evaluate the derivative because we had to specify the x and the y coordinate of the point.5 1 Figure 1.3 0. Solution: We write y = y(x) to emphasize that y as a function of x. The unit circle consists of all points which satisfy the equation x2 + y 2 = 1. This process is called implicit differentiation. dx y Plugging in the coordinates of the specified point.3 0.12: Lemniscate as the points for which y ≥ 0. RULES OF DIFFERENTIATION 35 0.

dx 3 −3 + 2 3 This specific calculation takes a bit of arithmetic skill and effort to carry out. the point (x. dx dy dx Finally we get an explicit expression for in terms of x and y: dy x(1 − 2(x2 + y 2 )) x(1 − 2(x2 + y 2 )) = = . we consider y as a function of x and differentiate both sides of the equation.57. Find the slope of the tangent line to the lemniscate (x2 + y 2 )2 = x2 − y 2 . We find 2(x2 + y 2 )(2x + 2y dy dy ) = 2x − 2y . y) with y = 0 on the lemniscate. That dy means that dx = 0 whenever 1 − 2(x2 + y 2 ) = 0 or 1 x2 + y 2 = . 2 2 Then we get an equation in one variable: 1 = x2 − 4 1 − x2 2 or x2 = 3 8 and 1 y2 = . A quick look at Figure 1.12. 1 −3 + 2 3) is a point on the lemniscate. BASIC CONCEPTS Example 1. 2 Substitute x2 + y 2 = 1 . we can plug it into the dy expression for dx and we get the slope of the curve at this point. dy The tangent line is horizontal whenever dx = 0. 8 .56. y) = ( 1 . Solution: You see a picture of the lemniscate in Figure 1. and y 2 = 1 − x2 into the equation of the curve.g. dx 2y(x2 + y 2 ) + y y(2(x2 + y 2 ) + 1) Given any point (x. dx dx Bring all terms with a factor dy/dx to the left hand side of the equation and those without to the right hand side. 2 2 and at this point the slope of the tangent line is √ dy 2− 2 =√ √ . and find the coordinates of the points where the tangent line is horizontal. (2y(x2 + y 2 ) + y) dy = x(1 − 2(x2 + y 2 )).36 CHAPTER 1.12 tells us that we may ignore points where x = 0 or y = 0. √ E. As in Example 1.

5 2 1. .5 -1 -0. then its equation is x2 + (y − a)2 = 1. Differentiating the equation of the circle with respect to x. After some implifications we 6 More sensibly.6124. Suppose the coordinates of the center of the circle are (0. we dy substitute dx = 4x into the second equation.13: Ball in a Cup. drop a ball of radius 1 into a cup whose vertical cross section is the parabola y = 2x2 . At which points will the circle touch the parabola?6 3 2. dx dy Assuming that dx is the same for both curves at the point of contact.5 1 Figure 1.13. ♦ 4 4 37 Example 1.5 0.5 1 0.13. a). y) = (± . we find that dx = 4x.1.3536). The crucial observation in this example is. Solution: You see a picture of the problem in Figure 1. Suppose you drop a circle of radius 1 into a parabola with the equation y = 2x2 . we get 2x + 2(y − a) dy = 0. that the tangent line to the parabola and the circle will be the same at the point of contact. RULES OF DIFFERENTIATION The points at which the tangent line to the lemniscate is horizontal are √ √ 6 2 (x. ±.58.± ) ≈ (±. Differentiating the equation of the parabola dy with respect to x.

Example 1. y) = (2.14 Related Rates Many times you encounter situations in which you have two related variables.57 with the curve given by the equation y 2 −x2 (1−x2 ) = 0. 0). We substitute 4 this expression into the equation √ the circle and find that the x coordinate of 15 of the point of contact is x = ± 4 . Consider the curve given by the equation x2 = sin y. Suppose the radius of a ball changes at a rate of 2 cm/min. The ball it too large to fit into the parabola and touch at (0. Find the slope of the curve at the point (x. Find the √ slope of the curve at the point with coordinates x = 1/ 4 2 and y = π/4. According the 3 the chain rule: dV dr dV dr = = 4πr 2 . We consider V as a function of r as well as t. The formula for the volume of a ball is V (r) = 4π r 3 . ♦ . Exercise 5. This is the rate at dt dt which the volume of the ball changes with respect to time. So we may assume that x = 0. We use t to denote the time variable. Repeat Example 1. At which rate does its volume change when r = 20 cm? Solution: Denote the volume of the ball by V and its radius by r. BASIC CONCEPTS x(1 + 4(y − a)) = 0. the 8 circle touches the parabola in the points √ 15 15 (x. and you like to know at which rate the other one changes. we find that y = 15 at the point of contact. .38 find: CHAPTER 1. Substituting this into the equation of the parabola. dt dr dt dt With r = 20 and dr = 2 we get dV = 3200π cm3 /min. we find that the y coordinate of the point of contact is y = a − 1 . Exercise 6.59. Consider the curve given by the equation x3 + y 3 = 1 + 3xy 2 . you know at which rate one of them changes. y) = ± . In summary. In this section we treat such problems. 1. Solving the equation 1 + 4(y − a) = 0 for y. −1). ♦ 4 8 Exercise 4.14. You find a picture of this Lissajous figure in Figure 1.

Pressure (P ) and volume (V ) of air at room temperature are related by the equation7 P V 1. we suppose that γ = 1. It is called the adiabatic law.4 Figure 1.4 0. For the purpose of this problem.2 0.60. We find that dx = 3 3.5 1 -0. as functions of the time variable t. Boyle-Mariotte described the relation between the pressure and volume of a gas. At which rate does it move in the horizontal direction? Solution: The equation of the circle is x2 + y 2 = 100.4 for air at room temperature. and dy = −3.61. Implicit differentiation of the equation of the circle gives us the equation dx dy + 2y = 0. The constant γ depends on the molecular structure of the gas and the temperature. 7 .2 -1 -0.5 -0. 0) in the Cartesian plane. At some time the particle √ is at the point (5. x and y. y = 5 3. dt dt √ √ In the given situation x = 5. 5 3) and moves downwards at a rate of 3 cm/min. dt dt √ so that the particle is moving to the right at a rate of 3 3 cm/min.1. ♦ 2x Example 1. They derived the equation P V γ = C.14: y 2 − x2 (1 − x2 ) = 0 Example 1. RELATED RATES 39 0. Suppose a particle moves on a circle of radius 10 cm and centered at the origin (0.4 = C.14. We consider both variables.

Differentiation of the equation yields dP 1. dt dV dt V dt Substituting the given information we find that the pressure decreases at a rate of 1. dt dv dt 3610 1000(c2 − v 2 )3/2 The perceived mass increases at a rate of approximately 1% of its mass at rest. BASIC CONCEPTS Here C is a constant. ♦ Exercise 7. At which rate is the mass changing when the particle’s velocity is 90% of the speed of light. dV V where m is that mass at rest and c is the speed of light.4P V .001c per second? Solution: According to our rules of differentiation dM mvc .40 CHAPTER 1. and increasing at .4P dV = =− .75 kg/cm2 sec. Find the rate of change of P if the volume increases at a rate of 10 cm3 /min. is leaning against a wall.1 m/sec.62.4 = 0 or dV According to the chain rule dP dP dV 1. is M (v) = m 1 − v 2 /c2 . = 2 dv (c − v 2 )3/2 Applying the chain rule and substituting the values. The mass M of a particle at velocity v. we find √ dM 9 19m dM dv mvc2 = = = ≈ . dP 1.010867m. 7 m long. ♦ Example 1. Right now the foot of the latter is 1 m away from the wall. You are pulling the foot of the ladder further away from the wall at a rate of . as perceived by an observer in resting position. At which rate is the top of the ladder sliding down the wall? . This formula is from Einstein’s special theory of relativity. A ladder. At some instant t0 the pressure of the gas is 25 kg/cm2 and the volume is 200 cm3 .4P =− . Solution: We consider P as a function of V .4 V + 1.

We denoted the proportionality factor by a. 3. 000 at time t1 = 2 and C2 = 7.46) A (t) = aA(t) and A0 = A(t0 ). Suppose the size of a population of bacteria in a laboratory experiment is C1 = 5. 1. The essential aspects of dealing with (1.45) A (t) = aA(t). Find the formula for the size of the population at any time t ≥ 0.63.45) is an example of a differential equation. Predict the size of the population at time t = 10. Find the relative growth rate a of the population.15 Exponential Growth and Decay An idealistic. and it can be shown that on an interval any solution is of this form. Theorem 1. which we still need to prove. That the function satisfies the differential equation follows from (1. Find the time within which the population doubles9 .46) are addressed in Example 1. 8 . 000. It says that the rate of change of a population is proportional to its size.15. We saw that the functions A(t) = Ceat are solutions of this equation. an equation which involves a function and its derivatives. 2.19). 4. We also say that A(t) grows exponentially and a is the relative growth rate. but very useful model for population growth is the Malthusian Law (1.1. EXPONENTIAL GROWTH AND DECAY 41 1. The uniqueness assertion follows from Proposition 2.46). The equation in (1. and the unknown is a function. Here time is measured in hours since the beginning of the experiment. 000 at time t2 = 5. say A0 = A(t0 ). Find the time at which the population reaches 8.9 on page 68. We may specify the value of A at some time t0 .64. Then we have an initial value problem (1. 9 Note that the doubling time depends only on the relative growth rate a. On an interval which contains t0 the function A(t) = A0 ea(t−t0 ) is the unique solution8 of the initial value problem in (1. 5.

t2 − t1 3 and ln C2 C1 = a(t2 − t1 ). k .47) A (t) = −kA(t). 000 about 6.42 CHAPTER 1. 1. then 8000 = 5000ea(t1 −2) or ln(8/5) = a(t1 − 2) or t1 = ln(1. Suppose the size of the population reaches 8. Consider a radioactive substance. where a is as above.4 = ≈ . Experiments have shown that the rate at which radioactive decays occur is proportional to the amount of radioactive material present. The size of the population at any time t ≥ 0 is A(t) = 5000ea(t−2) . where A0 = A(t0 ). one finds (1. The experience which we just described can be expressed as a differential equation (1. ≈ 6. Suppose at some time t0 the size of the population is A0 = A(t0 ) and T hours later the size of the population is 2A0 = A(t0 + T ). Substituting t = 10 we find that A(10) ≈ 12. 000 at time t1 . and 3. The minus sign in the equation is included so that k will be positive.6) + 2 ≈ 6. To calculate the relative growth rate a observe that C2 A0 ea(t2 −t0 ) = = ea(t2 −t1 ) C1 A0 ea(t1 −t0 ) We find that a= ln C2 − ln C1 ln 1. As in the computation of the doubling time in the previous example. 4.18 hours. 5. The theorem tells us that A(t) = A0 ea(t−t0 ) . 2. 264. The population grows at a rate of about 11% per hour.2 a The size of the population reaches 8. The half-life T of a radioactive substance is the time within which half of it decays. Suppose t denotes time and A(t) the amount of radioactive substance at time t.11. This rate is proportional to the rate at which the amount of the material decreases. Then A(t0 + T ) = A0 eaT = 2A0 Thus the doubling time is T = ln 2 a or eaT = 2 and aT = ln 2.48) T = ln 2 . BASIC CONCEPTS Solution: We denote the size of the population at time t by A(t).2 hours into the experiment.

ln 2 The tree died approximately 10. and the carbon-14 which was present at the time of death decays. For many organisms one also knows how many carbon-14 decays to expect at the time of death. Carbon-14 occurs naturally in the atmosphere.16 More Exponential Growth and Decay More generally than in (1. The number of decays to be expected t years after death is A(t) = 6. After death no more carbon-14 is absorbed.68 ≈ 10. MORE EXPONENTIAL GROWTH AND DECAY 43 In the late 1940ies Willard Libby invented (or discovered) the method of carbon-14 dating.16. We have that A(t1 ) = 1. and the amount is believed to have been essentially constant for a long time (until recent nuclear testing). We explain the process in a numerical example. In brief.50) f (t) = af (t) + b. approximately. 534. the idea is as follows.8 6.68 5568 or t1 = − 5568 ln ln 2 1. He was awarded the Nobel price for it.65. The level of the equilibrium is characteristic for the organism.49) ln 1. A time independent solution (steady state solution) of this equation is f (t) = −b/a. wood from an oak or a human bone). . and a = 0. The amount which is absorbed equals the amount which decays. All living organisms absorb it. Suppose dead wood of the same kind shows 1. and t1 the present time.8. Measuring the number of decays in a dead organism allows us to determine the time of death.8 decays per minute and gram. Example 1.45). or a part thereof (e. where a and b are constants. Suppose we measure 6.68 carbon-14 decays per minute and gram in a certain kind of wood at the time of death of the tree.1.68e− 5568 t . The half-life of carbon-14 has been determined to be about 5568 years.g. From this we calculate: (1. 1. 500 years ago. How long ago did the tree die? Solution: Let t0 = 0 be the time of death of the tree.8 ln 2 =− t1 6. Within a living organism there is an equilibrium. consider the differential equation (1. measured in years.

000. 000 + −200 .50). On an interval every solution of (1.50) is of this form. Example 1. In summary. and you are repaying the loan at a rate of $ 200. Analyze the future of the loan.005B(t) due to interest being added and decreases at a rate of $200. The important aspects are to translate the given information into a mathematical equation.9 on page 68. we have the initial value problem B (t) = . the function f (t) = y0 + b a ea(t−t0 ) − b a is the unique solution of the initial value problem f (t) = af (t) + b and f (t0 ) = y0 .66.68. It is not hard to verify that the given functions are solutions of the respective problems. We set t = 0 at the time of graduation. The uniqueness assertion is a minor modification of Proposition 2.005 e. .005 .67. 000.5% per month. We obtain a unique solution if we add an initial condition to the differential equation in (1.67 the solution of the initial value problem is B(t) = 15.50). Solution: As variable we use time. This is the time at which you start to repay the loan.005t + 40.44 CHAPTER 1. Let us apply these ideas to solve some problems. Theorem 1. According to Theorem 1.00 per month. On an interval which contains to . The rest will be routine calculation. On graduation day the balance of your student loan is $15. BASIC CONCEPTS Theorem 1. Here c denotes an arbitrary constant. 000. denoted by t and measured in months. Functions of the form f (t) = ceat − b a are solutions of the differential equation in (1. Interest is added at a rate of . 000e. Remark 2.005t − −200 = −25.005B(t) − 200 and B(0) = 15. The balance increases at a rate of . Denote the balance of your loan at time t by B(t).00 per month due to payments which you make.

Suppose you have an object whose temperature is different from the temperature of its surroundings.04t + 75. You stir the coffee gently so that the temperature in the cup remains homogeneous and almost no energy is added A(0) = 0. The steady state solution of the problem is A(t) = 75. B(T ) = 0 if T = 1 ln .1. after 12 hours there will be about 28. at least under idealized circumstances.04A(t) due to your liver metabolizing the medication. Analyze the amount of medication in your body at any time. measured in milligrams.6 milligram of medication in your body. MORE EXPONENTIAL GROWTH AND DECAY For example.16. 45 After approximately 94 months (7 years and 10 months) you repaid the loan.800.70 (Newton’s Law of Cooling). It will take slightly more than 40 hours before the amount of medication in your body reaches 60 milligrams.) The liver metabolizes the medication at a rate of 4% per hour. (You can keep this rate constant with a skin patch.005 40 25 ≈ 94. For example. so that you paid the principal plus $3.69. We have the initial value problem A (t) = −.800 in interest. You are absorbing a medication at a rate of 3 mg per hour. Then A(t) increases at a rate of 3 mg per hour because you are taking in medication and at the same time A(t) decreases at a rate of . Let A(t) denote the amount of medication in your body. We discuss how this happens. denote it by t and measure it in hours. The amount of medication will stabilize at this amount with time. ♦ Example 1. We denote by t = 0 the time when we start taking the medication. . With time.04A(t) + 3 and The solution of this problem is A(t) = −75e−. Your total payments were $18. the temperature of the object will approach the one of its surroundings. ♦ Example 1. Think of a cup of coffee. Solution: We use time as independent variable.

To determine a we use that T (5) = 80 = 70e5a + 25. Let us work out a numerical example. Setting b = −aK. the specific heat is highly temperature dependent for substances like protein. Determine the function T (t). so that we write T (t). The room temperature is 25 degrees. In addition. 2. Putting all of this into the formula for the solution of the initial value problem. The physics of heat transfer changes substantially if you take a solid object.048 degrees per minute for each degree of difference between the temperature of the coffee and the room temperature. this is the differential equation in (1. Find t1 . protein.0482t + 25. Furthermore.10 Denote the temperature of the coffee by T . It is a function of time. That means. we get that T (t) = (95 − 25)eat + 25 = 70eat + 25. is proportional to the temperature difference.46 CHAPTER 1. and K = 25. the specific heat (the amount of energy needed to increase the temperature of one unit of the material by one degree) varies.50). If K is the temperature of the surroundings. and with this the rate of change of temperature of the coffee. such as a turkey in the oven. All of this leads to a significantly different development of the temperature inside a turkey as you roast it for your Thanksgiving dinner. BASIC CONCEPTS through the process of stirring. we set t0 = 0. just after you poured the coffee into your cup.51) T (t) = a(T (t) − K) = aT (t) − aK. Note that −b/a = K. such that T (t1 ) = 70 degrees Celsius. then (1. Newton’s law of cooling says that the rate at which the heat is transferred.51) depends on the temperature T . its temperature is 95 degrees Celsius. while you stir it slightly and patiently. Five minutes later the temperature has dropped to 80 degree. The temperature in the solid object will not be homogenous. a in (1. At time t = 0.0482. 1. It is different for fat. y0 = 95. Using these data. Solution: To apply Theorem 1. Equa5 70 tion (1. and bone. and we conclude that a = 1 ln 55 ≈ −. 10 . Having a numerical value for a gives us an explicit expression for the temperature T as a function of t: T (t) = 70e−.67.51) says that the temperature of the coffee drops at a rate of about . the outside warms up much faster than the inside.

1. 2. At which time will the level of pollution be back to 2. It is assumed that the pollutant is distributed uniformly in the lake at any time. where t denotes time and is measured in years. at time t0 = 0. After a major accident the level has risen to 15 mg per m3 . 3. 47 Solving the equation for t1 . the factory owner proposes to reduce the emission of pollution so that the level of pollutant in the river is only 1. As a remedy. What is the long term estimate for the population of this species in the wild? . The population of an endangered species of birds on Kauai decreases at a relative rate of 25% per year. the owner has agreed to an acceptable level of 2. 2. Find the function P (t). Let t0 = 0 be the time just after the accident and at which the clean-up strategy is implemented. MORE EXPONENTIAL GROWTH AND DECAY We like to find out the time t1 for which T (t1 ) = 70e−. At which time will the population drop to 500 birds? 4.5 mg per m3 of a pollutant in the lake. and the amount of water in the lake is 10 times the amount of water carried by the river per year. and the river is the only contributor to the lake. ♦ Exercise 8.17 minutes after pouring it.0482t1 + 25 = 70.17. the population is estimated to be 700 birds.5 mg per m3 . 1. Denote the size of the population at time t by P (t). we find that t1 ≈ 9.16. Let P (t) denote the amount of pollutant (measured in mg per m3 ) in the lake at time t. 3. Currently. State the initial value problem for P (t). Assume that the amount of water carried by the river is the same all year around. A government agency raises the species in captivity and releases birds into the wild at a rate of 80 birds per year.5 mg per m3 ? Exercise 9. State the initial value problem for P (t). In negotiations which the EPA. A chemical factory is located on the banks of a river. Down stream from the factory is a lake. 1. Find the function P (t). That means that the temperature drops to 70 degrees approximately 9.

If the function is differentiable at each point of its domain.17 The Second and Higher Derivatives Let f (x) be a function which is defined on an open set. Its derivative. and denoted by f (x). Here is a sample computation in which you are invited to fill in the details: d2 sin x d e = cos xesin x = (− sin x + cos2 x)esin x . BASIC CONCEPTS 1. is called the second derivative of f . 2 dx dx Exercise 10. We may ask whether the function f (x) is differentiable. Their common feature is. that for a differentiable function we do not make a large error when we use the tangent line to the graph instead of the graph itself. 1. Find the second derivatives of the following functions: (1) f (x) = 3x3 + 5x2 (2) g(x) = sin 5x (3) h(x) = x2 + 2 (4) i(x) = e5x (5) j(x) = tan x (6) k(x) = cos(x2 ) (7) l(x) = ln 2x (8) m(x) = ln(x2 + 3) (9) n(x) = arctan 3x (10) o(x) = sec(x3 ) (11) p(x) = ln2 (x + 4) (12) q(x) = ecos x (13) r(x) = ln(tan x) (14) s(x) = ex 2 −1 (15) t(x) = sin3 x. 1. then f (x) is again a function with the same domain as f (x). Leibnitz’s notation for the second derivative of a function f (x) is d2 f /dx2 . This rather casual statement will become clearer when you look at the individual methods. This process can be iterated.1 Approximation by Differentials Suppose x0 is an interior point of the domain of a function f (x) and f (x) is differentiable at x0 . The method of approximation by differentials provides an approximate values f (x1 ) if x1 is near x0 . One uses the formula (1.18 Numerical Methods In this section we introduce some methods for numerical computations. etc. wherever it exists. .18. Assume also that f (x0 ) and f (x0 ) are known. We will make use of the second derivative. The derivative of the second derivative is called the third derivative.48 CHAPTER 1. It is denoted by f (x).52) f (x1 ) ≈ f (x0 ) + f (x0 )(x1 − x0 ). We use the symbol ‘≈’ to stand for ‘is approximately’.

12 Formula (1. Thus. .72.52) insofar as we have not estimated (provided an upper bound for) the error which we make using the right hand side of (1. Find an approximate value for 3 9.52). We have been causal in (1. 180 90 ♦ Your calculator will give you tan 46◦ ≈ 1.0349. Formula (1.1. Find an approximate value for tan 46◦ .52) we have l(x1 ).0801. applied with x1 = 9 and x0 = 8. f (π/4) = 1. compare your answer with one found on your calculator. f (x1 ) is close to l(x1 ) for x1 near x0 . In each case. √ Example 1. x 3 f (8) = 2. √ Solution: We set f (x) = 3 x. and f (8) = 1 . The inequality in Definition 1.52). and f (π/4) = 2. Example 1. NUMERICAL METHODS 49 On the right hand side in (1. Use the function 46 f (x) = tan x. Solution: We carry out the calculation in radial measure. f (x0 )) evaluated at x1 . and this corresponds to π/4 + π/180. Exercise 11. applied with x1 = (π/4 + π/180) and x0 = π/4 says tan 46◦ = tan π π π π + ≈ tan + sec2 4 180 4 4 π π =1+ ≈ 1.6. differentiability of the function f (x) means that there exist numbers A and d > 0.52) instead of of the actual value of the function on the left hand side.18.0833. so we are supposed to find f (9). In the sense of the definition of the tangent line in Section 1.71. According to this slightly more demanding definition.2 (4) arctan 1. 12 12 ♦ Your calculator will give you 9 ≈ 2.1. the tangent line to the graph of f (x) at (x0 . such that |f (x1 ) − [f (x0 ) + f (x0 )(x1 − x0 )| ≤ A(x1 − x0 )2 whenever |x1 − x0 | < d. Use approximation by differentials to find approximate values for √ 5 (1) 34 (2) tan 31◦ (3) ln 1. Note that f (x) = 1 −2/3 . says that √ 3 9 = f (9) ≈ 2 + √ 3 1 25 (9 − 8) = ≈ 2. if we know A and d. then we can approximate the error as long as |x1 − x0 | < d. Note that ◦ = 45◦ + 1◦ .25 on page 12 provides us with an estimate. Then f (x) = sec2 x.0355.

we find sin 31◦ ≈ sin √ π π π π 1 + cos = 1+ 3 ≈ . In each case. and √ f (π/6) = 3/2.52) tells us that √ 1 10 = f (10) ≈ f (9) + f (9)(10 − 9) = 3 + ≈ 3. BASIC CONCEPTS Example 1.16228.73. 2( x0 ) 54 The actual error is again substantially less than this.16666. Use approximation by differentials to find an approximate √ value of 10 and give an upper bound for the error. f (π/6) = 1/2. Use approximation by differentials to find approximate values for √ (1) cos 28◦ (2) 26 (3) sin 47◦ . ♦ Exercise 12. Measuring angles in radians we set x0 = π/6 and x1 = π/6 + π/180. The estimate assures us that the error is at most 1 1 √ 3 (x1 − x0 )2 = . The estimate assures us that the error is at most (x1 − x0 )2 = π 180 2 ≤ .515115. Example 1. 6 180 6 2 180 The calculator will tell that sin 31◦ ≈ . . From the computation in Example 1. estimate also the maximal error which you may have made by using the method of approximation by differentials. 6 √ The calculator will give you 10 ≈ 3. The f (x) = 1/(2 x). Find an approximate value for sin 31◦ and estimate the error.16).50 CHAPTER 1. f (x0 ) = 3.000305.74. see (1. √ √ Solution: We use f (x) = x and x0 = 9. Then f (x) = cos x. Applying the formula in (1.31 on page 17 we also know that we may use A = 1 and d = π/4 in the differentiability estimate.515038. and f (x0 ) = 1/6. Solution: Set f (x) = sin x. ♦ Comparison of the actual and approximate value confirm this. The formula in (1.52). For the error estimate we may use A= 1 √ 3 2( x0 ) and any d > 0.

. once . such that f (x) = 0. 4 x2 = 1 2 7 12 + 4 7 = 97 56 and x3 = 18817 . Suppose that by some means we know that such an x exists. 10864 We summarize the computation in Table 1. NUMERICAL METHODS 51 1. In the first column you find the subscript n.18. finding the zeros of functions of degree 1 and 2. l(x) = f (x0 )(x − x0 ) + f (x0 ). i. Suppose we want to find a zero of a differentiable function f (x). i. f (xn ) Geometry of Newton’s Method: Let us give a geometric explanation for the formlas. Given any x0 at which f is defined and differentiable. is the point at which l(x) intersects the x-axis. In the following two columns you find the values of xn .e. √ Let us calculate A. f (x0 ) x 2 = x1 − f (x1 ) .2 Newton’s Method Newton’s method is designed to find the zeros of a function. and that x0 is not far from x. the positive root of the function f (x) = x2 − A. i. we find the zero of the tangent line. Then we set (1. Specifically. The process is then iterated. we obtain the tangent line l(x) to the graph of f at this point.53) x1 = x0 − f (x0 ) . More sophisticated methods allow you to find the exact solutions of polynomial equations of degree three and four.53). then we find x1 = 1 2 2+ 3 2 7 = . Newton’s method works as follows.1.e.e. You have learned how to solve linear and quadratic equations. and (1. f (x2 ) and in general (1.. For polynomials of degree greater or equal to 5 and most other functions there are no general methods for finding their roots.. and instead of finding the zero of the function itself. f (x1 ) x 3 = x2 − f (x2 ) .55) xn+1 = xn − f (xn ) x2 − A x2 + A 1 = n = = xn − n f (xn ) 2xn 2xn 2 xn + A xn If we use A = 3 and x0 = 2. we want to find some x.54) xn+1 = xn − f (xn ) . etc. f (x0 ) This means that we accept that the tangent line is close to the graph of the function. Then f (x) = 2x.1. as given in (1.18. and l(x1 ) = 0 if x 1 = x0 − f (x0 ) . Then x1 .

as a continuous funtion.0625000000 1. Observe that f (0) = −1 < 0 and f (π/2) = π/2 > 0.0000000000 4.7500000000 3. Step 2: Let us come up with a first guess for a root. we guess that x0 = 1 is not . Your calculator 3 √ will give you 1. find a root of the function f (x) = x sin x − cos x.55) xn+1 = 1 2 xn + A xn to find good approximations of square roots. Find a solution of the equation x sin x = cos x. In fact. Step 1: Let us make sure that there is a root of the function to be found. Equivalently. then the accuracy of this approximation of 3 will exceed the accuracy of most calculators.52 CHAPTER 1. The numbers in the last column show that we are making rapid progress in finding a good approximation of √ 3.0003188775 1. You see that our value for x3 is rather precise. In the last column you see the square of xn . At least x2 is rather close to 3. once in decimal form. BASIC CONCEPTS expressed as a fraction of integers.7321428571 3. Let us consider one more example to illustrate Newton’s method.73205080757 as an approximate value of 3. has a root in the interval (0. we may say. The intermediate value theorem tells us that f (x). Let us call this root x. Considering the values of f at the end points of the interval.0000000085 Table 1. if you carry the calculation one √ step further and find x4 .0000000000 1. expressed as rational numbers. x2 n n xn 0 1 2 3 2 7/4 97/56 18817/10864 xn 2. π/2).1: The Babylonian Method More than 4000 years ago the Babylonians used the outermost expressions in (1.7320508100 3. We refer to the described procedure as the Babylonian method.

and with this also on I. many interesting phenomena can occur. by numerical means.6. In general. As first guess we used x0 = 1. The distance between x and xn will decrease rapidly as n increases. Let us set a = . Step 4: Repeat Step 3 and calculate x2 .3. You see that f (x1 ) is much closer to zero than f (x0 ).56) |x − xn+1 | ≤ 1. x + a].9) > 0. This illustrates that the xn approach x rapidly. There is one feature of Newton’s method which helps. an approximate solution of the following kind of problem: . . Suppose that f (x) is a differentiable function and f (x) = 0 and x0 is given. and in this sense we expect that x1 is much closer to the root x of f (x) than x0 . Do the xn tend (converge) to x.3 Euler’s Method Euler’s method is designed to find. This tells us that x ∈ [. .1. NUMERICAL METHODS 53 too far away from the root. x3 . 1. then we see that |x − x2 | < . Consider an interval I = [x − a. We made progress finding x.0016 and |x − x3 | < . . and suppose that |f (x)| ≥ m and |f (x)| ≤ M on I.5 and |f (x)| ≤ M = 2.1].8) < 0 and f (. .9]. so that I ⊂ J = [. 2m We illustrate the theorem by applying it to the previous example. You are invited to M verify these estimates using technology. A theorem from advanced calculus asserts that M M (x − xn )2 . If we repeat the process.8. In (1.2.00874.5. Still. On J. we give an answer. we have that |f (x)| ≥ m = 1. and how fast? For completeness sake. We explained Newton’s method because we want to illustrate the power of the concept ‘tangent line.’ A full discussion of Newton’s method requires mathematical tools which are not available to us at this time. . which we know to exist by Step 1.18. Suppose that xn for n ≥ 1 are computed according to (1.54). so that we know that |x − x0 | < .00000256. Suppose xn ∈ I and |x − xn | < 2m . You may say that with each iteration you make a fresh start. Observe that f (. and in this sense previous round-off errors don’t carry over.18. Step 3: Let us improve the guess: Set x 1 = x0 − f (x0 ) ≈ . The quoted theorem asserts that |x − x1 | < . f (x0 ) Your calculator will tell you that f (x1 ) ≈ . Actually f (1) ≈ .2. (1.8645. the principle problem is as follows.04.56) we use that 2m < 1.

54 CHAPTER 1. To get the second equality in (1. Here F (t.58) expresses the philosophy that the graph of a differentiable function is close to its tangent line. the problem in (1. t denotes time and y(t) the size of a population.45)). at least as long as T is close to t0 . the term ay expresses that population growth is proportional to the size of the population. the members of the population meet and compete for food and living space. It is an equation which involves a function and its derivative. The first. BASIC CONCEPTS Problem 1.57). In the equation. The second condition is called an initial condition. y0 ) is l(t) = y(t0 ) + y (t0 )(t − t0 ). In addition.58) y(T ) ≈ y(t0 ) + y (t0 )(T − t0 ) = y0 + F (t0 . The Logistic Law The differential equation in our next example is known as the logistic law of population growth. and the unknown is the function. Find a function y(t) which satisfies (1.58) we use the differential equation and initial condition in (1. .57) is a first order differential equation. The equation refines the Malthusian law for population growth (see (1. which depends on t. The constants a and b are called the vital coefficients of the population. so that it is assumed that population growth is reduced by a term which is proportional to y 2 . In the differential equation.57) y = dy = F (t. y0 ). It specifies the value of the function at one point. For short. which tell us that y (t0 ) = F (t0 . The probability of this happening is proportional to y 2 . y0 )(T − t0 ). Then you might try the formula (1. y(t0 )) = F (t0 .57) is called an initial value problem. Approach in one step: Suppose you want to find y(T ) for some T = t0 . The first condition on y in (1. The tangent line to the graph of y at (t0 . and t0 and y0 are given numbers. approximate equality in (1. so that the middle term in (1. The equation was first used in population studies by the Dutch mathematician-biologist Verhulst in 1837.58) is just l(T ). y) dt and y(t0 ) = y0 . y) denotes a given function in two variables.

According to the exact solution in (1. ♦ . so that we can see how well our approximate values match it.61). this number is rounded off. ♦ Let us be even more specific and give a numerical example.58) we find (1.1.61) 2 y(T ) ≈ y0 + (ay0 − by0 )(T − t0 ). 55 where a and b are given constants.57).18. An exact solution of the initial value problem in (1.1. the formula in (1. Find an approximate value for y(T ).76. Find approximate values for y(1) and y(10). y(10) = 538. We are providing the exact solution. Example 1.61) gives us a less satisfactory result. y) = ay − by 2 .60). Remark 3. We expect a close approximation only for T close to t0 . (1. we find that y(t) = 3000 .59) dy = ay − by 2 dt and y(t0 ) = y0 .59). NUMERICAL METHODS Example 1. Consider the initial value problem: (1. we find that y(1) ≈ 300 + 300 3002 − 10 10000 (1 − 0) = 321. 3 + 7e−t/10 Substituting t = 1. t0 = 0. For this larger value of T . and y0 = 300 into the solution in (1.62) dy 1 1 = y− y2 dt 10 10000 and y(0) = 300.60) by0 + (a − by0 )e−a(t−t0 ) This is not the time to derive this exact solution. Solution: Setting F (t. Consider the initial value problem. According to the formula in (1. b = 1/10000. though you are invited to verify that it satisfies (1. our approximate value is close. Solution: Substituting a = 1/10. you see that the differential equation in this example is a special case of the one in (1. For T = 10 the formula suggests that y(10) ≈ 510. So.4.75. we find the exact value y(1) = 321. According to the exact solution for this initial value problem.59) is given by the equation ay0 y(t) = (1.

t20 = 100. . y(ti ) + F (ti . and we repeat the process. . we use the one step method from above to get an approximate value for y(t1 ). Consider again Problem 1 on page 54. Starting out with t0 and y(t0 ). 11 We do not want to make this term precise. We want to get an approximate value for y(T ). Apply the multi-step method to find approximate values for y(t) at t1 = 5. . t20 = 100. but the F (t. . We use t1 and y(t1 ) to calculate an approximate value for y(t2 ). Iteratively. BASIC CONCEPTS Multi-step approach: We like to find a remedy for the problem which we discovered in Example 1. and the more steps we make the worse the result might get. Consider the initial value problem (1. Solution: As points in the multi-step process we use t0 = 0. For reasonably nice11 expressions F (t.75 is of this kind.77.76 for T further away from t0 . . y(ti ))(ti+1 − ti )] We continue this process until we reach T . we calculate [ti+1 . the number of steps we make (at least if all steps are of the same length). t4 = 20. For notational convenience we assume that T > t0 . y(ti )] according to the formula in (1.56 CHAPTER 1. . Then we pretend that y(t1 ) is exact. in an actual numerical computation we also make round-off errors in each step. On the other hand. t2 = 10. y(ti+1 )] from [ti . t1 = 5. y(ti+1 )] = [ti+1 . t3 = 15. 1. t3 = 15. . Pick several ti between t0 and T : t0 < t1 < t2 < · · · < tn = T. 2. . y) the accuracy of the value which we get for y(T ) will increase with n. Graph the points found in the previous step together with the actual solution of the initial value problem. Example 1.64) dy 1 1 = y− y2 dt 10 10000 and y(0) = 10.58): (1.63) [ti+1 . . Again we pretend that y(t2 ) is exact and use t2 and y(t2 ) to calculate y(t3 ). y) in Example 1. Arrange them in a table. t2 = 10. Experience will guide you in the choice of the step length.

31 33.96 40 219. .15 you see the graph of the exact solution of the initial value problem.22 49.07 90 988. . . NUMERICAL METHODS 57 t 0 5 10 15 20 25 y(t) & 10.55 55 531. You also see the points from Table 1.1. . t3 = 3.70 t y(t) & t y(t) 35 153. 12 It is incidental that the points eventually get closer to the graph again.2. y(t3 ).02 30 106. Then the constant function y(t) = y0 is a solution of the problem. t100 = 100 in your calculation.55 60 656. .74 80 956. .2. . y(t2 ).05 65 768. . Such a solution is called a steady state solution.09 45 304. In Figure 1. We summarize the calculation in Table 1.27 95 994.95 22. The points suggest a graph which does follow the actual one reasonably closely. You may try a shorter step length.2: Solution of Problem 1. t) = 0 for all t.00 14.07 100 997. This is due to the specific problem. t2 = 2.62 50 410. The points will follow the curve much more closely if you use t1 = 1. y(t20 ) consecutively. and they get worse as t increases12 .41 Table 1.57): y = dy = F (t.73 75 918. .77 For each ti (0 ≤ i ≤ 19) we use the formula y(ti+1 ) = y(ti ) + 5 2 y(ti ) yi (ti ) − 10 10000 and calculate y(t1 ). But you see that we are definitely making errors. Suppose F (y0 .28 72.18. and will not occur in general. ♦ Steady States: Let us consider some very specific solutions of our initial value problem in (1.07 85 977. y) dt and y(t0 ) = y0 .87 70 857.

Find the steady states of the differential equation (see (1.68 in Section 1. 000.50) in Section 1.16) (1.15: Illustration of Euler’s Method Example 1. So the constant function f (t) = −b/a is the only steady state of this differential equation.79.000.e. then the principal balance of your account will stay unchanged.00 per month.5% per month. BASIC CONCEPTS 1000 800 600 400 200 20 40 60 80 100 Figure 1. Your payments cover exactly the occuring interest charges.65) f (t) = af (t) + b. f (t) = 0 if and only if f (t) = −b/a. you see that the steady state in that example is B(t) = 40. For the logistic law (see Equation (1.16. the bank charges you interest at a rate of .00. Example 1. and you are repaying the loan at a rate of $ 200.. if your loan balance is $40.59)) dy = F (y. ♦ In review of Example 1. t) = ay − by 2 = y(a − by) dt .78.58 CHAPTER 1. I. Solution: Apparently.

ys (t) = a/b = 1. I.e. Finally. Use Euler’s method to find the population size over the next 30 years. 000 in Example 1. NUMERICAL METHODS 59 we find that F (y. In this sense. y(t) = 0 is an unstable steady state. It is also referred to as the carrying capacity. then y(t) will not tend to the steady state y(t) = 0. If the initial value y0 of the population is positive. Repeat the first two steps of the problem if hunting is stopped. There are no predators. 4.66) 1 1 2 y (t) = −50 + y(t) − y (t) 2 2000 and y0 = y(0) = 200. Let us interpret these steady state solutions for the specific numerical values of a = 1/10 and b = 1/10. Repeat the first two steps of the problem if the initial population is 100 animals.1. 1. 13 . If y0 = 0. then the population size will tend to and stabilize13 at y(t) = a/b = 1. If the initial value y0 is negative. and which one is unstable. 000. the last term in the differential equation accounts for the competition for space and food. The deer are hunted at a rate of 50 animals per year. Guess at which level the population stabilizes. 5. 000 is a stable steady state solution. There are two steady state solutions: yu (t) = 0 and ys (t) = a/b. The common language meaning of these expressions suffices for the purpose of our discussion. find for which values of y you have that y = 0? You will find two values. Reproduction takes place at a constant rate all year round. you should think of a population of deer in a protected wildlife preserve. then y(t) will tend to −∞ as time increases. Tabulate and plot your results. Consider the initial value problem (1. 3.18. Experiment with different initial values to see which of the steady states is stable. t) = 0 if and only if y = 0 or y = a/b. 2. The population has a growth rate of 50% per year. In this sense.77. and the mathematical definition of ‘tends to’ and ‘stabilizes at’ only make these terms precise. Proceed in 1 year steps. Find the steady states of the original equation in which hunting takes place.. Call the smaller one of them Yu and the larger one Ys . To make the problem explicit. It tells you which size population of the given kind the specific habitat will support. ♦ Exercise 13.

Here a represents the elevation. . A charged particle will move along an orthogonal trajectory. Suppose we are given a family F (x. y. The orthogonal trajectory provides you with a path which is always in the direction of the most rapid change of the field. You also see one orthogonal trajectory in Figure 1. y. We like to find curves Db which intersect the curves Ca perpendicularly.) We call such a curve Db an orthogonal trajectory to the family of the Ca ’s. There is one ellipse for each a > 0. a) = x2 + 3y 2 − a = 0. y1 ). a) = 0 of curves.60 CHAPTER 1. Suppose the curves Ca are the level curves in a crater. Suppose that each ellipse represents an equipotential line of an electromagnetic field. if the tangent lines to the curves at this point intersect perpendicularly.16. The orthogonal trajectory gives a path of steepest descent. A new lava flow which originates at some point in the crater will follow this path.16: Orthogonal Trajectory to Level Curves Let us explain where this type of situation occurs. 6 4 2 0 -2 -4 -6 -6 -4 -2 0 2 4 6 Figure 1. so that the elevation is constant along each curve Ca .16 you see a family of ellipses (1. In Figure 1.67) Ca : F (x. (We say that Db and Ca intersect perpendicularly in a point (x1 . BASIC CONCEPTS Orthogonal Trajectories Let us explore a different kind of application.

a) = x2 + 3y 2 − a = 0. Suppose a stands for the concentration of a nutrient in a solution. Let us apply Euler’s method to solve the problem. If we also require that the orthogonal trajectory goes through a specific point (x0 . They will move along an orthogonal trajectory. y. was pioneered by Alexander von Humbold (1769–1859). In this particular example it is not difficult to find solutions for the differential equation. It is constant along each curve Ca . y1 ) is −x11 . A heat seeking bug will. at any time. Find orthogonal trajectories for the family of ellipses (1. then we need that the slope of the tangent line to Db at this point is 3y11 .18. The orthogonal trajectory shown in Figure 1. i. This is exactly the kind of problem which we solved with Euler’s method. and this is the curve x = 0.e. 25 14 The idea of isothermal lines. we get 2x + 6y dy = 0 or dx dy −x = . Let us find approximate values for the initial value problem dy 3y = dx x and y(1) = 1 . and with this the method in all of these applications. along an orthogonal trajectory to the isothermal lines. Thus. . Solution: Differentiating the equation for the ellipses.68) Ca : F (x. In this case the curves are called isothermal lines14 .80. There is one orthogonal trajectory which does not have this form. On their search for food. bacteria will follow a path in the direction in which the concentration increases most rapidly. y1 ) perpendicularly. If 3y a curve Db intersects Ca in (x1 . Example 1. They are functions of the form y(x) = bx3 . y0 ). NUMERICAL METHODS 61 Suppose a stands for temperature. to find an orthogx onal trajectory to the family of the Ca ’s we need to find functions which satisfy this differential equation. so that along each ellipse the temperature is constant.. move in the direction in which the temperature increases most rapidly.1. dx 3y The slope of the tangent line to a curve Ca at a point (x1 . then we end up with the initial value problem dy 3y = dx x and y(x0 ) = y0 .16 has the equation y = x3 /25.

x1 = 3. so that the orthogonal trajectory passes through the point (3. . . .62 CHAPTER 1. x20 = 7. 3. Find the differential equation for an orthogonal trajectory. . Determine b. Use the points x0 = 3. 4).4.2 3yn−1 xn−1 for n = 1. 2. 20. . x2 = 1. x2 = 3. only for a = 0 the hyperbola degenerates into two intersecting lines. ♦ Exercise 14. . Use Euler’s method to find points on the orthogonal trajectory through the point (3. Consider the family of hyperbolas: Ca : x2 − 5y 2 + a = 0. Plot the points (xn . Check that the graph of y(x) = bx−5 is an orthogonal trajectory to the family of hyperbolas for every b. Without recording the results of this calculation.16. Graph several of the curves Ca . 4. . x20 = 5. yn ) according to the formula yn = yn−1 + .2. . 1. . . 4). we graphed the points in Figure 1. . yn ) in your figure. BASIC CONCEPTS Use x0 = 1. and add this graph to your figure.4. 2. x1 = 1. . 1/25) and calculate (xn . . There is one hyperbola for each value of a. y0 ) = (1.2. We set (x0 .

1). ∞) all x for which tan x is defined all x for which cot x is defined all x for which sec x is defined − csc x cot x all x for which csc x is defined x ∈ (−∞. π/2) ∪ (π/2. ∞) x ∈ (−∞. arcsec x ∈ (0. ∞). π/2) x ∈ (−1. π/2) Table 1. 0) ∪ (0. arccos x ∈ (0. arcsin x ∈ (−π/2. 1).3: Some Derivatives . x = 0 x ∈ (−∞. ∞) x ∈ (−1. ∞). ∞) x ∈ (−∞. arccot x ∈ (0.19.19 Table of Important Derivatives f (x) xq ex ln |x| sin x cos x tan x cot x sec x csc x arctan x arcsin x arccos x arccot x arcsec x arccsc x f (x) qxq−1 ex 1/x cos x − sin x sec2 x − csc2 x sec x tan x 1 1+x2 √ 1 1−x2 √ −1 1−x2 −1 1+x2 √1 |x| x2 −1 −1 √ |x| x2 −1 Assumptions q a natural number. arcsec x ∈ (−π/2. π) x < −1 or x > 1. π) x ∈ (−∞. or x > 0 x ∈ (−∞.1. TABLE OF IMPORTANT DERIVATIVES 63 1. π) x < −1 or x > 1.

64 CHAPTER 1. BASIC CONCEPTS .

Then we discuss geometric properties of graphs. The average rate of change of f (x) = sin x over [0. We apply these ideas to the study of extrema of functions. the average rate of change of f (x) = x2 over the interval [0. π] it is 0. Then we call f (b) − f (a) b−a the average rate of change of f over the interval [a. 65 . The fundamental result which allows us to do this is referred to as Cauchy’s mean value theorem. their monotonicity and concavity.1. Augustin-Louis Cauchy (1789–1857) was one of the great mathematicians of the 19-th century. 2] is 2. Let f (x) be a function which is defined on the interval [a. For example. b]. b]. With this information it is possible to sketch graphs capturing their essential features. In this chapter we will use local information about a function to draw global conclusions.Chapter 2 Global Theory So far we studied the local behaviour of a function. He made major contributions to make calculus a rigorous mathematical theory.1 Cauchy’s Mean Value Theorem It is useful to make the following Definition 2. All concepts related to the behaviour of a function near a point. We will discuss some uniqueness properties of solutions of differential equations. 2. π/2] is 2/π and over [0.

and that the proof of both of them depends heavily on the completeness1 of the real numbers. b−a In words. there exists a number d such that f (x) = d for all x ∈ I. b] and differentiable on (a. and f (−1/2) = −1. Let f be a real valued function which is defined and continuous on the interval [a. then a = b and there exists some c ∈ (a. b).e. b ∈ I. The following special case of the theorem. a < c < b) such that f (c) = 0. where a < b. We are not going to say anything about the proof of these two theorems. Then there exists a number c ∈ (a. If f (a) = f (b). Corollary 2.. Let f be a real valued function which is defined and continuous on the interval [a.1.66 CHAPTER 2. 1] is −1. b) such that f (c) = f (b) − f (a) . If f (a) = f (b). such that f (c) = f (b) − f (a) = 0. . b−a But this contradicts the assumption that f (c) = 0 for all c ∈ I. as they occur in the two theorems. b] and differentiable on (a. 1 We discussed this property of the real numbers in Section 1. If f (x) = 0 for all interior points x of I. GLOBAL THEORY Theorem 2. called Rolle’s theorem (named after Michel Rolle (1652–1719)). where a < b. For example. the theorem asserts that the average rate of change over an interval is equal to the rate of change at some point in the interval. Theorem 2. the average rate of change of f (x) = x2 over the interval [−2. is of particular interest.3 (Rolle’s Theorem). A different formulation of the claim is that f (a) = f (b) for all a. and the corollary is proved. Proof. b). We are also not interested in finding the points c. We are interested in more general consequences. Let f be a real valued function which is defined and continuous on an interval I.2 (Chauchy’s Mean Value Theorem). except that Cauchy’s theorem and Rolle’s theorem are equivalent (each is an easy consequence of the other one). In other words. then f is constant on this interval. We prove this statement using Cauchy’s theorem. b). then there exists a number c between a and b (i.4.

Proof.5. For example. If h (x) = g (x) for all x ∈ I. Typically.2 Unique Solutions of Differential Equations Corollary 2. there exists a number d such that h(x) = g(x) + d for all x ∈ I. Let h and g be functions which are defined and continuous on an interval I. More generally.. 2. If the function F (x) is defined on an interval I and F (x) = 0 for all x ∈ I. if you like to find all antiderivatives F (x) of a function f (x) on an interval.7. In other words.4 implies Proposition 2. Apply the previous corollary to f (x) = h(x) − g(x). then it suffices to find one antiderivative H(x). ∞) is of the form F (x) = x2 + c where c is a constant. . we can reformulate Corollary 2. the integration constant is determined by an initial condition. then h and g differ by a constant.2. Definition 2. Suppose h and g are antiderivatives of a function f . By differentiating H(x) you can check whether you guessed right. 67 Corollary 2. Using this notion. Then h and g differ by a constant. i.6.e. We call a function F (x) with domain I an antiderivative of f if F (x) = f (x) for all x ∈ I. Any antiderivative F (x) of the function f (x) = sec2 x on the interval (−π/2.2. The constant c is referred to as integration constant. then F (x) is constant on I. on intervals the only solutions of the differential equation F (x) = 0 are the constant functions. any antiderivative F (x) of the function f (x) = 2x on the real line (−∞. defined on an interval. Any antiderivative F (x) is of the form H(x) + c where c is a constant. π/2) is of the form F (x) = tan x + c. UNIQUE SOLUTIONS OF DIFFERENTIAL EQUATIONS We are going to use the following corollary frequently.8. Suppose the function f (x) is defined on the interval I. Suppose we like to solve the initial value problem f (x) = cos x and f (0) = 1.5. Corollary 2. For the time being you depend on being able to guess such a function H(x).

The solution of the initial value problem is f (x) = sin x + 1. where c is a constant. h is differentiable. As a product of differentiable functions. We asserted in (1. Its derivative is h (x) = f (x)e−ax − af (x)e−ax = af (x)e−ax − af (x)e−ax = 0. GLOBAL THEORY Our first conclusion is that f (x) = sin x + c.9. Proposition 2. Substituting the initial condition we obtain C = f (x0 ) = ceax0 . Next we substitute x = 0 in the equation.19). . Proof. Then we see that f (0) = c = 1. Consider the function h(x) = f (x)e−ax . In fact f (x) = Cea(x−x0 ) .10. The initial value problem f (x) = af (x) and f (x0 ) = C has a unique solution on an interval containing x0 . that all functions of the form f (x) = ceax satisfy the differential equation. and will eventually prove. We want to see that these are the solutions. Let f (x) be any function which satisfies the differential equation on some interval. Proof. Thus c = Ce−ax0 and f (x) = ceax = Ce−ax0 eax = Cea(x−x0 ) .4 tells us that h(x) is a constant function.68 CHAPTER 2. Calling the constant c we find that f (x) = ceax . Of particular importance to our discussion of expenential growth and decay is Proposition 2. This follows from the above because (sin x + c) = cos x. Every solution f (x) of the differential equation f (x) = af (x) on an interval is of the form f (x) = ceax for some constant c. By the previous proposition we know that the solution is of the form f (x) = ceax for some c. Corollary 2. This means that all solutions of the differential equation f (x) = af (x) are of the form f (x) = ceax .

2.3. THE FIRST DERIVATIVE AND MONOTONICITY

69

Remark 4. The uniqueness of the solution of an initial value problem as in the previous proposition is not only of theoretical importance. Imagine that you study the growth rate of a strain of bacteria, as we did in Example 1.64 on page 41. Before you can publish your result, it must be certain that your experiment can be reproduced at a different time in a different location. That is a requirement which any experiment in science must satisfy. If there is more than one mathematical solution to your problem, then you have to expect that the experiment can go either way, and this would invalidate your experiment.

2.3

The First Derivative and Monotonicity

One of the interesting properties of a function is whether it is increasing or decreasing. We might want to find out whether the part of a population which is infected with a disease is increasing or decreasing. We might want to know how the level of pollution in a body of water is changing. The first derivative of a function gives us information of this kind.

2.3.1

Monotonicity on Intervals

Recall that a function f is called increasing if f (b) > f (a) whenever b > a. It is called decreasing if f (b) < f (a) whenever b > a. A function is called monotonic if it is either increasing or decreasing. Theorem 2.11. Suppose that the function f is defined and continuous on the interval I. 1. If f (x) > 0 for all x ∈ I, then f is increasing on I. 2. If f (x) < 0 for all x ∈ I, then f is decreasing on I. 3. More generally, the conclusions in (1) and (2) still hold if in each finite interval J ⊂ I there are only finitely many points at which the assumption on f (x) is not satisfied.2 Proof. We show (1). Let a and b be points in I, and suppose that a < b. Cauchy’s theorem says that there exists a point c, a < c < b, such that f (c) =
2

f (b) − f (a) . b−a

It is permissable that f is not differentiable at a few points in J, or that f (x) = 0. It is not possible that f (x) < 0 at some point in the interval, and f (x) is increasing on the interval.

70

CHAPTER 2. GLOBAL THEORY

We have that f (c) > 0 and b − a > 0, and it follows that f (b) − f (a) > 0. This means that f (b) > f (a). The proof of the second claim is similar. We leave it and the generalization of both statements to the reader.
1 For example, log2 x = ln x = x ln 2 > 0 for all x ∈ (0, ∞). In the ln 2 computation we used (1.19), Theorem 1.36, and that ln 2 > 0. It follows from Theorem 2.11 that log2 x is increasing on x ∈ (0, ∞). You see part of the graph of the function in Figure 1.8. The exponential function expa x = ax is increasing on (−∞, ∞) if a > 1 and decreasing if 0 < a < 1. To see this, observe that ax = ex ln a and d x x x dx a = (ln a)a . Furthermore, a > 0 and ln a > 0 if a > 1 and ln a < 0 if 0 < a < 1. Now Theorem 2.11 implies our assertion. You may also want to have a look at the graph of the exponential function with base 2 in Figure 1.7. The function f (x) = 1/x is defined and differentiable on the set of all nonzero real numbers, and its derivative is f (x) = −1/x2 . In particular f (x) < 0 for all nonzero real numbers. According to Theorem 2.11, f (x) is decreasing on the interval (−∞, 0), and that f (x) is decreasing on the interval (0, ∞). The function is not decreasing on the union of the two intervals. The example illustrates that it is crucial in Theorem 2.11 that we deal with functions which are defined and differentiable on an interval. The function f (x) = tan x, defined on (−π/2, π/2), has as its derivative f (x) = sec2 x, and the derivative is positive. Consequently, f (x) is increasing on (−π/2, π/2). Its inverse g(x) = arctan x, defined on (−∞, ∞), 1 has as its derivative g (x) = 1+x2 , which is positive on (−∞, ∞), so that g(x) = arctan x is increasing on (−∞, ∞). As a general priciple, one may show that the inverse of an increasing function is increasing.

Example 2.12. For a three dimensional solid we set E = A/V , where A denotes the surface area and V the volume. For example, for a ball E(r) = (4πr 2 )/( 4 πr 3 ) = 3/r, where r denotes the radius. Then E (r) < 0. 3 The same principle holds for other shapes, E decreases as we enlarge the solid without changing its shape. What does this have to do with the size of animals? Warm blooded animals living in cold climates need to preserve their body temperature. The total amount of heat stored in the body is proportional to the volume, while the heat loss is proportional to the surface area. The ratio of volume to surface area increases as the animal gets larger, so that for warm blooded animals it is of advantage to be large if they live in cold climates. In hot climates they need to give off heat, so that it is of advantage to be

2.3. THE FIRST DERIVATIVE AND MONOTONICITY

71

small. Natural selection (Darwinism) should favor the larger specimens of a warm blooded species in a cold climate and smaller ones in a hot climate. You can observe this phenomenon in real life. For cold blooded animals the converse holds. They absorbe heat so that they body reaches a temperature at which they can be active. In cold climates it helps to be small, because then the surface area is relatively large, compared to the volume. In hot climates cold blooded animals can afford to be large, as it is easy to reach and maintain the temperature at which they can be active. The argument is again consistent with real life. Needless to say, there are other mechanisms to increase the surface area of a body than decreasing its size, and the maintenance of the body temperature is only one factor which influences the size of specimens of a species. Larger animals need more food, are stronger, cannot hide so well, and are often less agile. All of these factors need to be taken into account to determine the optimal size of an animal. ♦ So far we have only discussed examples where we used (1) and (2) of Theorem 2.11. Let us show how to use the conclusion in (3). To apply it we need to determine intervals on which a function does not change signs. We recall a procedure which works well for continuous functions. Definition 2.13. Suppose f (x) is a function. We call a point x0 on the real line exceptional if either f (x0 ) = 0 or f (x0 ) is not defined. The following result is an immediate consequence of the Intermediate Value Theorem, see Theorem 1.16 on page 8. Expressed casually it says that a continuous function can change signs only at exceptional points. Proposition 2.14. Suppose f (x) is continuous and f (x) has no exceptional points in the interval (x0 , x1 ). Then f (x) > 0 for all points in the interval (x0 , x1 ), or f (x) < 0 for all points in the interval (x0 , x1 ). In particular, if f (x) is positive at one point in the interval, then it is positive at all points in the interval. If f (x) is negative at one point in the interval, then it is negative at all points in the interval. Example 2.15. For example, consider the function f (x) = x2 (x2 − 4) x2 (x − 2)(x + 2) = . 2 + 2x − 15 x (x − 3)(x + 5)

The zeros of the numerator, and with this the zeros of f (x), are x = 0, x = 2, and x = −2. The zeros of the denominator, i.e., the points where f (x) is not defined, are x = 3 and x = −5.

(0. In particular. x (1) f (x) = x3 − x2 − 5x − 3 (2) g(x) = 3 .2: A cubic polynomial.1: A quadratic polynomial. −5/6]. (−5. You see that the sign changes at some. Theorem 2. the sign of f (x) remains unchanged on each of the intervals (−∞. (−2. Find intervals on which the following functions do not change signs. except at x = −5/6. ∞). −2). x + 5x2 − 4x − 20 We are ready to discuss the monotonicity of functions whose derivative vanishes at some points. f is decreasing on the interval (−∞. −5). positive on (−2. ∞). ∞). 0). 30 5 25 20 15 10 5 -3 -2 -1 -5 1 2 3 4 -2 -5 -10 -15 -20 2 4 Figure 2. ♦ Exercise 15. ♦ .72 CHAPTER 2. 0) and on (0. GLOBAL THEORY According to the proposition. −5). So f (x) > 0 for all points x ∈ [−5/6. 3). negative on (−5. Find intervals of monotonicity for the function f (x) = 3x2 + 5x − 4. Counting signs of the factors in the expression for f (x). f (x) = 3x2 + 5x − 4 Figure 2.1. and positive on (3. −2). f (x) > 0 if x ∈ (−5/6.16. (2. p(x) = x3 − 3x2 − 9x + 3 Solution: We graphed the function in Figure 2. ∞). negative on (2. ∞). 2). By a similar argument. exceptional numbers. 2). we see f (x) is positive on the interval (−∞.11 (3) says that f is increasing on the interval [−5/6. Decide whether the functions are positive or negative on these intervals. Its derivative is f (x) = 6x + 5. Example 2. 3) and (3. but not all.

THE FIRST DERIVATIVE AND MONOTONICITY 73 Example 2. and f (x) < 0 on the intervals (−1. and (3. (x − 1)2 We see that the exceptional points for f (x) are x = 1. (−1. . The derivative is negative on the interval (−1. Solution: We differentiate the function and rewrite the expression for the derivative so that it is easier to find its exceptional points. We conclude that f (x) does not change signs on the intervals (−∞. (1. −1) and (3. ∞). We conclude that f (x) is increasing on the (−∞. Observe that f (x) is defined and differentiable on the entire real line with the only exception of x = 1. 3]. x−1 Solution: The simplified expression for the derivative of f is f (x) = f (x) = (x + 1)(x − 3) . Counting the signs of the factors we see that p (x) is positive on (−∞. ∞). Find intervals of monotonicity for the degree three polynomial (for a graph see Figure 2.3. 3). −1). Find intervals on which the function f (x) = sin 2x + 2 sin x is monotonic. ♦ Example 2. we conclude that f (x) > 0 on the intervals (−∞. Its derivative is p (x) = 3x2 − 6x − 9 = 3(x2 − 2x − 3) = 3(x − 3)(x + 1).17. Restrict your discussion to the interval [0. 1). 2π]. 1) and (1.2) p(x) = x3 − 3x2 − 9x + 3 Solution: The function is defined and differentiable on the real line. ∞). x = −1 and x = 3. Counting the signs of the factors of f (x).19. f (x) = 2 cos 2x + 2 cos x = 2[2 cos2 x + cos x − 1] = 4(cos x + 1) cos x − 1 2 . −1]. Find intervals of monotonicity for the rational function x2 + 3x . The theorem implies that p(x) is decreasing on the interval [−1. ∞).2. ∞) and that it is increasing on the interval (−∞. −1] and [3. We conclude that p(x) is increasing on the interval [3. 3).18. 3). The function is decreasing on the intervals [−1. −1) and on (3. 1) and (1. 3]. ♦ Example 2.

(π/3. Checking the sign of f (at one point) in each of the intervals. there f (x) is increasing. x = π) and where cos x = 1 (i.. and in this interval there are three points at which f (x) is not positive. We conclude that f is increasing on the interval [0. The function is decreasing on the interval [π/3. There you see the graph of the function (solid line) and the graph of its derivative (dashed line). 5π/3) and (5π/3. 5π/3). (g) and (h). π/3] and [5π/3. Find intervals on which the function f increases and intervals on which f decreases. We find exceptional points where cos x = −1 (i. 5π/3]. there f (x) is decreasing. π) and (π. As you see. ♦ Exercise 16.74 CHAPTER 2. restrict yourself . 2π] on which f does not change sign.3. You may confirm the calculation by having a look at Figure 2. and f (x) < 0 for x ∈ (π/3. Then we solved the quadratic equation in terms of cos x. This provides us with the intervals [0.e. we find that f (x) > 0 for x ∈ [0. GLOBAL THEORY To see the second equality we used that cos 2x = 2 cos2 x − 1. wherever f (x) is positive. 2π]. Observe that f is differentiable on [0.. Wherever f (x) is negative. 2π]. x = π and x = 5π ).3: A function and its derivative. π/3) and x ∈ (5π/3. π/3). and that f (x) = 0 at the end points of this interval. π). In the last two problems. 2 3 3 4 3 2 1 1 -1 -2 2 3 4 5 6 Figure 2.e. (π. 2π].

f (x) < f (c) for all x ∈ (c − d.20. THE FIRST DERIVATIVE AND MONOTONICITY to the interval [0. Suppose f is a function and c is an interior point of its domain. Being increasing on an interval is a global property.3. and how this concept is related to the one of being increasing on an interval. Definition 2. c) and f (x) > f (c) for all x ∈ (c. For the global property the interval is given to us. interval. Then f is increasing (decreasing) on I if and only it it is increasing (decreasing) at each point in I. We are making a statement about the behavior of the function on some open interval which contains c. Theorem 2. Expressed informally.3. but the ‘if’ part uses some deeper facts about finite closed intervals. possibly rather small. The global property has to hold for any two points in the given interval. For the local property we compare f (x) to f (c) where c is fixed and x is any point in an open interval around c which we may chose.21. For the local property we may chose the. (a) f (x) = 3x2 + 5x + 7 (b) f (x) = x3 − 3x2 + 6 (c) f (x) = (x + 3)/(x − 7) (d) f (x) = x + 1/x (e) f (x) = x3 (1 + x) (f) f (x) = x/(1 + x2 ) (g) f (x) = cos 2x + 2 cos x √ (h) f (x) = sin2 x − 3 sin x 75 2. for some d > 0.2. We say that f is decreasing at c if this statement holds with the inequalities reversed.2 Monotonicity at a Point It is quite natural to ask what it means that a function is increasing at a point. Suppose f is a function which is defined on an open interval I. . at least for a while. c + d). This theorem establishes the relation between the local and the global property. Being increasing or decreasing at a point c is a local property. Our second result gives us a valuable tool to detect monotonicity of functions at a point. to the left of c the function is smaller and to the right of c it is larger than at c. 2π]. We say that f is increasing at c if. We address both questions in this subsection. The ‘only if’ part is not difficult to show.

A function can be differentiable and increasing at a point x. but there is not open interval which contains x such that the function is increasing on this interval. Remark 6.. even if the assumptions of Proposition 2. Secant lines will either be required to lie above or below the graph. If f (c) < 0.22 do not hold. and the rates of change will be either increasing of decreasing. then f is decreasing at c. The ideas of of a function being increasing or decreasing at a point may be generalized to cover domains of functions which are halfclosed or closed intervals. GLOBAL THEORY Proposition 2. 10 8 6 4 -2 -1 4 2 1 -2 -4 2 3 4 2 -6 -2 -1 1 2 -8 Figure 2. A function can also be increasing at a point x. A function does not have to be differentiable to be increasing. Graph the function f (x) = 2x + |x| to convince yourself of this fact.22. We have no specific needs for such statements.4: Concave Up Figure 2.e. but the motivated reader is encouraged to explore them. Remark 5.5 as illustrations of the discussion. These properties can be described globally over intervals and locally at points.4 The Second Derivative and Concavity We like to capture the property of a graph being bent upwards or downwards.5: Concave Down .4 and 2.76 CHAPTER 2. 2. i. then f is increasing at c. but if f (0) = 0. Let f be a function and c an interior point of its domain. and where we like to make a statement about the behavior of a function at an endpoint. If f is differentiable at c and f (c) > 0. You may use the graphs in Figures 2. f (x) = x3 is increasing at x = 0.

If f (x) is decreasing on I. 2.24 (2) says that q is concave 3 Strictly speaking. b). f (b)) be two distinct points on its graph. The inequality expresses that between the points a and b the secant line lies above the graph. THE SECOND DERIVATIVE AND CONCAVITY 77 2.4. For example. Theorem 2. resp. f (b)). Definition 2. 3. We state a theorem which provides you with assumptions under which a function is concave up or down. The line through these two points is l(x) = f (a) + f (b) − f (a) (x − a). then f (x) is concave up on I. The inequality expresses that between the points a and b the secant line lies below the graph.. We say that f is concave down on I if f (c) > l(c) for all a. the conclusions in (2) still hold if in each finite interval J ⊂ I there are only finitely many points at which the assumption f (x) > 0.4. We say that f (x) is twice differentiable on I. f (a)) and (b. The second derivative will be unique at all points in I if I is not empty and does not consist of exactly one point. 1. b in I and c ∈ (a. We will not provide a proof of the theorem. Let f be a function which is defined on an interval I. the line segment joining the two points. and F (x) is twice differentiable on J. if f (x) extends to a function F (x) which is defined on an open interval J which contains I. b). More generally. More generally. Let f be a function which is defined on an interval I.1 Concavity on Intervals Let f (x) be a function and let (a. . the function shown in Figure 2. Theorem 2. then f (x) is concave down on I. If f (x) > 0 for all x in I. Here l(x) is the secant line through (a. If f (x) is increasing on I.e.23. is not satisfied. f (a)) and (b. then f (x) is concave up on I. Suppose that f (x) is differentiable on I. We say that f is concave up on I if f (c) < l(c) for all a. f (x) < 0. so far we can consider being ‘twice differentiable’ only for functions which are defined on open intervals.24. If f (x) < 0 for all x in I. then we get the secant line through the two point. Suppose that f (x) is twice differentiable3 on I.11. b in I and c ∈ (a. we proceed as in Section 1. b]. Its second derivative is q (x) = 2 > 0.2. i. then f (x) is concave down on I. b−a If we restrict l(x) to x ∈ [a.4 is q(x) = x2 − 2x + 3.

You may consider the spread of a desease.24 (1). Solution: You find the graph of this function in Figure 2. ∞). One deduces that f (x) is concave down on the interval (−∞. and its second derivative is g (x) = −2 < 0.24 (2). Example 2.5 is g(x) = −x2 + 5x − 1. ∞). p (x) < 0 for x ∈ [−∞. Similarly. secant lines which are above the graph turn into secant lines below the graph.2. GLOBAL THEORY up on (−∞. you may note that the derivative ln x = 1/x is decreasing on (0. It may be scary if I (t) > 0. ∞) and apply Theorem 2. ∞). Thus. Its second derivative is p (x) = 6x−6 = 6(x−1). You are invited to study the concavity of the other trigonometric and hyperbolic functions. ln x being concave down implies that exp(x) is concave up.25. In this process. You may also use that exp (x) is increasing on the real line. tan x < 0 for x ∈ (−π/2. and p (x) < 0 for x ∈ (−∞. In particular.24 (3) implies that tan x is concave down on (−π/2. if I (t) > 0. To see this. Finally. This means that p (x) > 0 for all x ∈ [1. and then quote Theorem 2. Denote the number of infected people by I(t). ∞) with only one exception. then g is concave down. π/2). i. you may note that exp (x) = exp(x) > 0 and apply Theorem 2. You may confirm these statements visually by inspecting a graph of the tangent function. To see this. if f is concave up. The exponential function exp(x) = ex is concave up on (−∞.24 (2) says that q is concave down on (−∞. Remark 7.. ♦ Consider the function tan x. Let us look at examples where we apply condition Theorem 2. you may observe that a function is concave up if its inverse is convave down4 . and vice versa.78 CHAPTER 2. 1].24 (2). Study the concavity properties of the function p(x) = x3 − 3x2 − 9x + 3.24 (3). It is worse. x = 1. Alternatively. ∞). 0] and concave up on [0. ∞).e. 1). Theorem 2. ∞) and apply Theorem 2. ∞). Theorem 2. You may verify that tan x = 2 sec2 x tan x. π/2). 4 . I(t) increases. then the graph of one of the functions is obtained from the one of the other one by reflection at the diagonal x = y. The function shown in Figure 2. you may use that ln (x) = −1/x2 < 0 on (0.24 (1) to derive the desrired conclusion. The function ln x is concave down on the interval (0. and often true in the early stages of an epedemic. Theorem 2. 0) and tan x > 0 for x ∈ (0. 1) with only one exception. If f and g are inverses of each other. We see that p (x) > 0 for x ∈ (1.24 (3) tells us that p(x) is concave up on the interval [1. x = 1. So.

j(x) = x/(x2 − 1) 6. You see the turning point in the graph in Figure 1. During a televised presidential debate. Yes. page B6): “Some of these facts and figures just don’t add up. The spread or the desease slows. When I (t) turns negative. As long as y is less than a/(2b).15. THE SECOND DERIVATIVE AND CONCAVITY 79 This means that I (t) increases. We called y = a/b the carrying capacity of the system. The first case occurs if y = 0 or y = a/b.4. 2π]. It has begun to decline. Earlier we considered the logistic equation y = ay − by 2 . 1. i(x) = 2x4 − x2 5.2. We see that y = 0 if y = 0 or y = a/(2b). 1984.15. The inflection occurs when y is half the carrying capacity. Let us look at this phenomena in a concrete example. the population grows at an increasing rate. Exercise 17. k(x) = 2 cos2 x − x2 for x ∈ [0. then I (t) decreases. then growth slows. For a while the population seems to explode. Medical professional will not necessarily wait for the time when I(t). the number of infected people. If a/(2b) < y < a/b. f (x) = x3 − 4x2 + 8x − 7 2. resp. Find intervals on which the following functions are concave up.” 5 . and it was the stable equilibrium point. there has been an increase in poverty but it is a lower rate of increase than it was in the preceding years before we got here. Use implicit differentiation to find the second derivative: y = ay − 2byy = (a − 2by)y .75 and the graph of a solution of this diferential equation in Figure 1. one of the candidates said (see the New York Times from October 8th.. so that the actual number of sick people decreases. but it is still going up. See Example 1. and the desease spreads at an increasing rate. h(x) = x + 1/x 4. but after a while it levels off so that it does not exceed a the carrying capacity. starts decreasing. The point at which I (t) changes signs from being positive to being negative may be considered the turning point in the spread of the desease. One of the recent presidents was confused by a subtle argument of this kind5 . One may hope that eventually I (t) becomes negative. concave down. g(x) = x4 + 2x3 − 3x2 + 5x − 2 3.

one shows a function which is concave down. and this is the behavior which we like to capture in a definition. Our next theorem tells us how to detect concavity. concave down.26. resp. 4 2 3 1. One shows a function which is concave up at the indicated point. for all x ∈ I with x = c. such that l(c) = f (c) and f (x) > l(x).80 CHAPTER 2.2 Concavity at a Point The notion of being concave up or down was defined for functions which are defined on intervals. Still. We say that f is concave up. If the graph is above the line. at c if there exists an open interval I and a line l.. f (x) < l(x). GLOBAL THEORY 2.7: Concave down at • In other words. Definition 2.5 1 2 0. . if it is below. we got a picture how the function has to look like near a point.5 0. We assume that the graph and the line agree at c. 6 The idea of an interior point was defined in Definition 1.. such that the graph lies on one side of the graph. called a support line.6: Concave up at • Figure 2. at least near c. then the function is concave up.18 on page 10. resp. we are asking for a line l(x).7. and it tells us how to find the support line if the function is differentiable.5 1 -0.5 1 1. then the function is concave down.4. You see this situation illustrated in two generic pictures in Figures 2.5 2 1 2 3 4 5 Figure 2. Let f be a function and c an interior point 6 of its domain.6 and 2.

We say that f has a local maximum. If f is differentiable and concave up or down at c.. down) on (a. Theorem 2. minimum. b). resp. Definition 2. then there is only one support line. In general. For example. If the second derivative is zero. resp. It is the tangent line. In this case we call f (c) a local maximum. If f is increasing at c or if f (c) > 0.27. LOCAL EXTREMA AND INFLECTION POINTS 81 Theorem 2. b) if and only if f is concave up (resp.2.. Then f is concave up (resp. f (c) ≤ f (x). The sign of the second derivative of a functions tells us whether a function is concave up or down at a point. resp. then f is concave down at c. b). then f is concave up at c. 2. Let f be a function and c an interior point of its domain. or neither. we focus on interior points is the domain of the function. at c if f (c) ≥ f (x). To relate concavity properties on an interval to those at each point in the interval we state. Let f be a function and c an interior point in its domain7 . 1. of f . So. down) at each point in (a. The function can be concave up.28. and this line is the tangent line to the graph of f at c. 3. If f is decreasing at c or if f (c) < 0. without proof. 2.5. A local extremum is a local maximum or minimum. the function f (x) = x5 − 7x4 + 2x3 + 2x2 − 5x + 4 is concave down at x = 2 because f (2) = −148 < 0. We just hold the ruler against the graph. there can be many support lines at any given point. we can draw the tangent line easily. but if the function is differentiable at c. the following theorem. minimum. Let f be a function which is defined on an open interval (a. . for all x in some open interval I around c. then the test is inconclusive.5 Local Extrema and Inflection Points We are going to discuss two types of points which are particularly important in the discussion of (graphs of) functions.29 (Local Extrema). then the support line is unique. for a differentiable function which is concave up or down at a point. As we like to apply local properties of the function. down.

and a local maximum if the coefficient of x2 is negative. and f (x) = −(x − 1)2 has a local maximum at x = 1. The function is concave down on the interval (−π/2.5 -0. or vice versa. I. We call c an inflection point of f if the concavity of f changes at c. The vertex of a parabola is always a local extremum. We do not need any test to see that f (x) = |x| has a local minimum at x = 0. No test is required to see that f (x) = tan x has an inflection point at x = 0. b).18 on page 10 this means that f (x) is defined for all x in some open interval around c.5 1 -1 -2 -1 -1 -2 1 2 -0.30 (Inflection Points). We will study tests which allow us find local extrema soon. we have that f is concave up on the interval (a. 7 According to Definition 1. You see the graph of this function in Figure 2. The function shown in Figure 2.82 CHAPTER 2.e.8 has a local minimum at x = −1. if f (c) is the largest value compared the values at points near c.5 -1 -1.9.. . Definition 2. a local minimum if the coefficient of x2 is positive.5 1 Figure 2. GLOBAL THEORY 1.5 3 1 2 0. f has a local maximum of f (c) at c. 0] and concave up on the interval [0.5 0. c] and concave down on [c. Soon we will develop tests which detect inflections points. π/2). Let f be a function and c an interior point of its domain.9: An Inflection Point In other words. for some numbers a and b with a < c < b.8: A local minimum Figure 2. So the concavity changes at x = 0 and that means that there is an inflection point at x = 0.

33 (Saddle Points). or if f is not differentiable at c. there are points x to the left of and arbitrarily close to c such that f (x) < f (c). If a function has a local extremum at c. c + d). If c is a critical point of the function. Let f be a function and c an interior point of its domain. and there are points x to the right of and arbitrarily close to c such that f (x) > f (c). then c is a critical point of the function. Definition 2. then either f (c) > 0 or f (c) < 0. It makes sense to introduce one more word.31. No local extrema can occur at points which are not critical. We say that c is a critical point of f if f is differentiable at c and f (c) = 0. DETECTION OF LOCAL EXTREMA 83 2. This means. but f does not have a local extremum at c.6. To see this. These functions have no critical points.31. Typically. and f (x) > f (c) for all x ∈ (c. Let f be a function and c an interior point of its domain. Proof of Theorem 2. then the same argument applies with inequalities reversed. c). it is customary to say: Definition 2. Theorem 2. To have an abbreviation for the points which are recognized as important in this theorem. If f (x) < 0. and in neither case we have an extremum at c.31 they have no local extrema.32 (Critical Points). So. The first result excludes many points. such that f (x) < f (c) for all x ∈ (c − d. and their derivatives exp x = exp x and ln x = 1/x are everywhere nonzero. In other words. If f (c) = 0. We say that c is a saddle point of f if f is differentiable at c and f (c) = 0. Let f be a function and c an interior point of its domain. . If f is differentiable at c and f (c) = 0.31 provides us with a necessary condition. The test does not give a sufficient condition for a local extremum. observe that these functions are differentiable on their domain. Theorem 2. there are very few points where local extrema can occur. by definition. if f has a local extremum at c.6 Detection of Local Extrema We will discuss how to detect local extrema. that f does not have a local extremum at c. Neither the exponential function nor the logarithm function have local extrema. and according to Theorem 2.2. then f does not have a local extremum at c. then f is either not differentiable at c or f (c) = 0. Proposition 2.22 on page 76 tells us that there exists some positive number d. then the function need not have a local extremum at c. Suppose that f is differentiable at c and f (c) > 0.

♦ 8 8 6 6 4 2 4 -2 -1 -2 1 2 2 -4 -6 -1 1 2 3 -8 Figure 2. which is the only zero of g (x) = 3x2 . The only point at which we can have a local extremum. the function has a saddle point at this point.e. ♦ . the only critical point. i. and q (x) = 2x − 2 = 2(x − 1). where this function is graphed. and its only critical point is at x = 0. Obviously. 0). then we see that q does indeed that a local minimum at x = 1.. This saddle point is shown in Figure 2. Find the local extrema of the function q(x) = x2 − 2x + 3. GLOBAL THEORY Example 2.11: A saddle point Example 2.11. ∞) and g(x) < 0 for all x ∈ (−∞. If we write the function in the form q(x) = (x − 1)2 + 2.10.84 CHAPTER 2. As g (0) = 0 and there is no local extremum at x = 0.34.10: A local minimum Figure 2. Solution: The function is differentiable for all real numbers x. is x = 1.35. This means that there is no local extremum at x = 0. g(x) > 0 for all x ∈ (0. Show that the function g(x) = x3 has a saddle point at x = 0. Solution: The function g(x) is everywhere differentiable. You should confirm this result by having a look at Figure 2. So q (x) = 0 if x = 1.

If f (x) < 0 for all x ∈ (c − d. then f has a local minimum at c. then f has a saddle point at c. This conclusion also holds if f (x) < 0 for all x ∈ (c−d. Theorem 2. c + d) for some d > 0. You can see graphs of f and f in Figures 2. c] and increasing on [c.38. If f (x) > 0 for all x ∈ (c − d. 1 − 3/3) and (1 + 3/3. and c is a critical point. Suppose f is a function which is defined and differentiable on (c − d. c + d). Then f has a local maximum at c. Let us illustrate the use of the theorem with an example. c + d). If f (x) > 0 for all x ∈ (c−d. 1 + 3/3). then f has a local maximum at c. If the function is decreasing on (c − d.13 . ∞). and suppose that for some d > 0 the function is increasing on (c−d. c+d). c + d). It gives us a sufficient condition for a local extremum at c. c)∪(c. Theorem 2. c + d). and f (x) is negative on the interval √ √ (1 − 3/3.6. 1. c) and f (x) > 0 for all x ∈ (c. We √ conclude that f (x) = 0 if √ = 1 ± 3/3. Find the local extrema of the function f (x) = x3 − 3x2 + 2x + 2. Solution: We differentiate f (x) and express f (x) as a product of linear factors: √ √ 3 3 2 f (x) = 3x − 6x + 2 = 3 x − 1 + x− 1− 3 3 It is easy to determine where the factors are zero. c+d). Suppose c is an interior point of the domain of a function f . DETECTION OF LOCAL EXTREMA 85 Let us formulate a criterion which confirms that a function has a local extremum at a point c. c) and f (x) < 0 for all x ∈ (c. Taking advantage of the information provided by the first derivative. 3.12 and 2. c] and decreasing on [c. then f has a local minimum at c. c)∪(c. positive and negative.2.37 (First Derivative Test). 2. Example 2.36. we obtain the following test. f (x) is positive on the intervals x √ (−∞.

86 CHAPTER 2.5 2 Figure 2.19. (g) and (h).4 1. restrict yourself to the interval [0.5 1. 2π].2 0. ♦ Exercise 18. . Find the local extrema of the following functions. Hint: We discussed the monotonicity properties of these functions in Examples 2.12: f (x) = x3 − 3x2 + 2x + 2 Figure 2. Find the local extrema of the following function: (1) f (x) = x2 + 3x x−1 (2) g(x) = sin 2x + 2 sin x for x ∈ [0. GLOBAL THEORY 2. Exercise 19.18 and 2.5 -1 1 1. (a) f (x) = 3x2 + 5x + 7 (b) f (x) = x3 − 3x2 + 6 (c) f (x) = (x + 3)/(x − 7) (d) f (x) = x + 1/x (e) f (x) = x3 (1 + x) (f) f (x) = x/(1 + x2 ) (g) f (x) = cos 2x + 2 cos x √ (h) f (x) = sin2 x − 3 sin x Hint: You discussed the intervals of monotonicity for these functions in Exercise 16.5 2 4 3 2 1 0. and these are the only points where a local extremum can occur.2 1 1.8 1.4 2.6 1. 2π].13: f (x) = 3x2 − 6x + 2 √ The only only critical points of f are at x = 1 ± 3/3. In the last two problems. Based on the sign of f (x) on intervals to the left and right √ these two critical points we see that √ has of f a local maximum at x = 1 − 3/3 and a local minimum at x = 1 + 3/3.

37. we differentiate f and find the critical points. It does not provide us with a necessary condition. p (−1) = −12 and p (3) = 12. Furthermore. There may or may not be a local extremum at c. Example 2. The assumption that f (c) > 0 means that f is concave up at c (see Theorem 2. p (x) = 3x2 − 6x − 9 = 3(x + 1)(x − 3). The second derivative test tells us that we have a local maximum at x = −1. If f (c) = f (c) = 0. If f (c) < 0. Then we differentiate f (x). let us assume that f (c) = 0 and f (c) > 0. p (x) = 6x − 6 = 6(x − 1). Furthermore. see Figure 2. The sign of f at the critical points tells us whether we found a local minimum or a local maximum.27 (1)). First. Theorem 2. Spelled out explicitly this means that f (x) > l(x) = f (c) .39 (Second Derivative Test). because this is a critical point and p (−1) < 0. In particular. Assume also that f (c) and f (c) exist and that f (c) = 0.40. DETECTION OF LOCAL EXTREMA 87 We may use the second derivative to detect the change of sign of the first derivative. the test provides us with a sufficient condition for the existence of a local extremum at a point. In this sense. the function f can have a local extremum at c. the zeros of f (x). then f has a local maximum at c. as it is called for in the assumptions in Theorem 2. ♦ Proof of the Second Derivative Test.2. Let f be a function and c an interior point in its domain. then the test is inconclusive.2 on page 72) p(x) = x3 − 3x2 − 9x + 3. To apply the theorem to the detection of the local extrema of a differentiable function f (x). The critical points of the function are x = −1 and x = 3. Find the local extrema of the function (for a graph. We also have a local minimum at x = 3 because at this critical point the second derivative of the function is positive. We will show that f has a local minimum at c. The assumption that f (c) = 0 means that the tangent line to the graph of f at (c. If f (c) > 0.6. and the assumptions of the test are not satisfied. Its equation is l(x) = f (c). Solution: We calculated the first derivative. f (c)) is horizontal. then f has a local minimum at c.

GLOBAL THEORY for some positive number d and for all x ∈ (c − d.5 1 1. (a) f (x) = 4x2 − 7x + 13 (b) f (x) = x3 − 3x2 + 6 (c) f (x) = x + 3/x (d) f (x) = x2 (1 − x) (e) f (x) = |x2 − 16| (f) f (x) = x2 /(1 + x2 ).5 2 2.5 1 1. A theorem provides a necessary and a sufficient condition for the existence of an inflection point.7 Detection of Inflection Points We defined an inflection point to be a point at which the concavity of a function changes.88 CHAPTER 2. Figure 2.15: The graph of g . c) ∪ (c. 2.5 -0. . c + d). Let us start out with an example.5 -2 0. Find the critical points and the local extrema.5 1 2 0. then we can just answer this question. Exercise 20.41. If we know where the function is concave up and down.5 0.5 -0. We want to detect inflection points more efficiently.5 2 2.5 -0.14: The graph of g. The proof that f has a local maximum at c if f (c) = 0 and f (c) < 0 is similar. Find the the inflection points of the function g(x) = x3 − 4x2 + 3x − 5.5 2 1. In other words.5 6 4 Figure 2. 8 2. We leave it to the reader. f has a local minimum at c. Example 2.

Suppose that the first and second derivatives of f exist at c. We calculate the third derivative of f : f (t) = 48t − 36. DETECTION OF INFLECTION POINTS 89 You see the graph of g in Figure 2. Solution: We calculate the second derivative of the function and find f (t) = 24t2 − 36t + 10. By definition.42. ♦ Theorem 2.2. ♦ . Find the inflection points of f (t) = 2t4 − 6t3 + 5t2 − 7t + 4. ∞). 2. If f (c) = 0. From the formula for the second derivative we conclude that g (x) < 0 if x ∈ (−∞. 4/3] and concave up on [4/3. f (c) exists and f (c) = 0. 1. The roots are √ 3 9 ± 21 1√ t= ± 21 = . and √ this means that f (9 ± 21)/12) = 0. 4 12 12 Now. we have an inflection point at x = 4/3. We calculate the first and second derivative of g: g (x) = 3x2 − 8x + 3 and g (x) = 6x − 8. but this is a bit cumbersome. You also see that g (x) has a local extremum at the same point. 4/3) and that g (x) > 0 if x ∈ (4/3. we have to find the zeros of f (x) to determine where an inflection point can be. √ We could plug t = (9 ± 21)/12 into the expression for f . We see right away that f (t) = 0 exactly if t = 3/4.43. ∞). let us check whether there are inflection points at either of these values for t. then f has an inflection point at c.7. If f has an inflection point at c.15. This means that g is concave down on the interval (−∞.14. Example 2. According to the theorem. then f (c) = 0.14 and the one of g in Figure 2. You see the inflection point indicated as a dot in Figure 2. Let f be a function and c an interior point of its domain. The theorem says that the inflection √ points of f (t) are at t = (9 ± 21)/12.

Find the inflection points of the function f (x) = 1. A look at the graph of f barely reveals some of the inflection points. In many cases we like to find the maximal value of a function.17: The graph of f .44. Example 2. 3 2.5 2 1. If we are given graphical information. Discuss the relation between the inflection points of a function f and the local extrema of its derivative f . Zooming in on parts of the graph f will not improve this.16 and 2. 3].8 Absolute Extrema of Functions We said that a function f has a local maximum at c if its value at c is largest in comparison to the values at point near c. anywhere in the domain of the function. then this quite easy. We asked the computer to graph f and f for x ∈ [−3. At least in this example.17. You see the graphs in Figures 2. and where it occurs. 2.5 1 0. Figure 2. our ability it find inflection points of a function is limited by our ability to find the zeros of its second derivative.5 8 6 4 2 -3 0 1 2 3 -2 -1 -2 1 2 3 -3 -2 -1 Figure 2. This concept is captured in . it will take an effort to calculate the second derivative of this function. the graph of f tells us much more about the concavity of the function f than its own graph.2 + x2 − 3(sin x)3 . Apparently.90 CHAPTER 2. GLOBAL THEORY Apparently. but the graph of f shows them clearly. ♦ Exercise 21. and it will be nearly impossible to find the zeros of f . Any reasonable software has no problem with this.16: The graph of f .

We say that f has an absolute maximum at c if f (x) ≤ f (c) for all x in the domain of f . The argument for the absolute minimum is left to the reader. Then we call f (c) the absolute maximum of f . the c is critical.19: 3x2 − 10x + 6 Solution: According to Theorem 2. b] assumes its absolute maximum and minimum either at a critical point or at an endpoint of the interval. the absolute extrema of the function occur either at one of the end points x = 0. Figure 2. x = 4. Let f be a function. Example 2. 4].18: x3 − 5x2 + 6x + 1. A continuous function on a closed interval [a. or at a critical .5 2 3 4 Figure 2.2.47. and the function has a local maximum at c. Theorem 2. If f is differentiable at c. then f (c) = 0 by Theorem 2. If the function does not assume its absolute maximum at an endpoint. In Theorem 1. 8 12.17 we asserted that a continuous function assumes its absolute maximum at some point in the interval.45. A different expression is to say that the function assumes its absolute extremum at c.31.46.5 10 6 7.8. then we say that f has an absolute minimum at c. Find the absolute extrema of the function f (x) = x3 − 5x2 + 6x + 1 for x ∈ [0. and we call f (c) the absolute minimum of f . ABSOLUTE EXTREMA OF FUNCTIONS 91 Definition 2.46. and c a point in its domain. then it does so at some interior point c.5 2 1 1 2 3 4 -2. If f (x) ≥ f (c) for all x in the domain of f . Proof.5 4 5 2. and c is critical as well. If f is not differentiable at c.

and its absolute at minimum of approximately .18 and the one of f in Figure 2. We consider a few examples and give some problems for practice. Calculus helps us to solve these optimization problems. 5] (b) f (x) = x3 + 3x2 − 5x + 2 for x ∈ [−3. 2. the zeros of 2 − 10x + 6. How long should each piece be. Use one piece as the perimeter of an equilateral triangle and the other one as the perimeter of a disk.92 CHAPTER 2. (a) f (x) = x2 − 5x + 2 for x ∈ [0.3689. 5] (d) f (x) = cos 2x + 2 cos x for 0 ≤ x ≤ 2π (e) f (x) = sin x + cos x for 0 ≤ x ≤ 2π 2. To avoid lenghty introductions to real-life problems. In fact. we content ourselves with problems of an algebraic or geometric nature.5486 and .5] √ √ (c) f (x) = 2 + x/ 1 + x for x ∈ [0. You may also check that f (x) = 6x − 10. The approximate values of the function at these points are √ √ f ((5 + 7)/3) = 3. we conclude that the function assumes its absolute maximum of 9√ x = 4.e.1126 and f ((5 − 7)/3) = .3689 at x = (5 − 7)/3.19. Example 2. Cut a string of length 50 centimeters into two pieces. Approximate values of these roots are 2. i.7848. GLOBAL THEORY point.. ♦ Exercise 22.√ The critical points.9 Optimization Story Problems Many real-life problems are formulated as optimization problems.48. f (0) = 1 and f (4) = 9. are x = (5 ± f (x) = 3x 7)/3. Comparing the values of f (x) at these four points. The second derivative test tells us that the function has a local minimum at √ √ x = (5 + 7)/3 and a local maximum at x = (5 − 7)/3. so that the combined . You may compare our calculation with the graphs of f in Figure 2. Find the absolute extrema of the functions on the indicated intervals. and √ √ f ((5 + 7)/3) > 0 and f ((5 − 7)/3) < 0.

Repeat the previous example with 1. Introduction of notation: There are many ways to set up the notation to solve this problem. 3 4 3 3 3 √ and A (r) = 0 if and only if r = 50/(2π + 6 3). we need that 0 ≤ r ≤ 25/π. The combined area of the triangle and disk is √ √ a2 3 3 50 − 2πr 2 2 A= + πr 2 . √ . 3 2 The height of the triangle is h = a 2 3 . is where the local minimum occurs. Solve the mathematical problem: The derivative of A(r) is √ 3 50 − 2πr 4π −π 50 − 2πr A (r) = − · + 2πr = √ + 2πr. which we just found. ♦ Exercise 23.9. Express information as equations: The perimeter of the triangle will be 3a and the perimeter of the circle will be 2πr.28 amd A(25/π) ≈ 198. + πr = 4 4 3 For this to make sense. This means that 3a + 2πr = 50 and a = √ 50 − 2πr . The area of the disk is πr 2 . It is also the absolute minimum of A(r) on any interval which contains the critical point. In the latter case. OPTIMIZATION STORY PROBLEMS 93 area of the triangle and the circle is minimal? How long should each piece be. a disk and a square. and it will be maximal of r = 25/π. and its area is a 4 3 .2. 4 3 for r ∈ [0. For the end points we have: A(0) ≈ 120. so that the combined area of the triangle and the circle is maximal? In our solution we will go through several steps. Formulate the problem mathematically: Find the absolute minimum (maximum) of the function √ 3 50 − 2πr 2 A(r) = + πr 2 . We note that A(r) is a parabola which is open upwards. all string is used for the circle. The critical point. 25/π].94. Answer the original question: The combined area of the disk and √ the triangle will be minimal if r = 50/(2π + 6 3). Among them we say that the side length of the triangle is a and the radius of the circle is r.

and convince yourself that any box obtained by a different process will have smaller volume. an equilateral triangle and a square. length L − 2x. W/2]. GLOBAL THEORY 2.94 CHAPTER 2. .49. the box will have width W − 2x. Express information as equations: As we folded up a strip of width x. Clarification and introduction of notation: We construct the box by making an incision at a 45 degree angle at each corner. 3. To simplify matters. For yourself. 5. Example 2.. By construction. The derivative of V is V (x) = W L − 4(W + L)x + 12x2 . i. V (0) = V (W/2) = 0. and x ≤ L/2. The zeros of V are at x= 8 1 (L + W ) ± 6 L2 + W 2 − LW . 60 and 90 degrees). On the interior of the interval the function is positive. 4. draw a picture of this production process. height x. Formulate the problem mathematically: Find the absolute maximum of the function V (x) = W Lx − 2(L + W )x2 + 4x3 for x ∈ [0. x ≥ 0. What are the dimensions of the box with the largest possible volume? In our solution we will go through several steps. Solve the mathematical problem: At the end points of the interval V vanishes. x ≤ W/2. You could have cut out a square of size x × x at each corner.e. a regular hexagon and a square. two geometric shapes of your own choice. a disk and half an equilateral triangle (the angles at 30. Then we fold up a strip of width x along each side8 . we call the longer side of the rectangle L and the shorter one W . Construct an open box from a rectangular piece of card board of length L and width W . in fact x ≤ W/2. and volume V (x) = (W − 2x)(L − 2x)x = W Lx − 2(L + W )x2 + 4x3 .

so that its area is A(x) = (π − 2x) sin x. Exercise 25. Repeat the previous example with specific numbers for the width and length of the piece of card board. 2 . Exercise 27. Modify the problem from above. The width of the rectangle is π − 2x and its height is sin x. ♦ Its width will be W − 2x and its length L − 2x. 0). constructing a box with a round base from a circular piece of card board. Exercise 24. π/2]. After a simple algebraic simplification. The first derivative of this function is A (x) = −2 sin x + (π − 2x) cos x. sin x) for some x ∈ [0. sin x) and (π − x. Make incisions at the corners. OPTIMIZATION STORY PROBLEMS 95 The function has an inflection point at (W + L)/6. you find that A (x) = 0 if and only if tan x = π − 2x . Answer the original question: The box with the largest volume will have a height of x= 1 (L + W ) − 6 L2 + W 2 − LW . π/2]. As the function V (x) has only one local maximum in the interval. (π − x. 0). We need to find the absolute maximum for this function for x ∈ [0. Start out with an equilateral piece of card board with side length a. Convince yourself that the vertices of the rectangle should be (x. the local maximum is the same as the absolute maximum. What is the largest possible volume for a right circular cone of slant height a? Example 2. and fold up strips along the edges. You will get an open box whose base is an equilateral triangle. Determine the rectangle of maximal area which can be placed between the x-axis and the graph of the function f (x) = sin x.2. We conclude that V has a local maximum at x= 1 (L + W ) − 6 L2 + W 2 − LW .9.50. (x. so that the volume of the box is maximal? Exercise 26. and to the left of which V is concave down. to the right of which V is positive and V is concave up. Solution: Draw a graph of sin x so that you can follow the discussion. How broad should the folded up strips be.

652183. You may calculate A (x). . GLOBAL THEORY Find an approximate solution of the equation using Newton’s method or your calculator. Our next result allows us to do the same even if the interval is not closed and bounded. ∞). then f assumes its absolute maximum at x0 . Theorem 2. Example 2.72066 and a height of sin x0 = . So these functions can intersect in only one point.51 tells us that the absolute minimum of the function is f (1) = 2. and f is concave up on (0. Its area will be about 1. We conclude that x0 is the only critical point of A(x). Convince yourself9 that this is the only zero of A (x) for x ∈ [0.51. A fairly good approximation of the zero of A (x) is x0 = . and that f (x) > 0 for all x in (0. ∞). Solution: We calculate the first and second derivative of f (x): f (x) = 1 − 1 x2 and f (x) = 2 . Suppose f is defined on an interval I. then f assumes its absolute minimum at x0 . Apparently A(x) = 0 at the end points x = 0 and x = π/2 of the interval.710462. With this. and that is decreasing. This tells us that A(x) assumes its absolute maximum at x0 . (a) If f is concave up on I and has a local minimum at x0 . Theorem 2. π/2]. ♦ 9 π−2x 2 One possible argument is that tan x is increasing on the interval [0. π/2).52. (b) If f is concave down on I and has a local maximum at x0 . The assumptions of this theorem are satisfied in many applied problems. It follows from the second derivative test that A(x) has a local maximum at x0 . So f has a local minimum at x = 1. Find the absolute minimum of the function f (x) = x + 1 x for x ∈ (0. x3 We find that f (x) = 0 if x = 1. the final answer to our problem is: The rectangle of maximal area which can be placed between the x-axis and the graph of the sine function will have a width of approximately π − 2x0 = 1.12218. Substituting x0 you will see that A (x0 ) < 0. b] we could inspect the values of the function at the critical points and at a and b. ♦ To find the absolute extrema of a continuous function on an interval of the form [a. It allows us to decide whether a local extremum is also an abolute one.96 CHAPTER 2. ∞).

The total length of the fence is 600 meters. Exercise 29. and (0. b). .2. OPTIMIZATION STORY PROBLEMS 97 Exercise 28. meet at a right angle. 0). The cost of the exterior walls is $ 1. b). Find the dimensions of a right circular cone of minimal volume. What should the slope of the line be.000. and the same material is used for the top. Exercise 35. Determine the length of the longest ladder that can be carried horizontally from one hallway into the other one. Find the dimensions of the cylinder if its volume is the be a maximum. Exercise 31. (a.2 m3 ) if (a) the drum has a bottom and a top. the other three sides by straight fences. 0). one vertex on the positive x-axis. and (0.00 per linear meter. A right circular cone is inscribed in a sphere of radius R. one vertex on the positive y-axis.e. Find the dimensions of the cone if its volume is to be maximal. Determine the dimensions of the meadow so that its area is maximal. Suppose that a and b are positive. Inscribe a right circular cylinder into a right circular cone of height 25 cm and radius 6 cm.. A rectangular warehouse will have 5000 m2 of floor space and will be separated into two rectangular rooms by an interior wall. 0) in the plane. one 8 feet wide and one 6 feet wide. so that a ball of radius 10 centimeters can be inscribed. Two hallways. Find the largest possible area for a rectangle with base on the x-axis and upper vertices on the curve y = 4 − x2 . Consider a triangle in the plane with vertices (0. 5) lies on the line through the points (a. Exercise 34. One side of a rectangular meadow is bounded by a cliff. Exercise 36. bottom and sides. Draw a rectangle with one vertex at the origin (0. so that the area of the triangle is minimal? Exercise 37. Exercise 33. Find the dimensions of the warehouse that minimizes the construction cost.9. and one vertex on the line 3x + 5y = 15. 0). Exercise 30. Minimize the cost of the material needed to make a round drum with a volume of 200 liter (i. and that (2. . (b) the drum has no top (but a bottom) and the same material is used for the bottom and sides. What are the dimensions of a rectangle of this kind with maximal area? Exercise 32.00 per linear meter and the cost of the interior wall is $ 600.

and. Then the tangent line to the graph of f at x1 intersects the line joining P (x1 ) and Q perpendicularly. so that its volume is maximal. for all practical purposes. y0 ) not on this graph. and the material for the top and bottom is twice as expensive as the material for the sides. Exercise 39. assuming you are allowed a fixed amount of material? More specifically determine the ratio of radius and height which will maximize the volume.000 per kilometer under ground and $70. Suppose D(x) has a local minimum at x1 . What should be the dimensions of the banner if the area of the white area is to be maximized? Exercise 41.000 under water. Let us make a list of data which we may determine. The point on the coast line closest to the island is 6 km from the power station.98 CHAPTER 2.10 Sketching Graphs The techiques which we developed so far provide us with some valuable tools for graphing functions. Exercise 42. (A roman window has the shape of a rectangle capped by a semicircle. and the left over material is recycled for half its value. f (x)) on the graph of a differentiable function f (x) and a point Q = (x0 . but the top and the bottom are cut out of squares. (d) the situation is as in the previous case. Going through the following . Consider a box with a round base and no lid whose interior is subdivided into six wedge shaped sectors. GLOBAL THEORY (c) the drum has a bottom and a top. 2. The width of the border at top and bottom is 15 cm. A rectangular banner has a red border and a white center. the same material is used for the top and bottom. Exercise 38. and along the sides 10 cm. so that we can sketch a graph rather precisely. Design a roman window with a perimeter of 4 m which admits the largest amount of light. The total area is 1 m2 .) Exercise 40. Which shape should it have. To lay the cable costs $40. Find the minimal cost for laying the cable. A power line is needed to connect a power station on the shore line to an island 2 km off shore. Consider the distance D(x) between a point P (x) = (x. you may suppose that the shore line is straight.

(f) Find the second derivative f (x) of f (x). The zeros of f (x) provide you with the critical points of f (x). decreasing. negative. resp. Let us go through the program in an example. concave down. try it numerically (Newton’s method).2. If you cannot find the zeros by analytical means. then your graph will look very much like the graph of f (x). and intervals on which f (x) is negative give you intervals on which f (x) is concave down.e.. (c) If possible. More importantly. (d) Find the first derivative f (x) of f (x). the graph will have all of the essential features of the graph of f (x). If you now draw a graph which exhibits all of the properties which you gathered in the course of the suggested program. Plot the inflection points (x and y value). Useful information for graphing a function: We call the function f (x). and keep track of the intervals on which the function is increasing. (g) Repeat (b) and (c) with f (x) in place of f (x).. resp. Find the inflection points of the function. and whether it is a minimum or a maximum. such as the y-intercept. i. If the function is given on a closed interval. decide on which intervals the function is positive.. Plot the critical points (x and y value). the points where the concavity changes. Intervals on which f (x) is positive give you intervals on which f (x) is increasing. Intervals on which f (x) is positive give you intervals on which f (x) is concave up. . (b) Plot the zeros of the function. SKETCHING GRAPHS 99 program is also a good review of the material which we developed in this chapter. and keep track of the intervals on which the function is concave up. (e) Repeat (b) and (c) with f (x) in place of f (x).10. resp. plot the values at its endpoints. (a) Plot some points on the graph. and intervals on which f (x) is negative give you intervals on which f (x) is decreasing. (h) Decide at which critical points of f (x) the function has a saddle point or local extremum..

1) f (x) = (x − 1)2 (x2 − 4) = (x − 1)2 (x − 2)(x + 2). −2) and (2. √ • f (x) is positive on the interval ((1 − 33)/4. (1+ 33)/4) and f (x) is decreasing √ on [1. This allows us to factor the expression for f (x). 3] and f (x) is increasing √ on [(1 + 33)/4. Solution: To make the discussion a little easier. They are (1 ± 33)/4. In the first step we applied the product rule. (e): We use the quadratic formula to find the zeros of the factor 2x2 −x−4 √ in the expression for f (x). and then we used elementary algebra. (1 − 33)/4]. √ • f (x) is negative on the interval (1. (b): As a polynomial. √ 1 x − [1 − 33] . and we find: √ 1 f (x) = 4(x − 1) x − [1 + 33] 4 We conclude that: √ • f (x) is negative √ the interval [−3. 1) and (1. x = 1.1). You should verify this by multiplying out the expression for f (x) in (2. 3]. 1) and f (x) is increasing √ on [(1 − 33)/4.1). we see that f (x) is positive on the intervals [−3. We based the calculation on the description of f (x) in (2. GLOBAL THEORY Example 2. √ • f (x) is positive on the interval ((1 + 33)/4. 1]. 4 . (c): Counting the signs of the factors of f (x). we note that (2. (d): We calculate the derivative of f (x): f (x) = 2(x − 1)(x2 − 4) + (x − 1)2 2x = 2(x − 1)(2x2 − x − 4). Having written f (x) as in (2. or x = 2.53. The only exceptional points are its zeros. f (0) = −4 and f (3) = 20. 3]. 3]. Discuss the graph of the function f (x) = x4 − 2x3 − 3x2 + 8x − 4 for x ∈ [−3. 2). and negative on (−2.1). we see right away that f (x) = 0 if and only if x = −2. the function f (x) is differentiable on the given interval. (a): Plot the y intercept of the function and its values at the end points of the given interval: f (−3) = 80. Plot these x-intercepts. (1 − 33)/4) and f (x) is decreason ing on [−3.100 CHAPTER 2. (1 − 33)/4].

The values of the function at its inflection points is approximately: √ √ 1− 3 1− 3 f( ) ≈ −7.21 & f ( ) ≈ −. and find f (x) = 12x2 − 12x − 6. a critical point and local maximum at x = 1. 3] √ • f (x) √ inflection points at x = (1 − 3)/2 ≈ −.37 and at x = has (1 + 3)/2 ≈ 1. 2 .54.39 & f (1) = 0 & f ( ) ≈ −. (1 − 3)/2) and f (x) is concave on up on [−3. but we decided this already based on first derivative behaviour in (e). (1 − 3)/2] √ √ • f (x) is negative on the interval ((1 − 3)/2. 3] and f (x) is concave up √ on [(1 + 3)/2.10. (g): We use the quadratic formula to find the zeros on f (x) and factor it: √ 1 f (x) = 12 x − [1 + 3] 2 We conclude that: √ • f (x) is positive√ the interval [−3. (h): At this point we could use the second derivative test to find at which critical points the function has local extrema.37. (1 + 3)/2] √ • f (x) is positive on the interval ((1 + 3)/2. (1 + 3)/2) and f (x) is √ √ concave down on [(1 − 3)/2.29. √ 1 x − [1 − 3] . 4 4 Plot these points. 2 2 Plot these points. SKETCHING GRAPHS 101 √ • f (x) has a critical point and local minimum at (1 − 33)/4 ≈ −1. (f): We rewrite the first derivative as f (x) = 4x3 − 3x2 − 3x + 4.19. and a critical point and √ local minimum at (1 + 33)/4 ≈ 1.69.2. The values of the function at its three critical points are approximately: √ √ 1 − 33 1 + 33 f( ) ≈ −12.

In analogy with the previous example. ♦ Exercise 43. It should be understood.5]. extrema. Showing all of the graph would show less clearly what happens near the intercept. 2 √ √ 1 + 3 1 + 33 I6 = . 4 2 √ 1− 3 I4 = . GLOBAL THEORY Let us gather and organize our information. f is concave down where f is negative. 2 4 √ 1 + 33 I7 = . and inflection points. and parts of the graphs of f and f . etc.2 4 I8 = [2. You can use them to see that f is decreasing where f is negative. . We consider the interval: I1 = [−3. discuss the function f (x) = (x − 1)(x − 2)(x + 2) = x3 − x2 − 4x + 4 on the interval [−3. −2] I2 = −2. 1− 1− √ √ 4 √ 1+ 3 I5 = 1. The dots indicate the points which we suggests to plot.21 you see the graph of f on an even smaller interval. as the values at the endpoint a comparetively large.1: Properties of the Graph In Figure 2.20 you see the graph of the function.1 2 We tabulate the which properties hold on which interval.102 CHAPTER 2. Property Sign I1 I2 I3 I4 neg inc I5 neg dec I6 I7 I8 pos neg neg inc up neg neg pos dec up inc up inc up Monotonicity dec dec Concavity up up down down Table 2. In addition. find the absolute extrema of this function. In Figure 2. We have shown it on a slightly smaller interval. that at some end points of intervals the function is zero. 2. 3] . 33 √ 33 1 − 3 I3 = .

Exercise 45. find the absolute extrema of this function. 2]. f Exercise 44. In analogy with the previous example. find the absolute extrema of this function. In analogy with the previous example. In addition. f . You may have to apply Newton’s method to find zeros of f .2. discuss the function f (x) = 2 sin x + cos 3x on the interval [0. SKETCHING GRAPHS 103 10 10 5 5 -3 -2 -1 -5 1 2 3 -2 -1 -5 1 2 -10 -10 Figure 2. 2π].21: f . .20: The Graph Figure 2. f . discuss the function f (x) = x3 − 3x + 2 on the interval [−2.10. In addition. and f .

104 CHAPTER 2. GLOBAL THEORY .

consider the function f (x) = x2 e−x . Suppose that f is a function which is defined and bounded on the interval [a. it should have the following properties: • The area of a rectangle is the product of the lengths of its sides. We will denote the area of a region Ω by Area(Ω). 105 .1 Properties of Areas So far. like rectangles. so that for a non-negative function it makes sense to think of the integral as the area of the region bounded by the graph of the function. The indefinite integral of a function f is the family (set) of all antiderivatives of f . It is denoted by b f (x) dx. If it exists. b] is a real number. 3.1.e. and the x-axis. The Fundamental Theorem of Calculus relates definite integrals and antiderivatives. a The definition is set up. To be concrete. shown in Figure 3. b]. i. the x-axis. all functions whose derivative is f . and the lines x = a and x = b. Whatever concept of area we have in mind. For important classes of functions one may utilize definite integrals to construct antiderivatives. we only know the area of some simple regions. then the definite integral of f over the interval [a.Chapter 3 Integration We will introduce the ideas of the definite and the indefinite integral. and find the area of the region Ω bounded by the graph of f (x). the lines x = 1 and x = 5.

24.6 0. If the regions Ω 1 and Ω2 do not intersect.3 0.3 you see a rectangle Ru with area 2. and that the area of each of them is defined.6 ≤ Area(Ω) ≤ Area(Ru ) = 2.1: f (x) = x2 e−x • Suppose that Ω1 and Ω2 are regions in the plane.24 which contains Ω. • Suppose that Ω1 and Ω2 are regions in the plane. The first two principles tell us that Area(Rl ) = . then Area(Ω1 ) ≤ Area(Ω2 ). From above principles one may derive another one. which is contained in Ω.1 1 2 3 4 5 6 Figure 3.5 0. and Area(Ω1 ∪ Ω2 ) = Area(Ω1 ) + Area(Ω2 ).6. and that the area of each of them is defined.4 0. In Figure 3.106 CHAPTER 3.2 you see a rectangle Rl with area . In Figure 3.1 has an area. Suppose for a moment. If Ω1 ⊆ Ω2 . which occurs frequently in our upcoming constructions: . INTEGRATION 0.2 0. that the region under the graph shown in Figure 3. then the area of the union Ω1 ∪ Ω2 of Ω1 and Ω2 is defined.

. 3. .6 0.5 0. . A partition of an interval [a. .5 0.2 0.3: A rectangle Ru containing Ω • Suppose the region R in the plane is the union of a finite number of rectangles R1 . .2 Partitions and Sums We like to refine the approach to calculating areas of regions which we started in the previous section.2: A rectangle Rl contained in Ω Figure 3.3 0. such that a = x0 ≤ x1 ≤ · · · ≤ xn−1 ≤ xn = b.2 0. . Rn : Area(R) = Area(R1 ) + · · · + Area(Rn ). We do so by partitioning the interval before applying the ideas from above. b] is of a collection is points {xj | 0 ≤ j ≤ n}.4 0. The interval [a. .2. and then we add up what we get over the individual intervals. PARTITIONS AND SUMS 107 0. . xj ] with 1 ≤ j ≤ n. and it is equal to the sum of the areas of the regions R1 .3 0.6 0.1 0.4 0. Then Area(R) is defined.3. b] is partitioned into n intervals [xj−1 . Rn and any two of them intersect at most in an edge. .1 1 2 3 4 5 6 1 2 3 4 5 6 Figure 3.

108

CHAPTER 3. INTEGRATION

3.2.1

Upper and Lower Sums

As before, f denotes a function which is defined and bounded on [a, b]. On each interval we pick numbers mj and Mj , such that mj ≤ f (x) ≤ Mj We define the lower sum to be (3.1) Sl = m1 (x1 − x0 ) + m2 (x2 − x1 ) + · · · + mn (xn − xn−1 ). for all x ∈ [xj−1 , xj ].

and the upper sum to be (3.2) Su = M1 (x1 − x0 ) + M2 (x2 − x1 ) + · · · + Mn (xn − xn−1 ).

These sums depend on the choice of partition and the choices for the mj and Mj .

0.6 0.5 0.4 0.3 0.2 0.1

0.6 0.5 0.4 0.3 0.2 0.1

1

2

3

4

5

6

1

2

3

4

5

6

Figure 3.4: A union of rectangles contained in Ω

Figure 3.5: A union of rectangles containing Ω

Let us return to the example of the function f (x) = x2 e−x on the interval [1, 4]. In the computation of the lower sum we use the partition x0 = 1 < x1 = 2 < x2 = 3 < x3 = 4 < x4 = 5 of the interval. We also pick m1 = .35, m2 = .43, m3 = .28 and m4 = .16. This leads to a lower sum Sl = 1.22. In the computation of the upper sum we use the partition x0 = 1 < x1 = 3 < x2 = 4 < x3 = 5

3.2. PARTITIONS AND SUMS

109

of the interval. We also pick M1 = .55, M2 = .45 and M3 = .3. This leads to an upper sum Su = 1.85. The mj and Mj represent the heights of the rectangles in Figures 3.4 and 3.5, and we trust these figures to show that mj ≤ f (x) and f (x) ≤ Mj on the respective interval. As before, let Ω denote the region under the graph. Then the union of the rectangles shown in Figure 3.4 is contained in Ω, and the union of the rectangles shown in Figure 3.5 contains Ω. Thus, if Ω has an area, the our principles tell us that Sl = 1.22 ≤ Area(Ω) ≤ Su = 1.85. In fact the only number greater or equal to all lower sums and smaller or equal to all upper sums is 5 − 37 , and this will be the area of the region e e5 Ω. Here e is the Euler number. Example 3.1. Let us find upper and lower sums for the function f (x) = x3 − 7x2 + 14x − 8 for x ∈ [.5, 4.5]. In contrast to the function in the previous example, this function is not non-negative.

4 3 2 1 1 -1 -2 -3 2 3 4

4 3 2 1 1 -1 -2 -3 2 3 4

Figure 3.6: Rectangles for calculating an upper sum.

Figure 3.7: Rectangles for calculating a lower sum.

Solution: For the purpose of calculating an upper sum, we partitioned the interval [.5, 4.5] using the intermediate points x0 = .5, x1 = 1.1, x2 = 2.4, x3 = 3.8, and x4 = 4.5. As numbers Mi (so that Mi ≥ f (x) for x ∈ [xi−1 , xi ])

110

CHAPTER 3. INTEGRATION

we chose M1 = .3, M2 = .7, M3 = −.9, and M4 = 4.4. These data are shown in Figure 3.6. With these choices, the upper sum is Su = .3(1.1 − .5) + .7(2.4 − 1.1) + (−.9)(3.8 − 2.4) + 4.4(4.5 − 3.8) = 2.91. In Figure 3.6 you see four rectangles. Their areas are combined to calculate the upper sum. The areas of the ones above the x-axis are added, the ones below the axis are subtracted, in accordance with the sign of the Mi . In the calculation of the lower sum we partitioned [.5, 4.5] using x0 = .5, x1 = .8, x2 = 2.3, x3 = 4.2, and x4 = 4.5. As numbers mi (so that mi ≤ f (x) for x ∈ [xi−1 , xi ]) we chose m1 = −2.7, m2 = −.8, m3 = −2.2, and m4 = 1.3. These data are shown in Figure 3.7. With these choices we calculate a lower sum of Sl = −2.7(.8 − .5) + (−.8)(2.3 − .8) + (−2.2)(4.2 − 2.3) + 1.3(4.5 − 4.2) = −5.8. In Figure 3.7 you see four rectangles. Their areas are combined to calculate the lower sum. The areas of the ones above the x-axis are added, the ones below the axis are subtracted, in accordance with the sign of the mi . In summary, you see that we still combine areas of rectangles in the calculation of the upper and lower sum, only that, depending on the sign of the Mi or mi , these rectangles are either above or below the x-axis, and depending on this, their areas are either added or subtracted. ♦ Let us make a simple albeit important observation: Theorem 3.2. Let f be a function which is defined and bounded on a closed interval [a, b]. Let Sl be any lower sum of f and Su any upper sum. Then Sl ≤ Su . Let us repeat the statement of the theorem to emphasize its meaning. Whichever partition of the interval [a, b] and whichever mi we use in the calculation of the lower sum Sl and whichever partition of the interval and whichever Mi we use in the calculation of the upper sum Su , the lower sum is always smaller or equal to the upper sum. To see this, one refines the partitions for the upper and lower sum computation so that they become the same. Then one notes that mi ≤ Mi for all i.

4 0. 5]. Let f be a function which is defined and bounded on a closed interval [a. let us return to the example f (x) = x2 e−x on the interval [1. xj ]. Pick once more a partition a = x0 ≤ x1 ≤ · · · ≤ xn−1 ≤ xn = b of the interval. their bases are the intervals in the subdivision.1 1 2 3 4 5 6 Figure 3. 5]. Then we define the Riemann Sum (3. Then Sl ≤ SR ≤ Su . and SR any Riemann sum. 3] √ and x2 = π ∈ [ 3. In each subinterval. b]. Let us use the partition √ x0 = 1 < x1 = 3 < x2 = 5.8: Representing a Riemann Sum To be more concrete. . Let Sl be any lower sum of f .3) SR = f (x1 )(x1 − x0 ) + f (x2 )(x2 − x1 ) + · · · + f (xn )(xn − xn−1 ). There are two rectangles. PARTITIONS AND SUMS 111 3. b].3.2. In Figure 3. pick a point xj ∈ [xj−1 . √ √ In the two interval of this subdivision we pick the points x1 = 2 ∈ [1.2 0.2.6 0.5 0. As Riemann sum we obtain SR = f (x1 )(x1 − x0 ) + f (x2 )(x2 − x1 ) ≈ 1.3.2 Riemann Sums Suppose once again that f (x) is a function which is defined on the interval [a. 0.3 0.8 you see the picture illustrating the computation. and their heights are f (x1 ) and f (x2 ).749741. Su any upper sum. We leave it to the reader to contemplate Proposition 3. The sum of the areas of these rectangles is the Riemann sum.

The specifics depend on which sums we are working with. Yl ≤ Yu . It is also denoted by b f (x) dx. Let f be a function which is defined and bounded on a closed interval [a. one observes that the set of all lower sums of f has a least upper bound. b]. Call it Yu . Idea of Proof. There exists a real number Y . The set of all upper sums of f has a greatest lower bound.5. If there is exactly one number Y . To deduce the theorem from the completeness of the real numbers.112 CHAPTER 3. whatever choices we make in the calculation of lower and upper sums Sl and Su . we always have that Sl ≤ Su . We are now prepared to define the concept of integrability of a function. the number Y is called the integral1 of f for x between a and b. Let f be a function which is defined and bounded on a closed interval [a. Definition 3. such that Sl ≤ Y ≤ Su for all lower sums Sl and upper sums Su of f . Call it Yl . but typically equivalent. if the function is non-negative. then we say that f is integrable over the interval [a. so that in the limit our sums can be justifiably called the area of the region under the graph. Apparently. 3. b]. . INTEGRATION 3. Then Y is any number such that Yl ≤ Y ≤ Yu .3. Theorem 3. a 1 To distinguish it from the result of a different. such that S l ≤ Y ≤ Su for all lower sums Sl and all upper sums Su of f . In this case. A crucial additional fact is stated in the next result.3 Limits and Integrability The idea is to refine the partitions in our previous construction. construction we should Y the Darboux integral. b].1 The Darboux Integral and Areas As we discussed earlier.4.

n n n n This is an equidistant partition of the interval [0. m2 = f (x1 ) = 1 n 2 2 × 1 n . all subintervals have the same length 1/n. M2 = f (x2 ) = 2 2 n 2 . So. 1]. and integrability for the function f (x) = x2 on the interval [0. and with this an entire interval. Solution: Fix a natural number n and set 1 2 n−1 n x0 = 0 < x1 = < x2 = < · · · < xn−1 = < xn = = 1. M3 = f (x3 ) = 3 n 2 . j and Mj = f (xj ) = n in general. Example 3. . xj ] because f (x) is increasing on [0. In this case there are at least two different numbers..3. LIMITS AND INTEGRABILITY 113 Remark 8. On the other hand. we use that 12 + 22 + 32 + · · · + n2 = We calculate the upper sum n(n + 1)(2n + 1) .. a function over a closed interval [a.. let us explain what happens when a function is not integrable. For completeness sake and later use. 1]. lower sums.6. Without proof.. Apparently. between all upper and lower sums. Explore upper sums. . Mj ≥ f (x) for all x ∈ [xj−1 . 6 Su = M1 (x1 − x0 ) + M2 (x2 − x1 ) + · · · + Mn (xn − xn−1 ) 1 2 1 2 2 1 n = × + × + ··· + n n n n n 1 2 = 1 + 22 + · · · n2 n3 n(n + 1)(2n + 1) = 6n3 1 1 1 = + + 2 3 2n 6n For the lower sums we pick m1 = f (x0 ) = 0. b] is not integrable if and only if the exists a positive number D such that Su − Sl ≥ D for any lower sum Su and any upper sum Su . a function is integrable if for every positive number D there is an upper sum Su and a lower sum Sl such that Su − Sl < D. . 1]. m3 = f (x2 ) = 2 n 2 . For the upper sums we pick M1 = f (x1 ) = 1 n 2 .3.

2 0. According to the definition this means.114 CHAPTER 3.4 0.9: Rectangles for calculating a lower sum.4 0. b].75 0.2 0.8 1 1. and nonnegative on a closed interval [a.75 0.2 0. Let f be a function which is defined. The resulting lower sum is Sl = 1 1 1 − + 2 3 2n 6n For n = 5 you see the rectangles whose areas are the summands in the lower and upper sums in Figures 3.5 0.2 0.25 1 0. and mj = f (xj−1 ) = j−1 n in general. that f (x) = x2 is integrable over the interval [0. If f is integrable over this interval.25 1 0. then we say that the region Ω has an area and b Area(Ω) = a f (x) dx. Our answer is formulated as a Definition 3. We do not only see that Sl ≤ 1 ≤ Su .2 Figure 3. so that Sl ≤ Y ≤ Su for all natural numbers n.6 0.5 0. 2 Figure 3.8 1 1. but also that Y = 1/3 is the only real 3 number. 1] and that 1 0 x2 dx = 1 . Let Ω be the region bounded by the graph of f .10: Rectangles for calculating an upper sum. bounded.6 0. ♦ 3 We motivated our introduction of upper and lower sums by our quest to define the concept of area. INTEGRATION 1.25 -0.10.2 1. . Using the expressions for Su and Sl you see that Su − Sl = 1/n.7.25 -0. and the lines x = a and x = b. the x-axis.9 and 3.

which broke [a. and set SR = f (x1 )(x1 − x0 ) + f (x2 )(x2 − x1 ) + · · · + f (xn )(xn − xn−1 ). because there are a lot of choices which we make to define such a sum. If the limit of the SR exists. We picked a partition P : a = x0 ≤ x1 ≤ x2 ≤ · · · xn−1 ≤ xn = b. then we say that f is Riemann integrable over [a.2 The Riemann Integral Earlier we introduced the idea of a Riemann sum. b]. . For an integrable function there is exactly one real number between the lower and upper sums.3.8 (Limit for Riemann Sums). the norm of P is the length of the longest of the intervals [xj−1 . Suppose the function f (x) is defined on [a. so this is the only number which we can call the area of Ω. and the lines x = 0 and x = 1 is 1 Area(Ω) = 0 1 x2 dx = . call L the Riemann integral of f . This is trickier than for functions. ♦ 3 3. such that |L − SR | < whenever |P| < δ. We want to consider a limit Riemann sums. In each of the subintervals we picked a point xj ∈ [xj−1 . We define the norm of the partition P to be |P| = max{xj − xj−1 | 1 ≤ j ≤ n}. For example. LIMITS AND INTEGRABILITY 115 The upper and lower sum were constructed such that if there is any justification to assigning an area to Ω then Sl ≤ Area(Ω) ≤ Su . the area of the region Ω bounded by the graph of the function f (x) = x2 . in other words. and write b L = lim SR = |P|→0 a f (x) dx. b] up into smaller interval [xj−1 . b]. xj ]. We say that L = lim SR |P|→0 if for all > 0 there exists a δ > 0. xj ]. xj ].3. Consider an interval [a. Definition 3. b] and a function f (x) defined on it.3. the x-axis.

116

CHAPTER 3. INTEGRATION

Thus L = lim SR if we can force SR to be close to L, as close as we like, by making the partition fine, by making each subinterval no longer that some number. It is worth pointing out and not very difficult to show the following proposition. Proposition 3.9. Suppose the function f is defined on the interval [a, b]. Then f is Riemann integrable if and only if it is Darboux integrable. If defined, the Riemann and the Darboux integral are the same.

3.4

Integrable Functions

We like to provide a supply of integrable functions. Our first result is typically proved in an analysis course. Theorem 3.10. Suppose f is defined and continuous on [a, b]. Then f is integrable over [a, b]. According to this theorem, polynomials are integrable over any interval of the form [a, b]. Rational functions (i.e., functions of the form p(x)/q(x) where p(x) and q(x) are polynomials) are integrable over intervals of the form [a, b] as long as q does not vanish anywhere on the interval. The trigonometric functions (sin, cos, tan, cot, sec, and csc) are integrable on intervals where the functions are defined. Arbitrary powers of a variable, f (x) = xα , are integrable. One just needs to make sure that the function is defined on the interval [a, b]. For any real number α it suffices to assume that a > 0. For any real α ≥ 0, it suffices to assume a ≥ 0. For rational numbers α = p/q, where p and q are integers and q is odd, it suffices to assume 0 ∈ [a, b]. For non-negative integers α no assumption needs to be made on a and b. Just making sure that the resulting functions are defined everywhere on [a, b], the functions just mentioned may be added, subtracted, multiplied, divided, and composed, and one still ends up with integrable functions. Let us introduce another class of functions for which we can prove that they are integrable. Definition 3.11. Suppose f (x) is a function. We say that f (x) is nondecreasing if f (x1 ) ≤ f (x2 ) whenever x1 and x2 are in the domain of f (x) and x1 ≤ x2 . We say that f (x) is non-increasing if f (x1 ) ≥ f (x2 ) whenever x1 ≤ x2 . Proposition 3.12. Let [a, b] be a closed interval and let f be defined and non-increasing or non-decreasing on [a, b]. Then f is integrable on [a, b]. In particular, monotonic (increasing or decreasing) functions are integrable.

3.4. INTEGRABLE FUNCTIONS

117

Proof. We will use Darboux integrability. Let us assume that the function f is non-decreasing on the interval. The non-increasing case is left as an exercise. Take any partition of the interval: a = x0 < x1 < · · · < xn = b. The reader may justify why we can use the same partition in the computation of the upper and lower sum. For i = 1, . . . , n we set mi = f (xi−1 ) Then, because f is non-decreasing, mi ≤ f (x) ≤ Mi for all x ∈ [xi−1 , xi ]. & Mi = f (xi ).

We use the mi and Mi to compute upper and lower sums. Let ∆ be the largest value of the xi − xi−1 . Then Su − Sl = [M1 (x1 − x0 ) + · · · + Mn (xn − xn−1 )] −[m1 (x1 − x0 ) + · · · + mn (xn − xn−1 )] = (M1 − m1 )(x1 − x0 ) + · · · + (Mn − mn )(xn − xn−1 ) ≤ [(M1 − m1 ) + (M2 − m2 ) + · · · + (Mn − mn )] ∆ = (Mn − m1 )∆ = [f (b) − f (a)]∆ The inequality in the computation follows from the choice of ∆. The second to last equality follows because Mi−1 = mi for all i = 2, . . . , n. Many terms in the computation cancel. Given any positive number D, we can make the partition fine enough so that [f (b) − f (a)]∆ < D. According to our Remark 8 this means that f is integrable over the interval, as we claimed. We illustrate the steps in the proof in a concrete example. In Figure 3.11 you see the upper and lower sum. The lower sum is the sum of the areas of the darkly shaded rectangles. The upper sum is the sum of the areas of the lightly and darkly shaded rectangles. The difference between the upper and the lower sum is the sum of the lightly shaded rectangles shown in Figure 3.12. We can combine these areas by sliding the rectangles sideways so that they form one column. Its height will be f (b) − f (a). Its width may vary, but in the widest place it is no wider than ∆, the width of the largest interval in the partition of [a, b]. That means, the difference between the upper and the lower sum is at most [f (b) − f (a)]∆. As above, we conclude that the function is integrable.

118

CHAPTER 3. INTEGRATION

1.2 1.2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.2 0.2 -0.2 0.4 0.6 0.8 1 -0.2 0.4 0.6 0.8 1

Figure 3.11: Rectangles for calculating a lower and an upper sum.

Figure 3.12: Rectangles for calculating the difference between an upper and a lower sum.

Remark 9. There are functions which are not integrable over any interval of the form [a, b] with a < b. Remark 10. Here we only discuss integrability of function over closed finite intervals, i.e., intervals of the form [a, b]. The discussion of integrability of functions over intervals which are not of this form, e.g., half-open intervals like [a, b) or unbounded closed intervals like [a, ∞), requires additional ideas and techniques which we are not ready to discuss yet.

3.5

Some elementary observations

In spite of our success calculating some integrals using upper and lower sums and the definition, this is certainly not the way to go in general. To integrate “well behaved” functions we want a theory which allows us to calculate integrals more easily. We have to develop a few basic tools. These are fairly straight forward consequences of the definition of the integral. Proposition 3.13. If the function f is defined at a, then
a

(3.4)
a

f (x) dx = 0

Proof. The reader should contemplate the proposition.

5) exists. and consider it over each of the intervals [xj−1 . Then f is integrable on [a. Suppose that f is defined on an interval [a. If one of the sides of Equation (3. SOME ELEMENTARY OBSERVATIONS 119 Proposition 3. The remaining details are left to the reader. According to Proposition 3. Use c as one of the points in the partition. So the assertion follows from Theorem 3. b]. b]. Let f be defined on the interval [a.14 we find Corollary 3. Idea of Proof. xj ] separately.17.5) a f (x) dx + c f (x) dx = a f (x) dx. b] if and only if it is integrable over the intervals [a. b]. we can change the defintion of the function at a point or two and make it continuous. xj ) for all 1 ≤ j ≤ n. then so does the other one. b].16. b] be a closed interval. then f is integrable on [a. Corollary 3. We call f piecewise continuous if there is a partition a = x0 < x1 < · · · < xn−1 < xn = b such that f is continuous on the open intervals (xj−1 .3. We can also extend Theorem 3.14. and the one-sided limits (see Section 1. Definition 3. c] and [c.5. b]. and f a function which is defined on the interval. On each of these smaller intervals.15.10. Idea of Proof. . If f is a piecewise contiuous function on [a. Then c b b (3. As an immediate consequence of Propositions 3.10. c a point between a and b. This changes neither the integrability nor the value of the integral. Suppose that we can partition the interval into a finite number of intervals such that f is non-increasing or non-decreasing on each of them.14 we may break the problem up. Let [a. exist and are finite. Implicitly in the formulation of the proposition is the statement that f is integrable over [a.12 and 3. b].3) x→x+ j−1 lim f (x) and x→x− j lim f (x).

If h and g are integrable over [a. Then the absolute value of f is integrable over [a. leading to exactly this formula. Proposition 3. INTEGRATION Definition 3. Corollary 3. Proof. Proposition 3. and b (3.120 CHAPTER 3. This definition is convenient and consistent with what we have said so far about the integral. The proof is left to the reader. then b a g(x) dx ≥ a b h(x) dx. b]. b]. b]. The approach to integrals via lower and upper sums b could also be generalized to include integrals a where b < a. b]. b]. and f (x) ≥ 0 for all x ∈ [a. Let [a.18. and g(x) ≥ h(x) for all x ∈ [a. b]. The proof of this proposition is elementary. Suppose that f and g are integrable over the interval. Then b a f (x) dx = − b a f (x) dx. b] be a closed interval and c a scalar. though a bit tricky. We mention a few useful estimates for integrals.6) a f (x) dx ≤ a b |f (x)| dx. Let f be defined and integrable on the interval [a. b] and b b b (f (x) + g(x)) dx = a a f (x) dx + a g(x) dx and b b cf (x) dx = c a a f (x) dx. Let [a. b] be a closed interval and f integrable over [a. Using the definition of the integral it is not difficult to show: Proposition 3. Proof. Use that f (x) = g(x) − h(x) ≥ 0 for all x ∈ [a.22.20. b].19. Then f +g and cf are integrable over [a. If f is integrable over [a. b].21. then b a f (x) dx ≥ 0. .

7 we have (3. to show that the integrability of f (x) implies the integrability of f + (x) and f − (x). and the lines x = a and x = b. then b Area(Ω) = a f (x) dx. We define two functions: f + (x) = f (x) if f (x) ≥ 0 0 if f (x) ≤ 0 and f − (x) = f (x) if f (x) ≤ 0 0 if f (x) ≥ 0 It is elementary. y) in the plane for which a ≤ x ≤ b and 0 ≤ y ≤ f (x). Only then have we addressed the question of it having an area. what happens if f (x) is not non-negative? Let f be a function which is defined and bounded on a closed interval [a.9) Area(Ω+ ) = a 2 If you want to be formal. Idea of Proof. Ω+ and Ω− .8) a f (x) dx = a f + (x) dx + a b f − (x) dx. Making use of this notation. Then Ω is the union of the sets Ω+ and Ω− . b]. then you have to flip the region Ω− to lie above the x-axis. . f = f + + f − . so that the additivity of the integral implies that b b (3. We decompose the region between the x-axis and the graph into the part Ω + above the x-axis and the part Ω− below it. though a bit tricky.23. According to Definition 3. If Ω is the area bounded by the graph of f (x). Apparently. b f + (x) dx.6. and Ω− of those points for which a ≤ x ≤ b and f (x) ≤ y ≤ 0. then the areas of the regions Ω+ and Ω− are defined2 and b (3.6 Areas and Integrals Let us return to the relation between areas and integrals.7) a f (x) dx = Area(Ω+ ) − Area(Ω− ). Specifically. Suppose f (x) is a non-negative integrable function over an interval [a. The question is. We decompose Ω into the union of two sets. If f is integrable. we have: Proposition 3.3. b] and Ω the set of points which lie between the graph of f (x) and the x-axis for a ≤ x ≤ b. AREAS AND INTEGRALS 121 3. Ω+ consist of those points (x. the x-axis.

The function −f − (x) is non-negative. According to Definition 3.8). The constant c in the expression is referred to as integration constant. In other words. the x-axis. we understand the the right hand side of (3.10) Area(Ω− ) = Area(−Ω− ) = a b −f − (x) dx = − a b f − (x) dx. Definition 3. we take its mirror image along the x-axis. This process does not change areas. and suppose that f has an antiderivative. and −Ω− is bounded by the graph of −f − (x). Allowing all real numbers as possible values for c. such that F1 (x) = F2 (x) + c for all x ∈ I. and the lines x = a and x = b.7 and our elementary properties of the integral we have (3.5).122 CHAPTER 3.11) f (x) dx = F (x) + c. In Definition 2..e. INTEGRATION Let −Ω− be the area obtained by flipping Ω− up.10) into (3. .11) as a set of functions. there exists a constant c. i. Having an anti-derivative of a function will (typically) make it easy to integrate it over a closed interval. For example.24.9) and (3. π/2 3. Let f be a function which is defined on an interval I. −π/2 sin x dx = 0 because the graph bounds congruent regions above and below the x-axis.6 we called a function F (x) with domain I an antiderivative of f (x) if F (x) = f (x). we typically write (3. Our claim follows now by substituting the results in (3.7 Anti-derivatives Consider a function f (x) with domain I. Different values for c result in different functions. Remember that any antiderivatives F1 and F2 of a function f on an interval I differ only by a constant (see Corollary 2. The set of all antiderivatives of f is called the indefinite integral of f . It is denoted by f (x) dx. In this expression c stands for an arbitrary constant. so Area(Ω− ) = Area(−Ω− ). Given a function f and an antiderivative F of it.

using the trigonometric identity cos2 x = (1 + cos 2x)/2. As you go through them you are expected to learn. E. Here are some examples.. E. 3 Occasionally. ♦ We shall explore additional ideas for finding antiderivatives at a later. 1 dx = x + c √ x dx = 2 3/2 x +c 3 1 x dx = x2 + c 2 1 xn dx = xn+1 (n = −1) n+1 sec2 x dx = tan x + c sec x tan x dx = sec x + c √ dx = arcsin x + c 1 − x2 sin x dx = − cos x + c cos x dx = sin x + c dx = arctan x + c 1 + x2 Using the linearity of the differentiation (see the differentiation rules in (1.3. or pick up some new ideas as you go along.3 on page 63 to come up with ideas for antiderivatives. ANTI-DERIVATIVES 123 Example 3. 2 2 Using a different trigonometric identity we find (1 + cot2 x) dx = csc2 x dx = − cot x + c. .g.g. Given a function f (x) we might know or guess a function F (x). You can check the correctness of your guess by differentiation. The reader may practice finding some antiderivatives for the functions in the next exercise.25.20)). we find that cos2 x dx = 1 2 (1 + cos(2x)) dx = 1 1 x + sin(2x) + c. an additional idea is required before we can see the antiderivative. Then we can write down the indefinite integral of f in the form F (x) + c. You may want to consult Table 1. 5 5x2 − 2 cos x dx = x3 − 2 sin x + c. such that F (x) = f (x). it is easy to produce more examples.7.

27 (Fundamental Theorem of Calculus). Find the following indefinite integrals: 1 dx x3 csc2 x dx (1 + tan2 x) dx csc x cot x dx sin2 x dx sec2 (3x) dx (a) (b) (c) (d) (e) (f) 3 dx (x + 4) dx (x2 − 5) dx cos 2x dx (3 + x)3 dx (3 + 2x)5 dx (g) (h) (i) (j) (k) (l) (m) (n) (o) (p) (q) (r) ex/3 dx x2 2x dx +1 (4 − 3x)5 dx cos(4 − 3x) dx 2x dx (x2 + 3)2 x sec2 (x2 + 5) dx 3.26. Let a ∈ I. The major tool for calculating integrals. Theorem 3. suppose that a function f is defined and continuous over the interval I. . so that the Fundamental Theorem of Calculus tells us that π 0 sin x dx = − cos(π) − (− cos(0)) = −(−1) − (−1) = 2. INTEGRATION Exercise 46. More specifically. defined over intervals. Continuous functions. b] and that F is an antiderivative of f . Theorem 3.8 The Fundamental Theorem of Calculus Our first result provides us with a large class of functions which have antiderivatives. have antiderivatives. F (x) = − cos x is an antiderivative of f (x) = sin x. Suppose that f is a continuous function over a closed interval [a. Then f (x) = for all x ∈ I. and the grand conclusion of our discussion of antiderivatives is the Fundamental Theorem of Calculus. Then b a d dx x f (t) dt a f (x) dx = F (b) − F (a). For example.124 CHAPTER 3.

Using this notation. we calculate that 3 −2 5 3 = p(5) − p(3) (x2 − 2x + 5) dx = x3 − x2 + 5x 3 3 = −2 95 . 3 Other examples are π/4 π/4 sec x tan x dx = sec x 0 0 = √ 2−1 and π/3 π/4 csc x cot x dx = − csc x π/3 π/4 = √ −2 3 3 √ √ √ 2 3 − (− 2) = 2 − . so that the Fundamental Theorem of Calculus tells us that π/4 0 sec2 x dx = tan(π/4) − tan(0) = 1. THE FUNDAMENTAL THEOREM OF CALCULUS 125 As another example. note that F (x) = tan x is an anti-derivative of f (x) = sec2 x. Remark 11 (Notational Convention). 3 The reader is invited to practice a few examples. .. E. we write π sin x 0 = sin π − sin 0.g.3. we also use the notation shown in the following example: x3 − 5x2 + 2x − 8 where p(x) = x3 − 5x2 + 2x − 8. If there are ambiguities due to the length of the expression to which this construction is applied.8. One commonly uses the notation b F (x) a = F (b) − F (a). This is quite convenient.

5) tells us that F and H differ by a constant. Proof of the Fundamental Theorem of Calculus.1 Some Proofs Because of their importance. the desired result is an easy consequence of Theorem 3. In particular.12) H(x) = a f (t) dt = F (x) + c We can find out the value for c by substituting x = a in this equation.12). F (x) = H (x) = f (x). In particular.26.126 CHAPTER 3.8. we obtain b a f (t) dt = F (b) − F (a). . Evaluate the following definite integrals: 1 π (a) 0 2 (3x + 2) dx 6−t dt t3 1 5 √ 2 x − 1 dx 2 0 (g) 0 π 1 cos x dx 2 cos(x/2) dx (b) (c) (d) (h) 0 2 (i) (j) 0 1 (t3 − t2 ) dt csc x cot x dx −2 π/2 |x2 − 1| dx cos2 x dx sin2 (2x) dx sec2 x dx π/4 π/2 (e) π/6 −1 (k) 0 π/4 (f) −1 7x dx 6 (l) 0 3. we like to prove Theorem 3.26.26 and the Fundamental Theorem of Calculus. Using this calculation of c and substituting x = b in (3. Cauchy’s Theorem (see its application in Corollary 2. For some constant c and all x ∈ I: x (3. Essentially. Let F (x) be any anti-derivative x of f (x) on I. and H(x) = a f (t) dt the one provided by Theorem 3. we find that a f (t) dt = 0 = F (a) + c a or c = −F (a). as claimed. INTEGRATION Exercise 47.

13) and Corollary 3. So it is our task to show that F is differentiable at x. According to the Extreme Value Theorem (see Theorem 1.8. but m and M are. and with this that m(h) ≤ 1 h x+h x f (t) dt ≤ M (h). and this is exactly what we needed to show. it follows from Theorem 3. It follows from (3.26. . h→0 It follows from a pinching argument (see Proposition 1. x Here we assume that x is not an endpoint of I. We omit (leave to the reader) the modifications of the proof which are required in the case where x is an endpoint of I.3.13) f (c) = m(h) ≤ f (x) ≤ f (d) = M (h) for all t between x and x + h.17) there are points c and d between x and x + h.4) that x+h h→0 x lim f (t) dt = f (x). The points c and d may not be uniquely determined by h. and that F (x) = f (x).10 that x F (x) = a f (t) dt exists. THE FUNDAMENTAL THEOREM OF CALCULUS 127 Proof of Theorem 3. so that x and x + h are both in I. Continuity of f (x) implies that h→0 lim m(h) = f (x) = lim M (h).26. Because we assumed continuity of f on the interval I. the task becomes to show that f (x) = F (x + h) − F (x) h x+h 1 = lim f (t) dt − h→0 h a h→0 lim x f (t) dt a = 1 h→0 h lim x+h f (t) dt. after adjusting the notation to fit the current setting.21 that m(h) · h = x x+h m(h) dt ≤ x x+h f (t) dt ≤ x x+h M (h) dt = M (h) · h. such that (3. Using Theorem 1.

Sometimes this method is helpful.128 CHAPTER 3. .e. Working through the examples will teach you how to apply this method in some typical situations. and this means in context that u = g(x). 8 4 Here we used g(x) = 2x − 3. INTEGRATION 3. a way to use the notation. (2x − 3)3 dx = 1 2 (2x − 3)3 · 2dx = 1 (2x − 3)4 + c. d F (g(x)) = f (g(x))g (x).9 Substitution In some cases it is not that easy to ‘see’ an antiderivative of the function one likes to integrate. The variable for the functions f and F is often called u. The method is based on the chain rule for differentiation. 4 There is a pattern. rather success justifies the means. g (x) = 2. when applied correctly. We may take antiderivatives of both sides of our previous equation. You hope that you can find an antiderivative for the simplified expression. i. Then. dx Assume that f and g are continuous on I.. Setting u = g(x) we write du = g (x)dx. will simplify the expression for the function you like to integrate. For example. It will give you at least some experience which you may then rely on in similar examples. We explain the method. Let F and g be functions which are defined and differentiable on an interval I. Your success with this method depends greatly on experience. and F (u) = u . f (u) = u3 . Then f (g(x))g (x) is continuous as well. other times it is not. according to the chain rule. practice.14) f (g(x))g (x) dx = F (g(x)) + c. Let us give a few examples to illustrate how this method can be put to use. Set F = f . and conclude that (3. Substitution is a method which. There are no general rules what substitution must be used. which can be applied to write down the steps in an integration using substitution efficiently.

10 9 8 Here we used the substitution u = t + 1. In the first step of this calculation we carry out the substitution. Thought of as infinitesimals or differentials.15) f (g(x))g (x) dx = f (u) du = F (u) + c = F (g(x)) + c. The equation du = g (x)dx helps us to write down what happens when we perform the substitution as in the first equality in (3. SUBSTITUTION 129 instead of g (x) = du/dx3 . and in the third one we reverse the substitution. we calculate that x x2 + 2 dx = = = = 1 x2 + 2 · 2xdx 2 1 √ u du 2 1 3/2 u +c 3 1 2 (x + 2)3/2 + c. so F = f . 3 du dx We used the substitution u = x2 + 2.9. We make use of this notation in our next example. (u2 − 2u + 1)u7 du (u9 − 2u8 + u7 ) du 1 10 2 9 1 8 u − u + u +c 10 9 8 1 2 1 (t + 1)10 − (t + 1)9 + (t + 1)8 + c. For example. in the second one we find the anti-derivative. Suppose also that F is an anti-derivative of f .3. Then du = dx and t = u − 1. Then We calculate that t2 (t + 1)7 dt = = = = = (u − 1)2 u7 du = 2x.15). 3 . We may have to use a substitution and a trigonometric identity to solve We do not attach any particular meaning to the symbols dx and du in their own right. but this is beyond the scope of these notes. Then the pattern for calculating an integral via substitution is (3. these symbols have a meaning. or du = 2xdx.

or more often. and check the details: sec2 x tan x dx = = = = sec x · sec x tan x dx u du 1 2 u +c 2 1 sec2 x + c. (x2 + 1) sin3 (x3 + 3x − 2) cos(x3 + 3x − 2) dx = = = = = 1 sin3 u cos u du 3 1 v 3 dv 3 1 4 v +c 12 1 sin4 u + c 12 sin4 (x3 + 3x − 2) +c 12 In the computation we used the substitution u = x3 + 3x − 2. to work out an integral. In a second substitution we set v = sin u. Here is an example. Find the substitution which we used in the following computation. 2 2 We used the substitution u = x2 + 5. 2 Sometimes we have to apply the method of substitution twice. so that du = 2xdx and the identity sin2 α = [1 − cos 2α]/2. INTEGRATION sin2 u du 1 [1 − cos 2u] du 2 1 1 u − sin 2u + c 2 2 1 1 (x2 + 5) − sin[2(x2 + 5)] + c.130 an integration problem: 2x sin2 (x2 + 5) dx = = = = CHAPTER 3. Then du = 3(x2 + 1) dx. Then dv = cos u du. .

In the first example we assume that a = 0. The dx = adu. and we use the subsitution x = au. SUBSTITUTION 131 Here are two examples. we calculate dx = x2 + 2x + 5 dx = (x + 1)2 + 4 du 1 = arctan u2 + 4 2 x+1 2 + c. 3 .16) a f (g(x))g (x) dx = g(a) f (u) du. observe that f has an anti-derivative.3.14). 1 0 (x2 − 1)(x3 − 3x + 5)3 dx = 1 3 3 5 u3 du = 1 4 u 12 3 5 =− 136 . and then we proceeded as in the previous example. we have b g(b) (3. Then b b g(b) g(b) f (g(x))g (x) dx = F (g(x)) a a = F (u) g(a) = g(a) f (u) du. The second one is obvious. adu a2 u2 + a2 1 du a u2 + 1 1 arctan(u) + c a 1 x arctan +c a a We used the substitution u = x+1. The first identity is obtained as a combination of the Fundamental Theorem of Calculus and (3. dx dx = x2 + a2 = = = Adding another idea. which we again denote by F .9. To see this. 3. which are important in the context of integrating rational functions. Assuming as before that f and g are continuous on the interval [a. b]. and the third one is another application of the Fundamental Theorem of Calculus.1 Substitution and Definite Integrals Let us now explore how substitution is used to calculate definite integrals. Let us apply this formula in a few examples.9.

15 We use the substitution u = x2 + 1. √ 8 0 x3 x2 + 1 dx = 1 2 9 1 √ 1 (u − 1) u du = 2 9 1 u3/2 − u1/2 du = 596 . Then 1 du = x dx and x2 = u − 1. then √ u = 1.e. If x = 0. 3 1 3 4 0 1 We use the substitution u = cos x. Is the example genuine. and if x = 1. then u = 0. To obtain the limits for the integral we calculate u(0) = 5 and u(1) = 3. Then du = (3x2 − 3) dx. i. If x = 0. INTEGRATION We used the substitution u = x3 − 3x + 5. 7 We use the substitution u = x + 1. You were told long time ago in school. and if x = π/4. we calculate 2 0 x(x + 1)6 dx = 1 3 (u − 1)u6 du = 3 1 u7 − u6 du = 3554 . Then dx = cos u du. or did we assume the answer previously? By definition. When we . √ Remark 12. the area of one forth of the disk of radius 1. 4 We use the substitution x = sin u. For our given values of x. Then du = dx and x = u − 1. Another example is √ √ √ π/4 2/2 2 1 3 2/2 1 2 2 cos x sin x dx = − u du = − u =− 1− . Similarly. and if x = 8. Finally. In our calculation of the derivative of the sine and cosine functions we used the estimate that | sin h − h| ≤ h2 /2.. and if x = 2. 1 0 1 − x2 dx = π/2 0 1 − sin2 u cos u du = π/2 0 cos2 u du = π . then u = 1. if x = 0. There is a more serious matter. π is the ratio of the circumference of a circle by its diameter. If x = 0. then u = 9. and 1 2 3 du = (x −1) dx. then u = 2/2. the u = 3. that the area of this unit disk is π.132 CHAPTER 3. but they will lead to the same results. then u = π/2. Using x ∈ [0. For 2 √ the limits we calculate. then u = 1. so that the result of the calculation is hardly surprising. Incorporating one of our previous techniques. The graph of f (x) = 1 − x2 is the northern part of a circle. 1] means that we calculated the area under this graph in the first quadrant. there are other possible values for u. Then −du = sin x dx.

Exercise 48.10 Areas between Graphs Previously we related the integral to areas of a region under a graph. Calculate the area of the region between the graphs of the √ functions f (x) = x2 and g(x) = 1 − x2 .28. we draw the two graphs. Now you see the region between the two graphs whose area we want to calculate. This means.13. Let us look at an example. A typical proof of the latter inequality starts out by first showing that the area of the unit disk is π. √ After squaring the equation and solving it for x2 .3. . 2 Only the + sign occurs as x2 ≥ 0. we assumed the result in the example. we used that |h| ≤ | tan h| for h ∈ [−π/4. Solution To get a better understanding. We call the region Ω. we solve the equation f (x) = x2 = g(x) = 1 − x2 . we find the x-coordinates of the points where the curves intersect: A=− −1 + 2 √ 5 and B = −1 + 2 √ 5 . Example 3. see Figure 3. AREAS BETWEEN GRAPHS 133 showed this. we find x2 = −1± 5 . π/4]. Find the following integrals: dx 2x + 1 t dt 2 + 9)2 (4t √ t(1 + t2 )3 dt 2s ds 6 − 5s2 b3 x3 √ dx 1 − a4 x4 √ 3 π π/4 (a) (b) (c) (d) (e) (f) 0 x cos x dx √ x2 x + 1 dx x+3 √ dx x+1 sin2 (3x) dx π/2 2 (k) π/6 1/2 sec(2x) tan(2x) dx dx 4 + x2 0 sec2 x √ dx 1 + tan x √ 1 + sin x cos x dx r (g) (h) (i) (j) 0 (l) (m) (n) (o) cos2 x dx 0 r 2 − x2 dx 3. This idea can be generalized to the discussion of areas of regions between two graphs. we did not derive it.10. Taking the square root. To find their x-coordinates. The graphs intersect in two points.

. Concretely: B Area(Ω) = A g(x) dx − B B f (x) dx = A A (g(x) − f (x)) dx ≈ 1.2 -1 -1 -0.5 0.29. π] where g(x) ≥ f (x).4 0.5 1 1.14.6 0. π/4] on which f (x) ≥ g(x). B] we can calculate the appropriate integrals.5 0. and the region Ω2 over the interval [π/4. Find the area of the region between the graphs of the functions f (x) = cos x and g(x) = sin x for x between 0 and π. To get the area of the region Ω between the graphs. we take the area of the region under the graph of g(x) and subtract the area of the region under the graph of f (x).134 CHAPTER 3.5 3 0. The numerical value was obtained by computer. the region Ω1 over the interval [0. The region breaks up into two pieces. ♦ Some problems are a bit more subtle.5 2 2. INTEGRATION 1 1 0.13: Region between two graphs Figure 3. Solution: The region Ω between the graphs is shown in Figure 3.8 0.14: Region between two graphs To get the area of the region under the graph of f (x) and g(x) over the interval [A. You are invited to work out the integral with the help of the Fundamental Theorem of Calculus to verify the result.06651.5 -0.5 1 Figure 3. We calculate the areas of the regions Ω1 and Ω2 separately. Example 3.

b]. resp. The area of Ω is b Area(Ω) = a |f (x) − g(x)| dx. In some problems a and b are explicitly given. AREAS BETWEEN GRAPHS In each case.. and it incorporates and generalizes Proposition 3. In all cases it is good to graph the functions before calculating the area of the region between them. non-positive. Let Ω be the region between the graphs of f (x) and g(x) for x between a and b. ♦ Our general definition for the area between two graphs is as follows.23 on page 121. (b) y = 8 − x2 and y = x2 (c) y = x2 and y = 3x + 5. Here we did not. Exercise 49. we proceed as in the previous example: π/4 135 Area(Ω1 ) = Area(Ω2 ) = 0 π π/4 (cos x − sin x) dx = (sin x + cos x) π/4 0 π π/4 = √ 2−1 √ 2. (sin x − cos x) dx = −(sin x + cos x) =1+ In summary we find: √ Area(Ω) = Area(Ω1 ) + Area(Ω2 ) = 2 2. When we compared integrals and areas. . Sketch and find the area of the region bounded by the curves: (a) y = x2 and y = x3 . (d) y = sin x and y = πx − x2 .3.7 on page 114.10. (e) y = sin x and y = 2 sin x cos x for x between 0 and π. Having the correct picture in mind helps you to avoid mistakes. Definition 3. This definition generalizes Definition 3. An additional remark may be in place. in others you have to determine them from context. The definition is also consistent with the intuitive idea of the area of a region. Suppose f (x) and g(x) are integrable functions over an interval [a. We took care of this aspect by breaking up the interval into the part where f (x) ≥ g(x) and the part where g(x) ≥ f (x). Typically this problem gets addressed when the integral is calculated.30. Taking the absolute value of the difference of f (x) and g(x) allows us avoid the question where f (x) ≥ g(x) and where g(x) ≥ f (x). we had to take into account where the function is non-negative.

Explicitly. 2 Solution: Set f (x) = e−x and choose the partition: x0 = 0 < x1 = 1 3 < x2 = 1 < x3 = < x4 = 2. 2 2 2 .136 CHAPTER 3. We multiply it with the length of the associated interval.31. We describe different ways to find.18) IR = f (x1 )(x1 − x0 ) + f (x2 )(x2 − x1 ) + · · · + f (xn )(xn − xn−1 ).17) IL = f (x0 )(x1 − x0 ) + f (x1 )(x2 − x1 ) + · · · + f (xn−1 )(xn − xn−1 ). b]: b f (x) dx. Both expressions provide us with specific examples of Riemann sums. for some functions we have no good expression for its anti-derivative. In such cases we may have to rely on numerical methods for integrating. only we use the value of the function at the right endpoint instead of the left endpoint: (3. Still. and then add up the terms. and show some methods for finding an approximate value for the integral. we calculate (3. Use the left and right endpoint method to find approximate values for 2 0 e−x dx. Example 3. Left and Right Endpoint Method: In the left endpoint method we find the value of the function at each left endpoint of the intervals of the partition. INTEGRATION 3. approximate values for the integral of a function f (x) over the interval [a. a In all of the different approaches we partition the interval into smaller ones: a = x0 < x1 < · · · < xn−1 < xn = b.11 Numerical Integration The Fundamental Theorem of Calculus provided us with a highly efficient method for calculating definite integrals. In the right endpoint method we proceed as we did on the left endpoint method. by numerical means. Let us take such a function.

16: Use right end points Apparently. Note also.6 0.5 2 Figure 3. NUMERICAL INTEGRATION 137 Then xk −xk−1 = 1/2 for k = 1. Formula (3. we have IR = .6351975438.8 0. 2 1 0.2 0.4 0. and 4.11. In the midpoint methods.4 0.126039724.8 0. 2]. 2 The function and the rectangles whose areas are added to give us IL and IR are shown in Figure 3. 2]. we use the value of the function at the midpoints of the intervals of the partition.16.6351975438 ≤ 2 0 e−x dx ≤ IL = 1. and IR is a lower sum. 3. In this sense.17) for IL specializes to IL = f (0) + f (1/2) + f (1) + f (3/2) ≈ 1.15: Use left end points Figure 3.5 1 1.5 1 1. so that IL is an upper sum for the function f (x) over the interval [0. That should be less bias.2 1 0. In our case the values of f (x) are all positive and all of the rectangles are above the x axis.3.6 0.18) for IR specializes to IR = f (1/2) + f (1) + f (3/2) + f (2) ≈ . 2 Formula (3.126039724. ♦ Midpoint and Trapezoid Method: We may try and improve on the endpoint methods. 2. that our specific function f (x) is decreasing on the interval [0. IL and IR are calculated by combining the areas of certain rectangles. . so the areas of the rectangles are all added.5 2 0.15 and Figure 3.

see (3. x2 ] the graph of T (x) is the secant line through the points (x1 .32. We have such a trapezoid over each of the intervals in the partition.138 CHAPTER 3. 2 Solution: We use the same partition of [0. b].8827889485. INTEGRATION We use the same partition and notation as above. This integral is easily computed by the formula in (3. x1 ].20) IT = IL + IR . This is the first 2 summand in the formula for IT .21) IT = b IT = a T (x) dx. 2 . For simplicity. Then (3. f (x1 )). we use appropriate secant lines above all of the intervals in the partition to define the function T (x) over the entire interval [a. f (x1 )) and (x2 . 2 2 It is quite easy to see that (3.e. midpoint) of the end points of the intervals in the partition. f (x0 )) and (x1 . but we average the values of the function at the end points. Proceeding in the fashion. Over the interval [x1 .75) ≈ . b].19)) specializes to IM = f (.25) + f (1. the formula is f (x0 ) + f (x1 ) f (xn−1 ) + f (xn ) (x1 − x0 ) + · · · + (xn − xn−1 ). The area of this trapezoid is f (x0 )+f (x1 ) (x1 − x0 ). Expressed differently. Consider the trapezoid of width (x1 − x0 ) which has height f (x0 ) at its left and f (x1 ) at its right edge. we can draw a secant line through the points (x0 . 2 Let us explain the reference to the word trapezoid.31. Use the midpoint and trapezoid method to find approximate values for 2 0 e−x dx. suppose that f (x) is non-negative on the interval [a. The formula for IM (see (3. This gives us the graph of a function T (x) over the interval [x0 .25) + f (. f (x2 )). Specifically.20).19) IM = f x 0 + x1 2 (x1 − x0 ) + · · · + f xn + xn−1 2 (xn − xn−1 ).20). Example 3. Then the formula for the midpoint method is: (3. 2] as in Example 3. and their areas are added to give IT . In the trapezoid method we do not take the function at the average (i.75) + f (1.

17: Use midpoints Figure 3. You see the rectangles for this calculation in Figure 3.18.8 0. and IT is the area of the region under this graph.5 2 0.8806186341. The dots are connected by straight line segments.2 1 0. NUMERICAL INTEGRATION 139 As for the endpoint methods. There you see the function 2 f (x) = e−x and five dots on the graph. So 2 IT = T (x) dx. IM is the combined area of certain rectangles. The specific formula for an approximate value of the integral of f (x) over [a.21) we find IT = IL + IR ≈ .5 1 1.17. Their width are the lengths of the intervals of the partition.2 0.8 0.6 0. b] is IS = (3.4 0. 0 ♦ Simpson’s Method: In Simpson’s method we combine the endpoint and midpoint methods in a weighted fashion. we use the same notation for the function and the partition as above.22) 1 x 0 + x1 f (x0 ) + 4f 6 2 1 + f (xn−1 ) + 4f 6 + f (x1 ) (x1 − x0 ) + · · · xn−1 + xn 2 + f (xn ) (xn − xn−1 ) .3. These line segments form the graph of a function T (x).18: Trapezoid Method Based on our previous calculations and Formula (3. Their heights are the values f (xi ) at the midpoints of the intervals of the partition.6 0. Again.5 1 1. 2 We illustrated this calculation in Figure 3.11. 1 0.4 0.5 2 Figure 3.

In one method we use two points on the graph and connect them by a straight line segment. With some work one can show that b IS = a P (x) dx.140 It is quite easy to see that IS = CHAPTER 3. In the other one we use three points on the graph and construct a parabola through them. 1 0. Simpson’s method is a refinement of the Trapezoid method.33.2 0.8 0.4 0.6 0. xk ] is chosen so that it agrees with f (x) at the end points and at the midpoint of this interval. The polynomial over the interval [xk−1 . b] by defining a degree 2 polynomial on each of the intervals of the partition. Use Simpson’s method to find an approximate value for 2 0 e−x dx.5 1 1.5 2 Figure 3. 6 3 Let us explain the background to Simpson’s method. INTEGRATION IL + 4IM + IR IT + 2IM = .19: Simpson’s Method Example 3. 2 . We define a function P (x) over the interval [a.

g.8820809836 Table 3.19. 2] as in Example 3.31. We tabulate the results.882078948840 0. 6 where IL .0000000 0. The formula for IS (see the special case of (3.3.11.. 2 Solution: We compare the approximate values for the integral obtained by the different formulas. Compare the accuracy of the various approximate values of 2 0 e−x dx.34. and vary n.1: Approximate Values of the Integral Simpson’s method is more accurate than the other ones. Simpson’s method with n = 4 gives a result which is better than the left and right endpoint method with n = 1000.0366313 0.881095681980474 0. Even if you use the midpoint and trapezoid method with n = 1000. the function f (x) = e−x and the function P (x) from the discussion of Simpson’s method.882081390762417 2. Only the thickness of the line suggests that there are two graphs of almost identical functions. IM and IR are as above. n=1 IL IR IM IT IS n = 10 n = 100 0.7357589 0.882081390762421. There you see the graphs 2 of two functions.872262105229 0.882081402972833 0. E. NUMERICAL INTEGRATION 141 Solution: We use the same partition of [0. ♦ .882081366341586 0. then the result is far less accurate that Simpson’s method with n = 100. ♦ IS = Example 3.882081390722 n = 1000 0. They should be compared with an approximate value for the integral of 0.891895792451 0.22)) specializes to IL + 4IM + IR ≈ .8299445 0. We partition the interval [0.8818388108 0.882082611663 0. 2] into n intervals of the same length.88206555104.7836703747 0.9800072469 0. You see the method illustrated in Figure 3.8822020700 1.883063050702697 0.0183156 0.

b].7 and Proposition 3. then I stands for the total distance which you traveled during the time interval [a. Example 3. and these may add up. For example. It is important that we keep the number n of intervals into which we partition [a.142 CHAPTER 3. The amount of water carried by the river depends on the season. The river Little Brook flows into a reservoir. . then I is the total amount of the drug which has been absorbed in the time interval [a. Suppose that f (t) is an integrable function over the interval [a. b] and the integral b I= a f (t) dt.34 and compare the different methods applied to the calculation of π/2 sin x dx = 1. it is g(t) = 2 + sin πt 180 . the following definition expresses the common notion of the average value of a function. In each computational step we expect to make a roundoff error. INTEGRATION Remark 13. In addition. b] small. You are invited to come up with more interpretations. If f (t) stands for the rate at which a drug is absorbed. Consider a function f (t) on an interval [a. this can have a more concrete meaning.12 Applications of the Integral In Definition 3. Based on the context. Proceed as in Example 3. If f (t) stands for the speed with which you travel. the average value of the sine function f (x) = sin x over the interval [0.23 we related definite integrals to areas. b]. b]. It does not only keep the number of overall computations small. the smaller the cummulative round-off error will be. referred to as Beaver Pond by the locals. π] is 2/π.35. Then the quantity fav := 1 b−a b f (t) dt a is called the average value of f (t) over the interval [a. Definition 3. As a function of time. b]. The fewer computations we make. 0 3. Let us explore the different aspects of integration in an example. Exercise 50.36.

APPLICATIONS OF THE INTEGRAL 143 We measure time in days. (c) At which rate does the amount of water in the reservoir change at the beginning of September? (d) On which days will there be 250 million liters of water in Beaver Pond? (e) At which amount of water will the reservoir crest? (f) On the average.2 π 3 millions of liters of water in the pond. after 120 days. This answers (a). Find F (t). there are 200 million liters of water in the reservoir.3. On the T -th day of the year. Water is released from Beaver Pond at a constant rate of 2 million liters per day. Set T A(T ) = 0 f (t) dt. This answers (b). after 240 days. the rate of change is . there are F (120) = 200 + 180 2π 1 − cos ≈ 238. by how much has the amount of water in Beaver Pond increased per day during the first three months of the year? Solution: Water enters and leaves the pond. The net rate entering is f (t) = g(t) − 2 = sin πt 180 millions of liters per day. the total amount of water in Beaver Pond is T F (T ) = 200 + 0 f (t) dt = 200 + 180 1 − cos π πT 180 millions of liters. We obtain the total change of the amount of water in the reservoir by integrating f (t).12. At the beginning of the year. The units of g(t) are millions of liter of water per day. The rate at which the amount of water in the pond changes is F (t) = f (t). By the end of April. (a) How many liter of water are in Beaver Pond by the end of April? (b) Suppose F (t) tells how much water there is in the reservoir on day t of the year. At the beginning of September. and t = 0 corresponds to New Year.

and the absorption process is complete at time t = 1. . After three months or 90 days there are about 257. Here t measures time in hours. INTEGRATION f (240) ≈ −. We solve the equation for T : 250 = 200 + 180 1 − cos π πT 180 or cos πT 180 =1− 5π . or 272.866. At the time we only stated that they exist because we did not have the tools to properly define them. On the 88-th and 272-nd day of the year there will be 250 millions of liters of water in the reservoir. such that F (t) tells how much medication has been absorbed at time t. The pond is losing water at a rate of 866. To find at which amount the reservoir crests. On the average. A pain reliever has been formulated such that it is absorbed at a rate of 600 sin(πt) (mg/hr) by the body. We will now fill in the details. (a) What is the total amount of the drug which is absorbed? (b) Find a function F (t).13 The Exponential and Logarithm Functions In Section 1.6 millions of liters of water.000 liters per day. To answer (d). we like to know for which T we have F (T ) = 250. This answers (e). This occurs apparently when cos(πt/180) = −1 or t = 180. 18 We apply the function arccos to both sides of the last equation and find T = 180 5π arccos 1 − π 18 ≈ 88.144 CHAPTER 3. (c) A total of 150 mg of the medication has to be absorbed before the drug is effective. we have to find the maximum value of F (t). Many of the routine calculations are formulated as exercises.10 we introduced the exponential function exp(x) = ex and the natural logarithm function ln x. How long does it take until this threshold is reached? 3.3 millions of liters of water in Beaver Pond.3 millions of liters. and then the amount of water in it is about 314. the amount of water in the reservoir increased by about 640. the amount of water has increased by 57. The pond crests at mid-year. t = 0 at the time you take the medication. Within this time. 000 liters per day. ♦ Exercise 51.

. u Using this calculation we deduce that xy ln(xy) = 1 dt = t x 1 dt + t xy x dt = ln x + ln y. We need a short calculation. Then xy x dt = t xy x 1 1 dt = (t/x) x y 1 du = ln y. The natural logarithm of x is defined as x (3. THE EXPONENTIAL AND LOGARITHM FUNCTIONS 145 Definition 3. For any x. Proof.39. Show: (1) ln 1 = 0. According to Theorem 3. t 1 We use the substitution u = x . the third rule in Theorem 1. Theorem 3. Exercise 52.24) ln(xy) = ln x + ln y.10 this means that ln x is defined for all x in (0.37. Here x and y are fixed positive numbers. so that du = x dt. Proposition 3. Let us also verify one of the central equations for calculating with logarithms.23) ln x = 1 dt . t This is exactly our claim. its derivative is ln x = and ln x is increasing on (0. x Proof. The function 1/x is defined and continuous on (0. Let x ∈ (0.26 tells us that ln x = 1/x.38. ∞). t Theorem 3. ∞). The natural logarithm function is differentiable on its entire domain (0.13. For the adjustment of the limits of integration. ∞). (3. ∞). and that t/x = u = y when t = xy. According to Theorem 2. the function is increasing because its derivative ln x > 0 for all x > 0.34.3. 1 .11. (2) ln(1/y) = − ln y for all y > 0. ∞). y > 0. observe that t/x = u = 1 when t = x.

such that (3. Show that ln 4 > 1. y > 0. increasing function with domain (0. ∞) and range (−∞. 1 dt = 1. we have to show that there is a number e which has the property used in the definition. This means that. and ln x = 1/x. The natural logarithm function ln x is a differentiable. Every real number x lies between two integers. equivalently. ∞). find a lower sum Sl for the function 1/t over the interval [1.16) that there is a number e for which ln e = 1. It also follows that 1 < e < 4.40. CHAPTER 3. Taken together it means that it has a unique solution. We can now define the Euler number: Definition 3.41. the equation ln y = x has at most one solution. We saw that ln y is an increasing function. To see this. Exercise 54.146 (3) ln(x/y) = ln x − ln y for all x. every real number is a value of the function ln y. Proposition 3. So all integers (whole numbers) are values of the natural logarithm function. For every real number x there exists exactly one positive number y. Show that ln(ar ) = r ln a for all positive numbers a and all rational numbers r. for any given x. Because ln x is differentiable. 4] so that Sl > 1.25) ln y = x Proof. observe that ln 1 = 0 < 1 < ln 4. The number Euler number e is the unique number such that e ln e = 1 or. i. In summary. . numbers of the form r = p/q where p and q are integers and q = 0.e. Hint: Using the partition 1 = x 0 < 2 = x1 < 3 = x2 < 4 = x3 .42. Observe that ln(en ) = n and ln(1/en ) = −n for all natural numbers n.. According to the Intermediate Value Theorem. t For this definition to make sense. we have seen that Corollary 3. INTEGRATION Exercise 53. it follows from the Intermediate Value Theorem (see Theorem 1.

∞) and range (0. In addition to the equation in (3. with domain (−∞. exp(ln(y)) = y . Hint: Use Exercise 54 and that the exponential and logarithm functions are inverses of each other. Show that exp(r) = er for all rational numbers r.. Hint: Use the results of Exercise 52. Summarizing this discussion. you need to show that (3.44. and that the exponential and logarithm functions are inverses of each other. Exercise 56. y = exp(x) is the unique solution of the equation ln(y) = x.. and adding some observations which we have made elsewhere. ∞).26). ∞).13. Exercise 55. exp (x) = exp(x).e. the definition of e in Definition 3. i.40. ∞) and range (0. we define exp(x) to be the unique number for which (3.26) ln(exp(x)) = x. increasing function with domain (−∞.43. The exponential function exp(x) is a differentiable. Exercise 57. This assignment (mapping x to exp(x)) defines a function. called the exponential function.3. 147 Definition 3. we have: Proposition 3. ∞). Given any real number x. i. and the exponential function is its own derivative. Show that the exponential function exp and the natural logarithm function ln are inverses of each other.27) for all y ∈ (0. Show for all real numbers x and y that: (1) exp(0) = 1 (2) exp(1) = e (3) exp(x) exp(y) = exp(x + y) (4) 1/ exp(y) = exp(−y) (5) exp(x)/ exp(y) = exp(x − y). THE EXPONENTIAL AND LOGARITHM FUNCTIONS We are now ready to define the exponential function.e.

13. ∞) and range (0.29) loga x = ln x ln a and expa (x) = exp(x ln a). (3) loga (x) and expa (x) are increasing functions if a > 1. Suppose a > 0 and a = 1. The functions expa and loga are inverses of each other. Definition 3.148 CHAPTER 3. INTEGRATION The expression er makes sense only if r is a rational number. Exercise 58. Suppose a > 0 and a = 1. Show (1) ln a > 0 if a > 1 and ln a < 0 if 0 < a < 1. Show that (a) expa (loga (y)) = y for all y > 0. (4) loga (x) and expa (x) are decreasing functions if 0 < a < 1.28) ex = exp(x). a = 1. For the function loga we use the domain (0. the specifications for the domains and ranges for the functions expa and loga and the results from Exercise 59 tell us that Corollary 3. For the exponential function expa we use the domain (−∞. Set (3. ∞). Let a be a positive number. and it defines what we mean by raising e to any real power. We now expand the discussion to other bases.46. Exercise 59. Taken together. (b) loga (expa (x)) = x for all real numbers x. (2) loga (x) and expa (x) are differentiable functions. If r = p/q then we raise e to the r-th power and take the q-th root of the result.45. ∞). We call loga (x) the logarithm function with base a and expa (x) the exponential function with base a.1 Other Bases So far we discussed the natural logarithm function and the exponential function with base e. 3. ∞) and range (−∞. This is consistent with the meaning of the expression for rational exponents due to Exercise 57. For an arbitrary real number we set (3. .

The expression ar makes sense if r is a rational number. y > 0.3. For an arbitrary real number we set (3. We can now state an equation which is typically considered to be one of the laws of logarithms: Exercise 63. Show the exponential laws: (1) expa (0) = 1 and expa (1) = a (2) expa (x) expa (y) = expa (x + y) (3) 1/ expa (y) = expa (−y) (4) expa (x)/ expa (y) = expa (x − y). Exercise 62. x > 0. 149 We rephrase a convention which we made previously for e. and r is a rational number. (d) loga (x/y) = loga x − loga y for all x. This is consistent with the meaning of the expression for rational exponents due to Exercise 62. a = 1. Show loga (ar ) = r and expa (r) = ar . It is also a standard convention to set 1x = 1 and 0x = 0 for any real number x. Suppose a > 0. (c) loga (1/y) = − loga y for all y > 0. Equation 3. Suppose a > 0 and a = 1. and z is any real number. Suppose a > 0. Then loga (xz ) = z loga (x). Typically 00 is set 1.30 specializes to the one in Equation 3. y > 0. . Exercise 61. Suppose a > 0 and a = 1.30) ax = expa (x).13. Suppose a > 0 and a = 1. Show the laws of logarithms: (a) loga 1 = 0 and loga a = 1. a = 1. and it defines what we mean by raising a to any real power. (b) loga (xy) = loga x + loga y for all x. If r = p/q then we raise a to the r-th power and take the q-th root of the result.28 if we set a = e. THE EXPONENTIAL AND LOGARITHM FUNCTIONS Exercise 60.

and that expa (x) is continuous. Here one uses that f (x) and expa (x) are monotonic. We have to show that f (x) = expa (x) for all real numbers x. Suppose f (x) is any monotonic function and f (r) = ar = expa (r) for all rational numbers r. which is defined for all real numbers x such that expa (x) = ax whenever x is a rational number. There exists exactly one monotonic function. and this function has all of the properties called for in the theorem. there in only one such function. a = 1. We are ready to prove Theorem 3. Let a be a positive number. . That settles the existence statement..150 CHAPTER 3. Proof.47.e. We have to show the uniqueness statement. We leave the verification of this assertion to the reader.10. called the exponential function with base a and denoted by expa (x). INTEGRATION We are now ready to fill in the details for one of the major statements which we made in Section 1. In this section we constructed the function expa (x). i.

1. Observe that the ratio referred to in the definition does not depend on the radius of the circle. 151 . We like to find the radian measure of the angle α. We take a practical approach to measuring the length of an arc on this circle. the radian measure of the angle α is not unique. We may also consider arcs which wrap around the circle several times before they end at p. al.Chapter 4 Trigonometric Functions In this section we discuss the radian measure of angles and introduce the trigonometric functions. Stated differently it says. In this sense. We imagine that we can straighten it out. cosine. but any two radian measures of the angle differ by an integer multiple of 2π. This definition goes back to the Greeks. It is shown in Figure 4. We collect some formulas relating these functions.1) α = ±s (radians). These are the functions sine. The − sign is used if it proceeds clockwise. and measure how long it is. Consider an angle α between the positive x-axis and a ray which originates at the origin of the coordinate system and intersects the unit circle in the point p. The + sign is used if the arc goes counter clockwise around the circle. Definition 4. and suppose its length is s. The number π is the ratio between the circumference of a circle and its diameter. et.1. Consider an arc on the unit circle which starts out at the point (1. It requires some work to introduce the idea of the length of a curve in a mathematically rigorous fashion. tangent. Arc Length and Radian Measure of Angles: Consider the unit circle (a circle with radius 1) centered at the origin in the Cartesian plane. that the circumference of a circle of radius r is 2πr. 0) and ends at p. Then (4.

one degree corresponds to π/180 ≈ 0. The measure of half a revolution (a straight angle) comprises π radians and 180 degrees. By convention. Let α be the angle between the positive x-axis and the ray which starts at the origin and intersects the unit circle in p.5 0. 0) we travel the distance |t| along the unit circle.1: The unit circle Conversely. and one radian corresponds to 180/π ≈ 57. This angle has radian measure t. So. In this way we reach a point p = (x(t).29577951 degrees. y(t)) on the circle. We construct the angle with radian measure t. and we set (4. let t be any real number.5 (cos t. 0) we travel the distance |t| along the unit circle (here |t| denotes the absolute value of t). . sin t) -1 -0.152 CHAPTER 4. counter clockwise if t is positive and clockwise if t is negative. we travel counter clockwise if t is positive and clockwise if t is negative.5 1 -0.3) x(t) = cos t and y(t) = sin t. In this way we reach a point p on the circle.2) x degrees = π x radians. Starting at the point (1. 180 Trigonometric Functions: Let t be once more a real number. Comparison of Angles in Degrees and Radians: We suppose that you are familiar with measuring angles in degrees.5 -1 Figure 4.017453293 radians. TRIGONOMETRIC FUNCTIONS 1 0. We have the conversion formula (4. Starting at the point (1.

5 -0. cotangent (cot).5 0. A small table with angles given in degrees and radians. 0) and (x. You can see the graphs of the secant and cosecant functions in Figure 4. To do this we return to Figure 4. secant (sec).8 and 4. 2π] in Figures 4.2 and 4.9. and there are books which were published for the specific purpose of providing these tables. and cosecant (csc) are defined as follows: (4.1. tangent (tan). then this is indicated by ‘n/a’. 0).2: f (x) = sin x Figure 4. Trigonometric Functions defined at a right triangle: Occasionally it is more convenient to use a right triangle to define the trigonometric functions.153 This defines the functions sin t and cos t. 1 1 0. y). See Figure 4.3. We may use a circle of any radius r.5 1 2 3 4 5 6 1 2 3 4 5 6 -0. Older calculus books may still contain tables with the values of the trigonometric functions. They are drawn over different parts of the domain to show different aspects. (x. If the functions are not defined at some point. The . This is really not necessary anymore because any scientific calculator gives those values to you with rather good accuracy.7. You see a right triangle with vertices (0. as well as the associated values for the trigonometric functions is given in Table 4. You can find the graphs of the sine and cosine functions on the interval [0. You see the construction implemented in Figure 4.1.3: f (x) = cos x The other trigonometric functions.4 to Figure 4.4) tan x = sin x cos x cot x = cos x sin x sec x = 1 cos x csc x = 1 sin x To make sure you have some idea about the behavior of the tangent and cotangent function we provided two graphs for each of them.1.5 -1 -1 Figure 4.

1.5 -3 -2 -1 -20 1 2 3 -0. π] .9: csc x on [−π. π] Figure 4.8: sec x on [−π.5: tan x on [−1. π] Figure 4.5 -1 1.7: cot x on [ π − 1.1] 40 1.5 2 2.5 1 20 0.5 -40 -1.6: cot x on [−π.4: tan x on [−π.5 1 -20 -1 -40 -2 Figure 4.5 0.154 CHAPTER 4.5 Figure 4.1. π + 1] 2 2 20 20 10 10 -3 -2 -1 1 2 3 -3 -2 -1 1 2 3 -10 -10 -20 Figure 4. π] Figure 4. TRIGONOMETRIC FUNCTIONS 40 2 20 1 -3 -2 -1 1 2 3 -1 -0.

others you should be aware of. Some of them you should know. Let α be the angle at the vertex (0. The following identities are obtained from elementary geometric observa- . Then opposing side hypothenuse opposing side tan α = adjacent side hypothenuse sec α = adjacent side sin α = cos α = adjacent side hypothenuse adjacent side cot α = opposing side hypothenuse csc α = opposing side Trigonometric Identities: There are several important identities for the trigonometric functions.5) sin2 x + cos2 x = 1.1: Values of Trigonometric Functions right angle is at the vertex (x. In the following the words adjacent and opposing are in relation to α. sec2 x = 1 + tan2 x. so that you can look them up whenever needed. 0).155 degrees radians sin x cos x tan x cot x 0 30 45 60 90 120 135 150 180 0 π/6 π/4 π/3 π/2 2π/3 3π/4 5π/6 π 0 1 2 √ 2 2 √ 3 2 √ √ sec x √ 2 3 3 csc x n/a 2 √ 2 1 3 2 √ 2 2 1 2 √ 0 3 3 n/a √ 3 √ 1 1 √ 3 n/a √ − 3 √ − 33 1 3 3 √ √ 2 2 1 −2 √ − 2 −1 √ 2 3 3 1 0 √ 2 2 √ − 23 0 − n/a √ 2 3 3 3 2 √ 2 2 1 2 −1 2 3 3 − −1 −1 √ √ − 3 −233 n/a √ 2 2 n/a 0 −1 0 Table 4. csc2 x = 1 + cot2 x. 0) and the hypotenuse has length r. From the theorem of Pythagoras and the definitions you obtain (4.

or even derived. the following addition formulas in precalculus.13) (4.16) sin2 α = 1 [1 − cos 2α] 2 and cos2 α = 1 [1 + cos 2α] 2 .15) sin α sin β = sin α cos β = cos α cos β = 1 [cos(α − β) − cos(α + β)] 2 1 [sin(α − β) + sin(α + β)] 2 1 [cos(α − β) + cos(α + β)] 2 which specialize to the the half-angle formulas (4.11) sin(α + β) = sin α cos β + cos α sin β sin(α − β) = sin α cos β − cos α sin β cos(α + β) = cos α cos β − sin α sin β cos(α − β) = cos α cos β + sin α sin β tan α + tan β tan(α + β) = 1 − tan α tan β tan α − tan β tan(α − β) = 1 + tan α tan β These formulas specialize to the double angle formulas (4.9) (4.14) (4.12) sin 2α = 2 sin α cos α and cos 2α = cos2 α − sin2 α From the addition formulas we can also obtain (4.6) (4.156 CHAPTER 4. (4. TRIGONOMETRIC FUNCTIONS tions using the unit circle.10) (4.7) (4.8) (4. sin x cos x cos x sin x = sin(x + 2π) = cos(x + 2π) = sin(x + π ) 2 = − cos(x + π ) 2 = sin(π − x) = − cos(π − x) = − cos(x + π) = − sin(x + π) = − sin(−x) = cos(−x) = − sin(x + 3π ) 2 = cos(x + 3π ) 2 You should have seen.

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.