You are on page 1of 80

4 — Integrations

4.1 Jordan Measure


You may have learned in high school that a definite integral of a function such as
ˆ 1
x3 dx
0

means the area under the graph y = x3 from x = 0 to x = 1, or more precisely, the area of the
region
{(x, y) : 0  x  1 and 0  y  x3 }.

You were probably told (but not explained) that a definite integral like this can be evaluated by
d x4
finding an anti-derivative of the function, namely = x3 , so we have
dx 4

1  x=1
x4 14 04 1
ˆ
x3 dx = = = .
0 4 x=0 4 4 4

This is known as the Fundamental Theorem of Calculus, or Newton-Leibniz formula that relates
the problem of finding area and differentiations.
We all learned about the concept of area in primary school (if not earlier), and know that the
area of a triangle is a half of the base times height, and the area of a circle with radius r is ⇡r2 .
However, it seems like we never learned about the rigorous definition of area! What is meant by
the area of a region? In this section, we introduce one definition of area, the Jordan measure –
which is will be used to give the rigorous definition of Riemann integrals.

4.1.1 Simple regions


Before introducing the Jordan measure of an arbitrary region in R2 , we first focus on some simple
regions. We first declare that the area of a rectangle ha, bi ⇥ hc, di ⇢ R2 to be (b a)(d c).
Here h means ( or [, and i means ) or ]. Intuitively, the intersection R1 \ R2 of two overlapping
rectangles R1 and R2 is also a rectangle, whereas the union R1 [ R2 needs not to be a rectangle.
However, one can break down R1 [ R2 into three smaller rectangles R1 [ R3 [ R4 so that these
three rectangles are disjoint.
38 Integrations

It is then sensible to define the area of R1 [ R2 to be the sum of areas of R1 , R3 and R4 .


Inductively, one can see if we have finitely many rectangles, their union can be decomposed into
smaller rectangles which are disjoint, and hence we can define the area of such a union. To
summarize, we have:
Theorem 4.1 — Simple Regions. A set E ⇢ R2 is called a simple region if E is the union of
finitely many rectangles, i.e.
N
[
E= hai , bi i ⇥ hci , di i,
i=1

where ai  bi and ci  di , and they are all finite real numbers for any i.

i We allow ai = bi or ci = di in the above definition. In other words, we also count line


segments and points to be “rectangles”.

Proposition 4.2 Every simple region E can be expressed as the union of finitely many rectangles
in R2 which are mutually disjoint.
` S
Proof. By induction and the law that ( i Ai ) [ B = i (Ai [ B). ⌅

Definition 4.1 — Area of Simple Regions. Let E ⇢ R2 be a simple region so that, in view of the
above proposition, can be expressed as
N
a
E= hai , bi i ⇥ hci , di i
i=1

where hai , bi i ⇥ hci , di i \ haj , bj i ⇥ hcj , dj i = ; wherever i 6= j. We define the area of


E ⇢ R2 to be:
N
X
A(E) := (bi ai )(di ci ).
i=1

i A simple region E can be expressed as the union of disjoint rectangles in many different
ways! It is possible to prove the definition of A(E) is independent of how we express E as
the union of disjoint rectangles. Again, it is an intuitive fact which cannot be easily proved.

4.1.2 General region in R2


Now consider a general region ⌦ in R2 . In primary school, we probably have learned how to find
its approximate area by counting the number of little squares in a grid. The rigorous definition
of area of ⌦ is in fact motivated by the idea of counting squares. We approximate ⌦ by simple
regions from inside and also from outside (such as S and T in Figure 4.1).
As S and T are simple regions, it makes sense to talk about their areas A(S) and A(T ). It is
then sensible to expect that the area of ⌦ should be bounded between A(S) and A(T ). As we
approximate ⌦ by a pair of “closer”, more “refined” simple regions S and T , we expect the “area”
4.1 Jordan Measure 39

Figure 4.1: Approximation of ⌦ by simple regions S and T

of ⌦ should be “something like” lim A(S) and lim+ A(T ). We learned about what is the limit
S!⌦ T !⌦
of a sequences or a function, but we never learn about limit of sets.
Instead of using “limits”, we define the area of ⌦ by taking it to be the “best” approximated
area by simple regions. If we approximate ⌦ by simple regions from inside (such as S), then the
“best” means the maximum possible A(S) among all possible simple regions S ⇢ ⌦. Similarly,
if we approximate ⌦ by simple regions from outside (such as T ), then the “best” means the
minimum possible A(T ) among all simple regions T ⌦. That is exactly what Jordan measure
means. Precisely, we have:
Definition 4.2 — Jordan Measure in R2 . Let ⌦ ⇢ R2 be a non-empty bounded set. We define
its inner Jordan measure µ⇤ (⌦) and outer Jordan measure µ⇤ (⌦) to be:

µ⇤ (⌦) := sup{A(S) : S ⇢ ⌦ and S is a simple region in R2 }


µ⇤ (⌦) := inf{A(T ) : T ⌦ and T is a simple region in R2 }

(Reference: Figure 4.2).


If µ⇤ (⌦) = µ⇤ (⌦), then we say ⌦ is Jordan measurable, and define its Jordan measure
µ(⌦) by taking µ(⌦) := µ⇤ (⌦).

i Whenever S and T are simple regions such that S ⇢ ⌦ ⇢ T , it is intuitively clear (but not
easy to prove) that A(S)  A(T ). Therefore, one must have µ⇤ (⌦)  µ⇤ (⌦).

i Clearly, if µ⇤ (⌦) = 0, then ⌦ must be Jordan measurable and µ(⌦) = 0.

Let’s first get a sense of Jordan measure by some elementary examples:

Proposition 4.3 Any simple region E in R2 is Jordan measurable, and µ(E) = A(E).

Proof. Since E is a simple region, E satisfies the criterion:

E ⇢ E and E is a simple region in R2

in the definition of µ⇤ (E). Therefore, A(E) belongs to the set:

{A(S) : S ⇢ E and S is a simple region in R2 }.

An element belonging to a set must be bounded above by its supremum (which is one of its upper
bound), so we have:

A(E)  sup {A(S) : S ⇢ E and S is a simple region in R2 } = µ⇤ (E).


| {z }
A(E) belongs to this set
40 Integrations

Figure 4.2: Outer and Inner Jordan Measures

The same argument, mutatis mutandis, proves that


A(E) inf {A(T ) : T E and T is a simple region in R2 } = µ⇤ (E).
| {z }
A(E) belongs to this set

In conclusion, we have proved:


A(E)  µ⇤ (E)  µ⇤ (E)  A(E).
Therefore, the only possible is they are all equal to each other. It proves E is Jordan measurable
and µ(E) = µ⇤ (E) = A(E). ⌅

⌅ Exercise 4.1 Let X and Y be bounded sets in R2 such that X ⇢ Y . Show that sup X  sup Y
and inf X inf Y . Hence, show that if ⌦1 , ⌦2 are bounded regions in R2 such that ⌦1 ⇢ ⌦2 ,
then we have µ⇤ (⌦1 )  µ⇤ (⌦2 ) and µ⇤ (⌦1 )  µ⇤ (⌦2 ).

The Jordan measure of a simple region can be found directly from the definition. For a general
region ⌦, we typically calculate its Jordan measure by taking a pair of sequences {Sn } and {Tn }
of simple regions, with Sn ⇢ ⌦ and Tn ⌦ for any n, such that lim A(Sn ) = lim A(Tn ). Let’s
n!1 n!1
look the example of a parallelogram:

⌅ Example 4.1 Consider the parallelogram ⌦ with vertices

O(0, 0), A(b, 0), B(b + c, h), C(c, h)

where b, c, h > 0, i.e. the parallelogram has base length b and height h. Show that ⌦ is Jordan
measurable and µ(⌦) = bh.

⌅ Solution For each n 2 N, we define the simple region:


n 
[ 
(k 1)c kc (k 1)h kh
Tn := , +b ⇥ , .
n n n n
k=1

See Figure 4.3 for the illustration.


4.1 Jordan Measure 41

Figure 4.3: Parallelogram and its simple region approximations

The simple region Tn contains the parallelogram ⌦, so by the definition of outer Jordan
measure (which is the infimum of all area of outer simple regions), we have
⇣ c⌘ h ⇣ c⌘
µ⇤ (⌦)  A(Tn ) = n · b + · =h b+ .
n n n
Similarly, one can also construct inner simple region Sn :
n 
[ 
kc (k 1)c (k 1)h kh
Sn := , +b ⇥ , .
n n n n
k=1

Then, we have
⇣ c⌘ h ⇣ c⌘
µ⇤ (⌦) A(Sn ) = n · b · =h b .
n n n
In summary, we have proved that for each n 2 N,
⇣ c⌘ ⇣ c⌘
h b  µ⇤ (⌦)  µ⇤ (⌦)  h b + .
n n
Letting n ! 1, we conclude that:

hb  µ⇤ (⌦)  µ⇤ (⌦)  hb,

and therefore µ⇤ (⌦) = µ⇤ (⌦) = hb, so ⌦ is Jordan measurable and µ(⌦) = hb.

⌅ Exercise 4.2 Show that any straight line segment has Jordan measure zero. Note that a
straight line may not be horizontal or vertical.

⌅ Exercise 4.3 Show that any right-angled triangle with one side vertical and one side horizontal
is Jordan measurable and its Jordan measure is given by 12 ⇥ base ⇥ height.

In general, if there exist sequences of inner simple regions {Sn } and outer simple regions
{Tn } such that A(Tn ) and A(Sn ) converge to the same limit, we can conclude that the region is
Jordan measurable. In fact, the converse is also true. Let’s state it as a proposition:

Proposition 4.4 Let ⌦ be a non-empty bounded region in R2 . Then the following are equiva-
lent:
1. there exist sequences of inner simple regions {Sn } and outer simple regions {Tn } of ⌦
such that lim A(Sn ) = lim A(Tn ) = m.
n!1 n!1
2. ⌦ is Jordan measurable and µ(⌦) = m.

Proof. For (1) =) (2), the proof is similar to the parallelogram example. By the definition of
42 Integrations

µ⇤ and µ⇤ , we have
A(Sn )  µ⇤ (⌦)  µ⇤ (⌦)  A(Tn ) 8n 2 N.
Letting n ! 1 and by lim A(Sn ) = lim A(Tn ) = m, we conclude that µ⇤ (⌦) = µ⇤ (⌦) = m.
n!1 n!1
For (2) =) (1), we recall that the definition of µ⇤ (⌦) is given by

µ⇤ (⌦) = inf{A(T ) : T ⌦ and T is a simple region in R2 }

“Infimum” means the greatest lower bound, so for any n 2 N, there must exist a simple region
Tn ⌦ such that
1
µ⇤ (⌦)  A(Tn ) < µ⇤ (⌦) + ,
n
otherwise µ⇤ (⌦) + 1
n would also be a lower bound of the set

{A(T ) : T ⌦ and T is a simple region in R2 }.

By squeeze theorem, it is clear that

lim A(Tn ) = µ⇤ (⌦).


n!1

Similarly, for each n 2 N, there exists a simple region Sn ⇢ ⌦ such that


1
µ⇤ (⌦) < A(Sn )  µ⇤ (⌦).
n
Letting n ! 1, we also have
lim A(Sn ) = µ⇤ (⌦).
n!1

Since ⌦ is Jordan measurable, we have

lim A(Tn ) = µ⇤ (⌦) = m = µ⇤ (⌦) = lim A(Sn ).


n!1 n!1

⌅ Exercise 4.4 Prove using Proposition 4.4 that the trapezium ⌦ with vertices:

O(0, 0), A(a, 0), B(b + c, h), C(c, h)

where a, b, c, h > 0, is Jordan measurable and µ(⌦) = 12 (a + b)h.

⌅ Exercise 4.5 Prove that for a non-empty bounded region ⌦ in R2 , the following is also
equivalent to (1) and (2) in Proposition 4.4:
“8" > 0, 9 simple regions S ⇢ ⌦ and T ⌦ such that A(T \S) < ".”
[Hint: You can use the fact that if X ⇢ Y ⇢ Z are all simple regions, then Z\Y , Z\X and
Y \X are simple regions too, with A(Z\Y )  A(Z\X) and A(Y \X)  A(Z\X).]

4.1.3 Finite additivity and isometric invariance


Next we prove two good properties about Jordan measure:
• If ⌦1 and ⌦2 are two disjoint bounded Jordan measurable regions in R2 , then so is ⌦1 [ ⌦2
and µ(⌦1 [ ⌦2 ) = µ(⌦1 ) + µ(⌦2 ).
• If : R2 ! R2 is a distance-preserving map (e.g. rotations, reflections, translations
and their compositions), then µ⇤ ( (⌦)) = µ⇤ (⌦) and µ⇤ ( (⌦)) = µ⇤ (⌦) for any bounded
region ⌦ in R2 .
These sound intuitive, but it is not very obvious from the definition of Jordan measures (which
involve sup and inf).
4.1 Jordan Measure 43

Proposition 4.5 — Finite Additivity. Suppose ⌦1 and ⌦2 are two bounded Jordan measurable
regions in R2 , then so is ⌦1 [⌦2 . Furthermore, if ⌦1 \⌦2 = ;, then µ(⌦1 [⌦2 ) = µ(⌦1 )+µ(⌦2 ).

Proof. We use Proposition 4.4. Given that ⌦1 and ⌦2 are Jordan measurable, there exist sequences
(i) 1 (i) 1
of inner simple regions Sn n=1 and outer simple regions Tn n=1 , where i = 1, 2, such that
(i) (i)
Sn ⇢ ⌦i ⇢ Tn for each n 2 N and i 2 {1, 2}, and

lim A Sn(i) = lim A Tn(i) = µ(⌦i ).


n!1 n!1

From elementary set theory, we know

Tn(1) [ Tn(2) \ Sn(1) [ Sn(2) ⇢ Tn(1) \Sn(1) [ Tn(2) \Sn(2) .

Therefore, we get
⇣ ⌘ ⇣ ⌘ ⇣ ⌘
A Tn(1) [ Tn(2) \ Sn(1) [ Sn(2)  A Tn(1) \Sn(1) + A Tn(2) \Sn(2) .

For two simple regions X ⇢ Y , it is intuitive (yet tricky to prove) that 0  A(Y \X) = A(Y )
A(X), so it follows that

0  A Tn(1) [ Tn(2) A Sn(1) [ Sn(2)  A Tn(1) A Sn(1) + A Tn(2) A Sn(2) .


| {z } | {z }
!0 !0

By squeeze theorem, we conclude that

lim A Tn(1) [ Tn(2) = lim A Sn(1) [ Sn(2) .


n!1 n!1

(1) (2) (1) (2) (1) (2)


Noting that Tn [ Tn and Sn [ Sn are simple regions such that Sn [ Sn ⇢ ⌦1 [ ⌦2 ⇢
(1) (2)
Tn [ Tn , we conclude by Proposition 4.4 that ⌦1 [ ⌦2 is Jordan measurable with

µ(⌦1 [ ⌦2 ) = lim A Sn(1) [ Sn(2) .


n!1

(1) (2)
To find µ(⌦1 [ ⌦2 ) when ⌦1 \ ⌦2 = ;, we observe that Sn \ Sn = ; for any n (while it is
not true for the outer simple regions). Therefore,
⇣ ⌘
µ(⌦1 [ ⌦2 ) = lim A Sn(1) [ Sn(2) = lim A Sn(1) + A Sn(2) = µ(⌦1 ) + µ(⌦2 ).
n!1 n!1

⌅ Exercise 4.6 Prove by induction that for any finitely many bounded Jordan measurable
regions ⌦1 , · · · , ⌦N in R2 , then the union ⌦1 [ · · · [ ⌦N is also Jordan measurable. Also, if
⌦i \ ⌦j = ; for any i 6= j, then

µ(⌦1 [ · · · [ ⌦N ) = µ(⌦1 ) + · · · + µ(⌦N ).

⌅ Exercise 4.7 Prove that if ⌦1 and ⌦2 are two bounded Jordan measurable regions in R2 , then
so are ⌦1 \ ⌦2 and ⌦1 \⌦2 .
1. Prove that µ(⌦1 [ ⌦2 ) = µ(⌦1 ) + µ(⌦2 ) µ(⌦1 \ ⌦2 ).
2. Assume further that ⌦1 ⇢ ⌦2 , prove that µ(⌦2 \⌦1 ) = µ(⌦2 ) µ(⌦1 ).

In Exercise 4.3, we proved that the area of a triangle with its base being horizontal and height
being vertical is given by:
1
⇥ base ⇥ height.
2
Then, how about a general triangle? Using finite additivity, this formula can be extended to
triangles with one of the side being horizontal by considering the following diagrams:
44 Integrations

For a general triangle, we need to prove that the Jordan measure is invariant under isometries
(a.k.a. distance-preserving maps) such as rotations, reflections, translations. It suffices to prove
the measure of a rectangle is invariant under isometries. According to the diagram below, finite
additivity of Jordan measures and Exercise 4.3, one can show that the measure of any rectangle
is always base times height. The base and height are preserved under isometry, so is the measure
of the rectangle.

Jordan measure of the yellow rectangle


1
= (h cos ✓ + b sin ✓)(h sin ✓ + b cos ✓) 2 · h2 sin ✓ cos ✓
| {z } 2
Jordan measure of the blue rectangle
1
2 · b2 sin ✓ cos ✓ = hb(cos2 ✓ + sin2 ✓) = hb.
2
Now the invariance under isometry of Jordan measures can be extended to general Jordan
measurable regions in R2 :

Proposition 4.6 — Isometric Invariance. Let: R2 ! R2 be a distance-preserving map, and


⌦ be a bounded Jordan measurable region in R2 . Then (⌦) is also Jordan measurable and
µ( (⌦)) = µ(⌦).

Proof. As per the above discussion, the measure of a rectangle is preserved under such a map ,
so the measure of simple regions (which are disjoint unions of rectangles) is also preserved too.
Take a sequence of outer simple regions {Tn } with ⌦ ⇢ Tn for any n, and lim A(Tn ) = µ⇤ (⌦).
n!1
Then, by (⌦) ⇢ (Tn ), we have from Exercise 4.1

µ⇤ ( (⌦))  µ⇤ ( (Tn )) = µ⇤ (Tn ) = A(Tn ).


4.1 Jordan Measure 45

Similarly, take a sequence {Sn } of inner simple regions (i.e. Sn ⇢ ⌦) such that A(Sn ) ! µ⇤ (⌦)
as n ! 1. By Exercise 4.1 and (Sn ) ⇢ (⌦), we also have

A(Sn ) = µ⇤ ( (Sn ))  µ⇤ ( (⌦)).

To summarize, we have for any n 2 N that

A(Sn )  µ⇤ ( (⌦))  µ⇤ ( (⌦))  A(Tn ) 8n 2 N.

Letting n ! 1, we proved:

µ⇤ (⌦)  µ⇤ ( (⌦))  µ⇤ ( (⌦))  µ⇤ (⌦).

Since ⌦ is Jordan measurable, the above is in fact an equality. This proves our desired results.

Using finite additivity and isometric invariance, we can find the Jordan measure of polygon
figures by splitting it into disjoint triangles, rectangles, etc.

⌅Exercise 4.8 Suppose ⌦ ⇢ R2 is a bounded region such that there exist sequences {En } and
{Fn } of bounded Jordan measurable sets with En ⇢ ⌦ ⇢ Fn for any n, and

lim µ(En ) = lim µ(Fn ) = m.


n!1 n!1

Show that ⌦ is also Jordan measurable and µ(⌦) = m.

⌅ Exercise 4.9 Using the previous exercises, show that a circle with radius r is Jordan measur-
able and has measure ⇡r2 . [Hint: take En ’s and Fn ’s to be regular polygons.]

4.1.4 Examples of non-measurable sets


There does exist some “strange” sets in R2 which are not Jordan measurable. Here is one example:

⌦ := {(x, y) : x, y 2 Q, 0  x, y  1}.

Any inner simple region S contained inside ⌦ must be a finite set of points, since the only
“rectangles” contained inside ⌦ are single points. This shows µ⇤ (⌦) = 0.
However, for any outer simple region T containing ⌦, we claim that the closure T (i.e. the
union of T and its boundary) contains [0, 1] ⇥ [0, 1]. It is by the density of rational numbers. For
any (a, b) 2 [0, 1] ⇥ [0, 1] one can take a sequence xn 2 Q ! a and yn 2 Q ! b as n ! 1. Then
(xn , yn ) 2 ⌦ ⇢ T for any n. By order rule, the limit (a, b) of the points (xn , yn ) must be in T or
on its boundary. Therefore, [0, 1] ⇥ [0, 1] ⇢ T , so A(T ) = A(T ) 1. This concludes that

µ⇤ (⌦) = inf {A(T ) : T ⌦ and T is a simple region in R2 } 1.


| {z }
all elements 1

Therefore, µ⇤ (⌦) 6= µ⇤ (⌦). The set ⌦ is not Jordan measurable.

⌅ Exercise 4.10 Show that (Q \ [0, 1]) ⇥ [0, 1] is not Jordan measurable.

In MATH 3033/3043, we will study an even more important type of measure called Lebesgue
measure, which is an improved version of measure that makes these rational sets to be measurable.
The Lebesgue measure also enjoys an even better additivity called countable additivity.
46 Integrations

4.2 Riemann Integrals


Given that we have defined rigorously the meaning of area in the previous section, we are now
ready to introduce the definition of Riemann integrals. We will use Jordan measure. This idea
is not originally from Riemann, but by Orrin Frink who related Jordan measure and Riemann
integrals together in his paper1 published in 1933.
Here we will exclusively discuss bounded function defined on a closed and bounded interval
[a, b]. If either one of the boundedness conditions is removed, the integral will be called an
improper integral which will be discussed later. Given such a function f : [a, b] ! R, we define

G+
[a,b] (f ) := {(x, y) : a  x  b, f (x) 0, and 0  y  f (x)}
G[a,b] (f ) := {(x, y) : a  x  b, f (x) < 0, and f (x)  y  0}
G[a,b] (f ) := G+
[a,b] (f ) [ G[a,b] (f )

Figure 4.4: G+
[a,b] (f ) and G[a,b] (f ).

⌅ Exercise 4.11 Show that the following are equivalent:


1. Both G+ [a,b] (f ) and G[a,b] (f ) are Jordan measurable.
2. G[a,b] (f ) is Jordan measurable.

Definition 4.3 — Riemann Integrals. Let f : [a, b] ! R be a bounded function, then we say f is
Riemann integrable on [a, b] if and only if G[a,b] (f ) is Jordan measurable. In this case, we
define ˆ b
f (x) dx = µ G+
[a,b] (f ) µ G[a,b] (f ) .
a

See Figure 4.4.

i We assign negative value to the part below the x-axis. One reason for doing so is to guarantee
´b ´b
that if f (x)  g(x)  0 on [a, b], we still have a f (x) dx  a g(x) dx.

1 Orrin Frink, Jordan Measure and Riemann Integration, Annals of Mathematics, Second Series, Vol. 34, No. 3 (July

1933), pp. 518-526


4.2 Riemann Integrals 47

4.2.1 Non-negative functions


Let’s discuss how we could determine whether G+ [a,b] (f ) and G[a,b] (f ) are Jordan measurable. For
simplicity, we label them by G+ and G respectively.
We first consider bounded functions which are non-negative, so that we could only consider
G+ . We need to consider the outer simple regions and inner simple regions of G+ . One nice
property about a region like G+ is that any outer simple region containing G+ can be shrunk to
become a “bar chart” type region like below with each vertical bar barely hit the graph y = f (x):

Recall that the outer measure of G+ is defined as

µ⇤ (G+ ) := inf{A(T ) : G+ ⇢ T and T is simple}.

For each simple region T containing G+ there is always a smaller “bar chart” region T 0 (to be
more precisely defined soon) with G+ ⇢ T 0 ⇢ T , so the above infimum can be taken over all “bar
chart” regions containing G+ only. It is because dropping those non-bar-chart simple regions will
not affect the infimum. Here is an analogy: say in a test, you know that your test score is higher
than one of your friend, then you know that the lowest is not you!
Likewise, any inner simple region contained in G+ can also be expanded to become an inner
“bar chart” region like below:

The inner measure of G+ is defined as

µ⇤ (G+ ) := sup{A(S) : S ⇢ G+ and S is simple}.

By similar rationale as the outer measure, one can simply take the supremum over all ”bar chart”
regions contained in G+ only.
To describe these “bar chart” regions in a more precise way, we can define a partition of [a, b]:

P : a := x0 < x1 < · · · < xn =: b,


48 Integrations

and its associated outer and inner “bar chart” regions are respectively
n
[
TP := [xi 1 , xi ] ⇥ [0, Mi ] where Mi = sup{f (x) : x 2 [xi 1 , xi ]},
i=1
[n
SP := [xi 1 , xi ] ⇥ [0, mi ) where mi = inf{f (x) : x 2 [xi 1 , xi ]}.
i=1

The areas of TP and SP are often called respectively the upper Darboux sum and lower
Darboux sum of f with respect to partition P , which are denoted by:
n
X
U (P, f ) := A(TP ) = Mi (xi xi 1)
i=1
Xn
L(P, f ) := A(SP ) = mi (xi xi 1 ).
i=1

As discussed, the outer and inner measures of G+ can be defined by taking the sup and inf
over all “bar chart” regions (which can be described using partitions of [a, b]), so we have:

µ⇤ (G+ ) = inf{U (P, f ) : P is a partition of [a, b]},


µ⇤ (G+ ) = sup{L(P, f ) : P is a partition of [a, b]}.

Definition 4.4 — Upper and Lower Darboux Integrals. Let f : [a, b] ! R be a bounded function
(not necessarily non-negative), we define and denote the upper and lower Darboux integrals
as:
ˆ b
f (x) dx := inf{U (P, f ) : P is a partition of [a, b]}
a
ˆ b
f (x) dx := sup{L(P, f ) : P is a partition of [a, b]}
a

The upper and lower Darboux integrals can be defined on any bounded function f on [a, b],
not only on non-negative functions. But if f is non-negative and bounded on [a, b], we then have:
ˆ b ˆ b
f (x) dx = µ⇤ (G+ ) f (x) dx = µ⇤ (G+ ).
a a

ˆ 1
⌅ Example 4.2 Show that f (x) = x2 is Riemann integrable on [0, 1] and find x2 dx from
0
the definition.

⌅ Solution Similar to proving a region in R2 is Jordan measurable, we will construct a sequence


of partitions Pn of [0, 1] such that U (Pn , f ) and L(Pn , f ) converge to the same limit.
For each n 2 N, consider the partition
1 2 n 1
Pn : x0 := 0 < < < ··· < < 1 =: xn .
n n
|{z} |{z} n }
| {z
x1 x2 xn 1
4.2 Riemann Integrals 49

Then, for any i 2 {1, · · · , n}, we have

i2
Mi := sup x2 = ,
x2[ i n1 , n
i
] n2
(i 1)2
mi := inf x2 = .
x2[ i 1 i
n ,n]
n2

Hence,
n
X Xn
i2 1 1 n(n + 1)(2n + 1) 1 n + 1 2n + 1
U (Pn , f ) = Mi (xi xi 1) = 2
· = 3· = · · ,
i=1 i=1
n n n 6 6 n n
n
X n
X1 i2 1 (n 1)n(2n 1) 1 n 1 2n 1
L(Pn , f ) = mi (xi xi 1) = = 3· = · · .
i=1 i=0
n3 n 6 6 n n

It is easy to see that


1
lim U (Pn , f ) = lim L(Pn , f ) = .
n!1 n!1 3
By applying squeeze theorem on the inequality:
ˆ 1 ˆ 1
L(Pn , x2 )  x2 dx = µ⇤ (G+ )  µ⇤ (G+ ) = x2 dx  U (Pn , x2 ), 8n 2 N,
0 0

we get
1 1
1
ˆ ˆ
x2 dx = x2 dx = .
0 0 3
Hence, x is integrable on [0, 1] and we have
2

1
1
ˆ
x2 dx = .
0 3

⌅ Exercise 4.12 Let f : [a, b] ! R be a non-negativea , bounded function. Suppose there exists
a sequence of partitions Pn of [a, b] such that

lim U (Pn , f ) = lim L(Pn , f ) = I,


n!1 n!1

ˆ b
then f is Riemann integrable on [a, b] and f (x) dx = I.
a
a We will extend the result to any bounded function later.

⌅ Exercise 4.13 Show that ex is Riemann integrable on any closed and bounded interval [a, b],
ˆ b
and find ex dx.
a

⌅ Exercise 4.14 The following classic formula was discovered by Jacob Bernoulli in 1713:
p
1 X
1p + 2p + · · · + np = ( 1)j Cjp+1 Bj np+1 j
, p2N
p + 1 j=0
50 Integrations

where Bj ’s are so-called Bernoulli’s numbers given by:

1 1
B0 = 1, B1 = , B2 = , · · ·
2 6
The proof of the above formula can be found in some standard number theory or complex
analysis textbooks. Using this formula without proof, show that xp (where p 2 N) is Riemann
integrable on [0, 1] and that:
1
1
ˆ
xp dx = , where p is a positive integer
0 p+1

from the definition of integrals.

⌅ Exercise 4.15 First prove the formula:


✓ ◆
x x 1
2 sin · (sin x + sin 2x + · · · + sin nx) = cos cos n + x
2 2 2

for any x 2 R and n 2 N. Hence, show that sin x is Riemann integrable on [0, ⇡], and find the
value of ˆ ⇡
sin x dx.
0

⌅ Example 4.3 Consider the function f : [0, 1] ! R by:


8
>
<0 if x is irrational
f (x) = 1 if x = 0
>
:1
n if x = m n 2 Q in the most simplified form (m, n 2 N)

For instance, we have f ( 23 ) = 13 , f ( 14


8
) = f ( 47 ) = 17 . Show that f is Riemann integrable on
ˆ 1
[0, 1] and f (x) dx = 0.
0

⌅ Solution Let Pn be the partition 0 < < n2 < · · · < nn 1 < nn = 1 where n 5 is prime. For
1
n
any i = 0, 1, . . . , n 1, the interval [i/n, (i + 1)/n] must contain at least one irrational number,
so we must have:
inf f = 0.
[i/n,(i+1)/n]

This immediately shows L(Pn , f ) = 0.


Next we estimate U (Pn , f ) from above. The rational numbers r in [0, 1] that give the largest
output f (r) are given in descending order by:

1 1 2 1 3 1 2 3 4
{rj }1
j=1 = 0, 1, , , , , , , , , , · · ·
2 3 3 4 4 5 5 5 5

as the outputs are:



1 1 1 1 1 1 1 1 1
{f (rj )}1
j=1 = 1, 1, , , , , , , , , , · · ·
2 3 3 4 4 5 5 5 5

The worst scenario for U (Pn , f ) is there is exactly one of {r1 , r2 , · · · , rn } in each [i/n, (i+1)/n],
4.2 Riemann Integrals 51

i = 0, 1, · · · , n 1, and so
1
U (Pn , f )  (f (r1 ) + · · · + f (rn )).
n
Here is why we want n to be a prime: note that when n 5 is a prime, it is impossible for
n = rj for any i = 0, 1, · · · , n 1 and j = 1, 2, · · · , n. That avoids rj , where 1  j  n, to be
i

contained in both [(i 1)/n, i/n] and [i/n, (i + 1)/n].


Let k = k(n) be the denominator of rn , i.e. k is the unique integer such that:

1 + '(1) + '(2) + · · · + '(k 1)  n < 1 + '(1) + '(2) + · · · + '(k),

where '(j) is the number of positive integers coprime to j. Then, we have


✓ ◆
1 '(1) '(2) '(k) 1 + '(1) '(2)
1 + 2 + ··· + k
'(k)
U (Pn , f )  1+ + + ··· +  .
n 1 2 k 1 + '(1) + '(2) + · · · + '(k 1)

From number theory, we have the following results:

'(1) '(2) '(k) 6k ⇣ ⌘


+ + ··· + = 2 + O (log k)2/3 (log log k)4/3 ,
1 2 k ⇡
3(k 1)2 ⇣ ⌘
'(1) + '(2) + · · · + '(k 1) = 2
+ O (k 1)(log(k 1))2/3 log log(k 1))4/3 .

Using these asymptotics, one can easily show that

lim U (Pn , f ) = 0,
n!1

as U (Pn , f ) behaves like ⇠ 1


k as n ! 1.

4.2.2 Non-Riemann integrable function: an example


The function below can be shown to be not Riemann integrable on [0, 1]:
(
1 if x 2 Q
Q := .
0 otherwise

To prove this, we consider an arbitrary partition P of [0, 1] with partition points denoted by xi ’s.
As each closed interval (with positive length) contains a rational number, so we have for any i

sup Q = 1,
[xi 1 ,xi ]

which implies
n
X
U (P, Q) = (xi xi 1) = xn x0 = 1 0 = 1.
i=1

However, each closed interval [xi 1 , xi ] must also contain an irrational number, so for any i we
also have
inf Q = 0,
[xi 1 ,xi ]

and so L(P, Q ) = 0.
This proves
ˆ 1
µ⇤ (G+ ( Q) = Q (x) dx = inf{U (P, Q) : P is a partition of [0, 1]} = 1
0

while ˆ 1
µ ⇤ G+ ( Q) = Q (x) dx = sup{L(P, Q) : P is a partition of [0, 1]} = 0.
0
52 Integrations

Therefore, Q is not Riemann integrable on [0, 1].


ˆ 1
Q (x) dx is therefore undefined in Riemann’s sense. However, in MATH 3033/3043, we
0
will introduce a more refined type of integrals, called Lebesgue integrals, that would allow us to
integrate Q over [0, 1] in some other sense.

4.2.3 General bounded functions


Now we discuss the definition of Riemann integral for general bounded functions on [a, b] which
are not necessarily non-negative. It is not simply repeating our treatment for G+ and applying
similar rationale on G , because the regions G+ and G must be too complicated for a general
function. Consider the function
(
sin x1 if x > 0
f (x) = .
0 if x = 0

Both G+ and G have infinitely many disjoint regions (try to sketch a graph to see this!).
Consider f : [a, b] ! R which is bounded, and so one can make sense of inf x2[a,b] f (x) =: m.
Here we assume m < 0 otherwise the function f is non-negative – we have discussed that before.
Note that then f (x) m 0 for any x 2 [a, b]. One good observation is that

µ G+ (f ) µ G (f ) = µ G+ (f m) + m(b a). (4.1)

From now on we will abbreviate G+[a,b] and G[a,b] by G and G if the interval involved is clear
+

from the context.


Equation (4.1) can be proved by first using the translation invariance of Jordan measure, so
that
µ G+ (f m) = µ {(x, y) : m  y  f (x)} .
| {z }
G+ (f m)+m=:⌦

Note that ⌦ = G+ (f ) t [a, b] ⇥ [m, 0) \G (f ) , so we have

µ(⌦) = µ G+ (f ) + |m| (b a) µ G (f ) ,

and so (4.1) follows. Note that m < 0 in our case.


ˆ b
As the Riemann integral f (x) dx is defined as µ G+ (f ) µ G (f ) , and (4.1) relates
´b a
this integral with a f (x) m dx where f (x) m 0, we can carry over many results we proved
for non-negative bounded functions to general bounded functions via (4.1).
Given any partition P : a := x0 < x1 < · · · < xn := b, we define just as in the non-negative
4.2 Riemann Integrals 53

case the upper and lower Darboux sums by:


n
X
U (P, f ) := sup f · (xi xi 1 ),
i=1 [xi 1 ,xi ]

Xn
L(P, f ) := inf f · (xi xi 1 ).
[xi 1 ,xi ]
i=1

Then, by observing that

sup(f m) = (sup f ) m and inf (f m) = (inf f ) m,


I I I I

we can easily deduce that


n
! n
X X
U (P, f m) = sup f m (xi xi 1) = U (P, f ) m (xi xi 1)
i=1 [xi 1 ,xi ] i=1
= U (P, f ) m(b a),
Xn ✓ ◆ n
X
L(P, f m) = inf f m (xi xi 1) = L(P, f ) m (xi xi 1)
[xi 1 ,xi ]
i=1 i=1
= L(P, f ) m(b a)

Using (4.1) and the above relations, one can then extend the result of Exercise 4.12 to general
bounded functions:
Proposition 4.7 Let f : [a, b] ! R be a bounded function. Suppose there exists a sequence of
partitions {Pn }1
n=1 of [a, b] such that

lim U (Pn , f ) = lim L(Pn , f ) = I,


n!1 n!1

ˆ b
then f is Riemann integrable and f (x) dx = I.
a

Proof. Denote m = inf [a,b] f . Recall that

U (Pn , f m) = U (Pn , f ) m(b a) and L(Pn , f m) = L(Pn , f ) m(b a),

so we have
lim U (Pn , f m) = lim L(Pn , f m) = I m(b a).
n!1 n!1

Note that f (x) m 0 for any x 2 [a, b], so by Exercise 4.12 we conclude f (x) m is Riemann
integrable on [a, b], and so G+ (f m) is Jordan measurable and we have
ˆ b
(f (x) m) dx = µ G+ (f m) = I m(b a).
a

By translational invariance, ⌦ := G+ (f m) + m is also Jordan measurable with µ(⌦) =


I m(b a). Then, G+ (f ) = ⌦ \ ([a, b] ⇥ [0, sup f ]) is also Jordan measurable (here sup f
means sup[a,b] f ), and G (f ) = ⌦ \ ([a, b] ⇥ [m, 0]) is also Jordan measurable. This shows, by
the definition of Riemann integrals, that f is Riemann integrable on [a, b]. As for the value of the
integral, we can use (4.1) to prove that:
ˆ b
f (x) dx = µ G+ (f ) µ G (f ) = µ G+ (f m) + m(b a) = I.
a


54 Integrations

p
⌅ Example 4.4 Show
p that x is Riemann integrable on [ 1,
3
2], and find the value of the
integral over [ 1, 2].

⌅Solution Here we choose non-uniform partitions so that we can always make 0 as one of the
partition points. For each n 2 N, we define
1 2 n 1 1p 2p n 1p p
Pn : 1< 1+ < 1+ < ··· < 1+ <0< 2< 2 < ··· < 2 < 2.
n n n n n n
One can then compute that
✓ ◆3 ✓ ◆3 ✓ ◆3 !
3 1 1 2 n 1 3
U (Pn , x ) = 1+ + 1+ + ··· + 1+ +0
n n n n
p ✓ ◆3 ✓ ◆3 ✓ ◆3 ⇣ !
2 1p 2p n 1p n p ⌘3
+ 2 + 2 + ··· + 2 + 2
n n n n n
p
1 3 3 3 3 2 · 23/2 3
= 4
(0 + 1 + 2 + · · · + (n 1) ) + (1 + 23 + · · · + n3 )
n n4
1 (n 1)2 n2 4 n2 (n + 1)2 3
= 4
· + 4· !
n 4 n 4 4
as n ! 1.
Similarly, we have
p
1 3 2 · 23/2 3
L(Pn , x3 ) = 4
(1 + 2 3
+ · · · + n 3
) + (0 + 13 + 23 + · · · + (n 1)3 )
n n4
1 n2 (n + 1)2 4 (n 1)2 n2 3
= 4
· + 4· !
n 4 n 4 4
as n ! 1. p
Using Proposition 4.7, we conclude that x3 is Riemann integrable on [ 1, 2] and
p
2
3
ˆ
x3 dx = .
1 4

⌅ Exercise 4.16 Show that for any p 2 N, the function f (x) := xp is Riemann integrable on
[a, b] for any real a < b. [Split the case into 0  a < b, a < 0  b and a < b  0.]

⌅ Exercise 4.17 Let f : [a, b] ! R be a bounded function. Prove that the following are
equivalent:
1. for any " > 0, there exists a partition P of [a, b] such that

U (P, f ) L(P, f ) < ".

2. there exists a sequence of partitions {Pn }1


n=1 of [a, b] such that

lim (U (Pn , f ) L(Pn , f )) = 0.


n!1

3. there exists a sequence of partitions {Pn }1n=1 of [a, b] so that U (Pn , f ) and L(Pn , f )
converge, and
lim U (Pn , f ) = lim L(Pn , f )
n!1 n!1

4. f is Riemann integrable on [a, b] (i.e. G[a,b] (f ) is Jordan measurable)


4.2 Riemann Integrals 55
ˆ b ˆ b
5. f (x) dx = f (x) dx (see Definition 4.4)
a a
Suppose any one of the above (and hence all) holds, show that then any sequence of partition
{Pn } of [a, b], such that both U (Pn , f ) and L(Pn , f ) converge to the same limit, must satisfy:
ˆ b ˆ b ˆ b
lim L(Pn , f ) = f (x) dx = f (x) dx = f (x) dx = lim U (Pn , f ).
n!1 a a a n!1

⌅ Exercise 4.18 Show that any monotone bounded function on [a, b] must be Riemann inte-
grable on [a, b].

i In view of Exercise 4.17, some textbooks would take one of (1)-(5) in that exercise to be the
definition of Riemann integrability. The most common one seems to be (5).

4.2.4 Continuous functions


Using the results in Exercise 4.17, one can prove that continuous functions on [a, b] must be
Riemann integrable. For that we need to introduce a concept of uniform continuity.
To give some motivation, let’s consider the function f (x) = ex . It is well-known to be
continuous at every a 2 R, meaning that 8" > 0, 9 > 0 such that whenever |x a| < ,
|ex ea | < ". Let’s think about what depends on? Certainly the smaller " is, the smaller is
needed. Furthermore, also depends on a because the larger the a, the steeper the graph y = ex
near a, so a smaller is needed. This can be seen using the mean value theorem:
|ex ea |  eb |x a|
where b 2 (a, x) or (x, a). If |x a| < , then we have
}
eb |x a|  emax{a,x}  emax{a,a+ .
To choose such that e < ", it is impossible to make it independent of a.
max{a,a+ }

Uniform continuity is a stronger notion of continuity in which the choice of does not depend
on a specific point in the domain. Precisely, we have:
Definition 4.5 — Uniform Continuity. Let f : I ! R be a function defined on an interval
I = ha, bi. We say f is uniformly continuous on I if 8" > 0, there exists > 0 which does not
depend on x, y 2 I, such that whenever x, y 2 I and |x y| < , we have |f (x) f (y)| < ".

⌅ Example 4.5 Any differentiable function f : I ! R with bounded f 0 on I is uniformly


continuous on I. To prove this, we let |f 0 (x)|  M for any x 2 I, then for any x, y 2 I, with
x 6= y, the mean value theorem shows there exists ⇠ 2 (x, y) or (y, x) such that

|f (x) f (y)|  |f 0 (⇠)| |x y|  M |x y| .

8" > 0, we choose = M +1 ,


"
then whenever x, y 2 I and |x y| < , we have

M"
|f (x) f (y)|  M < < ".
M +1
Note that this does not depend on x and y. Therefore, f is uniformly continuous on I.

⌅ Example 4.6 The function f (x) = ex is not uniformly continuous on R. To see this, we
assume on the contrary that it is so. Then, by taking " = 1, there exists > 0 such that
whenever |x y| < , we have |ex ey | < 1. Consider the sequences xn = n and yn = n + n1 .
1
For any n > 1 , we have |xn yn | = 1
n < , and so en en+ n < 1. By mean value theorem,
56 Integrations

there exists zn 2 (xn , yn ) such that

1 1 en
en en+ n = ezn · .
n n
However, this would show
en
<1
n
n
for any n > 1 . It is a contradiction as en ! +1 as n ! 1. Therefore, ex is not uniformly
continuous on R.
However, ex is uniformly continuous on any bounded interval by the previous example, as
it has bounded derivative on any bounded interval.

i This above example of ex shows that whether a function is uniform continuous depends
on the domain. A function can be uniformly continuous on a smaller domain but not on a
larger one. Therefore, it is crucial the specify the domain, such as f is uniformly continuous
on (a, b], when we mention about uniform continuity.

⌅ Exercise 4.19 Show that x2 is uniformly continuous on any bounded interval, but not on R.

One important fact relating Riemann integrals of continuous functions is that continuous
functions on any closed and bounded interval must be uniformly continuous on that interval.
Proposition 4.8 Any continuous function f : [a, b] ! R on a closed and bounded interval [a, b]
must be uniformly continuous on [a, b].

Proof. The proof is to use Bolzano-Weierstrass’s Theorem. Assume it is not true that f is uniformly
continuous on [a, b], then 9"0 > 0 such that 8 > 0, there exists x , y 2 [a, b] with |x y |<
but |f (x ) f (y )| "0 .
In particular, for any n 2 N, there exists xn , yn 2 [a, b] with |xn yn | < n1 but |f (xn ) f (yn )|
"0 .
As [a, b] is closed and bounded, there exist convergent subsequences {xnk }1 k=1 and {ynk }k=1 .
1

Since |xnk ynk | < nk for any k, we have lim xnk = lim ynk . Denote the limit by L, and by
1
k!1 k!1
closedness of [a, b] we have L 2 [a, b] too.
Recall that f is continuous on [a, b], and in particular at L, so we have

lim f (xnk ) = f (L) and lim f (ynk ) = f (L).


k!1 k!1

However, that would imply

0 < "0  lim |f (xnk ) f (ynk )| = |f (L) f (L)| = 0


k!1

which is clearly absurd.


This proves f must be uniformly continuous on [a, b]. ⌅

Proposition 4.9 Any continuous function f on a closed and bounded interval [a, b] must be
Riemann integrable on [a, b].

Proof. By Proposition 4.8, f is uniformly continuous on [a, b]. Hence, for any " > 0, there exists
> 0 such that whenever x, y 2 [a, b] and |x y| < , we have |f (x) f (y)| < ".
Now, take a partition P , with partition points {xi }ni=0 of [a, b] such that each subdivision
[xi 1 , xi ] has length < , then we have for any x, y 2 [xi 1 , xi ], |f (x) f (y)| < 2(b" a) . This
shows
"
sup f inf f  .
[xi 1 ,xi ] [x ,x
i 1 i ] 2(b a)
4.2 Riemann Integrals 57

Then, we have
n
! n
X " X
U (P, f ) L(P, f ) = sup f inf f (xi xi 1) < (xi xi 1) = ".
i=1 [xi 1 ,xi ]
[xi 1 ,xi ] b a i=1

By Exercise 4.17, f is Riemann integrable on [a, b]. ⌅

4.2.5 Properties of Riemann integrals


There are several properties about Riemann integrals that we will frequently use:
Proposition 4.10 — Properties of Riemann Integrals. Let f : [a, b] ! R and g : [a, b] ! R be
bounded functions. Then,
1. Fix any c 2 (a, b). If f is Riemann integrable on [a, c] and on [c, b], then f is Riemann
integrable on [a, b] and
ˆ b ˆ c ˆ b
f (x) dx = f (x) dx + f (x) dx.
a a c

2. If f is Riemann integrable on [a, b], then so does cf for any c 2 R and


ˆ b ˆ b
cf (x) dx = c f (x) dx.
a a

3. If f and g are both Riemann integrable on [a, b], then so does f + g and we have
ˆ b ˆ b ˆ b
(f (x) + g(x)) dx = f (x) dx + g(x) dx.
a a a

4. If f and g are Riemann integrable on [a, b] and f (x)  g(x) for any x 2 [a, b], then
ˆ b ˆ b
f (x) dx  g(x) dx.
a a

5. If f is Riemann integrable on [a, b], then so does |f | and


ˆ b ˆ b
f (x) dx  |f (x)| dx.
a a

Proof. (1) follows from the fact that G± ± ± ±


[a,b] (f ) = G[a,c] (f ) [ G[c,b] (f ), and so if G[a,c] (f ) and
G± ±
[c,b] (f ) are Jordan measurable, then the union G[a,b] (f ) is also Jordan measurable. The
± ±
intersection G[a,c] (f ) \ G[c,b] (f ) is a line segment, so it has Jordan measure zero. This proves

µ G± ± ± ± ±
[a,b] (f ) = µ G[a,c] (f ) [ G[c,b] (f ) = µ G[a,c] (f ) + µ G[c,b] (f ) ,

which implies
ˆ b
f (x) dx = µ G+
[a,b] (f ) µ G[a,b] (f )
a
= µ G+ +
[a,c] (f ) + µ G[c,b] (f ) µ G[a,c] (f ) µ G[c,b] (f )
ˆ c ˆ b
= f (x) dx + f (x) dx.
a c

(2) follows directly from


( (
cU (P, f ) if c 0 cL(P, f ) if c 0
U (P, cf ) = and L(P, cf ) = ,
cL(P, f ) if c < 0 cU (P, f ) if c < 0
58 Integrations

and the results proved in Exercise 4.17.


To prove (3), we make sure of the fact that

sup(f + g)  sup f + sup g, and inf (f + g) inf f + inf g.


I I I I I I

[The proof of these is trivial: supI f + supI g is an upper bound of f + g, and inf I f + inf I g is a
lower bound of f + g.]
This shows for any partition P of [a, b], we have:

U (P, f + g)  U (P, f ) + U (P, g),


L(P, f + g) L(P, f ) + L(P, g).

Given that both f and g are Riemann integrable on [a, b], by Exercise 4.17, there exist sequences
of partitions {Pn }1
n=1 and {Qn }n=1 of [a, b] such that
1

lim U (Pn , f ) L(Pn , f ) = 0 and lim U (Qn , g) L(Qn , g) .


n!1 n!1

Consider the sequence of partition Rn := Pn [ Qn (i.e. mixing the partition points of Pn and Qn
are create a more refined partition), one can show that U (Rn , f ) L(Rn , f )  U (Pn , f ) L(Pn , f )
and U (Rn , g) L(Rn , g)  U (Qn , g) L(Qn , g). See Exercise 4.20.
Combining with previous results, we get

U (Rn , f + g) L(Rn , f + g)  U (Rn , f ) + U (Rn , g) L(Rn , f ) L(Rn , g)


 U (Pn , f ) L(Pn , f ) + U (Qn , g) L(Qn , g)
| {z } | {z }
!0 !0

as n ! 1. By Exercise 4.17, f + g is Riemann integrable on [a, b]. To prove the additivity of


integrals, we take a subsequence {Rnk }1
k=1 of {Rn }n=1 such that all of the following converge as
1

k ! 1:

U (Rnk , f ), L(Rnk , f ), U (Rnk , g), L(Rnk , g), U (Rnk , f + g), L(Rnk , f + g).

It is possible by Bolzano-Weierstrass’s Theorem (note that f and g are bounded). According to


Exercise 4.17, we have
ˆ b
f (x) dx = lim U (Rnk , f ) = lim L(Rnk , f )
a k!1 k!1

and the same for g.


Then by Exercise 4.20 below, we have for any k:

L(Rnk , f ) + L(Rnk , g)  L(Rnk , f + g)


ˆ b ˆ b
 (f (x) + g(x)) dx  (f (x) + g(x)) dx
a a

 U (Rnk , f + g)  U (Rnk , f ) + U (Rnk , g).

Letting k ! 1, we get
ˆ b ˆ b ˆ b ˆ b ˆ b ˆ b
f (x) dx+ g(x) dx  (f (x)+g(x)) dx  (f (x)+g(x)) dx  f (x) dx+ g(x) dx.
a a a a a a

This proves (3) completely.


(4) is obvious by the fact that if f (x)  g(x) for any x 2 [a, b], then supI f  supI g for any
interval I ⇢ [a, b] and so U (P, f )  U (P, g) for any partition P of [a, b].
For (5), note that G(|f |) = G+ (f ) [ Rx G (f ) where Rx : R2 ! R2 is the reflection
about the x-axis, which is an isometry. If f is Riemann integrable, then G+ (f ) and G (f ) are
4.2 Riemann Integrals 59

Jordan measurable, so Rx G (f ) and hence G(|f |) is also Jordan measurable. This shows |f | is
Riemann integrable on [a, b]. The inequality
ˆ b ˆ b
f (x) dx  |f (x)| dx
a a

follows directly from |f (x)|  f (x)  |f (x)| and the use of (4). ⌅

i Combining (2) and (3) of Proposition 4.10, i.e. by taking c = 1 in (2), one can also prove
that if f and g are both Riemann integrable on [a, b], then so does f g and we have
ˆ b ˆ b ˆ b
(f (x) g(x)) dx = f (x) dx g(x) dx.
a a a

i When a > b, we would define


ˆ b ˆ a
f (x) dx := f (x) dx.
a b

Using this definition, one can easily show that (1) of Proposition 4.10 also holds if even c is
not in (a, b).

⌅ Example 4.7 Suppose f is continuous on [a, b], hence Riemann integrable on [a, b]. Show
that there exists c 2 [a, b] such that
b
1
ˆ
f (c) = f (x) dx.
b a a

⌅ Solution By extreme value theorem, f achieves its maximum and minimum on [a, b]. Let

M := sup f = f (x1 ) and m := inf f = f (x2 )


[a,b] [a,b]

for some x1 , x2 2 [a, b].


Then by f (x)  M for any x 2 [a, b], (4) in Proposition 4.10 shows
ˆ b ˆ b
f (x) dx  M dx = M (b a).
a a

Similarly by f (x) m for any x 2 [a, b], we have


ˆ b ˆ b
m(b a)  m dx  f (x) dx.
a a

This shows
b
1
ˆ
f (x2 ) = m  f (x) dx  M = f (x1 ).
b a a

As f is continuous, intermediate value theorem shows there exists c between x1 and x2 such
that ˆ b
1
f (c) = f (x) dx.
b a a

⌅ Exercise 4.20 Show that for any bounded function f : [a, b] ! R and any partition P of [a, b],
60 Integrations

if we add one partition point c to P , and denote P 0 = P [ {c}, then

L(P, f )  L(P 0 , f )  U (P 0 , f )  U (P, f ).

⌅ Exercise 4.21 Let I be a closed and bounded interval, and J ⇢ I be another closed and
bounded interval. Show that if f is Riemann integrable on I, then it is also Riemann integrable
on J.
Assume further that f (x) 0 on [a, b], show that
ˆ ˆ ⌘
f (x) dx  f (x) dx

if [↵, ] ⇢ [ , ⌘] ⇢ [a, b].

⌅ Exercise 4.22 Show that if |f (x)|  M for any x 2 [a, b], then

f (x)2 f (y)2  2M |f (x) f (y)| 8x, y 2 [a, b].

Hence, show that if f is Riemann integrable on [a, b], then so does f 2 .


Using this and the properties of Riemann integrabs proven, show that if f, g are bounded
Riemann integrable functions on [a, b], then so does f g.
4.3 Fundamental Theorem of Calculus 61

4.3 Fundamental Theorem of Calculus


4.3.1 Newton-Leibniz’s formula
In previous sections we have established the rigorous definition of Riemann integrals. In particular,
we proved that any continuous function on [a, b] must be Riemann integrable. However, it is rather
ˆ b
impractical to compute f (x) dx via taking a sequence of partitions {Pn }, as we have seen that

1
even the computation of xp dx where p 2 N could involve some summation formulae.
0
The Fundamental Theorem of Calculus links the Riemann integral of a continuous function
with its anti-derivatives, and provides us a very effective way of computing the value of the
integral.
Theorem 4.11 — Fundamental Theorem of Calculus. Let f be continuous on [a, b], then we
have x
d
ˆ
f (t) dt = f (x) for any x 2 [a, b]. (4.2)
dx a

Furthermore, if F is a differentiable function such that F 0 (x) = f (x) for any x 2 [a, b], then
ˆ b
f (x) dx = F (b) F (a). (4.3)
a

(4.3) is known as the Newton-Leibniz’s Formula. The function F is called an anti-derivative,


or a primitive function, of f .

´x ´x
i Note that a f (t) dt is a function of x, not of t. We use´ t inside the integral a f (t) dt because
x
x has appeared as the upper bound of the integral a . You can use any other variable too
(except x). We usually call t as the dummy variable.

Proof. To prove (4.2), we consider the definition of derivatives


ˆ x+h ˆ x
x f (t) dt f (t) dt
d
ˆ
a a
f (t) dt = lim
dx a h!0 h
x+h
1
ˆ
= lim f (t) dt.
h!0 h x

The last step follows from (1) of Proposition 4.10.


By the result of Example 4.7, there exists c between x and x + h such that
x+h x+h
1 1
ˆ ˆ
f (t) dt = f (t) dt = f (c).
h x (x + h) x x

Note that this c depends on both x and h.


Letting h ! 0 (keeping x fixed), by c 2 [x, x + h] or [x + h, x], we have c ! x and so by
continuity of f we get
lim f (c) = f (x).
h!0

This proves (4.2).


For (4.3), we consider
✓ˆ x ◆
d
f (t) dt F (x) = f (x) f (x) = 0 for any x 2 [a, b]
dx a

according to (4.2) and the given condition about F .


62 Integrations

The only functions with derivatives are constant function (a consequence of the mean value
theorem). Therefore, there exists C 2 R such that
ˆ x
f (t) dt F (x) = C for any x 2 [a, b].
a

In particular, putting x = a we get:


ˆ a
f (t) dt F (a) = C =) C = F (a).
| a {z }
=0

Therefore, we have
ˆ x
f (t) dt = F (x) + C = F (x) F (a) for any x 2 [a, b],
a

and in particular by putting x = b, we get (4.3). ⌅

i We often denote F (b) F (a) by [F (x)]ba , F (x)|ba .

Using (4.3), we can compute the integrals appeared in the previous section very easily –
simply find an anti-derivative.

ˆ 1  p+1 1
d xp+1 x 1p+1 0p+1 1
= xp where p 0 =) xp dx = = =
dx p + 1 0 p + 1 0 p + 1 p + 1 p + 1
ˆ ⇡
d
( cos x) = sin x =) sin x dx = [ cos x]⇡0 = ( cos ⇡) ( cos 0) = 2
dx 0
ˆ b
d x
e = ex =) ex dx = [ex ]ba = eb ea
dx a

Continuity is crucial when applying the Newton-Leibniz’s formula. The following absurd
result would come up if one applies (4.3) blindly on a discontinuous function:
1  1
1 1
ˆ
dx = = 2 (WRONG!)
1 x2 x 1

Clearly x12 > 0, so it is absurd for its Riemann integral being negative! The pitfall is that 1
x2 is not
continuous at 0 which lies in the interval [ 1, 1]. We cannot apply (4.3) directly!
However, it is perfectly fine to use (4.3) on
2  2
1 1 1 1
ˆ
dx = = ( 1) = ,
1 x2 x 1 2 2
1
1
ˆ
as 1
x2 is continuous on [1, 2]. We will discuss dx later as the function is unbounded on
0 x2
[0, 1]. It is an improper integral.

⌅ Exercise 4.23 Find the value of each integral below using Newton-Leibniz’s formula:
1
ex
ˆ
1. dx
ˆ0 ⇡ 1 + ex
2. x cos(x2 ) dx
ˆ0 b
3. sin(Ax + B) dx where A 6= 0 and B are constants.
a
4.3 Fundamental Theorem of Calculus 63

4.3.2 More uses of the Fundamental Theorem of Calculus


Let’s discuss more about the use of (4.2). First note that we stated (4.2) as
ˆ x
d
f (t) dt = f (x),
dx a

the lower bound a of the integral can be replaced by any other constant c as
ˆ x ˆ x ˆ c
f (t) dt = f (t) dt f (t) dt
c a a
ˆ c
and f (t) dt is a constant.
a ´x
However, one should note that the upper bound of the integral must be otherwise one
should consider using the chain rule. Another issue is that (4.2) requires the integrand f (t) to be
independent of the differentiate variable x. Let’s see some examples:

⌅ Example 4.8 Find the derivative with respect to x of each function below. Assume that f is
continuous onˆR.2
x
1. F (x) = f (t) dt
ˆ0 x
2. G(x) = xf (t) dt
a
2
ˆ x
3. H(x) = f (t) dt
x

´ x2
⌅ Solution The upper bound of the integral for F (x) is , we should use the chain rule:

x2
d
ˆ
F 0 (x) = f (t) dt
dx 0
x2
d d 2
ˆ
= f (t) dt · x
d(x2 ) 0 dx
= f (x2 ) · 2x = 2xf (x2 ).

For G(x), the integrand xf (t) depends on x, so one must take it out from the integral first
before applying (4.2): ˆ x ˆ x
xf (t) dt = x f (t) dt.
a a

The above holds because x is independent of the integration variable t. Then,


✓ ˆ x ◆
d
G0 (x) = x f (t) dt
dx
ˆ a
dx x
ˆ x
d
= f (t) dt + x f (t) dt
dx dx a
ˆ x a
= f (t) dt + xf (x).
a

We cannot proceed further because f is not explicitly given.


For H(x), note that the lower bound is also a function of x, so we first rewrite the integral
as: ˆ 2 ˆ 2 x ˆ x x
H(x) = f (t) dt = f (t) dt f (t) dt.
x
|0 {z } 0

G(x)
64 Integrations

You can replace 0 by any other number provided that f is continuous on the interval of
integration. Then we have:

H 0 (x) = G0 (x) f (x) = 2xf (x2 ) f (x).

⌅ Exercise 4.24 Derive a formula for:


↵(x)
d
ˆ
f (t) dt
dx (x)

where f is continuous on R, and ↵, are differentiable on R.

⌅ Exercise 4.25 Let f : R ! (0, 1) be continuous function, and consider


✓ˆ x ◆2
tf (t) dt
g(x) := ˆ0 x .
f (t) dt
0

Prove that g is strictly increasing on (0, 1).

ˆ x
⌅ Exercise 4.26 — Source: HKAL 1994. Let f (x) = sin(cos t) dt.
1
(a) Show that f is injective on [0, ⇡/2).
d
(b) Find f 1 (x)
dx x=0

⌅ Exercise 4.27 — Source: HKAL 1997. Evaluate


✓ x ◆
1 1
ˆ
2
lim+ et dt .
x!0 x3 0 x2

⌅ Exercise 4.28 Let f : R ! R be a continuous function. Show that f satisfies the differential
equation
f 0 (x) = sin 1 + f (x)2 and f (0) = a
if and only if f satisfies the integral equation
ˆ x
f (x) = a + sin 1 + f (t)2 dt.
0

Let’s discuss more use of (4.2):


Proposition 4.12 Let f : [a, b] ! R be a non-negative continuous function. Suppose
ˆ b
f (x) dx = 0,
a

then f (x) ⌘ 0 on [a, b].


ˆ b
Proof. It is quite an expected result since f (x) dx is the area under the graph y = f (x) for a
a
non-negative function f . If the area is zero, the only possibility is the function is 0. As we also
assume f is continuous, we rule out those function which is 0 except a finite number of point too.
4.3 Fundamental Theorem of Calculus 65

To prove it rigorously, we consider the function


ˆ t
F (t) := f (x) dx.
a

By (4.2), we have F 0 (t) = f (t) 0. Hence F is increasing on [a, b]. However, we also note that
ˆ a ˆ b
F (a) = f (x) dx = 0 and F (b) = f (x) dx = 0 (given).
a a

Therefore, F (t) is identically zero on [a, b] since:

0 = F (a)  F (t)  F (b) = 0 8t 2 [a, b].

This prove f (t) = F 0 (t) = 0 on [a, b]. ⌅

⌅ Exercise 4.29 — Source: HKAL 1998. Answer the following questions:


(a) [This part just asked for the proof of Proposition 4.12, hence omitted here.]
(b) Let g be a continuous function on [a, b]. Suppose
ˆ b
g(x)u(x) dx = 0
a

for any continuous function u on [a, b], show that g(x) = 0 for all x 2 [a, b].
(c) Let h be a continuous function on [a, b]. Define
b
1
ˆ
A= h(t) dt.
b a a
ˆ b
(i) If v(x) = h(x) A for all x 2 [a, b], show that v(x) dx = 0.
ˆ b a ˆ b
(ii) If h(x)w(x) dx = 0 for any continuous function w on [a, b] satisfying w(x) dx =
a a
0, show that h(x) = A for all x 2 [a, b].

4.3.3 Indefinite integrals


ˆ b
In view of the Newton-Leibniz’s formula (4.3), we can evaluate a Riemann integral f (x) dx
a
by finding an anti-derivative of f . This relates the problem of finding area with (the reverse
process of) differentiations. Because of this connection, we introduce the notion of indefinite
integrals which symbolically looks like a Riemann integral but conceptually different:
Definition 4.6 — Indefinite Integrals. Suppose f is a function defined on an interval I, then the
indefinite integral of f is defined to be:
ˆ
f (x) dx := {F (x) : F 0 (x) = f (x) on I}.

If F0 is a particular anti-derivative of f , then any other anti-derivative F of f on I would


differ from F0 by a constant, then we also have
ˆ
f (x) dx := {F0 (x) + C : C is a real constant}.
ˆ
Usually, we abbreviate the above by f (x) dx = F0 (x) + C so that students who are not
taking honor calculus could understand the notation.
66 Integrations
ˆ b

i Naturally, f (x) dx will then be called a definite integral of f . It is computationally


a ˆ
similar to the indefinite integral f (x) dx in view of the Newton-Leibniz formula, but as a
math
ˆ major, you should be very clear about their conceptual difference. You should regards
f (x) dx as
✓ ◆ 1
d
f
dx
where d
dx
is regarded as an operator.

Here are some examples:


d
ˆ
sin x = cos x =) cos x dx = sin x + C
dx
✓ p

d x xp
ˆ
= xp where p 6= 1 =) xp dx = +C
dx p+1 p+1
d 1 1
ˆ
1 1
p = sin x =) p dx + sin x+C
dx 1 x2 1 x2
When writing an indefinite integral, we often implicitly assume that the domain of both f
and F is an interval I which is connected. Consider a function f defined on a disjoint union of
two intervals: (
x3 if x 2 (0, 1)
f (x) = .
x4 if x 2 (2, 3)
One anti-derivative of f is certainly
(
x4
4 if x 2 (0, 1)
F0 (x) = x5
,
5 if x 2 (2, 3)
but the others may be of the form
(
x4
4 + C1 if x 2 (0, 1)
F (x) = x5
,
5 + C2 if x 2 (2, 3)
where C1 and C2 are two real constants, so it is not necessarily of the form F0 (x) + C. Therefore
it would be problematic to say
ˆ
f (x) dx = F0 (x) + C.

When writing
1 1
ˆ
2
dx = + C,
x x
we should implicitly assume the domain involved is an interval not containing 0, such as ( 2, 1)
or [1, 3), but not ( 1, 1].
1
ˆ
The indefinite integral dx worths some discussion. On the interval (0, 1), an anti-
x
derivative of x is clearly log x, but log x is undefined if on the interval ( 1, 0). Instead, the
1

anti-derivative of x1 on the interval ( 1, 0) is log( x) because by chain rule:


d d d( x) 1 1
log( x) = log( x) · = · ( 1) = .
dx d( x) dx ( x) x
1
ˆ
Therefore, we have dx = log x + C when the domain interval in the context is a subset of
x
1
ˆ
(0, 1), while dx = log( x) + C when the domain is a subset of ( 1, 0). However, we often
x
write it in a unified way:
1
ˆ
dx = log |x| + C,
x
4.3 Fundamental Theorem of Calculus 67

so that it applies to both interval types. Again,


ˆ if the domain in the context is an interval like
1
( 1, 1), it does not make sense to talk about dx as the integrand x1 is undefined at 0.
x
Many of might have known that
ˆ
tan x dx = log |cos x| + C = log |sec x| + C.

Similarly, when writing this we implicitly assume the interval I involved is one that either
cos x > 0 on I, or cos x < 0 on I.
We should also be careful when the function f is piecewise defined, such as
(
ex if x 0
f (x) = .
1 if x < 0

It is NOT true that


(
ex + C if x 0
ˆ
f (x) dx = , where C is any real constant (WRONG!)
x+C if x < 0
or
(
ex + C1 if x 0
ˆ
f (x) dx = , where C1 , C2 are any real constants (WRONG!)
x + C2 if x < 0

The function (
ex + C if x 0
F (x) =
x+C if x < 0
is not even continuous at 0 as lim F (x) = C + 1 whereas lim F (x) = C. The same for func-
x!0+ x!0
tions (
ex + C1 if x 0
x + C2 if x < 0
unless C1 and C2 are some carefully chosen constants.
In fact, one of the anti-derivative of f should be
(
ex if x 0
F0 (x) = ,
x + 1 if x < 0

so we should write (
ex if x 0
ˆ
f (x) dx = F0 + C = +C
x+1 if x < 0
where C is any real constant.

⌅ Exercise 4.30 Compute the indefinite integral of the function:


(
x2 if x 0
f (x) = .
sin x if x < 0
ˆ
Also, compute |x| dx (take the domain to be R)

Analogous results of (2) and (3) in Proposition 4.10 for Riemann (i.e. definite) integrals also
hold for indefinite integral, such as
ˆ ˆ ˆ ˆ ˆ
cf (x) dx = c f (x) dx and f (x) + g(x) dx = f (x) dx + g(x) dx.
68 Integrations

The proof is much easier. To prove the second statement, we take anti-derivatives F of f , and G
of g. Then, we have
ˆ ˆ
f (x) dx + g(x) dx = F (x) + C1 + G(x) + C2

where C1 , C2 are any real constants. Since (F + G)0 = f + g by the linearity of differentiations,
F + G is an anti-derivative of f + g and so
ˆ
f (x) + g(x) dx = F (x) + G(x) + C3

where C3 is any real constant. We are only left to show

{C1 + C2 : C1 , C2 2 R} = {C3 : C3 2 R}

which is trivial (just prove both ⇢ and ).


4.4 Integration by Substitutions 69

4.4 Integration by Substitutions


In this and the next sections we discuss some common techniques of doing integrations, including
method of substitutions and integration by parts. Let’s start with the method of substitu-
tions:
Proposition 4.13 Suppose u = g(x) : [a, b] ! I is C 1 function on x 2 [a, b], and f (u) is
continuous on u 2 I with an anti-derivative F , then
ˆ b ˆ g(b)
f (g(x))g 0 (x) dx = f (u) du.
a g(a)

Proof. By the chain rule, we have

d
F (g(x)) = F 0 (g(x))g 0 (x) = f (g(x))g 0 (x)
dx
Therefore, F (g(x)) is an antiderivative of f (g(x))g 0 (x) on x 2 [a, b], and we have
ˆ x=b
f (g(x))g 0 (x) dx = F (g(b)) F (g(a)).
x=a

Moreover,
ˆ u=g(b)
f (y) dy = F (g(b)) F (g(a)).
u=g(a)

Combining the results, we have


ˆ b ˆ g(b)
f (g(x))g 0 (x) dx = f (u) du.
a g(a)

i An easy to remember this rule is to regard g 0 (x) dx as du (by the virtue of du = du


dx
dx, and
f (g(x)) as f (u). Also x = a, b corresponds to u = g(a), g(b) respectively.

⌅ Exercise 4.31 Prove the indefinite integral version of the substitution rule:
ˆ ˆ
f (g(x))g 0 (x) dx = f (u) du.

Here we implicitly assume that x and u lie on some intervals on which the conditions in
Proposition 4.13 hold.

⌅ Example 4.9 Consider


ˆ 2
x(2x2 + 3)2 dx.
0

We let f (u) = u2 and g(x) := 2x2 + 3, then g 0 (x) = 4x, so we write


2 2 g(2)  11
1 1 1 u3 113 33
ˆ ˆ ˆ
x(2x2 + 3)2 dx = (2x2 + 3)2 · |{z}
4x dx = u 2
|{z} du = = .
0 4 0 | {z } 4 g(0) 4 3 3 12
=f (g(x)) =g 0 (x) =f (u)
70 Integrations

Very often, we would write the above solution without defining so many functions f and g
but simply let u = 2x2 + 3. Instead of “creating” a term g 0 (x) dx, we compute

du 1
du = dx = 4x dx =) x dx = du,
dx 4
and when x = 0, u = 3; whereas when x = 2, u = 11. These combine to give:
2 2 11
1 2 113 33
ˆ ˆ ˆ
x(2x2 + 3)2 dx = (2x2 + 3)2 · x dx = u du = .
0 0 | {z } |{z} 3 4 12
=u2 = 14 du

For a simple integral like this example, we may even save the use of the letter u and just
write:
1 1
x dx = d(x2 ) = d(2x2 + 3),
2 4
and so we have ˆ 2
1 x=2
ˆ
x(2x2 + 3)2 dx = (2x2 + 3)2 d(2x2 + 3).
0 4 x=0
Then we just regard 2x2 + 3 as the integration variable, and simply integrate the square
function:  x=2
1 x=2 1 (2x2 + 3)3 113 33
ˆ
(2x2 + 3)2 d(2x2 + 3) = = .
4 x=0 4 3 x=0 4

Comparing the three ways of maneuvering the integration by substitution, the first one is
seldom used – it is only good for giving the precise statement of Proposition 4.13. The second
and third ones are more common, and the third one is often used for simple substitutions.

⌅ Exercise 4.32 Compute the following integrals:


ˆ b
1. x cos(x2 + 1) dx
ˆa b
4
2. x3 ex dx
ˆab
x
3. dx
a 1 + x2

4.4.1 Trigonometric functions

Many integration formulae of some trigonometric functions are derived using substitutions.
Below we will state the indefinite integral version. They can be applied to definite integrals as
long as the integrand is continuous on the integration interval.
Proposition 4.14
ˆ
tan x dx = log |cos x| + C = log |sec x| + C
ˆ
cot x dx = log |sin x| + C
ˆ
sec x dx = log |sec x + tan x| + C
ˆ
csc x dx = log |csc x + cot x| + C
4.4 Integration by Substitutions 71

Proof. We prove only the formulae for tan x and sec x, and leave the other two as an exercise.
sin x
ˆ ˆ
tan x dx = dx
cos x
1
ˆ
= d(cos x) (implicitly letting u = cos x)
cos x
= log |cos x| + C
1
= log + C = log |sec x| + C.
cos x
The formula for sec x involves a somewhat clever observation:
sec x(sec x + tan x)
ˆ ˆ
sec x dx = dx
sec x + tan x
sec2 x + sec x tan x
ˆ
= dx
sec x + tan x
1
ˆ
= d(tan x + sec x)
sec x + tan x
= log |sec x + tan x| + C.

⌅ Exercise 4.33 Prove that


ˆ
cot x dx = log |sin x| + C
ˆ
csc x dx = log |csc x + cot x| + C

To apply the above integral formulae on definite integrals, we need to make sure the function
is continuous on interval of integration.

ˆ ⇡/4
⇡/4
p
tan x dx = [log |sec x|]0 = log 2 (RIGHT)
0
ˆ ⇡

tan x dx = [log |sec x|]0 = 0 (WRONG!)
0

4.4.2 Trigonometric substitutions


Below are more examples of integration formulae:
Proposition 4.15

1 x
ˆ
1
p dx = sin
+C
a2 x2 a
1 1 x
ˆ
dx = tan 1 + C
a2 + x2 a a

where a > 0.

Proof. For the pa21 x2 integral, we write x = a sin u, which means we let u = sin a.
1 x
The range
of u is then ( ⇡/2, ⇡/2). Then, we have
dx = d(a sin u) = a cos u du.
The integrand becomes
1 1 1 1 1
p =p =p = =
a2 x2 a2 a2 2
sin u a2 cos2 u |a cos u| a cos u
72 Integrations

as a > 0 and cos u > 0 when u 2 ( ⇡/2, ⇡/2). These show


1 1 x
ˆ ˆ ˆ
1
p dx = a cos u du = 1 du = u + C = sin + C.
a2 x2 a cos x a
For the 1
a2 +x2 integral, we write x = a tan u, which means u = tan a.
1 x
Then, we have
1 1 1
dx = a sec2 u du and = 2 = 2 .
a2 + x2 a (1 + tan2 u) a sec2 u
Combining both, we get:
1 1 1 1 1 x
ˆ ˆ ˆ
2 1
dx = a sec u du = du = u + C = tan + C.
a2 + x2 a2 sec2 u a a a a

i The choice of the trigonometric functions above is motivated by the formulae sin2 x+cos2 = 1
and 1 + tan2 x = sec2 x.

⌅ Exercise 4.34 Prove that integration formula below:


ˆ
1 p
p dx = log x + x2 + a2 + C
x2 + a2
1 1 x a
ˆ
dx = log +C
x2 a2 2a x+a

⌅ Exercise 4.35 Show that for any r > 0, we have


ˆ r p ⇡r2
r2 x2 dx = .
0 4

This shows the area of the circle with radius r is ⇡r2 .

4.4.3 More uses of integration by substitutions


One can also use the integration by substitution to prove some general results about definite
integrals.
Proposition 4.16 Let f : R ! R be a periodic continuous function of period T , i.e. f (x + T ) =
f (x) for any x 2 R. Show that
ˆ b ˆ b+T
f (x) dx = f (x) dx.
a a+T

Proof. Let u = x + T , then dx = du. When x = a, u = a + T ; and when x = b, u = b + T .


Therefore, we have
ˆ b ˆ b ˆ u=b+T ˆ u=b+T
f (x) dx = f ((x + T ) T ) dx = f (u T ) du = f (u) du.
a a u=a+T u=a+T

Here we have used the fact that f (u T ) = f (u T + T ) = f (u). Note that the u in the last
integral is dummy, so we can change it back to x:
ˆ u=b+T ˆ x=b+T
f (u) du = f (x) dx.
u=a+T x=a+T

This completes our proof. ⌅


4.4 Integration by Substitutions 73

Proposition 4.17 For any continuous odd function f , i.e. f ( x) = f (x) for any x 2 R, we
have: ˆ a
f (x) dx = 0 for any a > R.
a

For any continuous even function g, i.e. g( x) = g(x) for any x 2 R, we have:
ˆ a ˆ a
g(x) dx = 2 g(x) dx for any a > 0.
a 0

Proof. For the odd function result, we need to show


ˆ 0 ˆ a
f (x) dx = f (x) dx.
a 0

We let u = x, then when x = 0, u = 0; and when x = a, u = a. Therefore,


ˆ x=0 ˆ u=0 ˆ 0 ˆ a ˆ a
f (x) dx = f ( u)( du) = f (u)( du) = f (u) du = f (x) dx.
x= a u=a a 0 0

For the even function result, we need to show


ˆ 0 ˆ a
g(x) dx = g(x) dx.
a 0

The proof is very similar: let u = x, then


ˆ x=0 ˆ u=0 ˆ 0 ˆ a ˆ a
g(x) dx = g( u)( du) = g(u)( du) = g(u) du = g(x) dx.
x= a u=a a 0 0

⌅ Example 4.10 — Source: HKAL 2002 Paper II9, excerpt. Let f : R ! [0, 1) be a periodic
function with period T .
(a) Prove that ˆ b+kT ˆ b
x kT x
e f (x) dx = e e f (x) dx
a+kT a

for any k 2 N.
ˆ nT
(b) Let In = e x
f (x) dx. Prove that
0

nT
1 e
In = T
I1
1 e
for any n 2 N.
(c) If l > 0 and n is the positive integer such that nT  l < (n + 1)T , prove that
nT l (n+1)T
1 e 1 e
ˆ
x
T
I1  e f (x) dx  T
I1 .
1 e 0 1 e

ˆ b
⌅ Solution (a) Consider the integral e x
f (x) dx. Let u = x + kT , then du = dx; and when
a
74 Integrations

x = a, u = a + kT ; when x = b, u = b + kT . Therefore,
ˆ b ˆ u=b+kT
x (u kT )
e f (x) dx = e f (u kT ) du
a u=a+kT
ˆ b+kT
= ekT e u
f (u) du (since f (u kT ) = f (u)
a+kT
ˆ b+kT
ekT
= |{z} e x f (x) dx
a+kT
constant | {z }
change dummy vars

By rearrangement, we get:
ˆ b+kT ˆ b
x kT x
e f (x) dx = e e f (x) dx.
a+kT a

(b) Note that


ˆ T ˆ 2T ˆ nT
x x x
In = e f (x) dx + e f (x) dx + · · · + e f (x) dx
0 T (n 1)T
ˆ T +T ˆ T +(n 1)T
x x
= I1 + e f (x) dx + · · · + e f (x) dx
0+T 0+(n 1)T
ˆ T ˆ T ˆ T
T x 2T x (n 1)T x
= I1 + e e f (x) dx + e e f (x) dx + · · · + e e f (x) dx
0 0 0
T 2T (n 1)T
= I1 + e I1 + e I1 + · · · + e I1
T n
I1 (1 (e ) )
= T
.
1 e

The last step used the geometric series formula with common ratio e T
. This proves the result
in (b).
(c) Note that e x f (x) 0 as given. Therefore,
ˆ nT ˆ l ˆ (n+1)T
x x
e f (x) dx  e f (x) dx  e x f (x) dx
0 0 0
| {z } | {z }
In In+1

From (b), we conclude that


nT l (n+1)T
1 e 1 e
ˆ
x
T
I1  e f (x) dx  T
I1
1 e 0 1 e

as desired.

⌅ Exercise 4.36 — Source: HKAL 2012 Paper II Q8. Answer the following questions:
ˆ ⇡/2
1
(a) (i) Prove that dx = 1.
0 1 + sin x
ˆ ⇡/2
sin x
(ii) Evaluate dx.
0 1 + sin x
(b) Let f : [0, ⇡] ! R be a continuous function such that f (⇡ x) = f (x) for all x 2 [0, ⇡].
Using integration by substitution, prove that
ˆ ⇡ ˆ ⇡/2
f (x) dx = 2 f (x) dx.
0 0
4.4 Integration by Substitutions 75

(c) Let g : [0, ⇡] ! R be a continuous function such that g(⇡ x) = g(x) for all x 2 [0, ⇡].
Using the substitution u = ⇡ x, prove that
ˆ ⇡
1 ⇡
ˆ
g(x) log(1 + ecos x ) dx = g(x) cos x dx.
0 2 0
ˆ ⇡
cos x · log(1 + ecos x )
(d) Evaluate dx.
0 (1 + sin x)2

⌅ Exercise 4.37 Consider the integral:



1 a cos ✓
ˆ
Ia := d✓
0 1 2a cos ✓ + a2

where a 2 (0, 1)\{1}. This integral appears in the calculation of electric flux across a unit
sphere with a point charge either inside (a < 1) or outside (a > 1) the sphere. One elegant
way of computing Ia is to use complex analysis. This exercise is about a less elegant, but more
elementary, approach of evaluating Ia .
(a) Show that for any a 2 (0, 1)\{1} and ✓ 2 (0, ⇡).
1
1 a cos ✓ 1 1 a sec2 ✓2
2
= + ·⇣ ⌘2
1 2a cos ✓ + a2 2 1+a 1 a ✓
1+a + tan2 2

(b) Note that sec2 ✓2 and tan2 ✓2 are not both well-defined at ✓ = 0, ⇡, but 1 12a acos ✓+a2 is
cos ✓

defined and is continuous on the whole interval [0, ⇡]. Let ↵ 2 (0, ⇡2 ) and 2 ( ⇡2 , ⇡).
Using (a), compute
1 a cos ✓
ˆ
d✓.
↵ 1 2a cos ✓ + a2
(c) Hence, show that (
⇡ if a < 1
Ia =
0 if a > 1
76 Integrations

4.5 Integration by Parts


“Integration by parts” is another important technique of doing integrations. It is a consequence of
the product rule.

Proposition 4.18 — Integration by Parts. Let f, g : [a, b] ! R be two C 1 functions, then


ˆ b ˆ b
f (x)g 0 (x) dx = [f (x)g(x)]ba g(x)f 0 (x) dx.
a a

Proof. First recall that


d
f (x)g(x) = f 0 (x)g(x) + g 0 (x)f (x).
dx
By the Newton-Leibniz’s formula (4.3), we have:
ˆ b
f 0 (x)g(x) + g 0 (x)f (x) dx = [f (x)g(x)]ba .
a

The desired result follows immediately by rearrangement. ⌅

i If we let u = f (x) and v = g(x), then f 0 (x) dx can be regarded as du, and g 0 (x) dx as dv.
The integration by parts formula is often expressed as
ˆ x=b ˆ x=b
u dv = [uv]x=b
x=a v du.
x=a x=a

i With almost the same proof, the integration by parts formula has an indefinite integral
version: ˆ ˆ
f (x)g 0 (x) dx = f (x)g(x) g(x)f 0 (x) dx.

Using Proposition 4.18, we can now integrate log x. Letting f (x) = log x and g(x) = x, then
on an interval of all positive numbers, we have
ˆ ˆ
log x dx = x log x xd(log x)
1
ˆ
= x log x x · dx
x
ˆ
= x log x 1 dx

= x log x x + C.

⌅ Exercise 4.38 Using integration by parts, find the integrals:


2
log x
ˆ ˆ ˆ
2
1
dx, tan x dx, x3 ex dx.
1 x2

⌅ Example 4.11 Let’s integrate sec3 x – there is a small trick that is often useful when integrating
4.5 Integration by Parts 77

trigonometric functions.
ˆ ˆ
sec3 x dx = sec x · sec2 dx
ˆ
= sec xd(tan x)
ˆ
= sec x tan x tan xd(sec x)
ˆ
= sec x tan x tan2 x sec x dx
ˆ
= sec x tan x (sec2 x 1) sec x dx
ˆ ˆ
= sec x tan x sec3 x dx + sec x dx.
ˆ
Now we see that sec3 x dx appears again but fortunately with a “good” sign in front. By
rearrangement, we get
ˆ ˆ
2 sec3 x dx = sec x tan x + sec x dx = sec x tan x + log |sec x + tan x| + C.

We conclude that
1 1
ˆ
sec3 x dx = sec x tan x + log |sec x + tan x| + C
2 2

where C 0 is any constant.

⌅ Exercise 4.39 Compute the integrals:


ˆ ˆ
ex cos x dx, tan3 x dx.

4.5.1 Irrationality of e, again


Here we give “another” proof of e being irrational using integrals. For each n 2 N [ {0}, we
define ˆ 1
In := xn e x dx.
0
1
First of all, we argue that 0 < In < for any n 2 N and x 2 [0, 1] . Clearly In 0. Note that
e
In 6= 0 since the integrand xn e x is non-negative and not identically 0 on [0, 1]. The function
xn e x is strictly increasing on x 2 [0, 1] for any n 2 N (as (xn e x )0 = xn 1 (n 1)e x 0).
Therefore, we have xn e x  1n e 1 = 1e for any x 2 [0, 1] and n 2 N, and so
ˆ 1 ˆ 1
1 1
In = xn e x dx  dx = .
0 0 e e
Next we derive a relation between In+1 and In using integration by parts:
ˆ 1 ˆ 1
n+1 x
In+1 = x e dx = xn+1 d( e x )
0 0
ˆ 1
n+1 x 1 x
=[ x e ]0 ( e )(n + 1)xn dx
0
1
1 1
ˆ
= + (n + 1) xn e x
dx = + (n + 1)In .
e 0 e
78 Integrations

In
Let Jn := for any n 2 N [ {0}, then
n!
✓ ◆
In+1 1 1 1
Jn+1 = = + (n + 1)In = + Jn .
(n + 1)! (n + 1)! e (n + 1)!e

This shows
n
X n
X n
X n
1 1 1 1X 1
Jn = J0 + (Jk Jk 1 ) = I0 =1 =1 .
k!e e k!e e k!
k=1 k=1 k=1 k=0

Therefore, for any n 2 N [ {0},


n
! n
!
1X 1 n! X 1
In = n! 1 = e
e k! e k!
k=0 k=0

Assume that e is rational, then there exist p, q 2 N such that e = pq . Recall that 0 < In < 1
e for
any n 1, and so !
Xn
1
0 < n! e < 1.
k!
k=0
n
X n
X
1 n!
Take n > q, then n!e = n! pq 2 N. Clearly, n! = 2 N too, so
k! k!
k=0 k=0

n
!
X 1
n! e 2 Z,
k!
k=0

but it is clearly absurd as (0, 1) \ Z = ;. This shows e is irrational.

i We put “another” proof in quote because it is not really a new proof from what we have seen
in MATH 1023. The integral In actually came from the remainder of the Taylor’s series of ex .
We will discuss more in Proposition 4.19.

4.5.2 Reduction formulae


In the above proof of the irrationality of e, we derived a recurrence relation for In using
integration by parts. It is also a very common technique for evaluating complicated integrals.

⌅ Example 4.12 For any m, n 2 N [ {0}, we define


ˆ
Im,n := cosm x sinn x dx.

Show that
1 n 1
Im,n = cosm+1 x sinn 1 x + Im,n 2 , 8m 0, n 2 (4.4)
m+n m+n
1 m 1
Im,n = cosm 1 x sinn+1 x + Im 2,n , 8m 2, n 0 (4.5)
m+n m+n
4.5 Integration by Parts 79

⌅ Solution We just prove (4.4) and leave (4.5) as an exercise.


ˆ ˆ
Im,n = cosm x sinn x dx = cosm x sinn 1
xd( cos x)
ˆ
= cosm+1 x sinn 1
x+ cos x d(cosm x sinn 1
x)

= cosm+1 x sinn 1 x
ˆ
+ cos x m cosm 1
x( sin x) sinn 1
x + cosm x · (n 1) sinn 2
x · cos x dx
ˆ
= cosm+1 x sinn 1
x mIm,n + (n 1) cosm+2 x sinn 2 x dx
ˆ
= cosm+1 x sinn 1
x mIm,n + (n 1) cosm x(1 sin2 x) sinn 2
x dx

= cosm+1 x sinn 1
x mIm,n + (n 1)Im,n 2 + (n 1)Im,n .

By rearrangement, we get (4.4).

⌅ Exercise 4.40 Prove (4.5).

Using (4.4), one can then compute some complicated integrals such as
1 6 1
ˆ
cos4 x sin6 x dx = I4,6 = cos5 x sin4 x + I4,4
4+6 4+6

By applying (4.4) again on I4,4 , we can reduce it to I4,2 ; and apply (4.4) again we get I4,0 .
Next we apply (4.5) on I4,0 and reduce it to I2,0 , which can be easily computed by half-angle
formula:
1 + cos 2x 1 1
ˆ ˆ
I2,0 = cos2 x dx = dx = + sin 2x + C.
2 x 4

⌅ Exercise 4.41 Complete the above reduction procedure and find the full expression of
ˆ
cos4 x sin6 x dx.

⌅ Exercise 4.42 For Im,n when at least one of m and n is odd, we could just use one of the

(4.4) and (4.5). Explain why?

Let’s also see an example of definite integrals

⌅ Example 4.13 For any n 2 N [ {0}, let


ˆ 1 p
In := xn 1 x dx.
0

Find a recurrence relation for {In }, and deduce its general term.
80 Integrations

⌅ Solution Using integration by parts, we can prove that for any n 1:


1
2
ˆ
In = xn d((1 x)3/2 )
3 0
2⇥ n ⇤1 2 1
ˆ
= x (1 x)3/2 0 + (1 x)3/2 · nxn 1 dx
3 3 0
2n 1 p
ˆ
=0+ (1 x) · xn 1 1 x dx
3 0
2n 1 n 1 p 2n 1 n p
ˆ ˆ
= x 1 x dx x 1 x dx
3 0 3 0
2n 2n
= In 1 In .
3 3
By rearrangement, we get:
2n
In = In 1 8n 1.
2n + 3
One can then apply this recurrence relation inductively and get for any n 2 N:

2n 2n 2n 2 (2n)(2n 2)(2n 4) · · · (4)(2)


In = In 1 = · In 2 = ··· = I0 .
2n + 3 2n + 3 2n + 1 (2n + 3)(2n + 1)(2n 1) · · · (7)(5)

We can compute that


1  1
p 2 2
ˆ
I0 = 1 x dx = (1 x)3/2 = .
0 3 0 3

This concludes that for any n 2 N:

(2n)!!
In = 2 ⇥ .
(2n + 3)!!

⌅ Exercise 4.43 For any m, n 2 N [ {0}, we define:


ˆ ⇡
Im,n := emx sinn x dx.
0

Find a recurrence relation between Im,n and Im,n 2, and show that:

n!(em⇡ 1)
Im,n =
m(m2 + 4)(m2 + 16) · · · (m2 + n2 )

for any m 2 N [ {0} and even n 2 N.

⌅ Exercise 4.44 — Source: HKAL 1996 Paper II Q12. For non-negative integers k and m, define
ˆ 1
F (k, m) = uk (1 u2 )m du.
0

(a) Show that


1
F (k, 0) =
k+1
2m
F (k, m) = F (k + 2, m 1) for m 1.
k+1
4.5 Integration by Parts 81

(b) Show that


2m (m!)
F (k, m) = .
(k + 1)(k + 3) · · · (k + 2m + 1)
⇡/2 2
2m (m!)
ˆ
(c) Using (b), prove that cos2m+1 ✓ d✓ = .
0 (2m + 1)!
Xm
( 1)r Crm
(d) Show that F (k, m) = .
r=0
2r + k + 1

⌅ Exercise 4.45 — Source: HKAL 2010 Paper II Q9 (restructured). Answer the following ques-
tions:
2n
(a) Prove that lim = 0.
n!1 n! ˆ e
(b) For any positive integer n, define In := x 3
(log x)n dx. Prove that for any n 2 N:
1

n
!
1 1 X 1
In = n! .
2n+1 e2 (n k)!2k+1
k=0

(c) Prove that e 2


x 1
(log x)n  x 3
(log x)n  x 1
(log x)n for all x 2 [1, e]. Hence prove
that
1 1
 In  .
e2 (n + 1) n+1
1
X 2k
(d) Using the above results, evaluate .
k!
k=0

4.5.3 Taylor’s remainder in integral form

Using integration by parts, one can derive a new form of remainder to the Taylor series (in
addition to the Cauchy’s and Lagrange’s forms discussed in MATH 1023).

Proposition 4.19 — Taylor’s Remainder in Integral Form. Suppose f is C n+1 on an interval I


containing a, then we have:
n
X x
f (k) (a) 1
ˆ
f (x) = (x a)k + (x t)n f (n+1) (t) dt
k! n! a
k=0

for any x 2 I.

Proof. The key idea is to use integration by parts repeatedly. For each m, n 2 N and m n, we
have:

ˆ x
Im,n := (x t)m f (n+1) (t) dt
a
ˆ x
= (x t)m d f (n) (t)
a

ˆ x
= (x t)m f (n) (t)]t=x
t=a f (n) (t) · m(x t)m 1
· ( 1) dt
a
= (x a)m f (n) (a) + mIm 1,n 1 .
82 Integrations

Apply this recurrence relation repeatedly, we get


⇣ ⌘
Im,n = (x a)m f (n) (a) + m (x a)m 1 (n 1)
f (a) + (m 1)Im 2,n 2

= (x a)m f (n) (a) m(x a)m 1 f (n 1) (a)


⇣ ⌘
+ m(m 1) (x a)m 2 f (n 2) (a) + (m 2)Im 3,n 3

= ···
= (x a)m f (n) (a) m(x a)m 1 (n 1)
f (a)
m 2 (n 2)
m(m 1)(x a) f (a) m(m 1)(m 2)(x a)m 3 (n 3)
f (a)
m n+1 0
··· m(m 1)(m 2) · · · (m n + 2)(x a) f (a)
+ m(m 1)(m 2) · · · (m n + 1)Im n,0
n
X
m! m!
= Im n,0 (x a)m n+k (k)
f (a)
(m n)! (m n + k)!
k=1

In particular, when m = n, we have:


x n
X
1 f (k) (a)
ˆ
In,n = f 0 (t) dt (x a)k
n! a k!
k=1
Xn
f (k) (a)
= f (x) f (a) (x a)k .
k!
k=1

By rearrangement, we can see that


n
X x
f (k) (a) 1
ˆ
f (x) = (x a)k + (x t)n f (n+1) (t) dt
k! n! a
k=0

as desired. ⌅

Combining Proposition 4.19 with Exercise 4.7, one can give another proof of the Cauchy’s
remainder theorem. Proposition 4.19 asserts that the remainder of Rn (x) = f (x) Tn (x) is given
by:
1 x
ˆ
Rn (x) = (x t)n f (n+1) (t) dt,
n! a
and Exercise (4.7) shows that exists c between a and x such that
ˆ x
(x t)n f (n+1) (t) dt = (x a) · (x c)n f (n+1) (c).
a

It shows
f (n+1) (c)
Rn (x) =
(x c)n (x a)
n!
which is exactly the Cauchy’s remainder.

⌅ Example 4.14 Prove that for any n 2 N, we have


✓ ◆ n
X
1 1 2
e+ 2 < .
e (2k)! (2n)!
k=0
4.5 Integration by Parts 83

⌅ Solution Applying Proposition 4.19 with f (x) = ex , a = 0 and order 2n, we have
ˆ x
x x2 x3 x2n 1 x2n 1
ex = 1 + + + + ··· + + + (x t)2n et dt,
1! 2! 3! (2n 1)! (2n)! (2n)! 0
ˆ x
x x x2 x3 x2n 1 x2n 1
e =1 + + ··· + + (x t)2n ( e t ) dt.
1! 2! 3! (2n 1)! (2n)! (2n)! 0

Putting x = 1 and adding the above results, we get:


✓ ◆ ˆ 1
1 x2 x4 x2n 1
e+ =2 1+ + + ··· + + (1 t)2n (et e t ) dt.
e 2! 4! (2n)! (2n)! 0

Next we proceed to estimate:


✓ ◆ n
X
1 1
e+ 2
e (2k)!
k=0
1
1
ˆ
= (1 t)2n (et e t ) dt
(2n)! 0
1
1
ˆ
 (1 t)2n et e t
dt
(2n)! 0
ˆ 1
1
 (et e t ) dt as (1 t)2n  1 when t 2 [0, 1]
(2n)! 0
1
= [et + e t ]10
(2n)!
1
= (e1 + e 1 2).
(2n)!

It is well-known that 2 < e < 3, so e + 1


e 2<3+ 1
2 2 < 2. It proves:
✓ ◆ n
X
1 1 2
e+ 2 <
e (2k)! (2n)!
k=0

as desired.

⌅ Exercise 4.46 Assume all conditions given in Proposition 4.19. Use the proposition to prove
that if |f (x)|  M on any interval I containing a, then

M
|Rn (x)|  |x a|n+1
(n + 1)!

for any x 2 I.

⌅ Exercise 4.47 — Source: HKAL 1993 Paper II Q12 (modified). (a) Show that
ˆ x
1 x3 x5 ( 1)n 1 2n 1 ( 1)n t2n
tan x=x + ··· + x + dt
3 5 2n 1 0 1 + t2

for all x 2 R and n = 1, 2, 3, · · · .


(b) Using (a), or otherwise, show that
✓ ◆
x3 x5 ( 1)n 1 2n x2n+1
tan 1 x x + ··· + x 1

3 5 2n 1 2n + 1
84 Integrations
1
X ( 1)k
for all x 0 and n = 1, 2, 3, · · · . Hence find .
2k + 1
k=0
1 1 ⇡
(c) Show that tan 1
+ tan = . Deduce that
1
2 3 4
n
X ✓ ◆
⇡ ( 1)k 1 1 1 1
+ 2k 
4 2k 1 22k 1 3 1 n · 22n+1
k=1

for n = 1, 2, 3, · · · .

4.5.4 Young’s and Hölder’s inequalities


In this subsection we use integration by parts and by substitutions to derive the Young’s inequality,
2 2
which is a generalization of the trivial result ab  a2 + b2 for a, b 0. We first prove the following
integral inequality:
Proposition 4.20 Given c > 0, f : [0, c] ! R is a strictly increasing differentiable function on
[0, c], and f (0) = 0. Then, for all a 2 [0, c] and b 2 [0, f (c)],
ˆ a ˆ b
1
ab  f (x) dx + f (y) dy
0 0

with equality holds if and only if b = f (a). The geometric meaning of the inequality can be
found in Figure 4.5.

Figure 4.5: Graphical meaning of Proposition 4.20

Proof. By intermediate value theorem, we can take b = f (') for some ' 2 [0, c]. Consider the
right integral and let z = f 1 (y), then when y = 0, z = 0; and when y = b = f ('), z = '. Also,
we have
f (z) = y =) f 0 (z) dz = dy.
Using integration by substitution and then by parts, we get:
ˆ b ˆ z=' ˆ x='
f 1
(y) dy = zf 0 (z) dz = xf 0 (x) dx
0 z=0 x=0
ˆ ' ˆ '
= [xf (x)]'
0 f (x) dx = 'b f (x) dx.
0 0

We first assume '  a, then as f is increasing, we have f (x) f (') = b for any x 2 [', a], and so
ˆ a ˆ ' ˆ a
f (x) dx = f (x) dx + f (x) dx
0 0 '
ˆ ' ˆ a ˆ '
f (x) dx + b dx = f (x) dx + b(a ').
0 ' 0
4.5 Integration by Parts 85

Adding both results, we get:


ˆ a ˆ b
1
f (x) dx + f (x) dx 'b + (a ')b = ab.
0 0

Equality holds if and only if f (x) = b for all x 2 [', a]. However, f is strictly increasing, so it
would happen only when a = ' (equivalently, f (a) = f (') = b).
We leave it as an exercise for readers to prove the case ' > a. ⌅

⌅ Exercise 4.48 Complete the proof of the case ' > a. Hint: draw a diagram to get some
geometric idea.

⌅ Exercise 4.49 — Source: MATH1024 Spring 2018 Midterm. Consider a bijective function f :
[a, b] ! [f (a), f (b)] where b > a > 0 and f (b) > f (a) > 0, and given that f is differentiable on
[a, b] and f 0 (x) > 0 on (a, b).
(a) By sketching a diagram, guess the value of:
ˆ b ˆ f (b)
1
f (x) dx + f (y) dy
a f (a)

in terms of a, b, f (a), f (b).


(b) Prove your claim
p in (a) using integration by substitution.
(c) Let g(x) = 5 x 6. Show that the definite integral:
ˆ 9
g(g(g(g(g(x))))) dx
4

is a rational number.

Corollary 4.21 — Young’s Inequality. For any a, b 0 and p, q > 1 such that 1
p + 1
q = 1, we
have:
ap bq
ab  + .
p q
Equality holds if and only if b = a1/p .

Proof. Note that the result is trivial if one of a, b is zero. We now assume a, b > 0. We just apply
Proposition 4.20 on the function f (x) = xp , whose derivative is f 0 (x) = pxp 1 > 0 on (0, 1).
The inverse function is given by f 1 (x) = x1/p = x1 1/q . It can be shown easily that:
a b
ap bq
ˆ ˆ
f (x) dx = and f 1
(x) dx = .
0 p 0 q

Young’s inequality can be used to prove another (even more important) inequality, the Hölder’s
inequality, which plays a crucial role in functional analysis. For simplicity, we first denote for
each p 1 the lp -norm of a finite sequence {xn }Nn=1 and the Lp -norm of a continuous function f
on [a, b] by:

N
! p1
X p
k{xn }kp := |xn |
n=1
! p1
ˆ b
p
kf kp := |f (x)| dx
a
86 Integrations
{xn }
It is clear that for any c 2 R, k{cxn }kp = |c| k{xn }kp and kcf kp = |c| kf kp , and so k{xn }kp and
f
kf kp have unit lp - and Lp -norms. The Hölder’s inequality is the following:

Proposition 4.22 — Hölder’s Inequality. Given p, q > 1 such that 1


p + 1
q = 1, then we have:

k{xn yn }k1  k{xn }kp k{yn }kq


kf gk1  kf kp kgkq

for any finite sequences {xn } and {yn } and any continuous functions f and g on [a, b].

Proof. We prove the Hölder’s inequality for functions and leave the sequence’s version as an
exercise for readers. Using the Young’s inequality with a = |fkf(x)| |g(x)|
k and b = kgk , we have: p q

!p !q
|f (x)| |g(x)| 1 |f (x)| 1 |g(x)|
 + 8x 2 [a, b].
kf kp kgkq p kf kp q kgkq

Therefore, by integrating both sides over [a, b], we get:


ˆ b ˆ b !p ˆ b !q
|f (x)| |g(x)| 1 |f (x)| 1 |g(x)|
dx  dx + dx.
a kf k p kgkq a p kf k p a q kgkq

Note that the norms are all constants, so we have:


b b
kf gk1 1 1
ˆ ˆ
p q
 p |f (x)| dx + |g(x)| dx.
kf kp kgkq p kf kp a q kgkq a

Note that ˆ b
p p
kf kp = |f (x)| dx
a

and similarly for g, so we get


kf gk1 1 1
 + = 1,
kf kp kgkq p q
and so our desired result holds.
Note that we assumed f and g are not identically 0 in the above proof. In one of them is
identically zero, the result is trivial. ⌅

⌅ Exercise 4.50 Prove the sequence version of the Hölder’s inequality.

⌅ Exercise 4.51 When does the equality hold for the Hölder’s inequality?

i Clearly, the Hölder’s inequality is a generalization of the well-known Cauchy-Schwarz’s


inequality. The latter is a special case p = q = 2.

⌅ Exercise 4.52 — Source: HKAL 2002 Paper I Q8. Answer the following questions:
(a) Proof of the Cauchy-Schwarz’s inequality (omitted here).
(b) (i) Prove that
✓ Pn ◆2 Pn
i=1 xi x2
 i=1 i ,
n n
where x1 , x2 , · · · , xn are real.
4.5 Integration by Parts 87

(ii) Prove that !2 ! !


n
X n
X n
X
2
i xi  i i xi ,
i=1 i=1 i=1

where x1 , x2 , · · · , xn are real numbers and 1 , 2 , · · · , n are positive numbers.


When does the equality hold?
(iii) Prove that
⇣y y2 y n ⌘2 y2 y2 y2
1
+ 2 + ··· + n < 1 + 22 + · · · + nn ,
t t t t t t
where y1 , · · · , yn are real numbers, not all zero, and t 2.

4.5.5 Basel Problem


The Basel Problem is about finding the exact value of

X1
1 1 1
1+ 2
+ 2 + ··· = 2
.
2 3 n=1
n

It was first posed by Pietro Mengoli in 1644, and was first solved by Euler in 1734 using infinite
2
products. He found that the exact value of this infinite sum is ⇡6 . The name of the problem,
Basel, is the name of a city in Switzerland near the border with France and Germany. The city is
the hometown of Euler and the Bernoulli’s family.

The original proof of Euler used infinite products which will not be discussed in the course,
but there are some other proof using different techniques. Some used more advanced tools such
as Fourier series and complex analysis. Below are two of the proofs that can be understood with
some basic knowledge about integration by parts. They are restructured as two exercises below:

⌅ Exercise 4.53 — Source: MATH1024 Spring 2018 Final Exam. For each integer n 0, we define
ˆ ⇡/2 ˆ ⇡/2
An := cos2n x dx Bn := x2 cos2n x dx.
0 0
✓ ◆
Bn 1 Bn 1
(a) Show that for any integer n 1, we have 2 = .
An 1 An n2
88 Integrations

(b) Show that there exists a constant C > 0, independent of n, such that
C
Bn  An
n+1
for any integer n 1. [Hint: Compare sin x with a linear function on 0  x  2 .]

X1
1 ⇡2
(c) Using the above results, show that ⇣(2) := 2
= .
n=1
n 6

⌅ Exercise 4.54 Define the sequence of functions {fn (x)}1


n=1 and the function g(x):

1
fn (x) := + cos x + cos 2x + · · · + cos nx
2
(
x/2
sin(x/2) if x 6= 0
g(x) :=
1 if x = 0

Consider the integral: ˆ ⇡


En := xfn (x) dx.
0

(a) Show that


n
⇡ 2 X ( 1)k 1
En = + ,
4 k2
k=1
n
X
⇡2 1
and so E2n 1 = 2 for any n 2 N.
4 (2k 1)2
k=1
(b) Show that g is a C 1 function, and that
✓ ˆ ⇡ ◆
1 0 (4n 1)x
E2n 1 = 2+2 g (x) cos dx
4n 1 0 2

X1
1 ⇡2
(c) Prove that E2n 1 ! 0 as n ! 1, and show ⇣(2) := 2
= .
n=1
n 6
[Remark: We may need the fact that rearrangement of a convergent series of positive
numbers preserves its value. We will prove it later.]

4.5.6 Irrationality of ⇡
In this subsection we present two proofs of irrationality of ⇡. They are again restructured as two
exercises.
⌅ Exercise 4.55 — Source: Exam at Cambridge University 1945, written by Mary Cartwright.
Consider the sequence of functions of x:
ˆ 1
In (x) := (1 z 2 )n cos(xz) dz,
1

where n 2 N [ {0}.
(a) Show that for any n 2 and x 2 R, we have:

x2 In (x) = 2n(2n 1)In 1 (x) 4n(n 1)In 2 (x).

(b) Define Jn := x2n+1 In (x). Prove that for any n 2 N,

Jn (x) = n!(Pn (x) sin(x) + Qn (x) cos(x))


4.5 Integration by Parts 89

where Pn and Qn are polynomials of integer coefficients and deg Pn , deg Qn  n.


(c) Now assume ⇡ 2 Q and write ⇡ = 2a b where a, b 2 N. Note that we do not assume b is
2a

in the simplest form so we can assume that numerator is even. Verify the following:
(i) For any n 2 N, we have

a2n+1
In (⇡/2) = Pn (⇡/2)b2n+1 .
n!
(ii) Deduce a contradiction by showing that LHS ! 0 as n ! 1 whereas RHS is always
a positive integer.

⌅ Exercise 4.56 — Source: Exercise in a Bourbaki’s booka . For each n 2 N [ {0} and b 2 N, we
define ⇡
xn (⇡ x)n
ˆ
An (b) := bn sin x dx.
0 n!
(a) Prove that for each b 2 N, we have 0 < An (b) < 1 for sufficiently large n.
(b) Now suppose ⇡ is rational and ⇡ = ab for some a, b 2 N. Consider the polynomial
xn (a bx)n
f (x) := , prove that:
n!
h i⇡ ˆ ⇡
⇡ 0 ⇡ (2n)
An (b) = [ f (x) cos x]0 [ f (x) sin x]0 +· · ·± f (x) cos x ± f (2n+1) (x) cos x dx.
0 0

(c) Hence, by showing An (b) is an integer, deduce a contradiction.


a Bourbaki is a group of prominent mathematicians, including Cartan and Weil, who co-authored a huge collection

of books and treaties in various topics of pure mathematics. There is one related joke: “When did Bourbaki stop
writing books? Answer: After they realized that Serge Lang is a single person.”
90 Integrations

4.6 Numerical Methods of Integrations


Experience from previous chapters told us that finding the exact value of an integral could be very
ˆ b
2
difficult. While it is easy to integrate xe x dx, no one has ever managed to find the exact
ˆ b a
2
value of e x dx even though this integral is important in statistics (normal distribution). In
a
view of this, mathematicians have developed various workable ways of finding the approximated
values of a definite integral.

4.6.1 Left-Hand, Mid-Point, and Right-Hand Sums


The key idea behind left-hand, mid-point, right-hand sums is to approximate the region under
the graph y = f (x) by bar-charts (i.e. rectangles). Consider a function f : [a, b] ! R. We define
Pn to be the uniform partition of [a, b]:
Pn : a = x 0 < x 1 < · · · < x n 1 < xn = b
where xi = a + b a
n i. We then define
n
b aX
Ln := f (xi 1)
n i=1
Xn ✓ ◆
b a xi 1 + xi
Mn := f
n i=1
2
n
b aX
Rn := f (xi )
n i=1

to be the n-th left-hand sum, mid-point sum, and right-hand sum respectively.
Practically, we could choose n to be a large integer (say 100) and compute L100 , M100 and
R100 directly using, for instance, a spreadsheet app. We leave it for readers to play around with
Excel on computing these sums. Our emphasis in this section is to determine how accuracy are
these approximations.

Proposition 4.23 Let f be a C 1 function on [a, b]. Then, the error between left-hand sum Ln or
ˆ b
the right-hand sum Rn (defined previously) and the actual integral f (x) dx is bounded by:
a

b
(b a)2
ˆ
f (x) dx Ln  sup |f 0 |
a 2n [a,b]
b
(b a)2
ˆ
f (x) dx Rn  sup |f 0 |
a 2n [a,b]

Proof. Consider the uniform partition Pn of [a, b] and denote the partition points by xi ’s where
i = 0, 1, · · · , n. By the Newton-Leibniz’s Formula, we get
ˆ x
f (x) = f (xi 1 ) + f 0 (t) dt, 8x 2 [xi 1 , xi ].
xi 1

Then, we have
n ˆ n ˆ
!
ˆ b X xi X xi ˆ x
0
f (x) dx = f (x) dx = f (xi 1) + f (t) dt dx
a i=1 xi 1 i=1 xi 1 xi 1

n n ˆ
!
X X xi ˆ x
0
= f (xi 1 )(xi xi 1) + f (t) dt dx.
i=1 i=1 xi 1 xi 1
| {z }
=Ln
4.6 Numerical Methods of Integrations 91
ˆ b
The first term is exactly Ln , so the second double integral term gives the error between f (x) dx
a
and Ln . Next we estimate:
ˆ x ˆ x ˆ x
f 0 (t) dt  |f 0 (t)| dt  sup |f 0 | dt = sup |f 0 | · (x xi 1 ).
xi 1 xi 1 xi 1 [a,b] [a,b]

This shows
n ˆ
!
ˆ b X xi ˆ x
f (x) dx Ln = f 0 (t) dt dx
a i=1 xi 1 xi 1

n ˆ
! n ˆ
X xi ˆ x X xi ˆ x
0
 f (t) dt dx  f 0 (t) dt dx
i=1 xi 1 xi 1 i=1 xi 1 xi 1

Xn ˆ xi
0
 (x xi 1 ) sup |f | dx
i=1 xi 1 [a,b]

Xn 2 n
X
(xi xi 1) (b a)2
= sup |f 0 | = sup |f 0 |
i=1
2 [a,b] i=1
2n2 [a,b]

(b a)2
= sup |f 0 |
2n [a,b]

where we used the fact that Pn is a uniform partition so that xi xi 1 = n .


b a

The proof for the right-hand sum is similar, mutatis mutandis. ⌅

i In the above proof, you may use instead the mean-value theorem instead of integral
remainder of Taylor series. We use the latter because it may result in a sharper estimate in
some other error estimations.

⌅ Exercise 4.57 Write up the proof of Proposition 4.23 using mean-value theorem instead.

⌅ Exercise 4.58 Write up the proof of the right-hand sum part in Proposition 4.23. Clearly
point out what are the essential differences from the proof of the left-hand sum.

ˆ 3
x2
⌅ Example 4.15 To see how large n needs to be in order to estimate e dx up to 4 decimal
1
places, we need to find an n so that

(3 1)2 x2 0
sup (e ) < 0.00001.
2n [1,3]

By straight-forward differentiation, we get

d x2 x2 x2 1 x2 0 6
e = 2xe =) 2xe 2⇥3⇥e 8x 2 [1, 3] =) sup (e )  .
dx [1,3] e

So we need an n such that


4 6
· < 0.00001.
2n e
It can be achieved when n 441456.
92 Integrations

ˆ 1
⌅ Exercise 4.59 Find an n such that the left-hand sum Ln gives an approximation of sin(x2 ) dx
2
with accuracy up to 5 decimal places.

i Note that the n that we find above may not be the least possible n.

Next we discuss the error estimation of the mid-point sum. We want to prove a more general
result, that is taking the sample point x⇤i in any sub-interval [xi 1 , xi ] of a uniform partition so
that it makes the ratio of 1 : with the end points xi 1 and xi .

Proposition 4.24 Let f (x) : [a, b] ! R be a C 1 function defined on a bounded interval [a, b].
Fix a constant 2 [0, 1] and a large positive integer n. Consider the uniform partition of [a, b]:

{a = x0 < x1 < · · · < xn 1 < xn = b}


ˆ b
and define a numerical approximation of f (x) dx by:
a

n
X
An := f (x⇤i ) · (xi xi 1 ), where x⇤i := (1 )xi 1 + xi .
i=1

Then we have:
b
(1 2 + 2 2 )(b a)2
ˆ
f (x) dx An  sup |f 0 | .
a 2n [a,b]

Proof. The proof is a modification of that of Proposition 4.23. Since we have demonstrated the
use of integral remainder when proving Proposition 4.23, we use mean-value theorem this time.
For any x 2 [xi 1 , xi ], the mean-value theorem shows there exists ci (depending on i and x)
between x and x⇤i such that
f (x) = f (x⇤i ) + f 0 (ci )(x x⇤i ).

Then,
ˆ b n ˆ
X xi n
X n ˆ
X xi
f (x) dx = f (x) dx = f (x⇤i )(xi xi 1) + f 0 (ci )(x x⇤i ) dx.
a i=1 xi 1 i=1 i=1 xi 1
| {z }
=An

ˆ b
The first term is An , so the second term gives the error between the integral f (x) dx and An .
a
Next we estimate the second integral:

n ˆ
X xi
f 0 (ci )(x x⇤i ) dx
i=1 xi 1
Xn ˆ xi
 |f 0 (ci )| |x x⇤i | dx
i=1 xi 1

n
X ˆ xi
 sup |f 0 | |x x⇤i | dx.
i=1 [a,b] xi 1

Then we need to compute the integral of |x x⇤i | over [xi 1 , xi ]. Note that x x⇤i  0 on
4.6 Numerical Methods of Integrations 93

[xi ⇤
1 , xi ] and x x⇤i 0 on [x⇤i , xi ], so we have
ˆ xi ˆ x⇤
i
ˆ xi
|x x⇤i | dx = (x⇤i x) dx + (x x⇤i ) dx
xi 1 xi 1 x⇤
i

1 1
= (x⇤i xi 1 )2 + (xi x⇤i )2
2 2
1 2 2 1
= (xi xi 1 ) + (1 )2 (xi xi 1)
2
2 2
(b a)2
= (2 2 2 + 1).
2n2

That shows
b n
X (b a)2 (b a)2
ˆ
f (x) dx An  sup |f 0 | · (2 2
2 + 1) = sup |f 0 | · (2 2
2 + 1)
a i=1 [a,b]
2n2 [a,b] 2n

as desired. ⌅

The quadratic function 2 2 2 + 1 achieves its minimum at = 12 . Therefore, the mid-point


sum tends to give a slightly better estimate among all other -sums.

⌅ Exercise 4.60 — Source: MATH1024 Spring 2018 Midterm. Let f : [a, b] ! R be a C 2 function
on [a, b], and let An be as in Proposition 4.24. Show that:
b
|1 2 | (b a)2 (1 3 + 3 2 )(b a)3
ˆ
f (x) dx An  sup |f 0 | + sup |f 00 | .
a 2n [a,b] 6n2 [a,b]

[Hint: Consider second-order Taylor’s approximation and its remainder.]

4.6.2 Trapezoidal Rule


The trapezoidal rule, as the name implies, approximates the area under the graph of a function
by trapeziums. It typically gives a better approximation than left-hand and right-hand sums
because the trapeziums form a piecewise linear graph that fits the function better than a step
function does.
Given a function f : [a, b] ! R, we again consider uniform partitions {xi = a + i x}ni=0
where x = b na . Then, the total area of these trapeziums (as show in Figure ??) is given by
✓ ◆
f (x0 ) + f (x1 ) f (x1 ) + f (x2 ) f (x2 ) + f (x3 ) f (xn 1)+ f (xn )
Tn = + + + ··· + · x
2 2 2 2
✓ ◆
f (x0 ) + f (xn )
= + f (x1 ) + f (x2 ) + · · · + f (xn 1 ) · x
2
✓ ◆
f (a) + f (b) b a
= + f (x1 ) + f (x2 ) + · · · + f (xn 1 ) · .
2 n

With this formula, the trapezium sum Tn can be easily computed using spreadsheet apps. As
before, we are more interested in its error estimation:
Proposition 4.25 Let f : [a, b] ! R be a C 2 function on [a, b], and Tn be the n-th trapezoidal
ˆ b
sum of the integral f (x) dx then we have
a

b
(b a)3
ˆ
f (x) dx Tn  sup |f 00 | .
a 12n2 [a,b]
94 Integrations

Proof. Denote the partition points by xi = a + i x where = b na . First, we denote


✓ ◆2
xi+1 + xi xi+1 xi ( x)2
Ai = and B = = .
2 2 4
It can be verified easily that:
 xi+1
f (xi ) + f (xi+1 )
(x + Ai )f (x) = · x
xi 2
 xi+1
(x + Ai )2 + B f 0 (x) =0
xi

for any i = 0, 1, 2, . . . , n 1. Then, by the fact that


✓ ◆
d 1
(x + Ai )f (x) ((x + Ai )2 + B)f 0 (x)
dx 2
0 1 1
= f (x) + (x + Ai )f (x) · 2(x + Ai )f 0 (x) ((x + Ai )2 + B)f 00 (x)
2 2
1
= f (x) (x + Ai )2 + B)f 00 (x).
2
Hence, using the Newton-Leibniz’s Formula and by our choice of Ai and B, we get
ˆ xi+1 ˆ xi+1
1
f (x)dx ((x + Ai )2 + B)f 00 (x) dx
xi xi 2
 xi+1
1
= (x + Ai )f (x) ((x + Ai )2 + B)f 0 (x)
2 xi
f (xi ) + f (xi+1 )
= · x.
2
f (xi ) + f (xi+1 )
Note that · x is the area of the i-th trapezium, so we conclude that
2
ˆ b
f (x) dx
a
n
X1 ˆ xi+1 n
X1 n
X1 ˆ xi+1
f (xi ) + f (xi+1 ) 1
= f (x) dx = · x+ ((x + Ai )2 + B)f 00 (x) dx
i=0 xi i=0
2 i=0 xi 2
n
X1 ˆ xi+1 1
= Tn + ((x + Ai )2 + B)f 00 (x) dx
i=0 xi 2
ˆ b
Hence, the integral terms give the error between f (x) dx and Tn . It worths noting that
a

(x + Ai )2 + B = (x xi+1 )(x xi )  0 on [xi , xi+1 ].


Therefore the error term is given by
n 1ˆ
1 X xi+1
ˆ b
f (x) dx Tn  (xi+1 x)(x xi ) |f 00 (x)| dx
a 2 i=0 xi
n 1ˆ
1 X xi+1
 (xi+1 x)(x xi ) dx · sup |f 00 |
2 i=0 xi [a,b]
n
X1 3
1 1 (b a)
( x)3 · sup |f 00 | =
= 2
sup |f 00 | .
i=0
6 2
[a,b] 12n [a,b]
ˆ xi+1
The third step follows from direct computation of the integral (xi+1 x)(x xi ) dx (left
xi
as an exercise for readers). It completes the proof. ⌅
4.6 Numerical Methods of Integrations 95

ˆ b
⌅ Exercise 4.61 From the above proof, what can you say about the integral f (x) dx and Tn
a
when f > 0 on [a, b]?
00

4.6.3 Simpson’s Rule


The Simpson’s rule approximates the graph of a function by quadratic curves. Given a continuous
function f : [a, b] ! R, we consider a sequence of uniform partitions {P2n } of [a, b] with 2n
subintervals. Denote the partition points to be {xi }2n i=0 , then on each interval [x2i , x2i+2 ] where
i = 0, 1, · · · , n 1, we approximate f by a quadratic function Qi (x) so that Qi (x2i ) = f (x2i ),
Qi (x2i+1 ) = f (x2i+1 ) and Qi (x2i+2 ) = f (x2i+2 ). One can find that such an Qi (x) can be written
as:
(x x2i+1 )(x x2i+2 ) (x x2i )(x x2i+2 )
Qi (x) = f (x2i ) · + f (x2i+1 ) ·
(x2i x2i+1 )(x2i x2i+2 ) (x2i+1 x2i )(x2i+1 x2i+2 )
(x x2i )(x x2i+1 )
+ f (x2i+2 ) · .
(x2i+2 x2i )(x2i+2 x2i+1 )

One easy way to see this is to observe that

(x x2i+1 )(x x2i+2 )


(x2i x2i+1 )(x2i x2i+2 )

equals 1 when x = x2i , and equals 0 when x = x2i+1 or x2i+2 . Similar for the second and third
terms.
⌅ Exercise 4.62 Given n distinct numbers x1 < x2 < · · · < xn , and a set of n numbers
y1 , · · · , yn (not necessarily distinct), find an n-th degree polynomial P (x) such that P (xi ) = yi
for any i = 1, 2, · · · , n.

Qi is simply a quadratic function, so the integral below can be found easily:


ˆ x2i+2
b a
Qi (x) dx = (f (x2i ) + 4f (x2i+1 ) + f (x2i+2 ))
x2i 6n

using the fact that P2n is a uniform partition so that x2i+2 x2i+1 = x2i+1 x2i = 2n .
b a
This is
left as an exercise for readers.
ˆ x2i+2
⌅ Exercise 4.63 Compute Qi (x) dx.
x2i

ˆ b
Summing up the area of Qi ’s, we get the following approximated value of f (x) dx:
a

n
!
X1 ˆ x2i+2 b a
n
X1 n
X1
Sn := Qi (x) dx = f (a) + f (b) + 2 f (x2i ) + 4 f (x2i+1 )
i=0 x2i 6n i=1 i=0

For the error estimation of the Simpson’s rule, it can be shown to be of order O(1/n4 ) provided
that f is C 4 on [a, b]:

Proposition 4.26 Let f : [a, b] ! R be a C 4 function, then there exists a universal constant
C > 0 such that
b
(b a)5
ˆ
f (x) dx Sn  sup f (4) .
a Cn4 [a,b]

Outline of Proof. The key idea is to use the Lagrange’s remainder theorem, which asserts that for
96 Integrations

each i = 0, 1, · · · , n 1 and any x 2 [x2i , b], there exists h1 (x) 2 [x2i , x] such that:

f 00 (x2i ) f 000 (x2i ) f (4) (h1 (x))


f (x) = f (x2i ) + f 0 (x2i )(x x2i ) + (x x2i )2 + (x x2i )3 + (x x2i )4 .
| 2 {z 3! } 4!
=:Pi (x)

That would give the error estimation of


ˆ x2i +2 x ˆ x2i +2 x
f (x) dx Pi (x) dx
x2i x2i

in terms of sup[a,b] f (4) . Here x = 2n .


b a

Next we estimate the error of


ˆ x2i +2 x ˆ x2i +2 x
Pi (x) dx Qi (x) dx .
x2i x2i

Recall that
x2i+2
b a
ˆ
Qi (x) dx = (f (x2i ) + 4f (x2i + x) + f (x2i + 2 x)).
x2i 6n

Writing
f (4) (h1 (x2i + x))
f (x2i + x) = Pi (x2i + ( x)4
x) +
4!
and similarly for f (x2i + 2 x), one can see there is a lot of cancellations within
ˆ x2i +2 x ˆ x2i +2 x
Pi (x) dx Qi (x) dx,
x2i x2i

and the only terms left are the 4th derivatives of f .


By considering
ˆ x2i +2 x ˆ x2i +2 x
f (x) dx Qi (x) dx
x2i x2i
ˆ x2i +2 x ˆ x2i +2 x ˆ x2i +2 x ˆ x2i +2 x
 f (x) dx Pi (x) dx + Pi (x) dx Pi (x) dx ,
x2i x2i x2i x2i

we get the error estimate on each subinterval [x2i , x2i+2 ] in terms of 4th derivatives of f . One
could sum up these error to yield the desired result.

⌅ Exercise 4.64 Fill in the detail of the above outline of proof.

i It is interesting to note that if f (x) is a cubic polynomial, then f (4) ⌘ 0 so the Simpson’s
ˆ b
rule indeed gives the exact value of f (x) dx. Of course, practically speaking we wouldn’t
a
integrate a cubic polynomial in this way.
4.7 Improper Integrals 97

4.7 Improper Integrals


ˆ b
An integral f (x) dx is called an improper integral if one of a = 1 or b = +1, and/or f is
a
unbounded on [a, b].

4.7.1 Integral over an unbounded interval


Let’s first discuss the case when the integral is defined over an unbounded interval, but f is
locally Riemann integrable on the domain of f , meaning that for any closed and bounded interval
[a, b] ⇢ domain(f ), the function f is bounded and Riemann integrable on [a, b]. Here the
bound can depend on a and b. One example is f (x) = ex . It is not bounded on R, but is bounded
(by eb ) on each closed and bounded interval [a, b]. Here are the definitions of such improper
integrals:
ˆ +1 ˆ b
f (x) dx := lim f (x) dx
0 b!+1 0
ˆ 0 ˆ 0
f (x) dx := lim f (x) dx
1 a! 1 a
ˆ +1 ˆ +1 ˆ 0
f (x) dx := f (x) dx + f (x) dx
1 0 1
ˆ 0 ˆ b
= lim f (x) dx + lim f (x) dx
a! 1 a b!+1 0
ˆ +1 ˆ b
Similar to an infinite series, we say f (x) dx converges if the limit lim f (x) dx exists,
0 b!+1 0
ˆ +1
and say f (x) dx diverges if it does not exist (including the case when the limit is infinity).
0

i The value 0 in the third integral can be generally replaced by any other constant:
ˆ +1 ˆ +1 ˆ c
= + .
1 c 1

One can show that if f is Riemann integrable on every closed and bounded interval [a, b],
then we have
ˆ c ˆ b ˆ 0 ˆ b
lim f (x) dx + lim f (x) dx = lim f (x) dx + lim f (x) dx
a! 1 a b!+1 c a! 1 a b!+1 0

for any c 2 R.

⌅ Example 4.16 Let p 2 R, then


8
x=b
b >
< x p+1
b1 p 1
1 if p 6= 1
ˆ
=
dx = p+1 x=1 1 p
1 xp >
: log b if p = 1

As b ! +1, we know b1 p ! 0 if p > 1, and b1 p


= +1 if p < 1. Also, log b ! +1 too.
Combining all these, we conclude that
8
ˆ +1
1
ˆ b
1 < 1 if p > 1
dx = lim dx = p 1 .
1 xp b!+1 1 xp :+1 if p  1
98 Integrations
+1
1
ˆ
We may say, for instance, p dx diverges.
1 x
ˆ +1 ˆ +1 ˆ 0
For an improper integral such as f (x) dx, we need both f (x) dx and f (x) dx
1 0 1
converge:

⌅ Example 4.17
+1 0 ˆ b
1 1 1
ˆ ˆ
dx = lim dx + lim dx
1 1 + x2 a! 1 a 1+x
2 b!+1 0 1 + x2
⇥ ⇤0 ⇥ ⇤b
= lim tan 1 x a + lim tan 1 x 0
a! 1 b!+1

= lim ( tan 1 a) + lim tan 1


b
a! 1 b!+1
⇣ ⇡⌘ ⇡
= + = ⇡.
2 2
ˆ +1
However, for e x
dx, we see that:
1
ˆ 0
x
e dx = lim ( e0 + e a
) = +1.
1 a! 1

ˆ +1 ˆ +1
Therefore, e x
dx diverges even though e x
dx converges.
1 0

i Note that ˆ +1 ˆ 0 ˆ b
x dx := lim x dx + lim x dx
1 a! 1 a b!+1 0

but NOT: ˆ a
lim x dx,
a!+1 a

which is one misconception for many non-math majors (and some math majors too).

For improper integrals involving the use of integration by parts or by substitution, one can
just do it as the proper case before taking limits.

⌅ Example 4.18
ˆ +1 ˆ b
x x
xe dx = lim xe dx
0 b!+1 0
!
ˆ b
x
= lim xd( e )
b!+1 0
!
ˆ b
x b x
= lim [ xe ]0 ( e ) dx
b!+1 0
b b
= lim be e + 1 = 0 + 0 + 1 = 1.
b!+1

Here we have used L’Hospital’s rule to compute the limit of be b


as b ! +1.
4.7 Improper Integrals 99

⌅ Example 4.19
+1 b
1 1
ˆ ˆ
dx = lim dx
2 x(log x)2 b!+1 2 x(log x)2
x=b
1
ˆ
= lim d(log x)
b!+1 x=2 (log x)2
 x=b
1
= lim
b!+1 log x x=2
✓ ◆
1 1 1
= lim + = .
b!+1 log b log 2 log 2

⌅ Example 4.20 Let f : [0, 1) ! R the function

1
f (x) = .
[x + 1]2

Then, f (x) = 1 on [0, 1), f (x) = 212 on [1, 2), and f (x) = 312 on [2, 3) etc. By sketching the
ˆ +1
P1
graph, it is easy to expect that f (x) dx should be equal to the infinite sum n=1 n12
0
2
(which is ⇡6 ). Let’s verify that it is really the case:
Note that [a]  a for any a 2 [0, 1), we have
ˆ a ˆ [a] ˆ a
f (x) dx = f (x) dx + f (x) dx.
0 0 [a]

Note that on the interval [0, [a]], the region G+


[0,[a]] (f ) under the graph of f is a simple region,
so by definition we have
[a]
ˆ [a]
1 1 X 1
f (x) dx = 1 + 2 + · · · + 2 = 2
0 2 [a] n=1
n

By the composition rule applied on the maps

R!N N!R
XN
1
a 7! [a] N 7! 2
n=1
n

one can then show, by taking composition of the above maps, we have
[a] 1
ˆ [a] X 1 X 1
lim f (x) dx = lim 2
= 2
.
a!+1 0 a!+1
n=1
n n=1
n

For the remaining integral over [[a], a], we consider:


a a
1 1
ˆ ˆ
f (x) dx  |f (x)| dx  (a [a]) sup |f |  1 ·  2.
[a] [a] [[a],a] ([a] + 1)2 a
100 Integrations
ˆ a
By squeeze theorem, we conclude that lim f (x) dx = 0. Hence, we proved
a!+1 [a]

+1 [a] a X1
1
ˆ ˆ ˆ
f (x) dx = lim f (x) dx + lim f (x) dx = 2
.
0 a!0 0 a!+1 [a] n=1
n

⌅ Exercise 4.65 Find the value of the following improper integrals, or show that it diverges.
ˆ 0
(a) ex dx
1
+1
x tan 1 x
ˆ
(b) dx
(1 + x2 )2
ˆ0 +1
log x
(c) dx
1 x2

P1
⌅ Exercise 4.66 Suppose n=1 an converges to L. Consider the function f : [0, 1) ! R where
f (x) = a[x]+1 for any x 2 [0, 1). Show that
ˆ +1
f (x) dx = L.
0

+1
P1
ˆ
Show also that if n=1 an diverges (as an infinite series), then f (x) dx also diverges (as
0
an improper integral).

ˆ +1
⌅ Exercise 4.67 Construct a function f : [0, 1) ! R such that f (x) dx converges, but
0
ˆ +1
|f (x)| dx diverges. [Hint: Use the previous exercise.]
0

4.7.2 Integral of an unbounded function


Next we consider functions which are unbounded whereas its domain could be bounded or
unbounded. For Riemann integrals, we focus on functions of the following class:
F := {f : [a, b] ! [ 1, 1] : f is locally Riemann integrable on [a, b]\{finite set of points}} .
Functions beyond this class are better handled using Lebesgue integrals, since the definition of
Riemann integrals rely heavily on the use of intervals and sub-intervals.
For any function f which tends to ±1 when x ! a+ and x ! b , but is locally Riemann
integrable on (a, b), we define its integral by
ˆ b ˆ c ˆ
f (x) dx := lim f (x) dx + lim+ f (x) dx
a ↵!a ↵ !b c

where c is any constant in (a, b). If f is locally Riemann integrable on [a, b) and tends ±1 when
x ! b+ , then we define
ˆ b ˆ
f (x) dx := lim f (x) dx,
a !b+ a
and similarly for a function f which is locally Riemann integrable on (a, b] and tends to ± when
x ! a.
Now for a general function f 2 F, we first locate all points c1 < c2 < · · · < ck in (a, b) at
which f tends to ±1, then we define
ˆ b ˆ c1 ˆ c2 ˆ ck ˆ b
f (x) dx := f (x) dx + f (x) dx + · · · + f (x) dx + f (x) dx.
a a c1 ck 1 ck
4.7 Improper Integrals 101
ˆ b
If just one of the above integrals diverges, we say f (x) dx diverges (even if just one of them
a
diverges)

⌅ Example 4.21
1 ˆ 1
p
1 1
ˆ
p dx = lim p dx = lim [2 x]1a
0 x a!0 +
a x a!0 +
p
= lim+ (2 2 a) = 2
a!0
1 ˆ 1✓ ◆
1 1 1
ˆ
dx = + dx
0 x(1 x) 0 x 1 x
ˆ 1/2 ✓ ◆ ˆ b ✓ ◆
1 1 1 1
= lim+ + dx + lim + dx
a!0 a x 1 x b!1 1/2 x 1 x
ˆ b ✓ ◆
1/2 1 1
= lim+ [log(x) + log(1 x)]a + lim + dx.
a!0 b!1 1/2 x 1 x

Note that
1/2
lim [log(x) + log(1 x)]a = lim+ (log(1/2) log a + log(1/2) log(1 a)
a!0+ a!0

1
1
ˆ
does not exists (since log a ! 1). We conclude that dx diverges. Note that
x(1 x)
ˆ b ✓ 0 ◆
1 1
there is no need to discuss the convergence of lim + dx, as whether or not it
b!1 1/2 x 1 x
ˆ 1
1
converges the integral dx would still diverge.
0 x(1 x)

⌅ Example 4.22 Consider the improper integral


+1
1
ˆ
dx.
0 x(x 1)(x 2)

The integral is improper in two ways: it is unbounded near 0, 1 and 2; and the interval [0, 1)
of integration is unbounded. Therefore, we should first express:
ˆ +1
1
dx
0 x(x 1)(x 2)
ˆ 1 ˆ 2
1 1
= dx + dx
0 x(x 1)(x 2) 1 x(x 1)(x 2)
ˆ +1
1
dx.
2 x(x 1)(x 2)

In each of the three integrals on the RHS, both the lower and upper end-points are “bad” points,
102 Integrations

so we need to further consider:


ˆ 1 ˆ 1/2 ˆ b
1 1 1
dx = lim+ dx + lim dx
0 x(x 1)(x 2) a!0 a x(x 1)(x 2) b!1 1/2 x(x 1)(x 2)
ˆ 2 ˆ 3/2 ˆ b
1 1 1
dx = lim+ dx + lim dx
1 x(x 1)(x 2) a!1 a x(x 1)(x 2) b!2 3/2 x(x 1)(x 2)
ˆ +1 ˆ 3 ˆ b
1 1 1
dx = lim+ dx + lim dx
2 x(x 1)(x 2) a!0 a x(x 1)(x 2) b!+1 3 x(x 1)(x 2)
+1
1
ˆ
If one of the above six integrals diverges, it suffices to claim that dx
0 x(x 1)(x 2)
diverges. In fact by observing that
1 1 1 1
= + ,
x(x 1)(x 2) 2x x 1 2(x 2)

one can already conclude that


1/2
1
ˆ
lim+ dx
a!0 a x(x 1)(x 2)
1/2 +1
1 1
ˆ ˆ
does not exist as dx diverges, and hence dx diverges.
0 2x 0 x(x 1)(x 2)

⌅ Example 4.23 It is good to keep in mind that


8 ⇣ ⌘
a1 p
>
>
< lim a!0 +
1
1 p 1 p = 11p if p < 1
1 1
1 1
ˆ ˆ
dx = lim+ dx = lima!0+ [log x]1a = +1
⇣ ⌘ if p = 1
0 xp a!0 a xp >
>
:lim +
1 a1 p
= +1 if p > 1
a!0 1 p 1 p

+1
1
ˆ
It is in contrast to dx, which converges when p > 1.
1 xp

⌅ Example 4.24 The function 1


x2 is unbounded as x ! 0, and is locally Riemann integrable on
[ 1, 1]\{0}, so:
1 0 1
1 1 1
ˆ ˆ ˆ
2
dx = dx. dx +
1 x 1 0 x 2 x2
1 1
1 1
ˆ ˆ
By the above example, we know that 2
dx diverges, so dx diverges too. It is
0 x 1 x2
improper to compute the improper integral this way:
1  1
1 1
ˆ
dx = = 2 (WRONG!)
1 x2 x 1

Furthermore, it is improper to compute the integral like this:


ˆ 1 ✓ˆ a ˆ 1 ◆
1 1 1
dx = lim+ dx + dx (WRONG!),
1 x a!0 1 x a x
4.7 Improper Integrals 103

but instead we should follow the definition:


ˆ 1 ˆ 0 ˆ 1 b 1
1 1 1 1 1
ˆ ˆ
dx = dx + dx = lim , dx + lim+ dx
1 x 1 x 0 x b!0 1 x a!0 a x

which diverges by the previous example.

⌅ Exercise 4.68 Determine whether the following improper integrals converge. If so, find its

value.ˆ
+1
log x
(a) dx
x2
ˆ0 1
log x
(b) p dx
x
ˆ0 1
1
(c) p dx
2
ˆ 21 1 x
1
(d) dx
2 1 x2

4.7.3 Comparison Tests


Very often, we are not concerned about the exact value of an improper integral, but we care about
whether it converges or not. In number theory and complex analysis, we study an important
function known as the Gamma function:
ˆ +1
(x) := tx 1 e t dt.
0

Although it can be shown (see Exercise 4.70) that (n) = (n 1)! for any n 2 N, but other values
of (x) cannot be easily found. However, we can prove it converges when x > 0 by comparing
(x) with another computable integral (note that tx 1 e t  e t/2 when t is sufficiently large).
Proposition 4.27 — Comparison Test for Improper Integrals. Suppose f, g are locally Riemann
integrable on [a, +1) and 0  f (x)  g(x) on [N, +1) for sufficiently large N a, then
ˆ +1 ˆ +1
• If g(x) dx converges, then f (x) dx converges.
ˆa+1 a
ˆ +1
• If f (x) dx diverges, then g(x) dx diverges.
a a
Similar comparison tests hold for other types of improper integrals.

Proof. The second result is the contrapositive of the first one, so it suffices to prove the first one
only. Define ˆ x ˆ x
F (x) := f (t) dt and G(x) := g(t) dt.
a a

By 0  f (x)  g(x) on [N, +1), we know that F (x) and G(x) are both monotonically increasing
ˆ +1
on [a, +1), and F (x)  G(x) on [N, +1). Note that g(t) dt being convergent means the
a
limit lim G(x) exists, so G(x) is bounded on [N, +1).
x!+1
This shows F (x) is also bounded on [N, +1). Combining with the fact that F (x) is monotone,
ˆ +1 ˆ +1
we conclude that lim F (x) exists, and equivalently f (x) dx (and hence f (x) dx)
x!+1 N a
converges. ⌅
104 Integrations

⌅ Example 4.25 On [2, 1), we have

1 1
0 p .
x2/3 x2 1
+1 +1
1 1
ˆ ˆ
Note that dx diverges, so p dx diverges too. On the other hand, on
2 x2/3 2
3
x2 1
(0, 12 ],
p
3
1 1 2
0 p  q = .
3
x2 1 3 1 2 x2/3
2x

Since 2
< 1, we know that
3
1/2
p
3
2
ˆ
dx
0 x2/3
1/2
1
ˆ
converges, so dx converges too.
0 x2/3

i The condition 0  f (x) is necessary in Proposition 4.27. Here is a counterexample:


1 1
 2 on [1, 1)
x x
+1 +1
1 1
ˆ ˆ
but dx diverges while dx converges.
1 x 1 x2
p
Sometimes it may be tricky to handle non-essential terms (such as the 1 in x2 1) when
using the comparison test. The limit comparison test below lets us focus on the most important
term when doing comparisons:
Proposition 4.28 — Limit Comparison Test for Improper Integrals. Suppose f, g are locally Rie-
mann integrable on [a, +1), and f (x), g(x) 0 on [a, +1). Consider the limit

f (x)
lim =: L.
x!+1 g(x)

Then, we have ˆ +1 ˆ +1
• If L 2 (0, +1), then f (x) dx if and only if g(x) dx converges.
ˆ +1 a ˆ +1 a

• If L = 0, then g(x) dx converges implies f (x) dx converges.


a ˆ a ˆ
+1 +1
• If L = +1, then f (x) dx converges implies g(x) dx converges.
a a
Similar results hold for other types of improper integrals.

Proof. The proof simply follows from Proposition 4.27 and the order rule. When 0 < L < +1,
then for sufficiently large x, we have

L f (x) L
  2L =) g(x)  f (x)  2Lg(x).
2 g(x) 2

f (x) f (x)
If L = 0, then lim = 0 shows  1 for sufficiently large x, and hence f (x)  g(x).
g(x)
x!+1 g(x)
Applying Proposition 4.27 yields the desired result.
The case L = +1 follows from swapping f and g in the L = 0 case. ⌅
4.7 Improper Integrals 105

⌅ Example 4.26 Consider


+1
|sin x|
ˆ
dx.
3 (x 1)(x 2)
First note that 1 and 2 are outside the interval [3, +1), so they are not considered as “bad
points”. Noting that on [3, +1) we have:

|sin x| 1
0  .
(x 1)(x 2) (x 1)(x 2)
+1
1
ˆ
We next argue that dx converges. Considering that
3 (x 1)(x 2)
1
(x 1)(x 2)
lim 1 =1
x!+1
x2

+1 +1
1 1
ˆ ˆ
and dx converges, the limit comparison test shows dx converges,
3 x2 3 (x 1)(x 2)
and by comparison test, ˆ +1
|sin x|
dx
3 (x 1)(x 2)
converges too.

⌅ Example 4.27 Consider


5
1
ˆ
p
3
dx.
0 5x + 2x4
Note that 0 is the “bad point” for
p this improper integral.
p As x ! 0, the term 5x is much larger
than that of 2x4 , so we expect 3 5x + 2x4 behaves like 3 x as x ! 0. It suggests that we should
compare it with

p 1 r r
3
5x+2x4 x 3 1 1
lim 1 = lim 3
= lim = p 2 (0, 1).
x!0+ p
3 x x!0+ 5x + 2x4 x!0+ 5 + 2x 3 3
5

5 5
1 1
ˆ ˆ
Recall that p dx converges, so by limit comparison test, we conclude that p dx
0
3
x 0
3
5x + 2x4
converges too.

⌅ Example 4.28 Consider the integral:


ˆ +1
x2
e dx.
1
ˆ 0 ˆ +1
x2 x2
In order to show that it converges, we need to show both e dx and e dx
1 0
converge. ˆ +1
2
On [1, +1), we have x2 x and so 0  e x  e x . Since e x dx converges (to 1),
ˆ +1 0
2
by comparison test, we know that e x dx converges.
ˆ 0 0
x2
For e dx, one can use the change of variables y = x to show that in fact
1
106 Integrations
ˆ 0 ˆ +1
x2 x2 x2
e dx = e dx. Alternatively, one can compare it with e  ex on ( 1, 1].
1 0
This concludes that
ˆ +1 ˆ 0 ˆ +1
x2 x2 x2
e dx = e dx + e dx
1 1 0

converges.

⌅ Exercise 4.69 Determine whether the following improper integrals converge or not.
+1
x tan 1 x
ˆ
(a) p dx
ˆ2 1 x5 2x2 + 1
x tan 1 x
(b) p dx
5 2x2 + 1
ˆ0 1 x
1
(c) q p p dx
0 x+ x+ x
ˆ +1
1
(d) q p p dx
1 x+ x+ x

⌅ Exercise 4.70 — Gamma Function. Let


ˆ +1
(x) := tx 1
e t
dt.
0

(a) Show that for any x > 0, the defining integral of (x) converges.
(b) Show that (x + 1) = x (x) for any x > 0, and deduce that (n) = (n 1)! for any
n 2 N.

⌅ Exercise 4.71 — Source: MATH1024 Spring 2018 Midterm. Consider the function

1
f (x) = .
(x 1)(x 2) · · · (x 1024)
ˆ +1
Determine all real numbers c such that f (x) dx converges. Explain your answer.
c

4.7.4 Absolute convergence versus conditional convergence


ˆ +1 ˆ +1
An improper integral such as f (x) dx is said to converge absolutely if |f (x)| dx con-
a ˆ +1 a

verges. By the proposition below, one can also claim that f (x) dx converges too:
a

Proposition 4.29 — Absolute Convergence Test for Improper Integrals. Suppose f is locally Rie-
ˆ +1 ˆ +1
mann integrable on [a, 1), and that |f (x)| dx converges, then f (x) dx converges
a a
too. Similar results hold for other types of improper integrals.

Proof. The key observation is 0  f (x) + |f (x)|  2 |f (x)|. By comparison test, we get that
ˆ +1
(f (x) + |f (x)|) dx
a
4.7 Improper Integrals 107
ˆ +1
converges, and then by the convergence of |f (x)| dx, we have
a
ˆ +1 ˆ +1 ˆ +1 ˆ +1
f (x) dx = (f (x) + |f (x)| |f (x)|) dx = (f (x) + |f (x)|) dx |f (x)| dx
a a a a

converges too. ⌅

⌅ Example 4.29 From Example 4.26, we know that


+1
1
ˆ
dx
3 (x 1)(x 2)

converges, so by Proposition 4.29, we conclude that


ˆ +1
sin x
dx
3 (x 1)(x 2)
converges too.

However, the converse of Proposition 4.29 is not true. It is possible that


ˆ +1 ˆ +1
f (x) dx converges but |f (x)| dx diverges.
a a
ˆ +1
In such case, we say f (x) dx converges conditionally.
a
Here is a counterexample:

⌅ Example 4.30 Consider the improper integral:


+1
sin x
ˆ
dx.
1 x

For any b > 1, we have


ˆ b
sin x
ˆ
1b h cos x ib ˆ b cos x
dx = d(cos x) = + dx
1 x 1 x x 1 1 x2
ˆ b
cos b cos x
= cos 1 dx.
b 1 x2

As b ! +1, we have cos b


b ! 0. For the integral of x2 ,
cos x
we consider that

cos x 1
0  2.
x2 x
+1 +1
1 cos x
ˆ ˆ
Since 2
dx converges, by comparison test we know dx converges. Then
1 x 1 x2
by absolute convergence test, ˆ +1
cos x
dx
0 x2
converges as well. This shows
b
sin x
ˆ
lim dx
b!+1 1 x
+1
sin x
ˆ
exists, and so dx converges.
1 x
108 Integrations

However, one can show ˆ +1


|sin x| x dx
1

diverges. It suffices to find a sequence {bn } such that bn ! +1 and


bn
sin x
ˆ
lim dx
n!1 1 x

diverges. We choose bn = n⇡ (noting that |sin x| has period ⇡). Then,

n⇡ n
X1 ˆ (k+1)⇡
sin x sin x
ˆ
dx dx
1 x k⇡ x
k=1
n
X1 ˆ (k+1)⇡ |sin x|
dx.
k⇡ (k + 1)⇡
k=1

By the periodicity of |sin x|, one can easily see that


ˆ (k+1)⇡ ˆ ⇡
|sin x| dx = |sin x| dx,
k⇡ 0

so it is independent of k. Finally, we get


n⇡ ⇡ n
X1
sin x 1 1
ˆ ˆ
dx |sin x| dx · .
1 x ⇡ 0 k+1
k=1

Pn 1 1
As, n ! +1, k=1 k+1 diverges to +1, hence
n⇡
sin x
ˆ
lim dx = +1.
n!+1 1 x

This shows
+1 b
sin x sin x
ˆ ˆ
dx = lim dx
1 x b!+1 1 x
diverges.

⌅ Exercise 4.72 Show that if f : R ! R is a continuous periodic function with period T > 0 (i.e
+1
f (x)
ˆ
f (x + T ) = f (x) for any x 2 R) and also that f (x) 6⌘ 0, show that dx diverges.
1 x

+1
sin x
ˆ
The method used to determine the convergence of the example dx can be further
a x
generalized. Consider the product f (x)g(x) where f is continuous, and g is C on [a, +1). Let
1

ˆ x
F (x) := f (t) dt.
a

Then, by integration by parts, we have


ˆ +1 ˆ +1
f (x)g(x) dx = g(x) d(F (x))
a a
!
ˆ b
0
= lim [F (x)g(x)]ba F (x)g (x) dx .
b!+1 a
4.7 Improper Integrals 109
ˆ +1 ˆ +1
If both lim F (b)g(b) and F (x)g 0 (x) dx converge, then f (x)g(x) dx converges too.
b!+1 a a
These would happen if, for instance, one of the following conditions is met:
• F (x) is bounded and g 0 (x) 0 on [a, +1), and lim g(x) = 0 (known as Dirichlet Test);
x!+1
• lim F (x) exists, g 0 (x) 0 and g is bounded on [a, +1) (known as Abel Test).
x!+1

⌅ Exercise 4.73 Verify that any of the above conditions implies that both lim F (b)g(b) and
b!+1
ˆ +1
F (x)g (x) dx converge.
0
a

⌅ Example 4.31 Suppose f : R ! R is a continuous periodic function with period T > 0, such
that ˆ T
f (x) dx = 0.
0
+1
f (x)
ˆ
Show that dx converges.
1 x

⌅ Solution For any x > 1, we consider let k 2 N such that 1 + kT  x < 1 + (k + 1)T . Then,
ˆ x ˆ 1+T ˆ 1+2T ˆ 1+kT ˆ x
F (x) := f (t) dt = f (t) dt + f (t) dt + · · · + f (t) dt + f (t) dt.
1 1 1+T 1+(k 1)T 1+kT

By the periodicity of f , we know


ˆ 1+T ˆ 1+2T ˆ 1+kT ˆ T
f (t) dt = f (t) dt = · · · = f (t) dt = f (t) dt = 0,
1 1+T 1+(k 1)T 0
ˆ x
so F (x) = f (t) dt. Next we claim that F (x) is bounded:
1+kT

ˆ x ˆ 1+(k+1)T ˆ T
|F (x)|  |f (t)| dt  |f (t)| dt = |f (t)| dt.
1+kT 1+kT 0

Here the last equality follows from the periodicity of the function |f (t)|. This shows F (x) is
bounded. Clearly dx d 1
x = x2
1
0, and x1  1 on [1, 1). We can let g(x) = x1 then both
F and g fulfill the conditions of the Abel test. Therefore, we conclude that
ˆ +1 ˆ +1
f (x) f (x)
dx and hence dx
1 x 1 x
converge.

⌅ Exercise 4.74 Show that the following improper integrals converge:


+1 1 +1
sin x tan x
ˆ ˆ
dx and sin(x2 ) dx.
1 x 1

⌅ Exercise 4.75 Show that the following improper integral converges when s > 0:
1
x [x]
ˆ
dx.
1 xs+1
110 Integrations

Show also that when s > 1:


X1 1
1 s x [x]
ˆ
⇣(s) := s
= s dx.
n=1
n s 1 1 xs+1

[Remark 1: Although x [x] is discontinuous at integer points, it is a bounded function.


ˆ 2 ˆ 3 ˆ 4
There is no need to break down the above integral into + + + · · · when showing
1 2 3
the convergence, but it is necessary
P1 to do so when computing it.]
[Remark 2: Even though n=1 n1s diverges when s 2 (0, 1), the RHS
1
s x [x]
ˆ
s dx
s 1 1 xs+1

is well-defined when s 2 (0, 1). We can extend ⇣(s) to a larger domain (0, +1)\{1} by giving
it a more general definition as:
ˆ 1
ˆ := s
⇣(s) s
x [x]
dx
s 1 1 xs+1

when s 2 (0, 1)\{1}. Then, we would have ⇣(s) ˆ = ⇣(s) when s > 1. In complex analysis,
one can show there is at most one way of extending a holomorphic function (beyond the
scope of this course), so one usually simply writes ⇣ instead of ⇣ˆ for the extension. In fact,
one can further extend ⇣ to the domain C\{1} and P1show that the extended ⇣ takes the value
12 when s = 1. However, we have ⇣(s) = n=1 n1s only when Re(s) > 1 and ⇣(s) is
1

defined
P1 as “something else” when Re(s)  1. It is totally a misconception that regard that
12 .]
1 1
n=1 n 1 = 1 + 2 + 3 + 4 + · · · =

+1
ts 1
ˆ
⌅ Exercise 4.76 (a) Show that when s > 1, the improper integral dt converges.
0 et 1
(b) By a suitable substitution, show that for any n 2 N and s > 0 we have
ˆ +1
1
(s) = ts 1 e nt dt.
ns 0

Here is the Gamma function defined in Exercise 4.70.


(c) Finally, show that ˆ +1 s 1
t
⇣(s) (s) = dt
0 et 1
for any s > 1, where ⇣(s) is the zeta function.
1
X ˆ +1
[Remark 1: You may assume that it is legitimate to swapping with . It is gener-
n=1 0
ally not always true, but we can justify it is fine in our example in MATH 3033/3043.]
´ +1 ts 1
[Remark 2: Hence we can take (s) 1
0 et 1 dt as a new definition of ⇣(s). With some
further works (including the extension of (s)), one can show that such an expression
could give an extension to ⇣(s) to the domain C\{1}.]
4.8 Parametric Curves 111

4.8 Parametric Curves


In R2 (similarly for R3 ), we often represent a curve in parametric form, such as the unit circle:

x = cos t
y = sin t

where t 2 [0, 2⇡]. One may denote these parametric equations in vector form:

r(t) = (cos t, sin t) = (cos t)i + (sin t)j, t 2 [0, 2⇡].

The graph y = f (x) of a single variable function f : [a, b] ! R can be represented in parametric
form by:
r(t) = (t, f (t)), t 2 [a, b].
It is helpful to think of t as the time variable, and r(t) as the position vector of a moving particle
at time t. Then the curve represented by r(t) is the path of the particle.

4.8.1 Geometric meaning of r0 (t)


In physics, the meaning of r0 (t) is defined to be the velocity at time t as it is the rate of change
of the position vector, and r00 (t) is the acceleration. In mathematics, one can show that r0 (t) (if
non-zero) is in fact a tangent vector to the curve r(t).
To prove this, we first show that it is true when the curve is a graph of a C 1 function f , i.e.
the special case r(t) = (t, f (t)). Then, we have

r0 (t) = (1, f 0 (t)) = i + f 0 (t)j.

This vector has slope f 0 (t), which is the slope of the tangent to the graph y = f (x) at the point
(t, f (t)). Hence, r0 (t) is a tangent vector to the curve at the point (t, f (t)).
Now consider a general case r(t) = (f (t), g(t)) where f, g are C 1 functions. If r0 (t0 ) 6= 0 at a
particular time t0 , then f 0 (t0 ) 6= 0 or g 0 (t0 ) 6= 0. WLOG we assume f 0 (t0 ) 6= 0. Then, f is strictly
monotone near t0 , and hence it is locally invertible. From MATH 1023, we learned that the local
inverse f 1 : (⌧0 ", ⌧0 + ") ! (t0 , t0 + ) is also C 1 . Here f (t0 ) = ⌧0 . Next we consider a
“new” curve:
(⌧ ) := r(f 1 (⌧ )) = ⌧, g f 1 (⌧ ) , ⌧ 2 (⌧0 ", ⌧0 + ").
We put “new” in quotation because it is not really a new curve, but the same curve as r(t) near
t = t0 with the particle travelling at a different speed. By now, the curve (⌧ ) is simply the graph
of y = g f 1 (x). From the previous paragraph, we know that 0 (⌧0 ) is a tangent vector to the
curve at the point (⌧0 ) = (f (t0 ), g(t0 )). However, by chain rule, we also know that

0 d d d
(⌧ ) = r(f 1
(⌧ )) = r0 (f 1
(⌧ )) f 1
(⌧ ) =) 0
(⌧0 ) = f 1
(⌧ ) r0 (t0 ).
d⌧ d⌧ d⌧ ⌧ =⌧0
| {z }
scalar

Therefore, 0 (⌧0 ) and r0 (t0 ) are parallel to each other, and so r0 (t0 ) is also a tangent vector to the
curve at (f (t0 ), g(t0 )). The case when g 0 (t0 ) 6= 0 is similar – just regard x is a function y near the
point (f (t0 ), g(t0 )).

i The geometric meaning of r00 (t) is related to the curvature. You may learn more about it in
MATH 2023 or 4223.

4.8.2 Rectifiable Curves


Next we discuss what it means by length of a curve. Given a curve r(t) : [a, b] ! R2 , we first
attempt to approximate it by line segments. That is, take a partition P = {a = t0 < t1 < · · · <
tn = b} and consider the sum:
n
X
lP := |r(ti ) r(ti 1 )| .
i=1
112 Integrations

It is the total length of the line segments joining points r(t0 ), r(t1 ), · · · , r(tn ). As we are
taking more and more refined partitions P , we expect lP gets larger by the triangle inequality.
Therefore, the best approximation of the length of the curve is naturally defined as the maximum
possible lP among all partitions P .
Definition 4.7 — Rectifiable Curve and Arc Length. Let r(t) : [a, b] ! R2 be a curve in R2 . We
call r(t) a rectifiable curve if lP  C for some constant C 2 (0, 1) independent of partitions
P of [a, b]. In such case, we define the arc length of {r(t)}t2[a,b] to be:
( n )
X
sup lP = sup |r(ti ) r(ti 1 )| : a = t0 < t1 < · · · < tn = b .
P i=1

i It doesn’t seem easy to check whether a curve is rectifiable nor to compute the arc length.
Fortunately, we can later show that any C 1 curve (i.e. r(t) = (f (t), g(t)) where f, g are C 1 )
ˆ b
is rectifiable and its arc length is simply given by the integral r0 (t) dt.
a

Obviously, any straight-line segment r(t) = (1 t)r0 + tr1 , t 2 [0, 1], joining the points
with position vectors r0 and r1 is rectifiable. To prove this, we compute that for any partition
P : 0 = t0 < t1 < · · · < tn = 1, we have:
n
X n
X n
X
|r(ti ) r(ti 1 )| = |(ti ti 1 )r1 r0 | = (ti ti 1 ) |r1 r0 | = |r1 r0 | .
i=1 i=1 i=1

In particular, lP = |r1 r0 | for any partition P of [0, 1], hence it is bounded above. This shows
the straight-line segment is rectifiable, and its length is given by:

sup lP = sup |r1 r0 | = |r1 r0 | ,


P P

which is exactly what we expect.


We next show that a unit circle is rectifiable. We want to avoid using any differentiation on
sin and cos functions, because they are based on the limit identity sinx x ! 1 when x ! 0. The
proof of this limit identity requires the use of length of a circular arc, so it is built upon the fact
that a circle is rectifiable. To prove that a unit circle is rectifiable without circular reasoning, we
parametrize the semi-circle by:
p
r(t) := t, 1 t2 , t 2 [ 1, 1].

After showing also the lower semi-circle is also rectifiable (mutatis mutandis), then we can
conclude that the full circle is rectifiable. We need the following observation:

⌅ Exercise 4.77 Let P be a partition of [a, b], and let P 0 = P [ {t0 }. Show that lP  lP 0 . Hence,
show that for a continuous curve {r(t)}t2[a,b] , if {r(t)}t2[a,c] and {r(t)}t2[c,b] are rectifiable for
some c 2 (a, b), then {r(t)}t2[a,b] is also rectifiable.
p
Recall that we parametrize the upper semi-circle by r(t) := t, 1 t2 , t 2 [ 1, 1]. Given
any partition P of [ 1, 1], we may refine P by taking P 0 := P [ {0}, then we must have lP  lP 0 .
After such a refinement, one can easily see from the diagram below that lP 0  4:

Figure 4.6: diagram to be added

In particular, we have lP  lP 0  4 for any partition P of [ 1, 1]. This shows the upper
semi-circle is rectifiable, meaning that supP lP exists in R. We then define

⇡ := sup lP .
P
4.8 Parametric Curves 113

⌅ Exercise 4.78 Show that if : R2 ! R2 is a distance-preserving map, and {r(t)}t2[a,b] is a


rectifiable curve, then { r(t)}t2[a,b] is also rectifiable and its length is the same as that of
{r(t)}t2[a,b] .

4.8.3 An example of a non-rectifiable curve


A curve could be non-rectifiable if it fluctuates too much, such as:
(
(0, 0) if t = 0
r(t) = .
(t, sin 1t ) if 0 < t  ⇡2

To see why it is not rectifiable, we consider the sequence of partitions:


1 1 1 1 1 1
Pn : 0 < ⇡ < ⇡ < ⇡ < ⇡ < ··· < ⇡ < ⇡.
2 + 2n⇡ 2 + 2n⇡ 2 + 2(n 1)⇡ 2 + 2(n 1)⇡ 2 + 2⇡ 2

Then, we can easily see that


n
X ✓ ◆ ✓ ◆
1 1
l Pn r ⇡ r ⇡
2 + 2k⇡ 2 + 2k⇡
k=1
Xn ⇣⇡ ⌘ ⇣ ⇡ ⌘
sin + 2k⇡ sin + 2k⇡
2 2
k=1
Xn
2 = 2n.
k=1

Since n 2 N can be arbitrarily large, it is impossible to find an upper bound C for lPn . This
concludes such the curve r(t) is not rectifiable.

⌅ Exercise 4.79 Show that the graph y = f (x), x 2 [0, 2/⇡] where
(
x sin x1 if x 6= 0
f (x) =
0 if x = 0

is not rectifiable.

4.8.4 Arc-length formula for C 1 curves


Now we are ready to derive the formula of arc-length that appears in many calculus textbooks
for physics/engineering majors.

Proposition 4.30 Suppose r(t) = (f (t), g(t)), t 2 [a, b], is a curve with f, g being C 1 on [a, b].
Then, {r(t)}t=[a,b] is rectifiable, and its arc-length is given by:
ˆ b ˆ b p
|r0 (t)| dt = f 0 (t)2 + g 0 (t)2 dt.
a a

Proof. The key idea is the use the mean value theorem to relate r0 (t) and r(ti ) r(ti 1 ). First
note that |r0 (t)| is a continuous function on [a, b], so it is Riemann integrable on [a, b]. Hence, we
have
ˆ b ˆ b ˆ b
0 0 0
sup L(|r | , P ) = |r (t)| dt = |r (t)| dt = |r0 (t)| dt = inf U (|r0 | , P ).
P a a a P

By the standard n -trick,


1
one can take a sequence of partitions {Pn } of [a, b] such that

lim L(|r0 | , Pn ) = sup L(|r0 | , P ).


n!1 P
114 Integrations

Since f 0 (t) and g 0 (t) are continuous on the closed and bounded interval [a, b], they are also
uniformly continuous on [a, b]. Hence, for any n 2 N, there exists n > 0 such that whenever
|t s| < n , we have |f 0 (t) f 0 (s)| < n1 and |g 0 (t) g 0 (s)| < n1 .
ˆ b
Now given any partition P of [a, b], we need to bound lP by a constant, and that |r0 (t)| dt
a
is the least upper bound of lP ’s among all partitions P of [a, b]. By mixing P with Pn , and with
enough partition points {c1 , · · · , ck }, we can get a refined partition Pn0 = P [ Pn [ {c1 , · · · , ck }
such that all its subintervals have width less than n . By reordering the partition points, we
denote:
Pn0 = {a = t0 < t1 < · · · < tn 1 < tn = b}
Then, we have ti ti 1 < n for any i. Then, we still have
ˆ b
lim L(|r0 | , Pn0 ) = |r0 (t)| dt
n!1 a

since ˆ b
L(|r0 | , Pn )  L(|r0 | , Pn0 )  |r0 (t)| dt.
a
Next on each [ti 1 , ti ], we use the mean value theorem compare |r(ti ) r(ti 1 )| with the
term inf [ti 1 ,ti ] |r0 | (ti ti 1 ) in L(|r0 | , Pn0 ). By extreme value theorem and continuity of |r0 (t)|,
there exists si 2 [ti 1 , ti ] such that
inf |r0 | = |r0 (si )| .
[ti 1 ,t]

Also, mean value theorem shows there exists t⇤i , t⇤⇤


i 2 (ti 1 , ti ) such that
f (ti ) f (ti 1) = f 0 (t⇤i )(ti ti 1 ),
0 ⇤⇤
g(ti ) g(ti 1 ) = g (ti )(ti ti 1 ).

Then, we have
n
X n
X
lP  lPn0 = |r0 (ti ) r0 (ti 1 )| = f 0 (t⇤i ), g 0 (t⇤⇤
i ) (ti ti 1 ).
i=1 i=1

Note that by si , t⇤i , t⇤⇤


i 2 [ti 1 , ti ] where ti ti 1 < n, so we have
1 1
|f 0 (t⇤i ) f 0 (si )| < and |g 0 (t⇤⇤
i ) g 0 (si )| < .
n n
This shows
q p
2 2 2
f 0 (t⇤i ), g 0 (t⇤⇤
i ) f 0 (si ), g 0 (si ) = |f 0 (t⇤i ) f 0 (si )| + |g 0 (t⇤⇤
i ) g 0 (si )| < .
n
Then by the (corollary of) triangle inequality in R2 : |v|  |v w| + |w| for any v, w 2 R2 , we
have
f 0 (t⇤i ), g 0 (t⇤⇤
i )  f 0 (t⇤i ), g 0 (t⇤⇤
i ) f 0 (si ), g 0 (si ) + f 0 (si ), g 0 (si )
p p
2 2
< + |r0 (si )| = + inf |r0 | .
n n [ti 1 ,ti ]

Therefore, we conclude that:


n
X
lP  lPn0 = f 0 (t⇤i ), g 0 (t⇤⇤
i ) (ti ti 1)
i=1
p !
2 0
< + inf |r | (ti ti 1)
n [ti 1 ,ti ]
p
2
= (b a) + L(|r0 | , Pn0 ).
n
4.8 Parametric Curves 115

Letting n ! 1, we proved that


p !
b
2
ˆ
0
lP  lim (b a) + L(|r | , Pn0 ) = |r0 (t)| dt.
n!1 n a

ˆ b
Therefore, lP is bounded from above by a constant |r0 (t)| dt independent of t. This shows the
a
curve {r(t)}t2[a,b] is rectifiable.
ˆ b
To show that supP lP = |r0 (t)| dt, we first show that
a
ˆ b
lim lPn0 = |r0 (t)| dt.
n!1 a

Recall that p
0 2
f (t⇤i ), g 0 (t⇤⇤
i )
0
f (si ), g (si ) 0
< .
n
By |w| |v w| + |v|, we have

f 0 (t⇤i ), g 0 (t⇤⇤
i ) f 0 (t⇤i ), g 0 (t⇤⇤
i ) f 0 (si ), g 0 (si ) + f 0 (si ), g 0 (si )
p p
2 2
> + |r0 (si )| = + inf |r0 | .
n n [ti 1 ,ti ]

This shows
n p
X 2
lPn0 = f 0 (t⇤i ), g 0 (t⇤⇤
i ) (ti ti 1) > (b a) + L(|r0 | , Pn0 ).
i=1
n

Combining with earlier result, we have


p p
2 0 0 2
(b a) + L(|r | , Pn ) < lPn0 < + (b a) + L(|r0 | , Pn0 ).
n n
Letting n ! 1 and by squeeze theorem, we proved:
ˆ b
lim lPn0 = |r0 (t)| dt
n!1 a
ˆ b
• |r0 (t)| dt is an upper bound of lP over all partitions P of [a, b], and
a
• there exists a sequence {Pn0 } such that
ˆ b
lim lPn0 = |r0 (t)| dt.
n!1 a
ˆ b
These combined show that supP lP = |r0 (t)| dt. It proves that this integral gives the length of
a
the curve. ⌅

⌅ Exercise 4.80 Show that if L is an upper bounded of X, and there exists a sequence xn 2 X
such that lim xn = L, then we have sup X = L.
n!1

⌅ Exercise 4.81 First digest the whole proof of Proposition 4.30. In the proof we considered
ˆ b
supP L(|r0 | , P ) to extract a sequence {Pn } so that L(|r0 | , Pn ) converges to |r0 (t)| dt. Can
a
we prove the proposition by considering inf P U (|r0 | , P ) instead? If not, point out why. If yes,
116 Integrations

rewrite the whole proof (without looking at the above proof) by considering inf P U (|r0 | , P ).

Using the arc-length formula, one can easily derive that the length of the graph y = f (x) over
x 2 [a, b] is given by:
ˆ bp
1 + f 0 (x)2 dx
a

It is simply because we can parametrize


q the graph by r(t) = (t, f (t)), t 2 [a, b]. One can check
2
easily that |r0 (t)| = |(1, f 0 (t))| = 1 + |f 0 (t)| .

⌅ Exercise 4.82 When you ride a bicycle near a farm field and a piece of cow’s dung sticks on
your wheel. The trajectory of the dung is given by:

r(t) = (rt r sin t, r r cos t),

where r > 0 is the radius of the wheel (assuming r is much larger than the diameter of the
dung). Find the distance travelled by the dung after one cycle.

⌅ Exercise 4.83 Write down a parametrization r(t) of the curve x2/3 + y 2/3 = 1, and compute
its arc length.

⌅ Exercise 4.84 A polar curve is one that is given by an equation r = f (✓), ✓ 2 [↵, ]. Here
(r, ✓) denote the polar coordinates and f is a C 1 function of ✓. Show that the length of the
curve is given by
ˆ p
f (✓)2 + f 0 (✓)2 d✓.

You might also like