Professional Documents
Culture Documents
Quantitative Methods 3
Academic year 2017-2018
Course Reader
Mathematics Part
Integration
Bachelor
Table of contents
2. Integration .............................................................................................................................. 25
1. The integral
This chapter is an elementary introduction to the theory of integration. Integrals can be used to determine the
area of a region below the graph of a positive function. In section 1.1 we will show that the area of such a region
can be important in an economic context.
Next we introduce for a positive function the so-called area function. We will show that differentiating this area
function leads to the original function. So it is worthwhile to know how to find for a given function f a function
F with derivative f. For obvious reasons, such a function is called an antiderivative, or integral, of f. So the area
of a region below the graph of a positive function has something to do with an antiderivative of that function.
We will explain how to determine an antiderivative of some elementary functions. You will learn how to find an
antiderivative of less elementary functions by using antiderivatives of elementary functions together with some
rules for multiples and sums of functions.
We will also investigate what the integral of an arbitrary (i.e. not necessarily positive) function has to do with the
area of one or more regions.
In this chapter all functions will be continuous. As it happens, this is a sufficient condition for functions to have
an antiderivative.
1.1 Area
By means of two examples we illustrate that the area of a region bounded by the graph of a positive function
may have an economic interpretation.
1
𝑃 = 𝑓(𝑄) = 50 − 𝑄
4
1
𝑃 = 𝑔(𝑄) = 10 + 𝑄
12
Then the competitive market equilibrium occurs at the intersection point of the graphs of these functions
(𝑄 ∗ , 𝑃∗ ) = (120,20) and what the consumers pay is equal to what the producer gets, namely 120 × 20 = 2400,
which is the area of the grey rectangle in the picture below:
4 Integration
However, according to the demand curve, some consumers are willing to pay a higher price, for example at
𝑄 = 80 the corresponding price on the demand curve is 𝑃 = 30, which means that the first 80 “units” could
have been sold at a price of 30 instead of 20. Refining this procedure we could determine at every value of
𝑄 < 120 the marginal consumer willingness to pay for the last unit output. Then the total amount that
consumers are willing to pay at the equilibrium output 𝑄 ∗ = 120 turns out to be the area below the demand
curve between 𝑄 = 0 and 𝑄 = 120. So they are willing to pay more than what they are actually paying. This is
called the Consumer surplus. It is the area of the light blue region in the figure below, indicated by CS:
The value of this area is easy to determine, as the demand function is linear, so the consumer surplus is here the
1
area of a triangle: 𝐶𝑆 = 2 × 120 × (50 − 20) = 1800.
But if the demand function is nonlinear, then in general for the determination of the area below the graph we
need more advanced tools.
For the producer there is a similar reasoning. At equilibrium the producer gets 120 × 20 = 2400, the area of the
grey rectangle. But, according to the supply curve, at a lower output the producer is willing to accept a lower
price, for example at 𝑄 = 60 the producer will accept a price 𝑃 = 15. Thus we can determine, at every value of
𝑄 < 120 the marginal price to be accepted by the producer. So the producer gets more than he is willing to
accept, which is in total the area below the graph of the supply function between 𝑄 = 0 and 𝑄 = 120. What the
producer gets extra is called the Producer surplus and is the pink area below, indicated by PS:
Integration techniques 5
1
The value of this area is again easy to determine. It is 𝑃𝑆 = 2 × 120 × 10 = 600. Also in this case we need
more advanced tools in order to determine PS if the supply function is nonlinear.
−𝜆𝑥
𝑓(𝑥) = {𝜆𝑒 , if 𝑥 ≥ 0
0, otherwise
where 𝜆 > 0 is a parameter specifying the density function. If the units of time are minutes then the probability
that your waiting time in the queue is less than T minutes turns out to be the area below the graph of 𝑓 between
𝑥 = 0 and 𝑥 = 𝑇. See the figure below, where we took 𝜆 = 0.5 and 𝑇 = 3:
1
N.B.: 𝜆 has something to do with the mean of the distribution; to be precise: 𝜆 is the mean, so in our specific
example the mean waiting time is 2 minutes.
6 Integration
We are interested in the area of the region bounded by the graph of this function, the horizontal axis and the two
vertical lines 𝑥 = 𝑎 and 𝑥 = 𝑏. These two lines are called left and right boundary of the region.
In order to determine this area, we use the following clever trick that seems to be useless at first sight: we are
going to vary the right boundary of the region, or in other words:
we consider the area of the region bounded by the graph of the function 𝑓, the horizontal axis, the vertical line
𝑥 = 𝑎 and the vertical line 𝑥 = 𝑡, where 𝑡 is an arbitrary number between 𝑎 and 𝑏.
Since the area of this region depends on the value of 𝑡 it is denoted by 𝐴(𝑡). In this way the area has become a
function of 𝑡, and we call it the Area function.
Area function
For a function 𝑓 that is positive on the interval [𝑎, 𝑏] we define the area function 𝐴 with domain [𝑎, 𝑏] as
follows:
For every 𝑡 in the domain [𝑎, 𝑏], the function value 𝐴(𝑡) is the area of the region bounded by the graph of 𝑓, the
horizontal axis and the vertical lines 𝑥 = 𝑎 and 𝑥 = 𝑡.
Integration techniques 7
In order to get a better feeling for what the area function really is we consider a few examples with simple
functions.
𝑔(𝑥) = 3, with 1 ≤ 𝑥 ≤ 4
Then the area function 𝐴 has at 𝑥 = 𝑡 a value that is the area of the grey rectangle in the figure below:
This area is 𝐴(𝑡) = 3(𝑡 − 1) and the total area below the graph of the function 𝑔 is 𝐴(4) = 9.
1.2.1 Exercise
Find the area function corresponding to the function 𝑓 with domain [0,5], defined by 𝑓(𝑥) = 𝑥. Try to solve this
yourself. The solution is given below.
1 1
Then the area function 𝐴 at 𝑥 = 𝑡 is the area of the grey triangle. This area is 𝐴(𝑡) = 2 𝑡 × 𝑡 = 2 𝑡 2 and the total
area below the graph of the function 𝑓 is 𝐴(5) = 12.5.
8 Integration
Now we will derive two properties of an area function that will help us to find area functions of all kinds of
functions.
The first property is obvious and follows directly from the definition of area function:
If 𝐴 is the area function corresponding to the positive function 𝑓, with domain [𝑎, 𝑏], then
𝐴(𝑎) = 0
For the second property we consider a small change in the area function 𝐴 corresponding to the function 𝑓 with
domain [𝑎, 𝑏].
𝐴(𝑡) is the area below the graph and between 𝑥 = 𝑎 and 𝑥 = 𝑡. If we increase 𝑡 by a small amount, say Δ𝑡, to
𝑡 + Δ𝑡, then the increase in the area function will be Δ𝐴 = 𝐴(𝑡 + Δ𝑡) − 𝐴(𝑡), as indicated in the figure below.
In this figure the function 𝑓 is decreasing between 𝑡 and 𝑡 + Δ𝑡, so that 𝑓(𝑡) > 𝑓(𝑡 + Δ𝑡). From the figure it is
immediately clear that Δ𝐴 must be between the area 𝑓(𝑡)Δ𝑡 of the larger rectangle with width Δ𝑡 and height 𝑓(𝑡)
and the area 𝑓(𝑡 + Δ𝑡)Δt of the smaller rectangle with width Δ𝑡 and height 𝑓(𝑡 + Δ𝑡). So
What will happen if we take Δ𝑡 smaller and smaller? In other words, what is the limit of the expressions in the
inequalities above if we let Δ𝑡 go to zero (shortly Δ𝑡 → 0)?
The left and the right hand part will be equal to 𝑓(𝑡) if ∆𝑡 → 0, but what about the middle part?
Well, by definition
If 𝐴 is the area function corresponding to the positive function 𝑓, with domain [𝑎, 𝑏], then
𝐴′ (𝑡) = 𝑓(𝑡).
N.B.: in the proof above we used that the function 𝑓 is decreasing between 𝑡 and 𝑡 + ∆𝑡, but also if the function
is increasing a similar reasoning is possible.
In words: the derivative of the area function corresponding to a given (positive) function is this function.
From this result we conclude that finding the area below the graph of a (positive) function 𝑓 has something to do
with finding a function of which the derivative is 𝑓.
1.2.2 Exercise
Check which of the following functions 𝐴 satisfy the properties of the area function of the function
𝑔(𝑥) = 2𝑥 − 2 with domain [2,7].
(1) 𝐴(𝑥) = (𝑥 − 1)2 (2) 𝐴(𝑥) = 𝑥 2 − 2𝑥 (3) 𝐴(𝑥) = 𝑥 2 − 22
Let 𝑓 be a function on the interval [𝑎, 𝑏]. Then a function 𝐹 defined on [𝑎, 𝑏] is called an antiderivative of 𝑓 if
𝐹(𝑥) + 𝑐
For this general form of antiderivative we use a notation with the integral sign:
∫ 𝑓(𝑥) 𝑑𝑥 = 𝐹(𝑥) + 𝑐
1
∫ 𝑓(𝑥) 𝑑𝑥 = ∫ 𝑥 𝑑𝑥 = 𝑥 2 + 𝑐
2
∫ 𝑓(𝑥) 𝑑𝑥 = ∫ 𝑘 𝑑𝑥 = 𝑘𝑥 + 𝑐
1
∫ 𝑥 2 𝑑𝑥 = 𝑥 3 + 𝑐
3
1
∫ 𝑥 𝑎 𝑑𝑥 = 𝑥 𝑎+1 + 𝑐
𝑎+1
where, obviously, 𝑎 ≠ −1.
This rule holds for any value of 𝑎, except -1, but we should take into account that for 𝑎 < 0 the function 𝑓 is not
defined at 𝑥 = 0.
1
What if we take 𝑎 = −1 in this last example? Or, what is an antiderivative of 𝑓(𝑥) = 𝑥?
1
EXAMPLE INTEGRAL OF THE FUNCTION 𝑓(𝑥) = 𝑥
You should know that for 𝑥 > 0 as an antiderivative for 𝑓 we can take 𝐹(𝑥) = ln 𝑥. But even if 𝑥 < 0, there is
an antiderivative of 𝑓, namely 𝐹(𝑥) = ln(−𝑥). Just take the derivative (with the chain rule) and observe that
also in this case 𝐹 ′ (𝑥) = 𝑓(𝑥). So the function
ln 𝑥 if 𝑥 > 0
𝐹(𝑥) = {
ln(−𝑥) if 𝑥 < 0
1
is an antiderivative of 𝑓(𝑥) = 𝑥, where we take into account that 𝑥 ≠ 0. this can be summarized to:
1
∫ 𝑑𝑥 = ln |𝑥| + 𝑐
𝑥
where 𝑥 ≠ 0.
1 𝑎𝑥
∫ 𝑒 𝑎𝑥 𝑑𝑥 = 𝑒 +𝑐
𝑎
where 𝑎 ≠ 0.
In order to find an antiderivative of the function 𝑔(𝑥) = 𝑎 𝑥 (with 𝑎 > 0 and 𝑎 ≠ 1) we need to apply some
rules of exponents and logarithms. First rewrite
𝑥
𝑔(𝑥) = 𝑎 𝑥 = 𝑒 𝑙𝑛𝑎 = 𝑒 𝑥𝑙𝑛𝑎
1 𝑥𝑙𝑛𝑎 1 𝑥
∫ 𝑎 𝑥 𝑑𝑥 = ∫ 𝑒 𝑥𝑙𝑛𝑎 𝑑𝑥 = 𝑒 +𝑐 = 𝑎 +𝑐
ln 𝑎 ln 𝑎
12 Integration
In the former examples we have seen the integrals of all kinds of elementary functions, which are summarized
below:
∫ 𝑓(𝑥) 𝑑𝑥 = ∫ 𝑘 𝑑𝑥 = 𝑘𝑥 + 𝑐
1
∫ 𝑥 𝑎 𝑑𝑥 = 𝑥 𝑎+1 + 𝑐
𝑎+1
1
∫ 𝑑𝑥 = ln |𝑥| + 𝑐
𝑥
1 𝑎𝑥
∫ 𝑒 𝑎𝑥 𝑑𝑥 = 𝑒 +𝑐
𝑎
1 𝑥
∫ 𝑎 𝑥 𝑑𝑥 = 𝑎 +𝑐
ln 𝑎
In the former section in the example about the area function of a constant function we have seen that the area
function of 𝑔(𝑥) = 3 with 𝐷𝑔 = [1,4] is 𝐴(𝑥) = 3(𝑥 − 1), which is the antiderivative of g with 𝐴(1) = 0.
1
Similarly in the example about the area function of 𝑓(𝑥) = 𝑥 with 𝐷𝑓 = [0,5] we found 𝐴(𝑥) = 2 𝑥 2 , which is
the antiderivative of 𝑓 with 𝐴(0) = 0.
Then the following property of antiderivatives will not surprise you:
Let 𝑓 be a function on the interval [𝑎, 𝑏] and 𝐹 be an antiderivative of 𝑓. Then there exists exactly one
antiderivative 𝐺 of 𝑓 with 𝐺(𝑎) = 0, namely
APPLICATION F INDING THE COST FUNCTION FROM THE MARGINAL AND FIXED COST
Assume that at production level 𝑥 the marginal cost function is given by
𝐶 ′ (𝑥) = 2𝑥 2 + 2𝑥 − 5
and that the fixed cost (at 𝑥 = 0) is equal to 100. In order to find the cost function 𝐶 we first try to find the
integral of 𝐶′. We know what the integrals are of the different terms in the marginal cost function, so we may
expect that the integral of 𝐶′ is the sum of these integrals:
2
∫(2𝑥 2 + 2𝑥 − 5) 𝑑𝑥 = 𝑥 3 + 𝑥 2 − 5𝑥 + 𝑐
3
Integration techniques 13
2
So 𝐶(𝑥) = 3 𝑥 3 + 𝑥 2 − 5𝑥 + 𝑐, where 𝑐 can be any number. However we also know that at 𝑥 = 0 the fixed cost
𝐶(0) = 100. Then 𝑐 is uniquely determined as 𝑐 = 100 and
2
𝐶(𝑥) = 𝑥 3 + 𝑥 2 − 5𝑥 + 100
3
Integration rules
In this section we have defined the integral of a function, we have seen integrals of several standard functions
and we formulated two rules of integration.
These rules make it possible to find antiderivatives of more complicated functions. But still there are a lot of
functions whose integral is hard to find. But in general it is easy to check whether a function is an antiderivative:
just take the derivative, applying all the rules of differentiation that you know (sum, product, quotient, chain
rule)!
3
𝑓(𝑥) = 5√𝑥 − (with 𝑥 > 0)
𝑥2
we first determine an antiderivative 𝐺 of the function 𝑔(𝑥) = √𝑥 and an antiderivative 𝐻 of the function
3
ℎ(𝑥) = 𝑥 2 .
1
1
Because √𝑥 = 𝑥 2 and 𝑥 2 = 𝑥 −2 , we conclude that
1 1 2 3 2
𝐺(𝑥) = 𝑥 2+1 = 𝑥 2 = 𝑥 √𝑥, and
1 3 3
+1
2
1 1
𝐻(𝑥) = 𝑥 −2+1 = −𝑥 −1 = − .
−2 + 1 𝑥
2 1 10 3
𝐹(𝑥) = 5 ∙ 𝑥√𝑥 − 3 ∙ (− ) = 𝑥 √𝑥 + (with 𝑥 > 0)
3 𝑥 3 𝑥
14 Integration
1.3.1 Exercise
Find an antiderivative of the function 𝑓 defined by
6 7 1 2
(a) 𝑓(𝑥) = 2𝑒 𝑥 + + 2 √𝑥 3 , with x > 0. (b) 𝑓(𝑥) = 4√𝑥 (√𝑥 + 2 ) .
𝑥 𝑥
3𝑥 + 2
𝑓(𝑥) =
2√𝑥 + 1
Then it is rather hard to find an antiderivative, but it is easy to check that indeed 𝐹(𝑥) = 𝑥 √𝑥 + 1 is really an
antiderivative.
The antiderivative with value 0 at 𝑥 = 0 is 𝐺(𝑥) = 𝐹(𝑥) − 𝐹(0) = 𝑥 √𝑥 + 1 − 1.
1.3.2 Exercise
Find the area function of the function 𝑓(𝑥) = 𝑥 3 with domain 𝐷𝑓 = [2,12] and calculate the area of the region
below the graph of the function.
𝑏
Hence the area we are looking for is equal to 𝐴(𝑏) = 𝐹(𝑏) − 𝐹(𝑎). This number is also denoted by ∫𝑎 𝑓(𝑥) 𝑑𝑥
and it is called the (definite) integral of f from a to b. This leads to the following general definition of an integral
for functions that are not necessarily positive:
𝑏
∫ 𝑓(𝑥) 𝑑𝑥 = 𝐹(𝑏) − 𝐹(𝑎)
𝑎
is called the (definite) integral of 𝑓 from 𝑎 to 𝑏. The number 𝑎 is called the lower limit and 𝑏 the upper limit of
integration.
Integration techniques 15
Often we write [𝐹(𝑥)]𝑏𝑎 instead of 𝐹(𝑏) − 𝐹(𝑎). This notation has the advantage that you can first write down
an antiderivative between the square brackets before filling in the values for 𝑏 and 𝑎.
Obviously it is only possible to evaluate an integral if you can find an antiderivative. Fortunately we have
already seen some elementary integrals and rules of integration in the former section, so that we are already able
to determine a lot of antiderivatives, and thus evaluate also a lot of (definite) integrals. The process of evaluating
an integral is called integration.
25
∫ 2𝑥√𝑥 𝑑𝑥
0
25
4 4
∫ 2𝑥√𝑥 𝑑𝑥 = [ 𝑥 2 √𝑥]25 2
0 = ∙ 25 √25 = 2500
0 5 5
1.4.1 Exercise
Evaluate the integral
2
1 2
∫ (𝑥 + 3 ) 𝑑𝑥
1 𝑥
We can consider the area of this region as the difference of the area of the region bounded by the graph of the
(positive) function 𝑓, the 𝑥-axis and the lines 𝑥 = 𝑎 and 𝑥 = 𝑏, denoted by
𝑏
∫ 𝑓(𝑥) 𝑑𝑥
𝑎
and the area of the region bounded by the graph of the (positive) function 𝑔, the 𝑥-axis and the lines 𝑥 = 𝑎 and
𝑥 = 𝑏, denoted by
𝑏
∫ 𝑔(𝑥) 𝑑𝑥.
𝑎
If 𝐹 and 𝐺 are antiderivatives of 𝑓 and 𝑔, respectively, then the area between the graphs of 𝑓 and 𝑔 equals
𝑏 𝑏
∫ 𝑓(𝑥) 𝑑𝑥 − ∫ 𝑔(𝑥) 𝑑𝑥 = [𝐹(𝑥)]𝑏𝑎 − [𝐺(𝑥)]𝑏𝑎 = 𝐹(𝑏) − 𝐹(𝑎) − 𝐺(𝑏) + 𝐺(𝑎)
𝑎 𝑎
𝑏
= [𝐹(𝑥) − 𝐺(𝑥)]𝑏𝑎 = ∫ (𝑓(𝑥) − 𝑔(𝑥)) 𝑑𝑥.
𝑎
Now this area will be equal to the area below the graph of the (positive) function −𝑓 and the 𝑥-axis and the lines
𝑥 = 𝑎 and 𝑥 = 𝑏, as will be immediately clear from the following figure:
−𝑓
𝑏 𝑏
∫ −𝑓(𝑥) 𝑑𝑥 = [−𝐹(𝑥)]𝑏𝑎 = −[𝐹(𝑥)]𝑏𝑎 = − ∫ 𝑓(𝑥) 𝑑𝑥.
𝑎 𝑎
We conclude that the integral of a negative function is the opposite of the area of the region between the graph of
that function and the 𝑥-axis.
In the last example of this section we show how the integral of a function that lies partly above and partly below
the horizontal axis can be explained in terms of area(s).
𝑓
𝐺1
𝑐
𝐺2
𝑏
∫ 𝑓(𝑥) 𝑑𝑥 = [𝐹(𝑥)]𝑏𝑎 = 𝐹(𝑏) − 𝐹(𝑎) = 𝐹(𝑐) − 𝐹(𝑎) + 𝐹(𝑏) − 𝐹(𝑐)
𝑎
𝑐 𝑏
= [𝐹(𝑥)]𝑐𝑎 + [𝐹(𝑥)]𝑏𝑐 = ∫ 𝑓(𝑥) 𝑑𝑥 + ∫ 𝑓(𝑥) 𝑑𝑥
𝑎 𝑐
= area 𝐺1 − area 𝐺2 .
In this case the integral represents the area of the region that lies above the 𝑥-axis minus the area of the region
lying beneath it.
𝑏 𝑏 𝑏
i) ∫ 𝑓(𝑥) + 𝑔(𝑥)𝑑𝑥 = ∫ 𝑓(𝑥) 𝑑𝑥 + ∫ 𝑔(𝑥) 𝑑𝑥
𝑎 𝑎 𝑎
If f is a continuous function and c is a real number, then the integral of the multiple 𝑐 ∙ 𝑓 is the multiple of the integral:
𝑏 𝑏
ii) ∫ 𝑐 ∙ 𝑓(𝑥) 𝑑𝑥 = 𝑐 ∙ ∫ 𝑓(𝑥) 𝑑𝑥
𝑎 𝑎
Besides, we have some other obvious properties of definite integrals which will be explained below.
As you know, for a nonnegative continuous function f, the integral from a to b is the area of the region bounded by the
graph of the function, the horizontal axis and the vertical lines 𝑥 = 𝑎 and 𝑥 = 𝑏. If we split the interval [𝑎, 𝑏] into two parts,
from a to t and from t to b, then it will be clear that
𝑏 𝑡 𝑏
iii) ∫ 𝑓(𝑥) 𝑑𝑥 = ∫ 𝑓(𝑥) 𝑑𝑥 + ∫ 𝑓(𝑥) 𝑑𝑥
𝑎 𝑎 𝑡
This property also turns out to hold if the integrand function is not a nonnegative function on the interval of integration. See
also the last example of section 1.5.
𝑎
iv) ∫ 𝑓(𝑥) 𝑑𝑥 = 0
𝑎
𝑏 𝑎
v) ∫ 𝑓(𝑥) 𝑑𝑥 = − ∫ 𝑓(𝑥) 𝑑𝑥,
𝑎 𝑏
𝑏 𝑎 𝑎
∫ 𝑓(𝑥) 𝑑𝑥 + ∫ 𝑓(𝑥) 𝑑𝑥 = ∫ 𝑓(𝑥) 𝑑𝑥 = 0
𝑎 𝑏 𝑎
1.6.1 Exercise
Determine the cost function 𝐶(𝑥) if the marginal cost function is given to be
𝐶 ′ (𝑥) = 2𝑥 2 + 2𝑥 − 5
𝑄∗
∫ 𝑓(𝑄) 𝑑𝑄
0
The difference between what they are willing to pay and what they really pay is called consumer surplus. In the figure
below it is the area of the blue region below the demand curve, but above the horizontal line 𝑃 = 𝑃 ∗ . So we conclude that
𝑄∗
Consumer surplus = ∫ 𝑓(𝑄) 𝑑𝑄 − 𝑄 ∗ ∙ 𝑃 ∗
0
In a similar way we can consider the situation from the producer’s viewpoint. The producer will get a revenue of 𝑃∗ ∙ 𝑄∗ ,
the area of the rectangle in the figure below. But, according to the supply curve the producer would also have been content
with the area of the region below the supply curve, which can be found by integrating the supply function
𝑄∗
∫ 𝑔(𝑄) 𝑑𝑄
0
The difference between what they get and what they would have been content with is called producer surplus, the blue
region in the figure below, and can be calculated as follows
𝑄∗
Producer surplus = 𝑄∗ ∙ 𝑃∗ − ∫ 𝑔(𝑄) 𝑑𝑄
0
Integration techniques 21
Now the total surplus is the sum of consumer and producer surplus and can be described as the area of the region between
demand and supply curve:
𝑄∗ 𝑄∗
Total surplus = ∫ 𝑓(𝑄) 𝑑𝑄 − ∫ 𝑔(𝑄) 𝑑𝑄
0 0
Now, if supply is below the equilibrium, 𝑄 < 𝑄 ∗ , then the total surplus decreases by the so-called deadweight loss (DWL):
𝑄∗ 𝑄∗
DWL = ∫ 𝑓(𝑄) 𝑑𝑄 − ∫ 𝑔(𝑄) 𝑑𝑄
𝑄 𝑄
22 Integration
Now the Gini coefficient GC is defined as twice the area between the line 𝑦 = 𝑥 and the Lorenz curve. See the figure below.
Integration techniques 23
1
GC = 2 ∫ (𝑥 − 𝐿(𝑥)) 𝑑𝑥
0
1.8.1 Exercise
𝜆 𝑒 −𝜆𝑥 , if 𝑥 ≥ 0
The pdf of the exponential distribution is 𝑓(𝑥) = {
0, otherwise
1
Determine the value of 𝜆 if we know that P(𝑥 ≤ 2) = .
2
1.8.2 Exercise
Find the following antiderivatives:
i) ∫ 𝑥 √𝑥 𝑑𝑥
ii) ∫ 2 𝑒 −3𝑥 𝑑𝑥
1.8.3 Exercise
(𝑥−2)2
Find F if 𝐹 ′ (𝑥) = and 𝐹(0) = 3.
√𝑥
1.8.4 Exercise
1 1
Show that ∫ 𝑥 2 ln 𝑥 𝑑𝑥 = 𝑥 3 ln 𝑥 − 𝑥 3 + 𝑐
3 9
1.8.5 Exercise
Compute the area 𝐴 bounded by
The graph of 𝑓(𝑥) = 𝑥, the x-axis, and the vertical lines 𝑥 = −1 and 𝑥 = 2.
The graph of 𝑓(𝑥) = 3(𝑒 𝑥 + 𝑒 −2𝑥 ), the x-axis, and the vertical lines 𝑥 = −1 and 𝑥 = 1.
1.8.6 Exercise
Evaluate the integrals:
2
i) ∫ 𝑦 2 √𝑦𝑑𝑦
1
1
ii) ∫ 2 𝑒 −3𝑎 𝑑𝑎
−1
24 Integration
1
iii) ∫ (3𝑥 + (𝑥 − 2)2 ) 𝑑𝑥
0
1.8.7 Exercise
1
Find the area between the two parabolas defined by the equations 𝑦 = 𝑥 2 − 2𝑥 and 𝑦 = 𝑥 2 . (Hint: sketch the graph of the
2
parabolas and determine the intersection points)
1.8.8 Exercise
Application: Lorenz-curve, Gini index and wealth distribution. Calculate the Gini index for the following Lorenz functions:
i) 𝐿(𝑥) = 𝑥 3
1 2
ii) 𝐿(𝑥) = 𝑥 2 + 𝑥 5
3 3
Integration techniques 25
2. Integration techniques
In order to determine an integral we need to be able to find an antiderivative of all kinds of functions. In the
former chapter we have already seen antiderivatives of some elementary functions, and an extension to sums
and multiples of elementary functions with the arithmetic rules. In this chapter we will discuss two
integration techniques for finding antiderivatives of more complicated functions, namely products of
elementary functions.
In the final section we will consider so-called improper integrals, where the interval of integration is an
unbounded interval or where the integrand function is not defined everywhere in the interval of integration.
As you know, sometimes an antiderivative is also called an integral, or, more precise, indefinite integral,
contrary to the (definite) integral as defined in section 4 of chapter 1, with a lower and upper limit of
integration. An indefinite integral is a function, or even better, a set of functions, but a definite integral is a
value.
So for example
1
∫ 𝑥 2 𝑑𝑥 = 𝑥 3 + 𝑐
3
3
2
1 3 3 1 2
∫ 𝑥 𝑑𝑥 = [ 𝑥 ] = 9 − = 8
1 3 1 3 3
∫ 2𝑥 𝑒 −𝑥 𝑑𝑥
As integration is the inverse of differentiation, and the integrand function is a product of two functions, let’s
consider the product rule for differentiation:
′
(𝑓(𝑥) ∙ 𝑔(𝑥)) = 𝑓 ′ (𝑥) ∙ 𝑔(𝑥) + 𝑓(𝑥) ∙ 𝑔′ (𝑥)
′
∫(𝑓(𝑥) ∙ 𝑔(𝑥)) 𝑑𝑥 = ∫ 𝑓 ′ (𝑥) ∙ 𝑔(𝑥) 𝑑𝑥 + ∫ 𝑓(𝑥) ∙ 𝑔′ (𝑥) 𝑑𝑥
26 Integration
This can be helpful if the integral at the right hand side (RHS) is easier to determine than the LHS integral.
This rule is called integration by parts.
If we want to apply integration by parts to our example ∫ 2𝑥 𝑒 −𝑥 𝑑𝑥, then we have two options:
i) Take 𝑓(𝑥) = 2𝑥 and 𝑔′ (𝑥) = 𝑒 −𝑥 , then 𝑓 ′ (𝑥) = 2 and 𝑔(𝑥) = −𝑒 −𝑥 , resulting in
∫ 2𝑥 𝑒 −𝑥 𝑑𝑥 = −2𝑥 𝑒 −𝑥 + ∫ 2 𝑒 −𝑥 𝑑𝑥
The RHS integral is clearly easier to find than the LHS integral: ∫ 2 𝑒 −𝑥 𝑑𝑥 = −2 𝑒 −𝑥 + 𝑐, so the final
solution is
∫ 2𝑥 𝑒 −𝑥 𝑑𝑥 = −2𝑥 𝑒 −𝑥 − 2 𝑒 −𝑥 + 𝑐
ii) What if we take 𝑓(𝑥) = 𝑒 −𝑥 and 𝑔′ (𝑥) = 2𝑥? Then 𝑓 ′ (𝑥) = −𝑒 −𝑥 and 𝑔(𝑥) = 𝑥 2 , resulting in
∫ 2𝑥 𝑒 −𝑥 𝑑𝑥 = 𝑥 2 𝑒 −𝑥 + ∫ 𝑥 2 𝑒 −𝑥 𝑑𝑥
This leads nowhere: the RHS integral is more complicated than the LHS integral.
∫ 𝑥 2 𝑒 −𝑥 𝑑𝑥
∫ 𝑥 2 𝑒 −𝑥 𝑑𝑥 = −𝑥 2 𝑒 −𝑥 + ∫ 2𝑥 𝑒 −𝑥 𝑑𝑥
Integration techniques 27
Now the RHS integral is the integral from the former example, so finally we get:
∫ 𝑥 2 𝑒 −𝑥 𝑑𝑥 = −𝑥 2 𝑒 −𝑥 − 2𝑥 𝑒 −𝑥 − 2 𝑒 −𝑥 + 𝑐
Of course we could also have taken a definite integral, for example with limits of integration 0 and 1:
1
5
∫ 𝑥 2 𝑒 −𝑥 𝑑𝑥 = [−𝑥 2 𝑒 −𝑥 − 2𝑥 𝑒 −𝑥 − 2 𝑒 −𝑥 ]10 = 2 −
0 𝑒
∫ 𝑥 2 ln 𝑥 𝑑𝑥
1
An obvious choice is 𝑓(𝑥) = ln 𝑥 and 𝑔′ (𝑥) = 𝑥 2 . Although 𝑔(𝑥) = 3 𝑥 3 + 𝑐 is more complicated than
1
𝑔′(𝑥), the differentiation of 𝑓 into the simple function 𝑓 ′ (𝑥) = outweighs this disadvantage. The result is
𝑥
1 1 1
∫ 𝑥 2 ln 𝑥 𝑑𝑥 = 𝑥 3 ln 𝑥 − ∫ 𝑥 3 ∙ 𝑑𝑥
3 3 𝑥
1 1 1 1
= 𝑥 3 ln 𝑥 − ∫ 𝑥 2 𝑑𝑥 = 𝑥 3 ln 𝑥 − 𝑥 3 + 𝑐
3 3 3 9
∫ ln 𝑥 𝑑𝑥,
the method turns out to work well in this case. The integrand function can be considered as a product of two
functions, namely ∫ ln 𝑥 𝑑𝑥 = ∫ 1 ∙ ln 𝑥 𝑑𝑥. Then take again 𝑓(𝑥) = ln 𝑥, and 𝑔′ (𝑥) = 1, resulting in
1
∫ 1 ∙ ln 𝑥 𝑑𝑥 = 𝑥 ∙ ln 𝑥 − ∫ 𝑥 ∙ 𝑑𝑥 = 𝑥 ∙ ln 𝑥 − ∫ 1𝑑𝑥 = 𝑥 ∙ ln 𝑥 − 𝑥 + 𝑐
𝑥
2.1.1 Exercise
Find the following antiderivatives:
28 Integration
i) ∫ 3𝑥 𝑒 −2𝑥 𝑑𝑥
ii) ∫ 𝑥 ln 𝑥 𝑑𝑥
2.1.2 Exercise
Calculate the following integrals:
2
i) ∫0 𝑥 ln(𝑥 + 1) 𝑑𝑥
1
ii) ∫0 𝑥 3𝑥 𝑑𝑥
1
iii) ∫−1 𝑥 2 𝑒 2𝑥 𝑑𝑥
𝑒
iv) ∫1 (ln 𝑥)2 𝑑𝑥
2
∫ 2𝑥 𝑒 −𝑥 𝑑𝑥
Although this integral only slightly differs from the integral in the first example in the former section, the
way of finding an antiderivative is completely different.
If you try to solve this integral by parts, then whatever you choose for 𝑓 and 𝑔′, the integral will not become
simpler (try it!).
So, we need to use a different technique. We have already transformed several rules of differentiation into
rules of integration: sum rule, multiplying by a constant and, in this chapter, the product rule, which was
transformed into integration by parts.
A rule that we did not yet transform is the chain rule. And exactly the reverse of this rule turns out to be very
suitable for finding the integral of this example.
So, let’s formulate the chain rule and take the antiderivative. Starting from a composite function 𝐹(𝑔(𝑥)) we
have
′
(𝐹(𝑔(𝑥))) = 𝐹′(𝑔(𝑥)) ∙ 𝑔′(𝑥)
′
∫ (𝐹(𝑔(𝑥))) 𝑑𝑥 = ∫ 𝐹′(𝑔(𝑥)) ∙ 𝑔′(𝑥)𝑑𝑥
or:
if we assume that 𝐹 is an antiderivative of 𝑓:
2
In our example we observe the composite function 𝑓(𝑔(𝑥)) = 𝑒 −𝑥 , with 𝑓(𝑥) = 𝑒 −𝑥 and 𝑔(𝑥) = 𝑥 2 (inner
function). Besides, we observe that the second factor in the integrand function is exactly the derivative of the
inner function 𝑔′ (𝑥) = 2𝑥. Because an antiderivative of 𝑓 is easy to find,𝐹(𝑥) = −𝑒 −𝑥 , the rule of
integration by substitution generates immediately the result
2 2
∫ 𝑒 −𝑥 ∙ 2𝑥𝑑𝑥 = ∫ 𝑓(𝑔(𝑥)) ∙ 𝑔′(𝑥)𝑑𝑥 = 𝐹(𝑔(𝑥)) + 𝐶 = −𝑒 −𝑥 + 𝐶
Of course it is easy to check whether this result is correct, by taking the derivative of the result and observe
that we get the original integrand function.
The rule is called integration by substitution, because we actually perform a substitution. Instead of
integrating with regard to the integration variable 𝑥, we are in fact integrating with regard to a new
integration variable, namely 𝑦 = 𝑔(𝑥).
In order to prevent making mistakes it is often useful to perform this substitution explicitly. It works as
follows:
𝑑𝑢
= 𝑔′ (𝑥), so 𝑔′ (𝑥)𝑑𝑥 = 𝑑𝑢
𝑑𝑥
30 Integration
∫ 𝑓(𝑢)𝑑𝑢
𝑥 1
∫ 𝑑𝑥 = ∫ ∙ 𝑥 𝑑𝑥
√3𝑥 2 + 1 √3𝑥 2 + 1
1
We observe a multiplication of a composite function 𝑓(𝑔(𝑥)) = and 𝑥. If we take 𝑔(𝑥) = 3𝑥 2 + 1 as
√3𝑥 2 +1
the inner function, then 𝑔′ (𝑥) = 6𝑥, which is “almost” 𝑥. So substitution appears to be the way to follow.
𝑑𝑦
Introduce a new variable 𝑦 = 3𝑥 2 + 1, with = 6𝑥, so 6𝑥 𝑑𝑥 = 𝑑𝑦.
𝑑𝑥
1
Now perform the substitution, with 𝑥 𝑑𝑥 = 𝑑𝑦, resulting in
6
1 1 1 1 1 1 1 1
∫ ∙ 𝑥 𝑑𝑥 = ∫ 𝑑𝑦 = ∫ 𝑦 −2 𝑑𝑦 = 𝑦 2 + 𝑐 = √𝑦 + 𝑐
2
√3𝑥 + 1 6 √𝑦 6 3 3
1 1
∫ ∙ 𝑥 𝑑𝑥 = √3𝑥 2 + 1 + 𝑐
√3𝑥 2 + 1 3
1
𝑥
∫ 𝑑𝑥
0 √3𝑥 2 + 1
Of course we can apply the complete substitution like we did in the former example and plug in the limits of
integration with result:
1 1
𝑥 1 1 1 1
∫ √ 2
𝑑𝑥 = [ 3𝑥 + 1] = √4 − =
2
0 √3𝑥 + 1 3 0 3 3 3
Again introduce the new integration variable 𝑦 = 3𝑥 2 + 1 and perform substitution like before:
1
𝑥 1 𝑥=1 1
∫ 𝑑𝑥 = ∫ 𝑑𝑦
0 √3𝑥 2 + 1 6 𝑥=0 √𝑦
But now the limits of integration do not correspond to 𝑦, but to the original integration variable 𝑥. Instead of
first taking an antiderivative and apply back substitution, we can also change the limits of integration, so that
they fit the new integration variable 𝑦. Now the lower limit of integration 𝑥 = 0 corresponds to 𝑦 = 1 and
the upper limit 𝑥 = 1 to 𝑦 = √3 + 1 = 2 and the integral changes into a proper definite integral with
integration variable 𝑦, and we do not need to apply back substitution anymore!
1 2
𝑥 1 2 1 1 1
∫ 𝑑𝑥 = ∫ 𝑑𝑦 = [ √𝑦] =
0 √3𝑥 2 + 1 6 0 √𝑦 3 0 3
We call this integration by substitution with carrying over the limits of integration.
2.2.1 Exercise
Find the following antiderivatives:
i) ∫(𝑥 + 1)5 𝑑𝑥
2𝑥+3
ii) ∫ 2 𝑑𝑥
𝑥 +3𝑥−10
2 −4
iii) ∫ 4𝑥 𝑒 3𝑥 𝑑𝑥
(ln 𝑥)2
iv) ∫ 𝑑𝑥
𝑥
2.2.2 Exercise
Calculate the following integrals:
2
i) ∫0 2𝑥 (𝑥 2 + 1)3 𝑑𝑥
1 6𝑥
ii) ∫0 𝑑𝑥
√𝑥 2 +3
ln 2
iii) ∫0 𝑒 𝑥 (1 + 𝑒 𝑥 ) 𝑑𝑥
𝑒 2+ln 𝑥
iv) ∫1 𝑥 𝑑𝑥
32 Integration
−𝜆𝑥
𝑓(𝑥) = { 𝜆 𝑒 , if 𝑥 ≥ 0 ,
0, otherwise
1
where 𝜆 > 0 is a parameter of the function (actually 𝜆
is the mean waiting time). This is called the
exponential probability distribution.
Now the probability that the waiting time is less than 𝑏 (time units: e.g. minutes, hours, days…) is described
by
𝑏
𝑏
P(𝑋 ≤ 𝑏) = ∫ 𝜆 𝑒 −𝜆𝑥 𝑑𝑥 = [−𝑒 −𝜆𝑥 ]0 = 1 − 𝑒 −𝜆𝑏
0
Here the integral is taken from 0 to 𝑏. It will be clear that the probability increases with increasing 𝑏, but it
cannot become bigger than 1. If we let 𝑏 increase to infinity, “taking the limit”, then the probability should
go to 1. In the notation we replace the upper limit by ∞, and we conclude indeed
∞
𝑏
∫ 𝜆 𝑒 −𝜆𝑥 𝑑𝑥 = lim [−𝑒 −𝜆𝑥 ]0 = lim 1 − 𝑒 −𝜆𝑏 = 1
0 𝑏→∞ 𝑏→∞
As the integrand function is a positive function, the integral can be considered as the area below the graph of
the function, between 0 and ∞. See the graph of the exponential pdf with 𝜆 = 2 below:
Of course we can also define an improper integral with unbounded lower limit of integration.
The general definition is as follows:
Integration techniques 33
∞ 𝑏
∫𝑎 𝑓(𝑥) 𝑑𝑥 = lim𝑏→∞ ∫𝑎 𝑓(𝑥) 𝑑𝑥, if this limit exists
𝑏 𝑏
∫−∞ 𝑓(𝑥) 𝑑𝑥 = lim𝑎→−∞ ∫𝑎 𝑓(𝑥) 𝑑𝑥, if this limit exists
These improper integrals are called convergent, if the limit exists, otherwise divergent.
1
∫ 𝑥 𝑎 𝑑𝑥 = x a+1 + 𝑐
𝑎+1
If we take the infinite interval of integration [1, ∞), then we need to evaluate the limit
∞ 𝑏
𝑎
1 a+1
1
∫ 𝑥 𝑑𝑥 = lim [ x ] = lim (𝑏 a+1 − 1)
1 𝑏→∞ 𝑎+1 1 𝑏→∞ 𝑎+1
Now it depends on the value of 𝑎 whether the integral is convergent. It is obvious that 𝑏 𝑎+1 has limit 0 if
𝑎 + 1 < 0, so if 𝑎 < −1 and it has no limit if 𝑎 > −1. So the integral is convergent for 𝑎 < −1 and
divergent for 𝑎 > −1.
Note: if 𝑎 = −1, then an antiderivative is the natural logarithm of 𝑥 (for 𝑥 > 0) and the integral is divergent:
∞
∫1 𝑥 −1 𝑑𝑥 = lim𝑏→∞ ln 𝑏, which limit does not exist.
∞ 0 ∞ 𝑏
∫ 𝑥 𝑑𝑥 = ∫ 𝑥 𝑑𝑥 + ∫ 𝑥 𝑑𝑥 = + lim ∫ 𝑥 𝑑𝑥
−∞ −∞ 0 𝑏→∞ 0
0 𝑏
1 1
= lim [ x 2 ] + lim [ x ] 2
𝑎→−∞ 2 𝑎 𝑏→∞ 2 0
∞ 𝑎
Note: do not make the mistake of writing ∫−∞𝑥 𝑑𝑥 = lim𝑎→∞ ∫−𝑎𝑥 𝑑𝑥 (which turns out to be 0). This is
definitely wrong, as you adopt the strong assumption that the lower and upper limit of integration go to
infinity in the same way.
2.3.1 Exercise
Determine whether the following integrals are convergent and, if so, determine the value:
∞ 4𝑥
i) ∫0 (1+𝑥 2 )2
𝑑𝑥
∞ 4𝑥
ii) ∫0 1+𝑥 2
𝑑𝑥
∞ 2𝑥
iii) ∫−∞ (1+𝑥2 )2 𝑑𝑥
1
1
∫ 𝑑𝑥
0 √𝑥
The integrand function is not defined at 𝑥 = 0, so we have to define what we mean by this notion. The
1 1
integrand function is only defined for 𝑥 > 0. That’s why we try to determine the integral ∫𝑐 √𝑥
𝑑𝑥 , where
𝑐 > 0, and see what happens if we let 𝑐 go to zero “from above”, or “from the right”. That is, we take the
“right limit” for 𝑐 → 0, shorthand notation 𝑐 ↓ 0:
1
1 1
lim ∫ 𝑑𝑥 = lim[2√𝑥]𝑐 = lim 2 − 2√𝑐 = 2
𝑐↓0 𝑐 √𝑥 𝑐↓0 𝑐↓0
Because this limit exists we call our original integral convergent with value 2.
Note: if we want to take a limit “from the left”, or “from below”, then we use the bottom-up arrow ↑.
In general:
𝑏 𝑏
∫𝑎 𝑓(𝑥) 𝑑𝑥 = lim𝑐↓𝑎 ∫𝑐 𝑓(𝑥) 𝑑𝑥, if this limit exists.
Integration techniques 35
If the function 𝑓 is defined on the interval [𝑎, 𝑏), with a non-existing limit at 𝑥 = 𝑏, then
𝑏 𝑐
∫𝑎 𝑓(𝑥) 𝑑𝑥 = lim𝑐↑𝑏 ∫𝑎 𝑓(𝑥) 𝑑𝑥, if this limit exists.
These improper integrals are called convergent, if the limit exists, otherwise divergent.
Note: also in the case of an unbounded integrand function, if we have two improperties in one integral,
meaning that the function is unbounded at the lower as well as the upper limit of integration, then first split
the integral into two with one improperty each.
2.4.1 Exercise
Determine whether the following integrals are convergent and, if so, determine the value:
1 1
i) ∫0 𝑥√𝑥
𝑑𝑥
11
ii) ∫0 𝑥 𝑑𝑥
2.4.2 Exercise
Consider the integral with parameter 𝑎 > 0:
1
1
∫ 𝑑𝑥
0 𝑥𝑎
For which positive values of 𝑎 is this integral convergent and what is the outcome in these cases?
What if 𝑎 ≤ 0?
2.4.3 Exercise
Show that the integral
1
1 1
∫ ∙ 𝑑𝑥
0 √𝑥 1 − √𝑥
2.5.1 Exercise
Find the following antiderivatives:
i) ∫ 𝑥(2𝑥 2 − 3)4 𝑑𝑥
ii) ∫ 𝑥 3 𝑒 2𝑥 𝑑𝑥
iii) ∫(2 − 𝑥 2 )𝑒 𝑥 𝑑𝑥
36 Integration
2
iv) ∫ 𝑥 3 𝑒 −𝑥 𝑑𝑥
2.5.2 Exercise
Calculate the following integrals:
5.5
i) ∫0 (0.5 + 0.2𝑥)𝑒 −0.03𝑥 𝑑𝑥
1
ii) ∫0 𝑥√𝑥 + 1 𝑑𝑥
1 𝑥3
iii) ∫0 (1+𝑥 2 )2
𝑑𝑥
4
iv) ∫1 𝑥 ln(𝑥 + 1) 𝑑𝑥
2.5.3 Exercise
Determine whether the following integrals are convergent and, if so, determine the value:
1 1
i) ∫0 1−𝑥
𝑑𝑥
√
∞ 2
ii) ∫0 𝑥 𝑒 −2𝑥 𝑑𝑥
∞ 1
iii) ∫0 3 𝑥 𝑑𝑥
√
1 2
∞
iv) ∫−∞ 𝑥 𝑒 −2𝑥 𝑑𝑥
0 𝑒 2𝑥
v) ∫−∞ 2𝑥 𝑑𝑥
𝑒 +3
0 𝑒 2𝑥 +3
vi) ∫−∞ 𝑒 2𝑥
𝑑𝑥
−𝜆𝑥
𝑓(𝑥) = { 𝜆 𝑒 , if 𝑥 ≥ 0 ,
0, otherwise
i) Find an explicit expression for the probability P(a X b) that X is between a and b, where 0 a b .
∞
ii) Determine the mean of X using the integral 𝐸(𝑋) = ∫0 𝑥 𝑓(𝑥) 𝑑𝑥.(*)
The median of a continuous probability distribution is defined as the number m with the property that the
1
probability for X being smaller than or equal to this number is equal to 0.5, so P( X m) .
2
ln 2
iii) Show that the median of the exponential distribution is equal to .
𝑥
(*)
You will need the limit property lim𝑥→∞ = 0.
𝑒𝑥