You are on page 1of 20

Lecture Notes on

Computational and Applied Mathematics


(MAT 3310, 3 units)

Academic year 2009/2010: second term

Prepared by
Jun ZOU (2006)a and Patrick CIARLET (2010)b

We have prepared the lecture notes purely for the convenience of our teaching. We gratefully
acknowledge the contribution of Prof. I-Liang CHERN (National Taiwan University, Taipei, Taiwan),
made during his visit to the Dept of Mathematics of CUHK in 2009.
a
Department of Mathematics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
b
Dept of Applied Mathematics, Ecole Nationale Superieure de Techniques Avancees, Paris, France

At some locations, we put the symbol .


This corresponds to paragraphs that may
be omitted in a first reading.

Numerical methods for differential equations

In this chapter, we shall introduce two popular numerical methods for solving boundary
value problems. The first class is the finite difference method. The second class is the
finite element method.

3.1
3.1.1

Finite difference method


Principle

Consider the Sturm-Liouville equation with homogeneous Dirichlet boundary conditions


(c u0 )0 (x) + q(x)u(x) = f (x),
u(a) = 0,

a < x < b,

(3.1)

u(b) = 0.

(3.2)

The finite difference method builds an approximate solution defined at a finite number
of grid points: x0 < < xn+1 . For simplicity, we choose uniform grids, that is, given
n N , we define the grid size h = (b a)/(n + 1) and introduce equally spaced grid
points xi = a + ih, i = 0, , n + 1. In other words, the finite difference solution will

x3= 0.6 (ba)+a

b
Figure 3.1: Example of a grid with n = 4.

not be defined over the whole interval [a, b], but at a finite number of points in this
interval. However, the location of the points, and their number can vary. In particular,
they can become arbitrarily close to one another, and their number might be as large as
one wants.
Evidently, we note that we have, for the boundary conditions,
u(x0 ) = 0,

u(xn+1 ) = 0 ,

so the approximation of these values is straightforward!

65

The next step is to approximate the differential equation itself. For small x > 0, we
know that the derivative of u at x can be approximated by
forward differencing:

u0 (x) = (u(x + x) u(x))/x + O(x) ;

backward differencing:

u0 (x) = (u(x) u(x x))/x + O(x) ;

centered differencing:

u0 (x) = (u(x + x) u(x x))/2x + O((x)2 ).

Indeed, these results can be proved easily by Taylor expansion. With this, we derive
an approximation of (3.1) at xi , for i = 1, , n, in two steps. First by the centered
differencing at xi with x = h/2, by using data at xi+1/2 := a + (i + 1/2)h and xi1/2 :=
a + (i 1/2)h. If u is a solution to (3.1), then we have at x = xi

1
c(xi1/2 )u0 (xi1/2 ) c(xi+1/2 )u0 (xi+1/2 ) + q(xi )u(xi ) = f (xi ) + O(h2 ).
h
Next, we approximate the derivatives at xi1/2 and xi+1/2 again by centered differencing:

1
u (xi1/2 ) =
u(xi ) u(xi1 ) + O(h2 ),
h


1
u (xi+1/2 ) =
u(xi+1 ) u(xi ) + O(h2 ).
h

Then equation (3.1) at xi , for i = 1, , n, can be replaced by



1
c(x
)(u(x
)

u(x
))

c(x
)(u(x
)

u(x
))
i
i1
i+1
i
i1/2
i+1/2
h2
+q(xi )u(xi ) = f (xi ) + O(h).

(3.3)

Finally, the finite difference method applied to (3.1) and (3.2) consists of using grid
values Ui , for i = 0, , n + 1, to approximate the solution u(xi ), for i = 0, , n + 1.
So ,we choose grid values (Ui )i=0, ,n+1 that satisfy

1
c(x
)(U

U
)

c(x
)(U

U
)
+ q(xi )Ui = f (xi ) i = 1, , n
i
i1
i+1
i
i1/2
i+1/2
h2
U0 = 0, Un+1 = 0 .

(3.4)
(3.5)

The resulting scheme is called a three-point scheme. Indeed, the relations (3.4) involve

i-1

i+1

Figure 3.2: Three-point scheme.


the three unknowns Ui1 , Ui and Ui+1 .
66

On the other hand, there are n equations for the n unknowns U1 , , Un . As we


have seen in chapter 1, we can rewrite these equations as a square linear system45 :
~ + QU
~ = F~
KU
where K is the stiffness matrix and Q is the mass matrix, both in Rnn , and F~ belongs
to Rn , with:

Ki,i = (c(xi1/2 ) + c(xi+1/2 ))/h2


1in

2
1. Ki,i1 = c(xi1/2 )/h
and else Ki,j = 0
2in

Ki,i+1 = c(xi+1/2 )/h2


1in1
2. Qi,i = q(xi )
and else Qi,j = 0
1in
3. Fi = f (xi )
1 i n.
From the above, we note that Q is a diagonal matrix, whereas K is a tridiagonal matrix,
since its non-zero elements are gathered on three diagonals.
To have more grid points, one increases the value of n. In the process, the grid
size h diminishes, so one can hope to have a better precision, since the terms that we
neglect are in O(h). We study the accuracy of the finite difference method in the next
subsection.
3.1.2

Accuracy of the finite difference method

We consider here a simple example, with c(x) = 1 and q(x) = 0, for all x ]0, 1[:
u00 (x) = f (x),
u(0) = 0,

0 < x < 1,
u(1) = 0.

(3.6)
(3.7)

For instance, in the particular case where f (x) = 1, for all x, one can check that the
solution is equal to
1
(3.8)
u0 (x) = x(1 x).
2
This elementary result will be useful later on...
~ by Gaussian elimination or by some
In principle, we can solve this linear system of equations in U
iterative method, such as the Jacobi method (see 5.4.2) or the Gauss-Seidel method (see 5.4.3),
provided that K + Q is invertible.
45

67

Assume that the solution u belongs to C 4 ([0, 1]) or, equivalently, that the data f
belongs to C 2 ([0, 1]).
To approximate (3.6)-(3.8) with the help of the finite difference method, let us prove
the technical result below, which explicits the term in O(h) in (3.3).
Lemma 3.1. Let x ]0, 1[ and h such that [x h, x + h] [0, 1]. Then there holds
] 1, 1[,

u(x h) + 2u(x) u(x + h) h2 (4)


u (x) =
+ u (x + h).
h2
12
00

(3.9)

Proof. One uses the Taylor-Lagrange expansion formula, which writes


] 1, 0[,
+ ]0, 1[,

h3
h4
h2 00
u (x) u000 (x) + u(4) (x + h)
2
6
24
2
3
h
h4
h
u(x + h) = u(x) + h u0 (x) + u00 (x) + u000 (x) + u(4) (x + + h).
2
6
24

u(x h) = u(x) h u0 (x) +

Summing up these two equalities, one finds


u00 (x) =

u(x h) + 2u(x) u(x + h) h2 (4)


+ (u (x + h) + u(4) (x + + h)).
h2
24

To reach formula (3.9), one must remember the intermediate value theorem. Since
u(4) is continuous, one can finally replace the sum (u(4) (x + h) + u(4) (x + + h)) by
2u(4) (x + h), with a parameter that belongs to [ , + ] ] 1, 1[.
Remark 3.1. The first term in the right-hand side of (3.9) is a good approximation of
u00 (x), when h is small. Indeed, thanks to the relation u(4) = f 00 , we know that
2

h (4)

u (x + h) Cf,2 h2 , where Cf,2 = sup |f 00 (x)|.
12

12
x[0,1]
As before, we define the grid size h = 1/(n + 1) and introduce equally spaced grid
points xi = ih, i = 0, , n + 1. According to lemma 3.1, the finite difference method
applied to (3.6) and (3.7) consists of using grid values Ui , for i = 0, , n+1 that satisfy

1

U
+
2U

U
i1
i
i+1 = f (xi ) i = 1, , n
h2
U0 = 0, Un+1 = 0 .

68

We can rewrite this equivalently as

... 0

1 2 1 . . . ...

~ = F~ , where A = 1
AU
0 ... ... ... 0
2
h
.. . . . . . .
.
.
. 1
.
0 . . . 0 1 2

Rnn .

(3.10)

In addition to being tridiagonal, A is also symmetric. We have now to check that there
~ to the linear system (3.10). In addition, is it possible
exists one, and only one, solution U
to compute and/or to bound the pointwise error, that is |u(xi )Ui |, for i = 0, , n+1?
This it the focus of the next series of results. First of all, we check that A is invertible,
~ . Moreover, we build in the process an explicit
to obtain existence and uniqueness of U
formula, expressing the error as a function of the data f (x). Then, to bound the error,
we study some features of the inverse A1 .
Definition 3.1. A vector ~v of Rn is positive when vi 0, for all i = 1, , n.
A matrix A of Rnn is positive when Ai,j 0, for all i = 1, , n, j = 1, , n.
A matrix A of Rnn is monotone when it is invertible, with positive inverse.
Before going back to matrix A, let us give a simple (and useful) characterization of
monotone matrices.
Lemma 3.2. A matrix A of Rnn is monotone if, and only if, the inclusion
{~v Rn : A~v 0} {~v Rn : ~v 0}

(3.11)

holds.
Proof. Assume that A is monotone.
Let ~v be such that A~v 0, then ~v = A1 (A~v ) and one finds that, for i = 1, , n,
P
vi = j (A1 )i,j (Av)j 0, since by assumption (A1 )i,j and (Av)j are positive. It follows that ~v 0, and the inclusion (3.11) holds.
Reciprocally, if the inclusion holds, let us show first that A is invertible.
Let ~v be such that A~v = 0 : one has A~v 0 and A(~v ) 0, which implies ~v 0 and
(~v ) 0, that is ~v = 0. Invertibility follows.
Knowing that A1 exists, let us now prove that it is positive...
69

Denote by (~ei )i=1, ,n the canonical orthonormal basis of Rn . Then, the vectors f~i =
A1~ei , for i = 1, , n, are respectively the column vectors of A1 . Evidently, one has
~ei = Af~i , for i = 1, , n. Then, the inclusion (3.11) says that f~i is positive, because ~ei
is positive. In others words, all entries of A1 are positive.
We conclude that the matrix A is monotone.
Lemma 3.3. The matrix A defined at (3.10) is monotone.
Proof. To prove that A is monotone, let us use lemma 3.2. Given ~v such that A~v 0,
define vk = mini=1, ,n vi (or, equivalently, vk such that vk vi , for i = 1, , n). The
goal is to prove that vk 0, so that it follows that all components of ~v are positive.
We can write A~v 0 as

2v1 v2 0
vi1 + 2vi vi+1 0, 2 i n 1 .

vn1 + 2vn 0
If vk = v1 , one uses the first equation and then the definition of vk to find
vk v2 vk 0.
A similar conclusion holds if vk = vn (one uses the last equation).
If k {2, , n 1}, one finds
(vk vk1 ) + (vk vk+1 ) 0.
But, we know that vk vk1 and vk vk+1 , which implies (vk vk1 ) + (vk vk+1 ) 0.
Combining these four inequalities yield
vk1 = vk+1 = vk !
By induction, one concludes easily that v1 = = vk1 = vk = vk+1 = = vn . Then,
the first (or the last) equation yields again vk 0.
Since A is monotone, it is also invertible: this property will be useful to study
the error between the exact solution and the finite difference solution. Let ~e be the
vector of Rn representing the error, whose components are equal to ei = Ui u(xi ), for
~ , we adopt the convention e0 = en+1 = 0 (fully
i = 1, , n. Similarly to the vector U
70

justified by the fact that u(0) = u(1) = 0, and U0 = Un+1 = 0!). We know that u (resp.
~ ) is the solution to (3.6)-(3.7) (resp. (3.10)), so, for i = 1, , n:
U
ei1 + 2ei ei+1
~ )i u(xi1 ) + 2u(xi ) u(xi+1 )
= (AU
2
h
h2
u(xi h) + 2u(xi ) u(xi + h)
= f (xi )
h2
2
h
(3.9)
= f (xi ) + u00 (xi ) + u(4) (xi + i h), with i ] 1, 1[
12
h2 (4)
u (xi + i h), with i ] 1, 1[
=
12
h2
= f 00 (xi + i h), with i ] 1, 1[.
12
Similarly to remark 3.1, let us introduce the vector ~ of Rn , whose components are equal
to i = (A~e)i = h2 /12 f 00 (xi + i h), for i = 1, , n. By construction, its infinity norm
k~k := maxi=1, ,n |i | is such that k~k h2 /12 Cf,2 , and the error can be expressed
as
~e = A1 ~.
(3.12)
(A~e)i

To carry on, let us use the property that all entries of A1 are positive. Using the
formula (3.12) to express the error, one can write
|ei | = |

(A1 )i,j j |

X
X
X
h2
(A1 )i,j |j |
(A1 )i,j k~k
Cf,2
(A1 )i,j .
12
j
j
j

To be able to bound the error, one has to provide an upper bound of


(3.13), for i = 1, , n.

(3.13)

1
j (A1 )i,j

in

Lemma 3.4. For each row of A1 , the sum of its entries is lower than or equal to 1/8.
P
P
Proof. We remark that j (A1 )i,j = j (A1 )i,j j , when j = 1, for j = 1, , n.
~ 0 = A1~, or AU
~ 0 = ~. The
What is the meaning of the corresponding vector ~? Let U
vector ~ can be seen as a special right-hand side of (3.10). It corresponds to f (x) = 1,
for all x, which sends us back to the solution u0 to (3.8). But, in this very special case,
one has exactly
u0 (x h) + 2u0 (x) u0 (x + h)
, x ]0, 1[, h s.t. [x h, x + h] [0, 1].
h2
~ 0 with entries (U0 )i = u0 (xi ) for i = 1, , n, we find
Thus, if we define U
X
~ 0 = ~, that is U
~ 0 = A1~, or
AU
(A1 )i,j = u0 (xi ), 1 i n.
u0 00 (x) =

71

Finally, one can check that supx[0,1] u0 (x) = u0 (1/2) = 1/8, thus leading to the conclusion.
Therefore, we have proved the result below.

Theorem 3.1. When the solution u to the boundary value problem (3.6)-(3.7) belongs to
C 4 ([0, 1]), the pointwise error of the finite difference method can be bounded by
k~ek

h2
sup |f 00 (x)|, where ei = Ui u(xi ), for i = 1, , n.
96 x[0,1]

(3.14)

So, one conclude that the pointwise error goes uniformly to 0 like h2 .
Remark 3.2. When the grid size h decreases, the number of grid points n grows in inverse proportion. In other words, the maximal pointwise error decreases as h2 , whereas
the number of grid points grows like 1/h.
However, these estimates hinge crucially on the regularity of u, since we used fourth
order Taylor expansions. What can one prove when the regularity of u is weaker? Let
us give another classical result, stated without proof46 .

Theorem 3.2. When the solution u belongs to C 2 ([0, 1]), the pointwise error of the finite
difference method can be estimated by
lim k~ek = 0.

h0+

(3.15)

Remark 3.3. Similar results hold for (3.1)-(3.2) in the case where c(x) and q(x) are
continuous functions of x, as long as they fulfill the positivity conditions (2.26), with
]a, b[ replacing ]0, 1[.
To conclude, let us prove that A is symmetric positive definite (so that one can use
the Cholesky factorization to solve the linear system (3.10)).
Prove this result, using a second order Taylor-Lagrange expansion, together with the fact that u00
is uniformly continuous over [0, 1].
46

72

Lemma 3.5. The matrix A defined at (3.10) is symmetric positive definite.


Proof. Clearly, the matrix A est symmetric.
Let us check that it is also positive definite. Using the definition of positive definiteness,
we form h2 (~v , A~v )Rn , for any vector ~v of Rn :
2

h (~v , A~v )Rn = h


= 2

n
X

i=1
n
X

vi (Av)i = v1 (2v1 v2 ) +

n1
X

vi (vi1 + 2vi vi+1 ) + vn (vn1 + vn )

i=2
2

vi 2

i=1

= v1 2 + vn 2 +

n1
X

i=1
n1
X

vi vi+1 = v1 + vn +

n1
X

vi 2 2vi vi+1 + vi+1 2

i=1

(vi vi+1 )2 .

i=1

Thus, (~v , A~v )Rn 0. Besides, having (~v , A~v )Rn = 0 implies that v1 = vn = 0, and
vi = vi+1 , for i = 1, , n 1. It follows by induction that vi = 0, for i = 2, , n 1,
so that ~v = 0.

3.2

Finite Element Methods Galerkins approach

In the above finite difference approach, we have assumed that u and the coefficient c
are smooth (that is, of sufficient regularity to be able to perform Taylor expansions). In
some models however, the coefficient may not be smooth and the corresponding solution
may not belong to C 2 ([a, b]). In this case, a weak formulation is favored. Indeed, as we
observed in chapter 2, it requires less regularity of the solution u and of the coefficient.
To solve a given boundary value problem, the Galerkin method consists of three
steps:
the variational formulation of the boundary value problem;
the construction of a finite dimensional space and of a discrete problem;
the solution of the discrete problem.
3.2.1

Weak formulation

Let us take the following two-point boundary value problem as an example:


(
0
cu0 (x) + q(x)u(x) = f (x),
a < x < b,
0
(cu )|x=a = , u|x=b = .
73

(3.16)

To derive the variational problem, we multiply (3.16) by a test function v(x) satisfying
the homogeneous boundary condition47 v |x=b = 0, then integrate over ]a, b[ to obtain the
following variational formulation of the boundary value problem (3.16):
Find u with u|x=b = such that a(u, v) = g(v) v with v |x=b = 0
where a(, ) and g() are forms respectively given by
Z
Z a
0 0
(cu v + quv) dx ,
g(v) =
a(u, v) =
b

(3.17)

f v dx v |x=a .

We assume that the positivity conditions (2.26) hold, to ensure that a(, ) is a strictly
positive form, so that the existence and uniqueness of the solution u(x) is guaranteed.
Next, we build an approximation of the weak formulation. To that aim, we approximate
on the one hand the solution u(x) (see 3.2.2), and on the other hand we define some
appropriate sets of test functions (see 3.2.3).
3.2.2

Approximation of functions Trial functions

We want to approximate the exact solution u by some functions called trial functions.
That is,
X
u(x) uh (x) :=
Ui i (x) ,
i

where the functions (i )i form a basis of the trial functions. The index h refers to the
approximation. The number of basis functions is finite. Then the problem is reduced to
finding the coefficients (Ui )i , which is a finite dimensional problem.
Remark 3.4. There is a fundamental difference between the finite element method and
the finite difference method. Indeed, with the finite difference method, we approximate
the values of the solution at the grid points only, whereas for the finite element method,
we approximate the solution over the whole interval, with uh (x), x [a, b].
A simple set of basis functions is as follows. We divide the interval [a, b] into (N + 1)
intervals using the points
Th : a = x0 < x1 < < xN < xN +1 = b .
47

Why?

74

The points x0 , , xN +1 are called the mesh points. We shall denote the length of the
interval ]xi1 , xi [ by hi , for i = 1, , N + 1. The mesh size h is defined by
h :=

max

i=1, ,N +1

the maximal length of an interval (it is this index

a
]x0 ,x1 [ ]x1 ,x2 [ . . .

hi ,
h

that is used for the approximation).

b
]xN ,xN +1 [

Figure 3.3: Mesh points and intervals.


Then we can construct one basis function i (x) for each mesh point xi , i = 0, , N + 1
such that

1 if j = i
i (xj ) = ij :=
.
(3.27)
0 if j 6= i
This generates the finite element space
span {0 , 1 , , N +1 } .
It is indeed a finite dimensional vector space, of dimension N +2. As a matter of fact, the
functions (i (x))i=0, ,N +1 are automatically linearly independent, according to the defiP +1
nition (3.27). Consider a vanishing linear combination, N
i=0 i i (x) = 0 for x [a, b]:
taking x = xj successively for j = 0, , N + 1 yields j = 0, for j = 0, , N + 1.
There are many choices for the basis functions (i (x))i=0, ,N +1 which satisfy the conditions (3.27). Very popular ones are the continuous, piecewise polynomial basis functions.
Let us discuss a very important point, the regularity issue. We approximate a second
order differential equation with solution u(x). So, why is it enough to choose functions
that are continuous, and piecewise smooth, to build the approximation uh (x)? The first
remark is that we approximate the weak formulation, which is based on zeroth and first
order derivatives of the solution, u(x) and u0 (x) (and similarly for the test functions).
Indeed, we gained one order by integrating by parts, so it is not necessary to approximate the second order derivative u00 (x)! The second remark is that the weak formulation
is made up of integral terms over the interval ]a, b[. It has a meaning as soon as the
term c(x)u0 (x)v 0 (x) + q(x)u(x)v(x) is integrable, which is for instance the case if it is a
75

bounded continuous function of x with a finite number of discontinuities. Approximating u(x) with continuous, piecewise polynomial basis functions, we get an approximation
uh (x) and its derivative u0h (x) that fall into this category!! Now, with similar choices have
for the test functions48 , we conclude that we can build an approximate weak formulation.
Next, we discuss the simplest case of approximation, with piecewise linear49 basis
functions, which yields the continuous, piecewise linear finite element space.
At each interior mesh point x = xi , i = 1, , N , we construct a basis function as
follows:

xxi1

x [xi1 , xi ]

hi ,
xi+1 x
i (x) =
,
x [xi , xi+1 ]
hi+1

0,
x 6 [xii , xi+1 ]
At the boundary mesh point x = x0 :

x1 x ,
h1
0 (x) =
0,

x [x0 , x1 ]

while at the boundary mesh point x = xN +1 :

xxN ,
N +1 (x) = hN +1
0,

x [xN , xN +1 ]

x 6 [x0 , x1 ]

x 6 [xN , xN +1 ]

It is easy to see that the above constructed basis functions (i (x))i=0, ,N +1 satisfy
the conditions (3.27). Due to their shape, they are usually called hat functions. By con-

1 0

N -1

...

N +1

b
Figure 3.4: Examples of basis functions.

48

And assuming that the model we consider comes with c(x) and q(x) that are also bounded and
continuous with a finite number of discontinuities...
49
By linear function, we mean a first-degree polynomial function (of the variable x).

76

struction, these functions are continuous over [a, b], piecewise linear, and differentiable
everywhere, except at three mesh points for i = 1, , N , and at one mesh point for
i = 0 and i = N + 1. In addition, their derivatives are bounded and continuous, with
either three or one discontinuities. Finally, their suppport is reduced to one (i = 0,
i = N + 1) or two intervals (i = 1, , N ): their support is local.
Accordingly, these basis functions define a finite dimensional space of trial functions:
Vh = span {0 , 1 , , N +1 } ,
the so-called continuous piecewise linear finite element space.
Moreover, for all vh Vh , we can prove50 that
vh (x) =

N
+1
X

Vi i (x)

for a x b

where Vi = vh (xi ) for 0 i N + 1.

i=0

For the approximate solution uh (x), we thus write the ansatz (or educated guess)

U1
U2
U0

UN

...

UN +1

UN !2
a

b
UN !1

Figure 3.5: Example of a continuous piecewise linear function.

uh (x) =

N
X

Ui i (x) + N +1 (x) ,

(3.28)

i=0

where the term N +1 (x) appears because of the Dirichlet boundary condition at x = b.
3.2.3

Approximation of functions Discrete test functions

Suppose again we have at hand a finite element space


span {0 , 1 , , N +1 } ,
for instance the one we chose at the end of the previous subsection, that is the continuous
piecewise linear finite element space Vh .
50

Check this important result, that can be obtained following the proof of linear independence!

77

Galerkins approach. To define discrete test functions, we want them to come with
a homogeneous Dirichlet boundary condition at x = b. So, we introduce


Vh0 = vh Vh : vh (b) = 0 = span {0 , 1 , , N }.
In Galerkins approach, the trial functions and the discrete test functions belong to the
same finite element space (here Vh ). This is natural because it will result in a number of
unknowns being equal to the number of equations (compare (3.28) to the definition of Vh0 ,
or see below). Notice here that the space of trial functions Vh is slightly different from
the space Vh0 to which the discrete test functions belong. It is the same as in (3.17), for
the exact weak formulation: the difference comes from the fact that all test functions
fulfill the homogeneous Dirichlet boundary condition at x = b, whereas the solution does
not (except if = 0). Then we can approximate the variational formulation (3.17) as
follows:
Find uh Vh with uh (b) = such that a(uh , vh ) = g(vh ) vh Vh0 .

(3.29)

This is the discrete weak formulation.


System of linear algebraic equations. Let us see how to solve the discrete weak
formulation (3.29) in practice. Starting from (3.29), we know that we have
a(uh , i ) = g(i ),

i = 0, , N .

(3.30)

Interestingly, it is equivalent to have (3.30) or to have (3.29). Indeed, for any vh Vh0 ,
P
taking vh (x) = N
i=0 Vi i (x), we find by linearity that:
!
N
N
X
X
a(uh , vh ) = a uh ,
Vi i =
Vi a(uh , i )
i=0
(3.30)

N
X

Vi g(i ) = g

i=0

i=0
N
X

Vi i

= g(vh ) .

i=0

Substituting the ansatz (3.28) for uh into (3.30) gives (again by linearity)
N
X

Uj a(j , i ) = g(i ) a(N +1 , i ) ,

i = 0, , N ,

(3.31)

j=0

which can be written as a linear system in RN +1 :


~ =B
~
AU
78

(3.32)

~ are the coefficients (Uj )j=0,N that appear in (3.28), and


where the components of U

B
A00 A01 A0N
0

B
1
10 A11 A1N
~

.
A=
, B= .

.

.

AN 0 AN 1 AN N
BN
Aij and Bi are respectively given by, for i = 0, , N and j = 0, , N ,
Aij = a(j , i ),

Bi = g(i ) a(N +1 , i ) .

~ as above is equivalent51 to (3.30). The other way


Let us prove that (3.32), with A and B
~ RN +1 solution to (3.32). Rewriting this linear system
around, suppose that we have U
P
row by row, we find (3.31). By linearity, we find that uh (x) = N
j=0 Uj j (x) + N +1 (x)
solves (3.30) (and also the discrete weak formulation (3.29)!).
The system of algebraic equations (3.32) can be solved either by the Gaussian elimination or by the Jacobi method (see 5.4.2) or the Gauss-Seidel method (see 5.4.3).
For that, one has to prove that A is invertible. To that aim, we prove the stronger result
that A is a positive definite matrix. We form (~v , A~v )RN +1 , for any vector ~v of RN +1 :
!
N
N
N
X
X
X
(~v , A~v )RN +1 =
vi (Av)i =
vi
Aij vj
i=0

N
X

i=0

vi

i=0

= a

N
X

j=0

!
a(j , i )vj

j=0
N
X
j=0

vj j ,

N
X
i=0

N
X

vi a

N
X

!
vj j , i

j=0

!
vi i

= a(vh , vh ) where vh (x) =

i=0

N
X

vi i (x) .

i=0

Under the positivity conditions (2.26), a(, ) is a strictly positive form, so it follows that,
if vh 6= 0, one has a(vh , vh ) > 0. But, this happens if, and only if, the vector ~v is not
equal to 0, which proves the positive definiteness of A.
3.2.4

Projection Accuracy of the finite element method

Unfortunately, with the tools at hand, we cannot study completely the accuracy of the
finite element method. However, we can prove some best approximation properties...
51

~ are the coefficients of uh (x) in the ansatz (3.28).


Bearing in mind that the components of vector U

79

Let us define the functional space V 0 , as the set of functions with homogeneous
Dirichlet boundary condition at x = b. Obviously, Vh0 is a subspace of V 0 . We know
that a(, ) is strictly positive52 and symmetric, so we introduce

1/2
kvka := a(v, v)
.
One can prove that a(, ) is a scalar product, and that k ka is a norm52 of V 0 .
Let us begin with the particular (and interesting!) case where = 0. The exact
weak formulation writes:
Find u V 0 such that a(u, v) = g(v) v V 0 .

(3.33)

Whereas the discrete weak formulation writes:


Find uh Vh0 such that a(uh , vh ) = g(vh ) vh Vh0 .

(3.34)

So, in both the exact and discrete cases, the solution and the test functions belong
to the same functional, or finite dimensional, spaces. We can prove a result on the
projection of the solution u to the subspace Vh0 of V 0 .

Theorem 3.3. Let u be the solution to the exact weak formulation (3.33), and let uh
be the solution to the discrete weak formulation (3.34), then
ku uh ka = min0 ku vh ka .

(3.35)

vh Vh

Remark 3.5. In other words, the discrete solution uh is the projection on Vh0 of the
exact solution u, with respect to the scalar product a(, ) of V 0 .
Proof. As uh Vh0 , we obviously have that kuuh ka minvh Vh0 kuvh ka ! In particular,
if ku uh ka = 0, we have proved (3.35).
Assuming it is not the case, let us take any wh Vh0 . We have
(3.33)

(3.34)

a(u, wh ) = g(wh ) = a(uh , wh ), so a(u uh , wh ) = 0.


52

 Choosing V 0 carefully, one can even prove that a(, ) is coercive over V 0 .

80

(3.36)

Now, let us take any vh Vh0 . We have:


ku uh k2a = a(u uh , u uh )
= a(u uh , u vh ) + a(u uh , vh uh )
= a(u uh , u vh )

take wh = vh uh in (3.36)

ku uh ka ku vh ka

by Cauchy-Schwarz inequality.

Dividing each side by ku uh ka , we reach


ku uh ka ku vh ka

vh Vh0 ,

which implies finally that


ku uh ka min0 ku vh ka .
vh Vh

This ends the proof.


 Let us conclude with the case where 6= 0. The exact weak formulation is
(3.17), whereas the discrete weak formulation is (3.29). In both the exact and discrete
cases, the solution and the test functions do not belong to the same functional, or finite
dimensional, spaces... However, in the same spirit as the previous study, we can make
the two remarks
u0 := u N +1 V 0 ,

and

a(u0 , v) = g(v) a(N +1 , v) v V 0 ;

2) u0h := uh N +1 Vh0 ,

and

a(u0h , vh ) = g(vh ) a(N +1 , vh ) vh Vh0 .

1)

In other words, one expects that, when 6= 0, u0 (resp. u0h ) to play the role of u (resp.
uh ), when = 0...

Theorem 3.4.  Let u be the solution to the exact weak formulation (3.17), and let uh
be the solution to the discrete weak formulation (3.29), then
ku uh ka = min0 ku N +1 vh ka .

(3.37)

vh Vh

Proof. As u0h Vh0 , we have that ku0 u0h ka minvh Vh0 ku0 vh ka . But u0 u0h is equal
to u uh , and u0 vh is equal to u N +1 vh , so it follows that
ku uh ka min0 ku N +1 vh ka .
vh Vh

81

If ku uh ka = ku0 u0h ka = 0, we have proved (3.37).


Assuming it is not the case, let us take any vh Vh0 . We have a(u0 u0h , u0h vh ) = 0
because of remarks 1) and 2). So, thanks to Cauchy-Schwarz inequality, we have now
ku0 u0h k2a ku0 u0h ka ku0 vh ka .
Dividing each side by ku0 u0h ka , we reach this time
ku0 u0h ka min0 ku0 vh ka .
vh Vh

Replacing u0 u0h by u uh , and u0 vh by u N +1 vh ends the proof.

82

You might also like