An Analysis of The Finite Element Method1

AN ANAL YSIS OF
THE FINITE ELEMENT METHOD
¡ "
1
Prentice-Hall
Series in Automatic Computation
George Forsythe, editor
AHO, editor, Currents in the Theoty of Computing MARTIN ANO NORMAN, The Computerizt
AHO AND ULLMAN, Theory of Parsing, Translation, and Compi/ing, MATHISON AND WALKER, Computers an~
-Volume 1: Parsing; Volume 11: Compiling MCKEEMAN, et. al., A Compiler Gen~rat
(ANDREE) 3, Computer Programming: Techniques, Analysis, and Mathematics MEYERs, Time-Sharing Computation in l
ANSELONE, · Collectively Compact Operator Approximation Theory and App/ications to MINSKY, _Computation: Finite and lnfini
Integral Equations NIEVERGELT, et al., Computer Approacll
ARBm, Theories of Abstract Automata PLANE AND MCMILLAN, Discrete Optim¡
BATES AND DOUGLAS, Programming Language/One, 2nd ed. Analysis for Management Decisions
BLUMENTHAL, Management lnformation Systems PRITSKER ANO KIVIAT, Simu/ation with
BRENT, Algorithms for Minimization without Derivatives Simulation Language
BRINCH HANSEN, Operating Systems Principies PYLYSHYN, editor, Perspectives on the •
COFFMAN AND DENNING, Operating-Systems Theory RICH, Interna/ Sorting Methods: Illustr
CRESS, et. al., FORTRAN IV with WATFOR and WATFIV RUSTIN, editor, Algorithm Specification
DANIEL, The Approximate Minimization o/ Functionals RUSTIN, editor, Computer Networks
DESMONDE, A Conversational Graphic Dáta PriJcessing System RUSTIN, editor, Data Base Systems
DESMONOE, Computers and Their Uses, 2nd ed. RUSTIN, editor, Debugging Techniques
DESMONOE, Real-Time Data Processing Systems - RUSTIN, editor, Design and Optilflizati•
DRUMMONO, Evaluation and Measurement Techniques for Digital Computer Systems RUSTIN, editor, Formal Semantics of P
EVANS, et al., Simulation Using Digital Computers SACKMAN AND CITRENBAUM, editors, 0
FIKE, Coinputer Evaluation of Matheinatical Functions Creative Problem-Solving
FIKE, PL/1 for Scientific Programers SALTON, editor, The SMART Retrieva.
FORSYTHE AND MOLER, Computer Solution o/ Linear Algebraic Systems Automatic Document Processing
GAUTHIER ANO PONTO, Designing Systems·Programs SAMMET, Programming Languages: Hb
GEAR, Numerical Intitial Value Problema in Ordinary Differential Equations scHAEFER, A Mathematical Theory of 1
GOLDEN, FORTRAN IV Programming and Computing 'scHULTZ, Spline Analysis
GOLOEN AND LEICHUS, IBM/360 Programming and Computing SCHWARZ, et al., Numerical Analysis o
GORDON, System Simulation SHERMAN, Techniques in Computer Pro
HARTMANIS ANO STEARNS, Algebraic Structure Theory of Sequential Machines SIMON ANO SIKLOSSY, Representa/ion a
HULL, Introduction to Computing with Information Processing Systems
JACOBY, et al., Iterative Methods for Nonlinear Optimiza/ion Problems STERBENZ, Floating-Point Computation
JOHNSÓN, System Structure in Data, Programs and Computers STERLING ANO POLLACK, Jntroduction l
KANTER, The Computer and the Executive STOUTEMYER, PL/1 Programming jor 1
KIVIAT, et al., Tfre SJMSCRIPT JI Programming Language STRANG AND FIX, An Ana/ysis of the f
LORIN, Para/lelism in Hardware and Software: Real and Apparent Concurrency STROUO, Approximate CalcUlation of 1
LOUOEN AND LEDIN, Programming the IBM 1130, 2nd ed. TAVISS, editor, The Computer lmpact
MARTIN, Design of Man-Computer Dialogues TRAUB, Iterative Methods for the So_!u
MARTIN, Design of Real-Time Computer Systems UHR, Pattern Recognition, Learning, a
MARTIN, Future Developments in Telecommunications VAN TASSEL, Computer Security Mana
MARTIN, Programming Real-Time Computing Systems VARGA~ Matrix lterative Analysis
MARTIN, Systems Analysis /or Data Transmission wAITE, lmplementing Software for No
MARTIN, Telecommunications and the Computer WILKINSON, Rounding Errors in Algeb
MARTIN, Teleprocessing Network Organization WIRTH, Systematic Programming: An
ting MARTIN AND NORMAN, The Computerized Society
tion, and Compiling, MATHISON AND WALKER, Computers and Telecommunications: Issues in Public Policy
MCKEEMAN, et. at, A Compiler Genfirator
es, Analysis, and Mathematics MEYERS, Time-Sharing Computation in the Social Sciences
Jroximation Theory and Applications to MINSKY, _Computation: Finite and lrifinite Machines
NIEVERGELT, et al., Computer Approaches to Mathematical Problems
PLANE AND MCMILLAN, Discrete Optimization: Integer Prograínming and Network
One; 2nd ed. Analysis for, Management Decisions
ns PRITSKER AND KIVIAT, Simulation with GASP II: a FORTRAN-Based
)erivatives Simulation Language
PYLYSHYN, editor, Perspectives on the Computer Revolution
1eory RICH, Interna/ Sorting Methods: 11/ustrated with PL/1 Program
md WATFIV RUSTIN, editor, Algorithm Specification
zctionals RUSTIN, editor, Computer Networks
ocessing System RUSTIN, editor, Data Base Systems
RUSTIN, editor, Debugging Techniques in Large Systems
'S - RUSTIN, editor, Design and Optimization of Compilers
iniques for Digital Computer Systems RUSTIN, editor, Formal Semantics of Programming Languages
ters SACKMAN AND CITRENBAUM, editors, On-Line Planning: Towards
mctions Creative Problem-Svlving
SALTON, editor, The SMART Retrieval System: Experiments in
inear Algebraic Systems Automatic Document Processing
rrams SAMMET, Programming Languages: History andFundamentals
dinary Di./ferential Equations SCHAEFER, A Mathematical Theory of Global Program Optimization
nputing 'scHULTZ, Spline Analysís ·
and Computing SCHWARZ, et al., Numerical Analysis of Symmetric Matrices
SHERMAN, Techniques in Computer Programming
fheory of Sequential Machines SIMON AND SIKLOSSY, Representation and Meaning: Experíments
with Information Processing Systems
Optimization Problems STERBENZ, Flo.ating-Point Computation
and Computers STERLING AND POLLACK, lntroduction to Statistical Data Processing
STOUTEMYER, PL/1 Programming for Engineering and Science
1g Language STRANG AND FIX, An Analysis o/ the Finite Element Method
Real and Apparent Concurrency STROUD, Approximate Calculation of Multiple lntegrals
ro, 2nd ed. TAVISS, editor, The Computer Impact
TRAUB, lterative Methods for the Solution o/ Polynomial Equations
lS UHR, Paítern Recognition, Learning; and Thought
ations VAN TASSEL, Computer Security Management
vstems VARGA, Matrix lterative Analysis
m WAITE, lmplementing Software for Notj-Numeric Application
WILKINSON, Rounding Errors in Algehraic Processes
WIRTH, Systematic Programming: An 'Introduction
AN ANALYSI
FINITE ELEME
GILBERT _STRANG
Massachusetts Institute of Technolo,
GEORGE J. FIX
University of Maryland
PRENTICE-HALL, INC.
ENGLEWOOD CLIFFS, l
AN ANALYSIS OF THE
FINITE ELEMENT METHOD
GILBERT STRANG
Massachusetts Institute o/Technology
GEORGE J. FIX
University of Maryland
PRENTICE-HALL, INC.
ENGLEWOOD CLIFFS, ·N.J.

Library of Congress Cataloging in Publication Data
STRANG, GJLBERT,
An analysis of the finite element method.
(Prentice-Hall series in automatic compuÚiÜon)
Bibliography: p.
1. Finite element method. l. Fix, Geor1;1e J.,
joint author. JI. Title.
TA335.S77 515'.624 72-12642
ISBN 0-13-032946-0
/
)
© 1973 by Prentice-Hall, Inc., Englewood Cliffs, N.J. To Jill an(
All rights reserved. No part of this book may be

reproduced in any form, by mimeograph or any other means,
without permission in writing from the publísher.
10 9 8 7 6 5 4
Printed in the United States of America.
PRENTICE-HALL INTERNATIONAL, INC., London

PRENTICE-HALL OF AUSTRALIA, PTY. LTD., Sydney
PRENTICE-HALL OF CANADA, LTD., Toronto
PRENTICE-HALL OF INDIA PRIVATE LIMITED, New Defhi
PRENTICE-HALL OF JAPAN, INC., Tokyo
glewood Cliffs, N.J. To Jill and Linda
; book may be
Jgraph or any other means,
1 the publisher.
1erica.
cherchez la fe. m .
. ,London
LTD., Sydney
1ronto
\iiTED, f'lew Delhi
vo
PREFACE
The finite element method has been

to sol ve the complicated equations of e
for those problems it has essentially s
ences. Now other applications are ra1
in the geometry is important-and the
only to solve a system of equations, b
discrete approximation in the first place
thing to contribute.
From a mathematical point of vie
Rayleigh-Ritz-Galerkin technique. 1t
partial differential equations. The Ritz
directly with the differential equaÜon;
into an equivalent variational forrr:
assumed to be a combination 2: qirpJ
the method of weighted residuals, and
underlying variational principie. It is
puter actually solves.
So far the idea is an old one. What
in the finite element method they are
responsible for the method'~ success.
the domain, and enters the computati<
cular _node. In that neighborhood rp i i
low degree, and the computations are
that simultaneously, and .quite indep
become preeminent in the mathemati
·,_/' tions. Apparently it was the right id~
Because the mathematical foundat
stand why the method works. This is ti
PREFACE
The finite element method has been an astonishing success. It was created
to sol ve the complicated equations of ehisticity and structural mechanics, and .
for those problems it has essentially superseded the method of finite differ-
ences. Now other applications are tapidly developing. Whenever flexibility
in the geometry is important-and the power of the computer is needed not
only to sol ve a system of equations, but also to formulate and assemble the
discrete approximation in the first place-the finite element method has some-
thing to contribute.
From a mathematical point of view, the method is an extension of the
Rayleigh-Ritz-Galerkin technique. It therefore applies to a wide class of
partial differential equations. The Ritz technique does not, however, operate
directly with the differential equation; instead, the contil1uous problem is put
into an equivalent vatiational form, and the approximate solution is
assumed to be a combination 2: q1 rp¡ of given trial functions rp¡(x). This is
the method of weighted residuals, and the weights q1 are computed from the
underlying variational principie. It is this discrete problem which the com-
puter actually solves.
So far the idea,-is an old one. What is new is the choice of trial functions:
in the finite element method they are piecewise polynomials. That choice is
responsible for the method'~ success. Each function rp1 is zero over most of ·
the domain, and enters the computation only in the neighborhood of aparti-
cular node. In that neighborhood rp 1 is pieced together from polynomials of
low degree, and the computations are as simple as possible. It is remarkable
that simultaneously, and quite independently, píecewise polynomials have
become preeminent in the mathematical theory of approximation of func-
tions. Apparently it was the right idea at the right time.
Because the mathematical foundations are sound, it is possible to under-
stand why the method works. This is the real reason for our book. It.s purpose
ix
X PREFACE
is to explain the effect ofe'!:ch bf the approximations that are essential for was to go out and program sorne fini1
the finite element technique to be computationally efficient. We list here sorne already available, students could be a~
of these approximations: ments with a theoretical seminar base:
Chapters 2 to 5 were also written b)
(1) interpolation of the original physical data w·ere drafted by the second author, ar
(2) choice of a finite number of polynomial trial functions the first. And the whole was typed b)
(3) sitÚplification of the geometry of the domain fully allowed us to believe that she er
(4) modification of the boundary conditions .
(5) numerica] integration of the underlying function~l in the variational
principie
(6) roundoff error in the solution of the discrete system.
Cambridge, Massachusetts
These questions are fundamentally mathematical, and so are the authors.
Nevertheless this book is absolutely not intended for the exclusive use of spe-
cialists in numerical análysis. On the contrary, we hope it may help establish
closer communication between the mathematical engineer and the mathe-
matical analyst. It seems to us that the finite element method provides a special
opportunity for this communication: the theory is attractive, the applications
are· growing, and best of all, the ·method is so new that tbe gap between
theory and application ought not yet to be insurmountable.
Of course we recognize that there are obstacles which cannot be made to
disa,ppear. One of them is the Janguage itself; we have kept the mathematical
notations to a mínimum, and indexed them (with definitions) at the end
of the book. We also know that, even after a norm has been interpreted as a
natural measure of strain energy, anda Hilbert space identified with the class
of admissible functions in a physically derived variational principie, there
still remains the hardest problem: to become comfortable with. these ideas,
and to make them one's own. This requires ·genuine patience and toleran ce
on both si des, as well as effort. Perhaps this book at least exhibits the kind of
proJ?lems which a mathematician is trained to solve, and those for which he
is useless.
In the last few years a great many numerical analysts have turned to finite
elements, and we are very much in their debt. This is acknowle,dged explicitly
throughout the book, and implicitly in the bibliography' even though we
have by no means attempted a formal history. Here, before the book begins,
we want to thank two others-engineers rathe'r than mathematicians-for
help that was the most important of aH. One is Isaac Fried, whose influence
led us to abandon an earlier (and completed) '''Fourier Analysis of the Finite
Element Method," and to study instead the real thing. The other is Bruce
lrons, whose remarkable intuitions are described (and proved correct, as
often as we can) in the book itself.
Chapter 1 is very much longer than the others, and was used by the first
author as the text in an introductory course at M.I.T. The only homewo~k
PREFACE xi
:tpproximations that are essential for was to go out and program sorne finite elements. Where such programs are
utationally efficieni. We list here sorne already available, students could be asked to combine computational experi-
ments with a theoretical seminar based on the book.
Chapters 2 to 5 were al so written by the first author. The last three chapters
tysical data were drafted by the second author, and then revised and "homogenized" by
ynomial trial functions the first. And the whole was typed by Mrs. Ingrid Naaman, who has grace-
>f the domain fully allowed us to believe that she enjoyed it; thank yo u.
::onditions
derlying function~l in the variational GILBERT STRANG
GEORGE J. FIX
of the discrete system.
Cambridge, Massachusetts
nathematical, and so are the authors.
in tended for the exclusive use of spe-
ntrary, we hope it may help establish
.thematical engineer and the mathe:-
tite element method provides a special
e theory·is attractive, the applications ·
wd is so new that the gap between
to be insurmountable.
e obstacles which cannot be made to
tself; we have kept the mathematical
them (with definitions) at the end
fter a norm has been interpreted as a
Hilbert space identified with the class
derived variátional principie, there
~come comfortable with ·these ideas,
uires genuine patience and tolerance
:his book at least exhibits the kind of
ned to solve, and those for which he
merical analysts have turned to finite

jebt. This is acknowledged explicitly
the bibliography, even though we
story. Here, before the book begins,
rs rather than mathematicians-for
One i~ Isaac Fried, whose influence
:ted) ~'Fourier Analysis of the Finite
l the real thing. The other is Bruce
described (and pro ved correct, as
he others, and was used by the first

urse at M.I.T. The only homework
xiii CONTENTS
3 APPROXIMATION
3.1 Pointwise Approximation

3.2 Mean-square ApproximatiOI
3.3 Curved Elements and !sopar
3.4 Error Estimates
CONTENTS
4 VARIATIONAL CRIMES
4.1 Violations of the Rayleigh-l

4.2 Non-conforming Elements
4.3 Numerical Integratioa·
PREFACE ix 4.4 Approximation of Domain ~
1 AN INTRODUCTION TO THE THEORY 1

5 STABILITY
1.1 The Basic Ideas 1
1.2 A Two-point Boundary-value Problem 3 5.1 Independence of the Basis
1.3 The Variational Form ofthe Problem 8 5.2 The Condition Number
1.4 Finite Difference Approximations 16
1.5 The Ritz Method and Linear Elements 24
1.6 The Error with Linear Elements 39
1.7 The Finite Element Meth9d in One DJmension· 51
1.8
1.9
Boundary Value Problems in Two Dimensions
Triangular and Rectangular Elements
63
74
6 EIGENVALUE PROBLENIS
1.10 Element M~trices in Two-dimensional Problems 90
6.1 Variational Formulation an
6.2 Sorne Elementary Examples
6.3 Eigenvalue and Eigenfunctic
6.4 Computational Techniques
2 A ªUMMARY OF THE THEORY 101
2.1
2.2
Basis Functions for the Finite Element Spaces Sh
Rates of Cop.vergence
101
105 7 INITIAL-VALUE PROBLEMS
2.3 Galerkin's Method, Collocation, and the Mixed Method 116
2.4 Systems of Equations; Shell Problems; Variations on 7.1 The Galerkin-Crank-Nicol:
the Finite Element Method .126 Equation
7.2 Stability and Convergence i
7.3 Hyperbolic Equations
xii
xiii 'CONTENTS
APPROXIMATION 136
3.1 Pointwise Approximation 136

3.2 Mean-square Approximation 143
3.3 Curved Elements and Isoparametric Transformations 156
3.4 Error Estiniates 165
4 VARIATIONAL CRIMES 172
4.1 Violations of the Rayleigh-Ritz Code 172

4.2 Non-conforming Elements 174
4.3 Numericat Integration 181
ix 4.4 Approximation of Domain and Boundary Conditions 192
:ORY 1
5 STABIUTY 205
1
3 5.1 Independence of the Basis '205
le m
8 5.2 The Condition Number 209
~m
16
!nts 24
39
·Dimension 51
Dimensions
1ts
63
74
6 EIGENVALUE PROBLEMS 216
nal Problems 90
6.1 Variational Formulation and the Min-max Principie 216
6.2 Sorne Elementary Examples 223
6.3 Eigenvalue and Eigenfunction Errors 228
6.4 Computational Techniques 236
101
tent Spaces Sh 101

105 7 INITIAL-VALUE PROBLEMS 241
1d the Mixed Method 116
ms; Variations on 7.1 The Galerkiri-Crank-Nicolson Method for the Heat
.126 Equation 241
7.2 Stability and Convergence in Parabotic Problems 245
7.3 Hyperbolic Equations 251
CONTENTS XÍV
8 SINGULARITIES 257
8.1 Cornees and Interfaces 257

8.2 Singular Functions 263
8.~.. Errors in the Presence of Singularities 266
8.4. ' Experimental .R.esb.lts 268
BIBLIOGRAPHY 281 1 AN INTRODUCTI
INDEX OF NOTATIONS 297
INDEX 303
1.1. THE BASIC IDEAS
The finite element method car

the problem to be solved is in va
the function u which minimizes a
minimizing property leads to a dif
but normally an exact solution
necessary. The Rayleigh-Ritz-G:
trial functions f/Jp ••• , 'fJN, and <1
to find the one which is minimi
unknown weights qj are determi
a system of N discrete algebraic
The theoretical justification for 1
minimizing process automatically
u. Therefore, the goal is to choc
enough for the potential energy
same time general enough to ap
The real difficulty is the first e
·ity. In theory there always exists
their linear combinations fill the
and therefore the Ritz approxim
with them is another matter. Thi
The underlying idea is -simpl(
or the-region of physical interes
easy for the computer to record
tangles. Then within each piece
CONTENTS Xiv
257
251
' 263
ties 266
268
281
1 AN INTRODUCTION TO THE THEORY
297
303
1.1. THE BASIC IDEAS
The finite element method can be described in a few words. Suppose that
the problem to be solved is in variational form-it may be required to find
the function u which minimizes a given expression of potential energy. This
minimizingproperty leads toa difterential equation for u (the Eulerequation),
but normally an exact solution is impossible and sorne approximation is
necessary. The Rayleigh~Ritz-Galerkin,idea is to choose a finite number of
trial functions rp 1, •.. , rpN, and among all their linear combinations :E qJrp1
to find· the one which is minimizing. This is the Ritz approximation. The
unknown weights qJ are determined, Iiot by a differential equation, but by
a system of N discrete algebraic equations which the computer can handle.
The theoretical justification for this ·method is simple, and compelling: The
minimizing process automatical/y seeks out the combination which is closest to
u. Therefore, the goal is to choosetrial functions rpJ which are coñvenient
enough for the potential energy to be computed and minimized, and at the
same time general enough to approximate closely the unknown solution u.
The real difficulty is the first one, to achieve convenience and computabil-
ity. Intheory there always exists a set of trial functions which is complete-
their linear combinations fill the space of all possible solutions as N oo,
and therefore the Ritz approximations converge-but to be able to compute
· with them is another matter. This is what finite elements have accomplished.
·The underlying idea is simple. It starts by a subdivision of the structure,
or the region of physical interest, into smaller pieces. These pieces must be
easy for the computer to record and identify; they may be triangles or rec-
tangles. Then within each piece the trial functions are given an extremely
1
2 AN INTRODUCTION TO THE' THEORY CHAP. 1 SEC. 1.2. ATV
simple form-normally they .are polynomials, of at most the third or fifth We have attempted a fairly compl
degree. Boundary conditions are infinitely easier to impose locally, along the displacement method. A comparabl~ th
edge of a triangle or rectangle, than globally along a more complicated not yet exist, although it would cer
boundary. The accuracy of the approximation can be increased, if that is equations-in which the difficulties él
necessary, but not by the classical Rhz method of including more and more rriake a few preliminary comments or
compl~x trial funétions. Instead, the same polynomials are retained, and the an outstanding problem for the futu
subdivtlsion is refined. The computer follows a nearly identical set of instruc- method over the alternative variatio:
tions, a·nd just takes longer to finish. In fact, a large-scale finite element sys- 2, we ha ve opted to si de with the maj
tem can use the power of the computer, for the formulation of approximate version of the finite element method
equations as well as their solution, to a degree never before achieved.,.in would be the same for all formulatiot
complicated physical problems. throughout the whole subject makes
Unhappily none of the credit for this idea goes to numerical analysts. methods and force methods nearly au
The method was created by structural engineers, and it was not recognized Our goal in this chapter is to illust
at the start as an !nstance of the Rayleigh-Ritz principie. The subdivision method:
into simpler pieces, and the equations of equilibrium and compatibility
between the pieces, were initially constructed on the basis of physical reason- l. The variational formulation of
ing. The later development of more accurate elements happened in a similar 2. The construction of piecewise
way; it was recognized that increasing the degree of the polynomials would 3. The computation of the stiffm
greatly improve the accuracy, but· the unknowns qi computed in the discrete system.
approximation have always retained a physical significance. In this respect 4. The estimation of accuracy in
the computer output is ni.uch easier to interpret than the weights produced
by the classical method. We take the opportunity, when
The whole procedure became mathematically respectable at the moment insert sorne of the key mathematical
when the unknowns were identified as the coeffi.cients in a Ritz approximation Hilbert spaces xs and their norms, ti
u : : : : : 1:; q/Pr and the discrete equations were seen to be exactly the condi-
the data, and the energy inner produc
tions for minimizing the potential energy. Surely Argyris in Germany and specific problem. With these iools, tt
England, and Martín and Clough in America, were among those responsible; proved even for a very complicated g
we dare not guess who was first. The effect was instantly to provide a sound tional arguments permits an analysi
theoretical basis for the methoci. As the techniques of constructing more re- been achieved for finite differences.
fined elements have matured, the underlying theory has also begun to take
shape. 1.2. A TWO-POINT BOUNDARY-V•
The fundamental problem is to dlscover how closely piecewise polyno-
mials can approximate an unknown solution u. In other words, we must Our plan is to introduce the finit1
determine how well finite elements-which were developed on the basis of which Iies behind it, in terms ofa spec
computational simplicity-satisfy the second requirement of good trial func- to choose a one-dimensional probler
tions, to be effective in approximation. Intuitively, any reasonable function ments shall be simple and natural, él
u can be approached to arbitrary accuracy by piecewise linear functions. inanipulations shall be straightforwa
The mathematical task is to estímate the error as closely as possible and to than sorne general Green's formula.
determine how rapidly the error decreases as the number of pieces (or the tion
degree ofthe polynomial within .each piece) is increased. Of course, the finite
element method can proceed without the support .of precise mathematical (1)
theorems; it got on pretty well for more than 10 years. But we believe it will
be useful, especially in the future development of the method, to understand With suitable boundary conditions
and consolidate what has already been done. is a classical Sturm-Liouville probl
CHAP. 1 SEC. 1.2. A TWO-POINT BOUNDARY-VALUE PROBLEM 3
,Jynomials, of at most the third or fifth We have attempted a fairly complete analysis of linear prob/ems and the
nitely easier to impose locally, along the displacement method. A comparable theory for fuiiy nonlinear equations does
an globa1ly along a more complicated not yet exist, although it would certainly be possible to treat semilinear
)foximation can be increased, if that is equations-in which the difficulties are confined to Iower-order terms. We
.itz method of inc1uding more and more make a few preliminary comments on nonlinear equations, but this remains
same polynomials are retained; and the an outstanding problem for the future. In our choice of the displacement
follows a nearly ideq¡t,ical set of instruc- method over the alternative variational formulations described in Chapter
. In fact, a large-scale· finite element sys- 2, we have opted to side withthe majority. This is the most commonly used
ter, for the formulation of approximate version of the finite eiement method. Of course; the approximation theory
to a degree never before achieved in would be the same for all formulations, and the duality which is so rampant
throughout the whole subject makes the conversion between displacement
r this idea goes to numerical analysts. methods and force methods nearly automatic.
al engineers, and it was not recognized Our goal in this chapter is to illustrate the basic steps in the finite el~ment
ayleigh-Ritz principie. The subdivision method:
[ons of equilibrium and compatibility
;tructed on the basis of physical reason- l. The variational formulation of the problem.
Lccurate elements happened in a similar 2. The construction of piecewise polynomial trial functions.
tg the degree of the polynomials would 3. The computation of the sti:ffness matrlx and solution of the discrete
e unknowns q1 computed in the discrete system.
a physical significance. In this respect 4. The estimation of accuracy in the final Ritz approximation.
to interpret than the weights produced
We take the opportunity, when stating the problem variationally, to
hematically respectable at the moment insert sorne of the key mathematical ideas needed for a precise theory-the
the coefficients in a Ritz approximation Hilbert spaces xs and their norms, the estimates for the solution in terms of
ms were seen to be exactly the condi- the data, and the energy inner product which is naturally assodated with the
tergy. Surely Argyris in Germany and specific problem. With these iools, the cónvergence of finite elements can be
.merica, were among those responsible; proved even for a very complicatedgeometry. In fact, the simplicity ofvaria-
e:ffect was instantly to provide a sound tional arguments permits an analysis which already goes beyond what has
he techniques of constructing more re- been achieved for finite differences.
lerlying theory has also begun to take
1.2. A TWO-POINT BOUNDARY-VALUE PROBLEM
iscover how c1osely piecewise polyno-
solution u. In other words, we must
Our plan is to introduce the finite element method, and the mathematics
.vhich were developed on the basis of
which lies behind it, in terms of a specific and familiar example. It makes sense
>econd requirement of good trial func-
to choose a one-dimensional problem, in order that the construction of ele-
l. Intuitively, any reasonable fQnction
ments shall be simple and natural, and also in order that ·the mathematical
curacy by piecewise linear functions.
inanipulations shall be straightforward-requiring integration by parts rather
~he error as c1osely as possible and to
than sorne general Green's formula. Therefore, our choice falls on the equa-
!ases as the number of pieces (or the
tion
•iece) is increased. Of course, the finite
the support .of precise mathematical (1) - fx (p(x)~~) + q(x)u = f(x);
~e than lO years. But we beJieve it will
opment of the method, to understand With suitable boundary conditions at the endpoints x = O and x = n, this
done. is a classical Sturm-Liouville problem. It represents a number of di:fferent
4 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.2. A
physical processes-the distribution of temperature along a rod, for example, sense "solved." Ofcourse there is sti
or thé displacement of a rotating string. Mathematically, the first point to which solution u corresponds to a gi
emphasize is that the equation and boundary conditions arise from a steady- book. But we believe it is worthwh
state problem, and not one which unfolds in time from initial conditions of first to get these function spaces rig:
disphicement and velocity. It would correspond in more space dimensions to the variational principies and theit
an ellipt{c boundarj-value problem, governed for example by Laplace's equa- space offunctions is admissible. (Tht
tion. n the implication that if u 1 and u2 art
In order to illustrate the treatment of different types of boundary''t:ondi- superposition is a natural property
tions, especially in the variational statement of the problem, we fix the left- We want to consider one specifi
hand end of the string and let the other be free. Thus at the end x = O there important to the theory, for the sp:
is an essential (or kinematic, or restrained, or geometric) boundary condition, admitted which have..finite energy. ·
in other words, one of Dirichlet type:
u(O) =O. (2) s: (/(>.

At the right-hand end x rc, the string is not constrained, and it assumes Any piecewise smooth function f is t
a natural (or dynamic, or stress) bound.ary condition, in other words, one of is not; we shall return later to this e
Neumann type: tions satisfying (2) is often denoted
u'(n) =O. cating by the superscript how many
energy (in this case it is only f itseU
We propose to consider this model problem from four different points For the simplest Sturm-Liouvi
ofview, in the following order: guess the corresponding space of so:
JCi-the subscript B refers to the
l. Pure mathematics. and the superscript 2 requires that th
2. Applied mathematics. The role of the pure mathematiciar
3. Numerical approximation by finite differences.
p(x) > Pmin > Oand q(x) > O, that
4. Numerical approximation by finite eléments. general equation -(pu')' + qu
j
in the following way:
1t is essential to recognize tne,common features of these four approaches
The operator L is a one-to-one ¡
to the same problem; the tools which are useful to the pure mathematician
for each f in 3C 0 the differential eq
in proving the existence and uniqueness of the solution, and to the applied
Furthermore, the solution depends CJ
mathematician in understanding its qualitative behavior, ought to be applied
sois u.
also to the study of the numerical algorithms.
The last sentence requires furth
We begin with the pure mathematician, who combines the differential
to measure the size off and u. The 1
equation and boundary conditions into a single equation,
space and solution space are differe
Lu f for the norms in terms ofthe energy
L is a linear operator, acting on a certain class of functions-those which in

sorne sense satisfy the boundary conditions and can be differenti~ted twice. 11 f llo [J (/(~)) 2 dx~
Mathematically, the fundamental question is precisely this: to match such
a space offunctions u with a class of inhoinogeneous terms J, in such a way that 11 u llz = [J {(u"(x))" -i
to ea eh f there corresponds one and on/y one solution u. Once this correspond-
ence betweenfand u has been established, the problem Lu isin anabstract tThese spaces are defined again in th<
:emperature along a rod, for example, sense "sol ved." Of course there is still a little way to go in actually discovering
g. Mathematically, the first point to which solution u corresponds toa givenf That step is the real subject ofthis
ndary conditions arise from a steady- book. But we believe it is worthwhile, and not just useless fussiness, to try
lds in time from initial conditions of first to get these function spaces right. In fact, it is of special importance for
respond in more space dimensions to the variational principies and their 'approximation to know exactly which
~rned for example by Laplace's, equa- space offunctions is admissible. (The references to "spaces" offunctions carry
1'\ the implication that if u 1 and u2 are admissible, then so is c 1u 1 + c2u2 ; this
)f different types of bbundary condi- superposition is a natural property in linear problems.)
:ment of the problem, we fix the left- We want to consider one specific choice, the one which is perhaps most
be free. Thus at the end x =O there important to the theory, for the space of inhomogeneous data: Those f are
d, or geometric) boundary condition, admitted which haveJinite energy. This means that
:o. (2) J: (f(x)) 2 dx < oo.
g is not constrained, and it assumes

Any piecewise smooth function f is thereby included, but the Dirac ó-function
:ry condition, in other words, one of
is not; we shall return later to this case of a "point load." The space offunc-
tions satisfying (2) is often denoted by L 2 ; we prefer the notation 3C 0 , indi-
=o. cating by the superscript how mat;ty derivatives off are required to have finite
energy (in this case it is only f itself).
problem from four different points For the simplest Sturm-Liouville equation -u" = f, it is not hard to
guess the corresponding space of solutions. This solution space is denoted by
3Cj-the subscript B refers to the boundary conditions u(O) il(n) =O,
and the superscript 2 requires that the second derivative ofu hasfiniteenergy.t
ite differences. The role of the pure mathematician is th,en to show, under the assumptions
p(x) > Pmin > O and q(x) > O, that 3Ci is still the solution space for the more
~e eletnents.
general equation -(pu')' + qu f In fact, his final theorem can be stated
on features of these four approaches in the following way:
re useful to the pure mathematician The operator L is a one-to-one transformation from 3Cj onto 3C 0 , so that
; of the solution, and to the applied for each fin 3C 0 the differential equation (1) has a unique solution u in 3Ci.
itative behavior, ought to be applied Furthermore, the solution depends continuously on the data: Jff is small, then
ithms. sois u.
cian, who combines the differential The last sentence requires further explanation; we need norms in which
a single equation, to measure the size offand u. The two norms will be different, since the data
space and solution space are different. Fortunately, there is a natural choice
f for the norms in terms of the energy, or rather its square root:
n class of functions-those which in

ons and can be differentiated twice. 11 f llo [J (f(x)) dxJ12,
2
ion is precisely ·this: to match such

1ogeneous terms f, in such a way that 11 u ll2 [J ((u"(x))2 + (u'(x)) 2 + (u(x)) 2) dx J 12
~
me solution u. Once this correspond-
' the problem Lu is in an abstract tThese spaces are defined again in the index of notations at the end of the book.
6 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.2; A
With.·these definitions, the continuous dependen ce of the solution on the data energy, f f 2 < oo. The paradox is
can be expressed in, a quantitative form: There exists a constant C such that functions u" in 3C0 • Whether f sati
or not, its expansion is valid in tht
(3)
The 1uniqueness of the solution follows immediately from this estímate:

Iff = e); then necessarily u O. In fact, it is such esiimates which lie at the
very center of the modero theory of partial differential equations. A general The boundary conditions on f are 1
technique for proving (3), which applies also to boundary-value problems in N~ oo. Fi~ure 1.1 shows how a:
severa! space dimensions, has been created only in the last generation. In this could still converge to a function.)
book, we shall accept such estimates as pro ved: For elliptic equations of order The Sturm-Liouville differenti
2m, this means that solved: iff = 2: anun, then u has tl
(4). llullzm < Cll/llo· (7)
We move now to a more applied question, the actual construction of the

With this explicit const.ruction, the
solution. If the coefficients p and q are constant, then this can be carried out
of data space 3C0 with solution sp
in terms of an· infinite series. The key lies in knowing the eigenvalues and
The question of the boundar:
eigenfunctions of L:
further comment. We have seen al
(5) uix) = Jf sin(n !)x, An p(n - f) 2 + q.

in terms of the Un, which do SR
absolutely nothing to do with th
What is different about u? Why d1
It is immediate to check that Lun -pu~ +
qun = lnu,, that the functions answer is that the series expansio1
un satisfy the boundary conditions · and therefore lie in 3C~, and that they than the expansion for f: not onl)
á re orthonormal: square sense, but so do its first anc
Suppose the ·inhomogeneous term is expanded Üt a series of eigenfunctions:
"'
(6) f(x) = ~ an-y(ii
2 sin(n- f)x.
Then integrating formally, the orthqgonality of the un gives
11/lla J: /2
dx =~~a?;.
The functigns fin 3C 0 are exactly those which admita harmonic expansion
of the form (6),. with coefficients satisfying L: a?; < oo. Actually, this ought
to seem a little paradoxical, since apparently every f of the form (6) will
satisfy f(O) O, f'(n) =O, whereas no boundary conditions were meant x=O
to be imposed on f: The elements of 3C 0 are required only to ha ve finite Fig. 1.1 IN in aci a)
pendence of the solution on the data energy, J f 2 < oo. The paradox is resolved by the completeness ofthe eigen-
There exists a constant e such that functions un in :JC 0 • Whether f satisfies these spurious boundary conditions
or not, its éxpansion is valid in the mean-square sense,
ll/llo·
ws immediately from this estima te:
it is such estimates \\:hich Jie at the
J: (f(x) ~ an¡{- sin(n -1)x )
2
dx-- O as N-- oo.
tial differential equaÚtins. A general The boundary conditions on f are thus unstabJe and disappear in the Iimit as
also to boundary-value problems in N-- oo. Figure 1.1 shows how a sequence of functions JN, all lying in Jei,
!d only in the last generation. In this could still converge to a function f outside that space.
·o ved: F or elliptic eq uations of order The Sturm-Liouville differential equation Lu = f is now ready to be
sol ved: iff = I: ar,un, then u has the expansion .
'11/llo· (7) u.
stion, the actual construction of the
mstant, then this can be carried out With this expJicit construction, the estimate 11 u 11 2 < e 11 f 1lo and the matching
ies in knowing the eigenvalues and of data space JC 0 with solution_ space Jei can be verified directly.
The question of the boundary conditions is more subtle and deserves
further comment. We have seen already that even thoughfcan be expanded
in terms of the un, which do satisfy the boundary conditions, still f has
absolutely nothing to do with these conditions. Therefore the question is:
What is different about u? Why does u satisfy the boundary conditions? The
u:' + qun = A.nun, that the functions answer is that the series expansion for u converges in a much stronger sense
therefore líe in :re¡, and that they than the expansion for f: not only does I: anunf A.n converge to u in the mean-
square sense, but so do its first and seconq derivatives. More precisely,
llu as N-- oo.
anded in a series of eigenfunctions: f {7T) :1= o

1
f~{7T) =o
· sin(n -1)x.
llity of the un gives
""
t' :¿:a;.
n=l
f{O) =F O
which admit a harmonic expansion
ng E a; < oo. Actually, this ought
~ently every f of the form {6) • will
boundary conditions were meant x=O x= 7T
C0 are required only to have finite Fig. 1.1 IN in 3C~ approxirnating a general functionf.
8 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.3. TF
The point is that when even the second derivatives converge, the boundary strong movement toward the minimi
conditions are stable; the limit function u is compelled to satisfy the boundary fundamental problem to be approxi1
conditions. (Note thát in Figure 1.1, the second derivatives of !N did not The relationship between the line;
converge to those off; therefore the limitfdid not have to satisfy the boun- if L, v, and f are just real numbers
dary conditions, and was outside the space ac¡. This is what will not happen provided that Lis positive, its minimu
for u.) ;, '
The g~neral rule is this: boundary conditions which invob'e only derivati~'es di 1 - 2(
below order s will make sense in the acs norm; those involving derivatives of dv v=u-
order s or higher will be unstable and will not apply to functions in the space
:Jes. We shall see that this is the rule which distinguishes between essential If L were not positive, the problerr
boundary conditions, which ~tay, and natural boundary conditions, which either the mínimum is oo or, if j
go. The distinction becomes crucial in the variational problem, whose state- straight line.
ment is in terms of first derivatives, that is, the JC 1 norm. The finite element A more interesting case is when
approximations will be required to satisfy aH boundary conditions below ordef is a symmetric positive definite ma1
1-that means the condition u( O) = 0-but they will not be required to satisfy stands for tlie usual inner product (
the condition on the first derivative. This leniency at x n will not prevent
the finite element approxi~ations from converging in the JC 1 norm to a solu- I(v) = L;L.~c
i,k J
tion u which does satisfy u'(n) = O. This is the key to the following section,
which extends the "pure mathematics" standpoint to the equivalent varia- Applying the symmetry Lik = Lk1; t
tional problem.
1.3. THE VARIATIONAL FORM OF THE PROBLEM

These n simultaneous equations m~
The linear equation Lu = f is related to the quadratic functional mínimum value of I, attained at u
I(v) = (Lv, v) - 2(f, v)
in the following way: l(v) is minimized at v u only if its derivative (or Since L is positive definite, so is L - l
first variation) vanishes there, arÍd the condition for the vanishing of this whenf = 0). Geometrically l(v) is 1
derivative is exactly the Euler equation Lu f The problems of inverting boloid opening upward, when L is
L and minimizing I are equivalent; they produce the same solution u. There- A vector derivation of the minin
fore, such problems can be investigated either in an operationalform, in terms all n coinponents at once. lf 1 has a
of the linear operator L, or in l'ariational form, in terms of the quadratic
l. The goal in this section is tó find the exact variatiohal equivalent of our l(u) l(u + EV) l(u) +
two-point boundary-value problem.
This -equivalence of•differential equations with variational problems is Since E can be arbitrarily small and
basic also to the choice of a computational scheme. The differential equation (8) (Lu, v) (f,
may be approximated by a discrete system, using finite differences, or the
variational integral can be minimized over a discrete class of functions, as This forces Lu We call the equ~
in the finite element method. In many applications-particularly in steady- of the problem. It no longer requir
state rather than transient problems-the variational statement is the pri- metric, since it deals not necessaril
mary physical principie, and the differential equation only a secondary tionary point. In other words, it stE
consequence. Therefore, it is not surprising to find in such applications a In this forro the problem leads to G
CHAP. 1 SEC. 1.!. THE VARIATIONAL FORM OF THE PROBLEM 9
l derivatives conv~rge, the boundary strong movement toward the minimization of the quadratic functional as the
~~ is compelled to sátisfy the boundary fundamental problem to be approximated .
.he second · derivatives of IN did 11ot The relationship between the linear and quadratic problems is transparent
ait/did not have to satisfy the boun- if L, v, and f are just real numbers. I(v) = Lv 2 - 2fv is a parabola and,
:tce Xi. This is what wil! not happen provided that Lis posith•e, its mínimum is attained at the point u where
nditions which involve;¡.only derivatives

norm; those involving· derivatives of
'!fv lv=u 2(Lu - /) O.
ill not apply to functions in the space
thich distinguishes between essential If L were not positive, the problem of minimization would break down:
natural boundary conditions, which either the mínimum is - oo or, if L = O, the para bola degenerates into a
:he variational problem, whose state- straight line.
t is, the 3<3 1 norm. The finite element A more interesting case is when v and f are n-dimensional vectors and L
y all boundary conditions below order is a symmetric positive definite matrix of order n. The notation ( , ) then
Jut they will not be required to satisfy stands for the usual inner product (or dot product) of two vectors, so
is leniency at x ::::::= 1t will not prevent
converging in the 3<3 1 norm toa solu-
is is the key to the following seciion,
standpoint to the equivalent varia- Applying the symmetry Lik Lki' the Euler equation is
m 1, ... , n.
: PROBLEM
These n simultaneous equations make up the vector equation Lu =f. The
i to the quadratic functional mínimum value of /, attained at u= L- 1¡, is
2(/, v)
d at v u only if its derivative (or Since L is positive definite, so is L - 1 , and this mínimum is negative (or zero,
condition for the vanishing of this when f = 0). Geometrically l(v) is represented by a convex surface, a para-
Lu =f. The problems of inverting boloid opening upward, when L is positive definite.
produce the same solution u. There- A vector derivation of the minimizing equation Lu f comes by varying
!Íther in an operational form, in terms all n components at once. If 1 has a mínimum at u, then for all v and f,
•na! form, in terms of the quadratic
! exact variational equivalent of our l(u) I(u + fV) I(u) + 2f[(Lu, v) - (f,.v)] + é(Lv, v).
tations with .variational problems is Since f can be arbitrarily small and of either sign, its coefficient must vanish:
nal scheme. The differential equation (8) (Lu, v) = (/, v) for every v.
>tem, using finite differences, or the
>ver a discrete class of functions, as This forces Lu We call the equation (8) the weakform, or Galerkinform,
applications-particularly in steady;.. of the problem. It no longer requires L to be positive definite or even sym-
the variational statement is the pri- metric, since it deals not necessarily with a mínimum but only with a sta-
~rential equation only a secondary tionary point. In other words, it states only that the first variation vanishes.
·ising to find in such applications a In this form the problem leads to Galerkin's approximation process.
10 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.3. TP.
lt .is worth mentioning that (Lv, v) - 2(/, v) is not the only quadratic is struck by the fact that. the expre
whose mínimum occurs at the point where Lu f lt is obvious that the because of the. integration by parts:
least-squares functional Q(v) (Lv- J, Lv f) has its mínimum (zero) I(v) will be well defined if only the fi.J
at the same point. Th~re is one significant difference, however: the Euler is required to ha ve finite energy. Th
equation, which arises by equating the derivatives aQ¡avm to zero, is not the class of functions which are ad
Lu J, put LTLu =-:::; Lrf Tfie two are theoretically equivalent, assuming that a space bigger than 3Cj.
L is inv~rtible, but in practice the appearance of LTL is a disadvantage. Our guiding principie will be th
Consider now the differential equation of the previous section: vided it can be obtained as the Iimit
"Jimit" we mean that the quadratic
I(v) converge:
Lu = [ - fx(p(x~) + q(x)Ju f,
u(O) = u'(n) O. (10) J: p(v' v:V) 2 + q(v

We want to construct I(v) = (Lv, v) - 2(/, v). These inner products now Notice that such an enlargement of
involve functions defined on the intervalO< x < n rather than vectors with mum value of /; each new value /(t
n components, but their definition is complete] y analogous to the vector case: the mimimum of 1 was already att'
(f, v) = J: f(x)v(x) dx.

minimizing function. This is exactl)
we now have the enormous advan
of being permitted to try functiom
Note that our data f and functions v are assumed to be reai, as in almost JCj. In practice, this means that we
all applications. This is a convenient assumption, and the modifications to ous but only piecewise linear-thc
be made in the complex case are well known; in the integrand abo ve, one of derivative has finite energy, but the:
the factors should be conjugated. Our problem is now to descrit
The computation of 1 involves an integration by parts (indeed the whole In other words, we want to discove·
theory of differential operators rests on this rule): which is the limit in the sense of (
functions vN which have two qeriv'
(Lv, v) = J: ["- -:( pv')' + qv]v dx
tions.
There are two properties to be de
(9)
= J: [p(v') 2 + qv 2] dx pv'v 1: ·
admissible functions and the bour
first is comparatively easy: since ti
gen ce of the fitst derivative, the lim
If v satisfies the boqndary conditions v(O)'= v'(n) O, then the integrated means that the norm
term vanishes and the quadratic functional is
I(v) = J: [p(x)(v'(x)) 2 + q(x)(v(x)) 2

2f(x)v(x)] dx.
must be finite.
This is the functional to be minimized. Again the boundary conditions
The solution of the differential problem Lu = f is expected to coincide sequence from JCj will satisfy vN(O)
with the function u that minimizes l. But within, what class of functions shall both these properties? lt turns óu1
we search for a mínimum? Since we are looking for a solution u in. JCj, then the other is lost. To see that the limi
certainly we ought to admitas candidates at least all members of JCj. This v' (n) = Ó, suppose for example tr
leads to a correct and equivalent variational problem; the mínimum of /(v) vN in Fig. 1.2. Since v- vN is zero
over all functions v in JCj occurs at the desired point v = u. However, one and there O v' v:V 1, the req
CHAP. 1 SEC. 1.~. THE VARIATIONAL FORM OF THE PROBLEM 1.1
- 2(f, v) is not tpe only quadratic js struck by the fact that the expression for 1 involves no second derivatives;
here Lu f. It is obvious that the beca use of the integration by parts, it involves only v and v'. It follows that
1: Lv- f) has its mínimum (zero) 1(v) will be well defined if only the first derivative of v, rather than the second,
:::ant difference, however: the Euler is required to ha ve finite energy. Therefore, the possibility arises of enlarging
derivatives aQ¡avm to zero, is not the class of functions which are admissible in the minimization problem to
~oretically equivalent, assuming that
a space bigger than 3C¡.
)earance of LTL is ~\disadvantage. Our guiding principie will be this: Any function v will be admitted, pro-
on of the previous se~tion: vided it can be obtained as the limit of a .sequence vN in 3Ci, where by the word
"limit" we mean that the quadratic terms in the potential-energy functional
1(v) converge:
)+ q(x)Ju f,
~) o. (10) as N - oo.
· 2(f, v). These inner products now Notice that such an enlargement of our space cannot actual/y lower the mini-
O x n rather than vectors with mum value of 1; each new value 1(v) is the limit of old values 1(vN). Thus if
tpletely analogous to the vector case: the mimimum of 1 was already attained for sorne u in ;re¡, that remains the
minimizing function. This is exactly what we know to be the case. However,
:x)v(x) dx. we now have the enormous advantage, while searching for this mínimum,
of being permitted to try functions v which were outside the original class
3Ci. In practice, this means that we can now try functions which are continu-
re assumed to be real, as in almost
ous but only piecewise Iinear-they are easy to construct, and their first
;sumption, and the modifications to
derivative has finite energy, but they do not lie in Xi.
1own; in the integrand above, one of
Our problem is now to describe this new and larger admissible space.
In other words, we want to discover the properties possessed by a function v
tegration by parts (indeed the whole
which is the limit in the sen se of (l 0)-this is effectively the 3C 1 norm-of
this rule):
functions v'j.¡ which have two derivatives and satisfy all the boundary condi-
tions.
+ qv]v dx There are two properties to be determined, the smoothness required ofthe
qv 2 ] dx pv' v l: · admissibJe functions and the boundary conditions they must satisfy. The
first is comparatively easy: sin ce the requirement (lO) no tices only conver-
gence of the first derivative, the limiting function v need only lie in 3C 1 • This
(0)'= v'(n) O, then the integra;ed means that the norm
•nal is
]1/2
:)(v(x)) 2 2f(x)v(x)J dx.
11 v !11 = [I n
0
(v 2 + (v') 2) dx
must be finite.
Again the boundary conditions pose a more subtle problem: Since any
blem Lu f is expected to coincide sequence from Xi will satisfy vN(O) =O and v:V(n) =O, will its limit v inherit
t within what class of functions shall both these properties? lt turns out that the jirst condition is preserved, and
looking for a solution u in. 3Ci, then the other is lost. To see that the limit need not satisfy the Neumann condition
tes at Ieast all members of Xi. This v'(n) = O, suppose for example that v(x) = x and consider the sequence
ional problem; the mínimum of J(v) vN in Fig. 1.2. Since v vN is zero except over the small interval at the end,
~ desired point v = u. However, one and there O v' - v:V < 1, the requirement (lO) is obviously satisfied. Thus
12 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.3.
Since this holds for f on both s

must vanish:
v'(1r) =1
v~(1T) =O o= s:pu'v' + q
(11)
J: [ -(pu')' .
If the minimizing u has two de

parts, this expression wiJJ be zerc
fin the interval, and the natural
Also u( O) O, beca use like every
boundary condition. This comp
x=7r-/¡ X 1 is equivalent to the solution
x'=O x=1r approached from either directio;
Geometrical1y, the process G
Fig. 1.2 Convergence in :Jet, with v:V.(rc) O but v'(rc) ::1= O. interpretation. The quadratic 1 il
loid in infinite dimensions. At 1
v(x) x lies in the limiting space of admissible functions, even though there were "boles" in this surfa,
v'(n) =F O. surface has changed neither its st
On the other hand, the condition v(O) = O continues to hold in the limit. pinpricks, corresponding to func
In fact, vN will converge at each individual point x, since by the Schwarz exist.
inequality To conclude this section, w
J:
lv~x)- ~M(x)l 2 = 1 (v~(y)- v~(y)) dyl 2
are of great irnportance in appli
the variational forrn which was
J: J: (v~- v~)Z
12 dy dy ~o.
l(v) is still to be minirnized over 1
function u is the desired so1utio
becomes more cornplex, special
By standard results in anal~sis, the limiting function v is continuous and the the solution is no longer in x¡. 1
convergence of vN to v is uniform in x. In particular, at the point x = O, this speciaJ behavior cannot be
v(O) = lim vN(O) O. On the other ha~d, the first derivatives converge only understood. From the viewpoin1
in mean square, giving no assurance that v'(n) =O. it is significant that the algorithn
The space of admissible functions v in the minimization is therefore X 1; on the singularity. Such inforrn
its members ha ve first derivatives with finite energy and satisfy the essential the convergence of the approxin
boundªry condition v(O) = O indicated by the subscript E. The natural the algorithrn wiH not break dov
boundary condition v'(n) Ois not imposed. Notice that if our mathematics
is consistent, then the function u in X1 which minimizes 1 should automati,. Remark l. The coefficient p(.
cal/y satisfy u'(n) =O. This is easy to confirm, since for any f and any v in the elastic property of the strin
X};, flow) changes abruptly. At such
the solution u will no Jonger ha~
/(u) l(u + fV) tion" at x 0 , we depend on the
= I(u) + 2f J: pu'v' + quv - fv +f 2
J: p(v') 2 + qv 2 • variation J pu'v' + quv- fv m
. Assuming no other points of dif
CHAP. 1 SEC. 1.3. THE VARIATIONAL FORM OF THE PROBLEM 13
Since this holds for f on both sides of zero, the linear term (first variation)
'must vanish:
v'{7r) =1
v~(TT)= O
(11)
O J: pu'v' + quv fv
= J: [ -(pu')' + qu f]v + p(n)u'(n)v(n).

If the minimizing u has two derivatives, permitting the last integration by
parts, this expression will be zero for all v in X 1 only if both -(pu')' qu+
fin the interval, and the natural condition u'(n) =O holds at the boundary.
Also u(O) O, beca use like every other function in X 1, u satisfies the essential
boundary condition ..This completes the cycle: The minimizadon of 1 o ver
X 1 is equivalent to the solution of Lu = f, and the computation of u can be
X= 1T- k approached from either direction.
x=TT Geometrically, the process of completion from Xi to X1 has a simple
vjy(n) = O but v'(n) =F- O. interpretation. The quadratic 1 is represented by a convex surface, a para bo-
lo id in infinite dimensions. At first, when 1 was defined only for v in :re¡,
:tdmissible functions, even though there were "boles" in this surface; aH we ha ve done is to fill them in. The
surface has ~hanged neither its shape nor its mínimum value; it isjust that the
1= O continues to hold in the limit. pinpricks, corresponding to functions v lying in X 1 but not in Xi, no longer
lual point x, since by the Schwarz exist.
To conclude this section, we remark on two singular problems which
are of great importance in applications. It is noteworthy that in both cases
y) - v:V(y)) dy 12 the variational forro which was just established remains completely valid;
1(v) ~s still to be minimized over the admissible space X1, and the minimizing
f: (v~ - v'M) 2 dy ~o. funct'ion u is the desired solution. In contrast, the operational forro Lu = f
becomes more complex, special conditions enter at the singularity x 0 , and
ng function v is continuous and the the solution is no longer in Xi. From the viewpoint of applied mathematics,
. In particular, at the point x O, this special behavior cannot be ignored; it is probably the very point to be
l, the first derivatives converge only understood. From the viewpoint of finite element approximations, however,
t v'(n) O. it is significant that the algorithm can proceed without complete information
¡ the minimization is therefore X 1;
on the singularity. Such information will be extremely valuable in speeding
nite energy and satisfy the essential the convergence of the approximations, as in Chapter 8 on singularities, but
by the subscript E. The natural the algorithm will not break down without it.
)Sed. Notice that if our mathematics
which minimizes 1 should automati~ Remark l. The coefficientp(x) may be discontinuous ata point Xo where
mfirm, since for any f and any v in the elastic property of the string (or the diffusivity of the medium in heat
flow) changes abruptly. At such a point there appears an interna! houndary;
the solution u will no longer have two derivatives. To find the "jurrip coridi-
tion" at x 0 , we dépend on the variational forro of the problem; the first
vadation f pu'v' + quv- fv must still vanish for all v if u is minimizing.
Assuming no other points of difficulty, integration by parts o ver the separate
intervals (0, x 0 ) and (x 0 ; n) yields But also for f = .5(x0), O < x 0 <
O=
f xo
0
[
1 1
-(pu) + qu - ]
f v + p_u_v_
1
I(v) J:
+ Jn.:ro [-(pu')' + qu f]v + p(n)u'(n)v(~)- P+u'+v:.
and the mínimum occurs at the raJ
The sub~hripts - and + indicate the limiting values as x approaches Xo the solution is not in 3Ci, and the
from the left and right, respectively. Recall that v_ v+ for any v in 3C1,
happens to fall at the bottom of t
since v is continuous, and in particular that u_ = u+. Varying v, it follows One more possibility: Suppos
that the differential equation holds in each interval, that u'(n) = O at the far that .5(n). Then the solution
leve/s off. In this case the so/utic
end,. and that
u'(n) O. Looking back at (11)
q O, and u(x) x, the first vari
This is the natural boundary condition at x 0 , and is a direct consequence
of the variational form: u' has a jump, but the combination pul remains
continuous.
J: (u'v' - fv) dx
Since there is a jump in u', the solution líes in 3C1 but not in 3C1. This is
for any v in 3C1. Thus the first vari
a case in which one of the holes in the surface l(v) was actually at the bottom.
minimizing. The only remarkable
The mínimum value of I(v) would have been the same over the original space
(11), and therefore the subsequent
3C 2 but within that space no function attained the mínimum. The surface
tion u'(n) = O, falls through. Thu
ca:e arbitrarily near to the hole, but it was filled in only by the function
there may no longer be a natural]
satisfying the jump condition. .
Once a J-function has been acc
The standard error estimates for finite elements, which rest on an assumed
the general question: Which class <
smoothness of u, will dégenerate at the discontinuity x 0 • These estimates can
More precisely, which space of dat:
be saved, however, if by placing a node at x 0 wé permit the trial fu?~tio~s
speaking, as long as /(v) remains f
in the approximation to copy the jump condition. Since the cond1tton IS
proceed. This rule accepts .5-functi
natural rather than essential, not every trial function has to satisfy it; as long
as the trial functions are not forced to have a continuous derivative, and
thereby violate the jump condition, the approximation will be good.
Remark 2. Up to now the inhomogeneous term f has been required to

come from 3C 0 , thereby excluding the .5-function. Physically this would be
ari interesting choice, representing a point load or point so urce, and mathe-
matically the corresponding u is the fundamental solutio.n. Therefore we try u(x):;:
harder t:o include it, by reconsidering the functional
I(v) = J: p(v') 2 + qv 2 - 2jv,
which is still to be minimized over 3C1.

x=·O
Suppose, for the sake of example, that p = 1 and q O. If f has finite
Fig. 1.3 Fundamental solut
energy, then the integral I is finite and its minimization is straightforward. 3C 0 and :JC2.
CHAP. 1 SEC. 1.3. THE VARIATIONAL FORM OF THE PROBLEM 15
But also for f = o(xo), o< Xo < 1' the integral is finite. In this case
p_u'_v_ I(v) = J: (v') 2 dx- 2v(x0 ),
+ p(n)u'(n)v(n) p+u'+v+.
and the mínimum occurs at the ramp function v = u shown in Fig. 1.3. Again
imiting values as x ~pproaches X 0
the solution is not in 3Cl, and the hole corresponding to this ramp function
happens to fall at the bottom of the infinite-dimensional paraboloid l(v).
;;all that v_ = v+ for any v in 3C1, One more possibility: Suppose the poi~t load is at the end x 0 = n, so
:hat u_ u+. Varying v, it follows that f = o(n). Then the solution is u(x) x, arid the ramp function never
:h interval, that u'(n) = O at the far levels off. In this case the solution vio/ates the natural boundary condition
u'(n) = O. Looking back at (11) this is perfectly consistent; with p = 1,
q = O, and u(x) = x, the first variation is
at x 0 , and is a direct consequence

), but the combination pu' remains s: (u'v' fv) dx = s: (v' o(n)v) dx . o
::m lies in 3C1 but not in X l. This }s for any v in 3C.1. Thus the first variation vanishes at u(x) = x, which is indeed
face I(v) was actual/y at the bottoin. minimizing. The only remarkable point is that the integration by parts in
1een the same o ver the original space (11), and therefore the subsequent derivation of the natural boundary condi-
attained the mínimum. The surface tion u'(n) O, falls through. Thus if f is allowed to be singular at x n,
t was filled in only by the function there may no longer be a natural boundary condition at that point.
Once a o-function has been accepted as a possible choice of J, there arises
: elements, which rest on an assumed the general question: Which class ofinhomogeneous terms can be permitted?
tiscontinuity x 0 • These estimates can More precisely, which space of data matcJ'¡es the solution space 3C1? Roughly
at x 0 wé permit the trial functions speaking, as long as I(v) remains finite for all v in X1, the minimization can
p condition. Since the condition is proceed. This rule accepts ó-functions and their linear combinations but not
~ial function has to satisfy it; as long
have a continuous derivative, and
tpproximation will be good.
u(x)=x 0
eneous term f has been required to
)-funétion. Physically this would be
nt load or point so urce, and mathe-
zdamental solutio.n. Therefore we try
functional
f- qv 2 - 2fv,
x=x0
x=O X=7T
hat p = 1 and q O. lf f has finite Fig. 1.3 Fundamental solution for a point load; f and u not in
its minimization is straightforward. 3C 0 and 3C2.
""':-
their derivatives. For example, the dipole function f = J'(x 0 ) would give "Every continuous mapping admit:
J: fv - J: v'J -v'(x 0 ),
mations." The previous sections es
1 to u, one for the differential equat
variational equivalent (11 u 11 1 u-
and this may not be finite; v' can have an unbounded peak at x 0 and still should be ripe for numerical soluti
ha ve fini 1tf energy. The dipole is too .singular to be allowed. We begin with the differential e<
The data f which are now accepted come from the space denoted by by difference quotients. The resul
:re-t; f's derivative of order -1, that is, its indefinite integral, is in X 0 • The discrete operational form. There ar
second-order operator L takes the space X 1 into X - 1 , just as it too k Xi of this difference equation:
into 3C 0 • A suitable norm in ac- 1 is
l. To compute the local trunca ti
of a Taylor series expansion.
1 Jf(x)v(x) dx 1
2. To establish that the systerr.
(12) ll/ll-t = ~a;l llvllt · Uh depends continuously on Jh as;
There is, however, a peculiar result if f happens to be a J-function concen- Together, these two steps estab
trated at the origin, namely that fv f
O for every v in the admissible space true solution u as h - O. The poin
X 1. Such a phenomenon means that the variational principie will treat this contrast the convergence theory fot
fas zero, and the solution will be u O. (This is ju.st the ramp function of used in the next section, and througl
Fig. 1.3, with x 0 0.) This means that the solution space 3C1 matches the gence in the variational problem. t
data space x- 1 only under the following proviso: There is to be no distinction steps above would be approached
between data / 1 and / 2 , if they differ by a múltiple of the J-function at the truncation error is abandoned in fa
origin. erties (or completeness) of the trial
Finally, the solution should depend continuously onf, using the solution- proof-it is automatically present, ·
space norm for u and the data-space norm for f. The proof depends on the As usual, the interval [O, x] is di,
vanishing of the first variation for every v in X 1, in particular for v = u: by the points x 1 = ih, i =O, 1, ...
-(pu')' qu + are removed in
f no p(u')2. + qu2 = fno fu.
The right si de, from the definition of the data norm ll f 11- P is bounded by
11 f 11- 1 11 u 11 1 • The left side is obviously larger than Pminll u' 115, and this is easily The result is a discrete equation -
shown to be larger than u 11 u lit for sorne positive a. Therefore, to hold at the interior mesh points
1
allulli < 11/11-tllullt or llullt -11/11-t·
a h
1
2[ -p(xi + ~ )cu:+l
(13)
This expr~sses the continuous dependence of u on f. + q(x )U: =
1 f(x;).
Since this is a second-order equa

1.4. FINITE DIFFERENCE APPROXIMATIONS
each end of the interval. At the left e
tThis section is a digression from OUI

lt is almost an article of faith in numerical analysis that whatever can be
difference methods are so closely related th:
solved abstractly can also be solved by concrete numerical computations. The reader may discern our own preferenc
CHAP. 1 SEC. 1.4. FINITE DIFFERENCE APPROXIMATIONS 17
1ole function f = Q'(x 0 ) would give "Every continuous mapping admits a convergent series of discrete approxi-
mations." The previous sections established two continuous mappings from
fto u, one for the differential equation (11 u 11 2 Cll f !lo) and the other for its
variational equivalent (ljujj 1 u- 111/11:.. 1). Therefore, both these problems
should be ripe for numerical solution.
~ an unbounded peak at x 0 and still We begin with the differential equation Lu fand replace the derivatives
~ular to be allowed.
by difference quotient,s. The result is a finite linear system V'U" f"-a
j come from the spa<?e denoted by
discrete operational form. There are two key steps in the theoretical analysis
, its in definí te integral, is in 3C 0 • The
of this difference equation:
:e 3C1 into :JC- 1 , just as it took 3Ci
l. To compute the local truncation error, or discretization error, by means
of a Taylor series expansion.
f(x)v(x) dx 1 2. To establish that the system is globally stable, in other wo~ds, that
llvllt . U" depends continuously on f" as the mesh size h approaches zero.
· happens to be a .:5-function cQllcen- Together, these two steps establish the rate of convergence of U" to the
0 for every v in the admissible space true solution u as h ~ O. The point of our discussion is really to be able tp
e variational principie will treat this contrast the convergence theory for difference equations with the techniques
O. (This is ju'st the ramp function of used in the next section, and throughout the rest ofthe book, to prove conver-
: the solution space 3C1 matches the gence in the variational problem.t It is remarkable how differently the two
proviso: There is to be no distinction steps abo ve would be approached in the variational framework: the local
y a mtiltiple of the .:5-function at the truncation error is abandoned in favor of verifying the approximation prop-
erties (or completeness) of the trial functions, and stability needs no special ·
;ontinuously onf, using the solution- proof-it is automatically present, for finite elements.
>rm for f The .proof depends on the As usual, the interval [0, n] is divided ~nto equal pieces oflength h = n/N
ry v in 3C1, in particular for v u: by the points x 1 = ih, i = O, 1, ... , N. Then the deriva ti ves in the equation
-(pu')' + qu f are removed in favor of centered difference quotients:
J: fu. h/2)
1e data norm 11 f 11-P is bounded by
trger than Pm 1nl 1u' 11~, and this is easily The result is a discrete equation -!J."(pll."U) + qU J, which we require
1e positive u. Therefore, to hold at the interior mesh points x 1 :
l
r llullt <-11/11-t·
q hl2 [ -p( X¡+ ~ )(Uf+t - Uf)+ P( X;- ~)(Uf- Uf-t)J
(13)
+ q(xJU7 = f(x¡).
1ce ofu onf
Since this is a second-order equafion, it requires a boundary condition at
each end of the interval. At the left end, U~ O is the obvious choi~. At the
~TIONS
tThis section is a digression from our main theme, but the finite element and finite
nerical analysis that whatever can be difference methods are so closely relatedthat we need to be in a position to compare them.
y concrete numerical computations. The reader may discern our own preference and, if he shares it, skip this section.
SEC. 1.4. FIJ'I
18 AN INTRODUCTION TO THE THEORY CHAP. 1
other end,' there is no unique difference rep]acement for u'(n) =O, and we operator,
shall consider two alternatives: a one-sided difference
U7v- Uj._ 1
(l4a)
h
o
and a certered difference
Again, consistency with the true conditi
(14b) u~+l u~-~_ 0 term. The centered difference is of cou1
2h - .
In the first case the difference equation (13) holds for O< ¡<N, which
u(n + h) _, u(n - h)
2h
together with the two boundary conditions produces N + 1 equations in
N+ 1 unknowns. In the second case the difference equation applies also at If the difference equation and bom
xN = n, to compensate for the extra unknown Uf.+ 1 .-These boundary condi-
rather than separately, the truncation e
tions.are easy to cpmpare when p 1 and q =O; after eliminating the un-
an example the matrices Lh displayed ab
knowns at the extreme ends, the difference equations are, respectively,
have been used to eliminate the last 1J
o Taylor series, the one-sided and cente1
~r-~o
-1
-~l(LJ CJ
2
(15a) LhUh ~-~-.::--->---2_.;.h) - f(il
h2 2
o -l 2u(n) 2u(n - h)
h2
and
o
(15b)
h2
(
Lh U h = _!_ - l
2 -1
o
.
2
.
o
2
-2
-J
o .
"JCl
2
.
UN
(]
To analyze the difference equation with variable p and q, we consider the
In this form, it would appear that t
which is completely wrong.
For an estímate ofthe error Eh=
tion which it satisfies:
local truncation error 't'h(x) which arises when the true solution u is substituted
for Uh: With the centered difference at x 11
equation are
E'iHt E
E~ =0,
This is a completely formal computation, in which u(xi ± h) and p(x; ± h/2) 2h
are exp~nded in a Taylor series aroundthe central point x 1• Because u satisfies
the differential equation, the zero-order term vanishes; this cancellation Notice that this difference equation
expresses the consistency of the difference and differential equations. The problem, except that the inhomoge¡
terms which remain are truncation error. We therefore expect
from the terms which are of order h 2
't'h = - ~~ [(pu')'" + (pu'")'] + O(h 4). -(pe~)' + qez = -
The same process applies at the boundaries. For the one-sided differ.ence ez(O) =O,
CHAP. 1 SEC. :l.4. FINITE DIFFERENCE APPROXIMATIONS 19
nce replacement for u'(n) O, and we operator,

-sided difference
u(n)- ~n - h) = u'(n) - ~ u"(n) + ...
1}:¡_1 o
= - ~ u"(n) + O(h 3 ).
Again, consistency with the true condition u'(n) O cancelled the zero-order
o. terrn. The centered difference is of course more accurate:
tion (I 3) holds for O < i < N, which u(n + h) lh, u(n - h} = ~ um(n) + O(h4).
tditions produces N + 1 equations in
the difference equation applies also at
mknown U}:¡+ 1 ••These boundary con di- • If the difference equation and boundary condition are treated together,
1 and q = O; after eliminating the un- rather than separátely, the truncation errors Iook quite different. We take as
rence equations are, respectively, an exarnple the matrices Lh displayed abo ve, in which the boundary conditions
have been used to elirninate the Iast unknown. Expanding the final row by
o
2 ][u,
-: L_,
) e) IN-1
Taylor series, the one-sided and centered conditions yield, respectively,
u(n- h)- u(n
2Ú(n) - x~(n ~
f(n
h) - f(n)
h) = -
= - ~
~ u"(n) +
u'"(n)
O(h),
+ O(hz).
o
In this forrn, it would appear that the overall errors are 0(1) and O(h),
which is completely wrong.
2
2 -rlriJ=Cl F<?r an estímate ofthe error Eh= Uh ..:__u, we Iook at the difference equa-
tion which it satisfies:
with variable p and q, we consider the .

when the true solution u is substituted
With the centered difference at x = n, the boundary conditions in this error
equation are ·
m, in which u(x; ± h) and p(x1 ± h/2) Eí =0,

the central point xi' Beca use u satisfies
der term vanishes; this cancelJation Notice that this difference equation is analogous to the original differential
~ence and differential equations. The problem, except that the inhomogeneous terms now come from the local
truncation error. We therefore expect the leading terrn in E\ say h 2 e2 , to arise
frorn the terms which are of order h 2 in the local error:
-(pe~)' + qe 2 -it:[(pu')'" + (pu'")'],

undaries. For the one-sided difference ez(O) =O, e~(n) = iu"'(n).
20 AN INTRODUCTION 1'0 THE THEORY CHAP. 1 SEC. 1.4.
This solution e 2 (x) is the principal error function. To compute the next term Obviously /!!.. is a finite difference
in the error, we substitute u + h 2 e 2 into the difference problem. This yields a
truncation error which statts at h 4 , and the coefficient ofthis term is theright-
hand side in the equation for e4 • /(v) = s:·p
In short, we may recursively determine an expansion
And, almost as obviously, JA is ac1
(16) the centered trapezoidal rule, whicl
replaced by sums biased to one si<
The calculation of the error terms en is entirely mechanical; it has to be This hints at a technique wt
stopped only when there is a break in the smoothness of the solution or when finite elements-the integral /(v) i~
the boundary conditions no longer permit such an expansion. One virtue of ference quotients, and then minin
(16) is that the error varies with x in the proper way; we ha ve more than just mations of a low order of accurac
an unrealistic maximum bound over the whole interval. Furthermore, this received. Its advantages tend to d
expansion justifies Richardson's extrapolation to h = O; one computes with is demanded.
two or more choices of ~, nnd chooses a combination of the results so as to Up to this point our analysis of
cancel the leading terms in the expansion. With the centered boundary ly formal, leading to the error exp
condition in our example, the linear term he 1 vanishes identically, and the to be taken: to prove that ijh conv1
combination of Uh and U 2 h which increa·3es the accuracy from second to totically valid. [The expansion itse
fourth order is would require that ijh is an analy
well posed even for complex h; al
This extrapolation technique has been disc_ussed and verified numerically = O(f7M+ 1) as h decreases to zero.:
any number of times but has not yet been widely adopted in practice. The of estima te as in the differential p1
accuracy of the boundary conditions is its most severe limitation, particular] y tinuously on the data Jh.
in more dimensions when the boundary of the region intersects the mesh in We return to the difference e<
a completely erratic way. there exist a unique solution Uh
Applying the same ideas to the one-sided boundary condition, the first Lh nonsingular? One of the most
term he 1 arises from the O(h)truncation error at the boundary: leads at the same time to a discret
Suppose LhUh =O. Let the larg
-(pe'1 )' + qe 1 =O, the sign of Uh to ·make u: > O. T
e 1 (O) = O, e'1(n) = -!u"(n).
Naturally the centered difference is to be preferred.

Since each term is nonnegative, all
This first-order accuracy appears also if the difference equation is linked
cient qn is positive, it follows imme
with a variational problem. As always, the positive-definite symmetric system
terms lead to u:+l = u:= u:_¡.
LhUh =~fh is the Euler equation for the vanishing of the first derivative, at
and the whole argument can be· re
the minimizing U\ of
Ultimately, after enough repetiti1
LhUh =O holds only if Uh is thc
Allowing inhomogeneous bounda
With the one-sided boundary condition, this functional is show that no component uf can
discrete maximum principie, from
h = :~t [Pi-ti2(V? -h V~-~)2 + q¡(V7)2- 2j¡V~J. function (Lh)- 1 is a nonnegative mt
J By the way, a similar proof l1
CI:IAP. 1
SEC. lA. FJNJTE DIFFERENCE APPROXIMATIONS 21
Jr function. To compute the next term

Obviously /t. is a finite difference analogue of the true quadratic functional
o the difference prbblem. This yie]ds a
the coefficient ofthis term is the right-
l(v) = s:·p(v') 2 + qv 2 - 2fv.
nine an expansion
And, almost as obviously, /t. is accurate only to first order. Instead of using
the centered trapezoidal rule, ·which is of second order, the integral 1 has been
replaced by sums blased to eme side.
is entirely mechanical; it has to be This hints at. a technique which ·Iies in between finite differences and
te smoothness of the solution or ~hen finite elements-the integral I(v) is approximated by a sum /t. involving dif-
·mit such an expansion. One virtue of ference quotients, and then minimized. It is an easy way to derive approxi-
: proper way; we ha ve more than just mations of a low order of accuracy, and deserves more analysis than it has
he whole interval. Furthermore, this received. Its advantages tend to disappear, however, when higher accuracy
olation to h = O; one computes with is demanded.
a combination of the results so asto Up to this point our analysis ofthe difference equation has been complete-
nsion. With the centered boundary ly formal, leading to the error expansion L; hnen. There is now a second step
erm he 1 vanishes identically, and the to be taken: to prove that Uh converges to u and that the expansion is asymp-
creases the accuracy from second to totically valid. [The expansion itself cannot converge for a finite h, since this
would require that Uh is an analytic function of h, and that the problem is
=u+ O(h 4 ). well posed even for complex h; all one hopes to pro ve is that U'' -
M
L; hnen
o
1 disc_ussed and verified numerically = O(hM+ 1) as h decreases to zero.] This second step demands the same kind
Jeen widely adopted in practice. The of estímate as in the differential problem: the solution Uh must depend con-
its most severe limitation, particularly tinuously on the data fh.
y of the region intersects the mesh in We return to the difference equation and ask the first question: Does
there exist a unique solution Uh for every fh? Equivalently, is the matrix
e-sided boundary condition, the first Lh nonsingular? One of the most effective proofs of the invertibility of Lh
m error at the boundary: leads at the same time to a discrete maximum principie, as follows.
Suppose LhUh =O. Let the largest component 1 U71 be the nth, and choose
qe 1 =O, the sign of Uh to make U~ >O. Then the difference equation at xn is
n) = -fu"(n).
>e preferred.
Since each term is nonnegative, all three must vanish. Ifthe zero-order coeffi-
m if the difference equation is linked
he positive-definite symmetric system cient qn is positive, it follows immediately that U~ = O. In any case, the other
e vanishing of the fir_st derivative, at terms lead to U~+ 1 = U~ = U~- 1 • Thus these components are also maximal,
and the whole argument can be repeated with n - l or n + 1 in 'place of n.
Ultimately, after enough repetitions, it follows that U~ = Vi = O. Thus
LhUh = O holds only if ljh is the zero vector, and Lh must be invertible.
Allowing inhomogeneous boundary conditions, the same argument would
, this functional is show that no compopent uf can be larger than both Vi and U'N. This is a
discrete maximum principie, from which it follows that the discrete Green's
function (Lh)- 1 is a nonnegatil'e matrix.
' By the way, a similar proof leads to Gerschgorin's theorem in matrix

22 AN INTROOUCTION TO THE THEORY CHAP. 1 SEC. 1.4.
theory: Every eigenvalue A of a matrix A líes in at least one of the circles For each choice of difference equa
It is. not an automatic conseque1
continuous inequality is necessary ,
In this sense the theory of differeJ
Choosing A to be either of the matrices h Lh displayed in· (15), all eigenvalues
2 an enormous variety of possible di
satisfy lí~ - 21 2. Thus the theorem does not rule out the possibility that or-less new proof of (18). As in th
A = Ois,·an eigenvalue, that is, that Lh is singular; it was at this point that the inequality-as true and concentrate
repetitions of the argument at i n 1, n - 2, ... , 1 were needed. in one-dimensional problems is s1
Gerschgorin's theorem becomes perfectly useless in a fourth-order (18) asserts the stability of the diJ.
problem. The leading coefficients in the simplest case are A u = 6, Ai,i± 1 depends continuously on the disc
-4, and At,t±z = 1, and the Gerschgorin circles are j A 61 < 10. Since To establish that Uh converges
A O lies inside these circl~s, the Gerschgorin argument fails to prove even of numerical analysis: Consisten
that A is semidefinite. This difficulty simply reflects the absence of a maximum theorem is proved in two steps:
principie for fourth-order problems. If we compare u" O with uav> = O,
for example, it is obvious that straight lines attain their extrema at the ends l. The error Eh satisfies the :
of the interval, whereas cubics need not. therefore by stability it depends e
The maximum principie, when it holds, can· be made to yield a simple but the local truncation error:
proof of convergence. We prefer instead to preserve a close analogy between
differential and difference equations, by· discussing the discrete inequality
which corresponds to 11 u lb < C 11 f llo· The statement of such an inequality
2. By consistency, which was
requires, first of aH, a redefinition of the norm to make it apply to grid func-
Eh(O), and A.1tE\ the right side aJ
tions. An obvious choice for the discrete el)ergy is
vergence is proved.
We call attention to one more p

For the square of the 2-norm, we introduce the energy in the function and , ti ve o'f u, the error estímate in 1
its first and second forward-difference quotients: IJEhll 2 < Ch 2 llu!l 4 • With a little t
These sums extend only over the grid points at which they are defined; the Thus the convergence is of order)
forward-difference quotient A+ft = (/;+ 1 - J¡)/h makes no sense at the Iast whenever fis in3C 0 • This is the sa
grid point. finite element method given in Sec
There is one other new feature to be introduced, inhomogeneou& bound- easier.
ary conditions. For the two-point boundary-value problem, the continuous Altogether, it appears that a
dependénce of u onf and on the boundary data is expressed by difference equations is possible b1
Iems, that is, for partial different
(17) llulb < C{IJ/IIo + ju(O)j + ju'(n)l). proof in the numerical analysis l
principie. Without this principie ti
For the finite difference equation, in which An denotes whichever boundary adequate for special problems, b1
operator is applied at the right-hand end, the corresponding inequality is oped is so delicate that it looks ex1
that a single differential equation
(18) approximations, above all at a cun
.<::HAP. 1 SEC. 1.4. FÍNITE DIFFERENCE APPROXIMATIONS 23
·ix A lies in at least one of the circles For each choice of difference equation, this is the basic estímate to be proved.
It is not an a u toma tic consequence of ( 17), .although it ·implies (17): The
continuous inequality is necessary but not sufficient for the discrete inequality.
In this sense the theory of difference equations is the more difficult; there is
s h 2 Lh displayed in (15), all eigenvalues an enormous variety ofpossible difference schemes, ánd each requires a more-
1 does not rule out t~e possibility that or-less new proof of (18). As in the continuous problem, we shall accept the
is singular; it was at f)lis point that the inequality as true and concentrate on its implications; the technique ofproof
1, n- 2, ... , 1 wete needed. in one-dimensional problems is summarized by Kreiss [K7]. The inequality
perfectly useless in a fourth-order (18) asserts the stability of the difference equation: The discrete solution Uh
he simplest case are Au = 6, Ai,t±t depends continuously on the discrete data f\ uniformly in h.
sorin circles are 1 A. - 61 1O. Since To establish that Uh converges to u, we need the most celebrated theorem
schgorin argument fails to prove even of numerical analysis: Consisten~y and stability imply com•ergence. This
nply reflects the absence of a maximum theorem is pro ved in two steps:
f we compare u" = O with u<iv> = O,
t lines attain their extrema at the ends l. The error Eh satisfies the same difference equation (13) as Uh, and
::>t. therefore by stability it depends continuously on its data, which ·is nothing
holds, can· be made to yield ·a simple but the local trunéation error:
d to preserve a close analogy between
by· disci.Jssing the discrete inequality
1• The statement of such an inequality
te norm to make it apply to grid func- 2. By consistency, which was reflected íri the Taylor expansions of t\
:te en.ergy is Eh(O), and AnEh, the right side approaches zero as h - t O. Therefore, con-
vergence is proved.
We call attention to one more point. Beca use th involves the fourth deriva-
oduce the energy in the function and \ tive ..of u, the error estímate in the centered difference case is effectively
quotients: 11 Eh 11 2 Ch 2 ll u !1 4 • With a little extra patience this can be reduced to
11 Eh llo < C'h 2 ll U llr
points at which they are defined; the Thus the convergence is of arder h 2 whenever u is in :JC2, or in other words
i+t - / 1)/h makes no sense at the last whenever fis in :JC 0 • This is the same rate of convergence as in the simp1est
finite element method given in Section 1.6, but the proof there is very much
e introduced, inhomogeneous bound- easier.
ndary-value problem, the continuous Altogether, it appears that a satisfactory theory for one-dimensional
ry data is expressed by difference equations is possible but not trivial. In multidimensional prob-
1ems, that is, for partial differential equations, almost every convergence
l u(O) 1 + !u'(n) 1). proof in the numerical analysis Jiterature has depended on a maximum
principie. Without ~his principie there have been a few ad hoc arguments,
hich A,. denotes whichever boundary adequate for special problems, but the general theory now being devel-
end, the corresponding inequality is oped is so delicate that it looks extremely difficult to apply. The problem is
that a single differential equation allows an enormous variety of difference
approximations, above all ata curved boundary. In contrast, the variational
24 AN INTRODUCTION TO THE THEO~Y CHAP. 1 SEC. 1.5. THE F
metho~s are governed by stricter rules, and it is just these restrictions which Clearly these functions lie in 3C1; t11
permit a more complete theory. condition (vh)'(n) O, which is not n1
We shall now concentrate exclusively on this theory-the construction The weights q1 are to be determined
and convergence of finite elements. nality property of the eigenfunctions,
1.5. THE\ RITZ METHOD ANO LINEAR ELEMENTS

l(vh) t[q¡A. ..
l J
In this section we begin on the finite element method itself. The general where 11 p(j - ~) 2 + q. The minim
framework is already established, that there is a choice of approximating to the optimal values
either the individual terms in the differential equation or the underlying
variational principie. The finite element method chooses the latter. At the
same time, the discrete equations which arise in the variational approximation
are effectively difference equations.
In variationarform, the problem is to minimize the quadratic functional Therefore, the Ritz approximation is
J(v) =· J: [p(x)(v'~x)) 2 + q(x)(v(x)) 2 2f(x)v(x)] dx
over the infinite-dimensional space JCJ,. The Ritz method is to replace X1

In this case the system oflinear equati
~ in this variational problem by afinite-dimensional subspace S, or more precise/y
coordinates Q1 was trivial to sol ve; it:
by a sequen ce offinite-dimensional subspaces Sh contained in JC 1. The elements
the eigenfunctions are orthogonal. F
vh of Sh are called trialfunctions. Because they belong to JCJ,, they satisfy the
true solution
essential boundary condition vh(O) O. Over each space Sh the minimization
of 1 leads to the solution of a system of simultaneous linear equations; the u í:·1
number of equations coincides with the dimension of S". Then the Ritz
approximation is the function uh which minimizes 1 over the space Sh:
onto the jirst N eigenfunctions. '
for all v" in S 11
• In this example it is easy to comp
the Ritz sequence of approximation
The fundamental problems are (1) to determine uh, and (2) to estímate the fPN, uh is improved by the term AÑ 1(j
distance between uh and the true solution u. This section is devoted to prob- approach zero very rapidly as N·~ ex
lem 1 and the next section to problem 2. tion is atmost AÑ 1 ll/llo· In a more re.
We begin with two examples which illustrate the classical Ritz method. will not be known, but if the geomet1
The subspaces Sh will not have the special form associated with the finite and cosines as trial functions is still e
element method; instead, each subspace in the sequence will contain the have shown how the fast Fourier tra
preceding one. within reasonable bounds.) For me
Suppose in the first example that the coefficients p and q are constant, and finite elements.
let Sh be the subspace spanned by the first N= 1/h eigenfunctions of the In the second example we keep p
continuous problem. The trial functions-the members of Sh-are then the tions which are polynomials: ·
linear combinations
Again these satisfy the essential bom

CHAP. 1 SEC. 1.5. THE RITZ METHOD AND LINEAR ELEMENTS 25
~s, and it is ]ust these restrictions which Clearly these functions Iie in X 1; they satisfy even the natural boundary
condition (vh)'(n) = O, which is not necessary.
ively on this theory-the construction The weights qi are to be determined so asto minimize l. From the orthogo-
nality property of the eigenfunctions, the integral has the special form
AR ELEMENTS
tite element method itself. The general where li p(j ~) 2 + q. The minimizingequation a¡¡aqi Oleadsdirectly
at there is a choice of approximating to the optimal values
ifferential equation or the underlying
ent method chooses the latter. At the
1 arise in the variational approximation j 1, ... ,N.
s to minimize the quadratic functional Therefore, the Ritz approximation is
r(x)(v(x)) 2 - 2f(x)v(x)] dx
C1. The Ritz method is to replace X 1

:nensional subspace S, or more precise/y
In this case the system of linear equations a¡¡ aq i = Odetermining the optimal
1aces Sh contained in X1. The elements coordinates Qi was trivial to solve; its coefficient matrix is diagonal, because
tse they belong to X1, they satisfy the the eigenfunctions are orthogonal. Furthermore, uh is the projection of the
true solution
1
•Over each space Sh the minimization
of simultaneous linear equations; the
u
the dimension of Sh. Then the Ritz
ninimizes 1 o ver the space Sh:
onto the first N eigenfunctions.
for all ?/' in Sñ. In this example it is easy to compute how much is gained at each step in
the Ritz sequence of approximations. By including the Nth trial ·function
determine uh, and (2) to estímate the rpN, uh is improved by the term AÑ 1(f, rpN)'PN· Forsmoothfthesecorrections
on u. This section is devoted to prob- approach zero very rapidly as N...- oo, and even for an arbitrary fthe correc-
2. tion is at most AÑ 1II f llo· In a more realistic problem the exact eigenfunctions
1 iUustrate the classical Ritz method. will not be known, but if the geometry remains simple, then the use of sines
,ecial form associated with the finite and cosines as trial functions is still of great importance. (Orszag and others
tce in the sequence will contain the have shown how the fast Fourier transform keeps the computational effort
within reasonable bounds.) For more complicated geometries we prefer
: coefficients p and q are constant, and finite elements.
: first N= I/h eigenfunctions of the In the second example we keep p and q constant, and choose trial func-
Is-the members of Sh-are then the tions which are polynomials:
, qFYIn
J
. (.J -
2 sm 1)
x.
"2"
Again these satisfy the essential houndary condition vh(O) O, but this time
26 AN INTRODUCTION TO "[HE THEORY CHAP. 1 SEC. 1.5. THE
· not the natural boundary ·condition. In this case In our example the admissible sp
ous. This rules out piecewise cons
choice for Sh is tite space of functl
[(j l)h,jh], continuous at the node.
tive of such a function is piecewise e
Differentiating 1 with• respect to the parameters qJ., we find a system of N thus Sh is a subspace of 3C1. We sh•
linear e ~.uations for the optimal parameters Q 17 . .. , QN:
1 elements.
For j l, ... , N, let rp1 be the
KQ=F.
particular node x jh, and vanish(
The unknown vector· is Q ( Q 1 , • • • , QN), the components of F are the functions constitute a basis for the st
"moments" F1 f fx 1 dx, and the coefficient matrix K is a mess: written as a combination
K,1 J: [p(ix 1- 1 )(jxF-I) + qx xi] dx

1
vh(x) =
_ pijnt+J-1 qn~+j+ 1
Notice the following irnportant fact
- i+j- 1 + i+j+ l. the val u e of v at the jth node x = jh
the nodal values of the function, tht
The matrix whose en tries are (i j + +1)- 1 is the notorious Hilbert matrix,
physical signíficance; they are the. J
with which computations are virtually impossible. The eigenvalues of this
of the string at the nodes. Notice al:
matrix are so out of scale-the matrix is so ill-conditioned-that normally
basis, because each rp' is identically
even for N ~ 6 roundoff error destroys all information in the data f With
In fact, it is obvious that ~-t is 01
the other terms of K included, the situation is even worse. The difficulty is
these functions doesn't vanish, the
that the powers xi are nearly linearly dependent; all of them ha ve essentially
not quite orthogonal, it is only a<
all their weight in the neighborhood of x n. Numerical stability will
With coefficients norrnalized to
depend on choosing a more strongly independent basis for the subspace.
The remedy is to cope with the excessive sensitivity of the problem befare
introducing the data, by orthogonalizing the original trial functions rp1
In most cases this means introducing Legendre or Chebyshev polynomials
xi. J(vh) = s: (((V")'
as a new basis for the polynomial space Sh. Such a basis is useful on an ínter- With vh :E qirp1,. this integral is
val, and remains so in more dimensions provided the geometry is very simple. q 1 , • • • , q N• and it can be computec
For a general domain, however, these orthogonal polynomials again become
unworkable.
We turn now to the construction of a finite element subspace Sh. The
domain-in this case the interval [0, n]-is divided into pieces, and on each
piece the trial functions vh are polynomials. Sorne degree of continuity is
imposed at the boundaries between pieces, but ordinarily this continuity is
no moré than is required in order that
l. The functions vh will be admissible in the variational principie.

2. The quantities of physical interest, usually displacements, stresses,
or moments, can be conveniently recovered from the approximate solution
uh.
Jt is very difficult to construct piecewise polynomials if too much con- o

tinuity is required at the boundaries between the pieces. Fig. 1.4 Piecewi
y 27
CHAP. 1 SEC. 1.5. THE RITZ METHOD AND LINEAR ELEMENTS
In this case In our example the admissible space i·s JC1, whose members are continu-
ous. This rules out piecewise constant functions. Therefore, the simplest
choice for Sh is the space of functions which are linear over each interval
[(} I)h,jh], continuous at the nodes x jh, and zero at x =O. The deriva-
tive of such a function is piecewise constant and obviously has finite energy;
: parameters qi, we find a system of N
thus Sh is a subspace of JC1. We shall refer to these trial functions as linear
tmeters QP ... , QN~'\
elements.
'=F. For j I, ... , N, Iet rp1 be the function in Sh which equals one at the
particular node x = jh, and ·vanishes at all the others (Fig. 1.4). These roof
· . , QN), the components of F aré the functions constitute a basis for the subspace, since every member of Sh can be
:oefficient matrix K is a mess: written as a combination
q1Cl+j+ l
Notice the following important fact about the coefficient qi: It coincides with
i+J+I
the value of v at thejth node x = jh. Since the coordinates qi are nothing but
+- !)- 1
is the notorious Hilbert matrix the nodal val ues of the function, the optimal coordinates Qi will have direct
lly impossible. The eigenvalues of thi~ physical significance; they are the. Ritz approximations to the displacement
·ix is so ill-conditioned-that normally of the string at the nodes. Notice also that the rp~ form a local basis, or patch
)ys all information in the data f With basis, because each rp1 is identicalJy zero except in a region of diameter 2h.
tuation is even worse. The difficulty is In fact, it is obvious that 'P1-t is orthogonal to rp~+ 1 , since whenever one of
dependent; all of them ha ve essentially these functions doesn't vanish, the other does. Thus although the basis is
d of x 11:. Numerical stabiJity will not quite orthogonal, it is only adjacent elements which will be coupled.
y independent basis for the subspace. With coefficients normalized to p q = 1, the problem is to minimize
:essive sensitivity of the problem before
ing the original trial functions rpi = xi.
~ Legendre or Chebyshev polynomials
:e Sh. Such a basís is useful on an inter- With vh :2; qirp1, this integral is a quadratic function of the coordinates
s provided the geometry is very simple. q., . .. , qN, and it can be computed over one subinterval at a time. On the
orthogonal polynomials again become
of a finite element subspace Sh. The

:]-is divided into pieces, and on each
omials. Sorne degree of continuity is
ieces, but ordinarily this continuity is
ible in the variational principie.

!rest, usually displacements, stresses,
•vered from the approximate solution
:ewise polynomíals if too much con- o

tween the pieces. Fig. 1.4 Piecewise linear basis functions.
28 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.5. TI
jth subinterval, the function vh goes linearly from qi_ 1 to qi and (vh)' = single element is recorded once a
(qi- qi_ 1 )/h. (By convention q 0 = 0.) Therefore, matrix:
A slightl~. longer compu..tation gives
Now the summation over the eler

sembly of the global stiffness matriJ
are added together, after being ¡:
These terms correspond to a single piece of string, with linearly varying dis-
placement. For the whole string, the second-degree term in I(vh) is the sum global array. The matrix associa1
unknown q 0 is to be discarded beca
This is nota particularly convenient form for the result. We would prefer
to ha ve it in the matrix form qrKq, in other words (Kq, q), beca use it is the +_!_
h
matrix K which we shall eventually need. The reason is this: The expression
l(vh) is quadratic in the parameters q = (ql' . .. , qN), of the form
2 -1 o o
-1 2 -1
The mínimum of such an expression occurs (as we know from the matrix 1 o -1 -1
-¡:¡
case, Section 1.3, where we set a1¡aqm =O) at the vector Q = (Ql' ... , QN)
determined by
o -1 2
o o _!._1
KQ=F.
Again the relationship between thc
This is the system we shall have to solve, and therefore all we need to know
is the matrix K and the vector F.
The best plan is to find the contribution to K from each "element," that is,
each piece of the string. Therefore, we go back to
The integral of the undifferentiat1

and record the right side (the value of the integral) in the matrix form mass matrix K 0 (later denoted b
process:
h
The matrix k 1 is an element stiffness matrix. lt represents a computation which Ko=6
needs to be done only once, since it is independent ofthe particulardifferential
equation. Similarly, the calculation of the zero-order term f (vh)2 dx o ver a
CHAP. 1
SEC. 1.5. THE RITZ METHOD AND LINEAR ELEMENTS 29
linearly from q j - 1 to q j and (vh)' =
lrherefore, · single ·element is recorded once and for all in terms of the element mass
matrix:
3h(qj-I
2 + qj_Iqj + qj2)-( )h(2
- qj_Iqj 6 1
1)(qj-I)
2 qj
= (q}_lq}k, (~;-')-
Now the summation over the elements j = 1, ... , N is replaced by an as-
sembly of the global stiffness matrix K. This means that the element matrices
::e of string, with linearly varying dis- are added together, after being placed into their proper positions in the
xond-degree term in I(vh) is the sum global array. The matrix associated with J: ((vh)') 2 dx, recalling that the
unknown q 0 is. to be discarded because· ofthe essential boundary condition, is
-1
. form for the result. We would prefer -1 1
other words (Kq, q), because it is the
d. The reason is this: The expression -1
= (q~' · .. , qN), of the form
-1
2 -1 o o
-1 2 -1 o
•ccurs (as we know from the matrix 1
= O) at the vecto. r Q = (Q 1' · · · ' QN ) -¡¡ o -1 -1 o
o -1 2 -1
F. o o -1
Again the relationship between the matrix and the integral is that
e, and therefore. all we need to know
on to Kfrom each "element," that is,

~oback to
The integral of the undifferentiated term (vh)2 is given by qTK0q, where the
he integral) in the matrix form mass matrix K 0 (later denoted by M) is formed by the same assembling
process:
4 1 o o
1 4 o
~x. It represents a computation which h
Ko=6 o o
!pendent ofthe particulardifferential
~e zero-order term f (vh)2 dx o ver a
o 4
o o 2
30 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1:5. THE
The req!Jired matrix K is now the sum K 1 + K 0 • These matrices need not be bling the mass matrix:
assembled all at one time and stored; instead the entries can be computed
when they are required in the process of solving the final system, KQ = F.
lt remains to compute the term J fvh, by which the inhomogeneous load
f enters the approximation. This integral is linear in the coordinates qi:
K0 differs from K 0 only in that the .

general f(O) =/= O.
if we define the load vectot F to ha ve components This result approximates the true
would be obtained by exact integrat
FTq; the mass matrix K 0 gives the coc
i':
In practice these numbers are computed, just like the stiffness and mass ma- · l)h)-
trices, by integrating over one element ata time. Suppose that over the jth
interval The difference between Fi and the
to estímate. If f is linear, then of cot
f 1 coincides withfitself. For a quad·
If f(x) = x 2 , then
When these integrals are summed, the coefficient of a given q k is Fk pk +
a k+ 1 • One sees he re in the most rudimentary form the bookkeeping which
must be carried out by the computer. A gíven nade kh enters as the right Fi = fjh x 2
(- j + 1 + ~) dx -!
(j-l)h
endpoint of the kth subinterval, leading toa term pkqk in the integral, and
also as the left endpoint of the next subinterval, leading to etk+ 1 qk. The as- whereas
sembly subroutine must be aware that both tttese subintervals are incident
on the kth node, and combine the results. (There seems to be no doubt
that, starting from scratch, it takes longer to program finite elements than
finite differences for a two-point boundary-value problem; again the real Therefore, the error in numerical in
benefit is felt first in more than one dimension.) arbitrary smooth f, this will be the h
For arbitrary f these integrals cannot be computed exactly, and sorne There does exist an integration
numerical quadrature will be necessary. One possibility is to approximate Fj, which gets also the quadratic (anc
f by linear interpolation at the nodes. In other words,/is replaced by its piece- Fi = h(fj-t + 10/j + /j+ 1 )/12 and is
wiseHnear interpolate / 1 = l:fktpí(x), where fk is the value off at the node [H3]. It can be derived by a quadratic
x = kh. Then the integral J/ 111' involves the same inner products J tpÍtp~ which val [(j l)h, (j + l)h], · fÓilowed t
were computed earlier in forming the mass matrices k 0 • Over the jth interval This derivation is unnatural, howevj
of computing once and for all an in
over each such interval twice, once 1
polate at the nades xj-n xi, xj+t' an
interpolate at xi, xj+P xj+l' This is
since the only difference between this and the integration of (vh)Z is that one efficient formula in a certain class is 1
pair of coefficients qj-I and qi is replaced by the two nodal values off process; given complete freedom te
Again the computer will su m these results from j = 1 to j N by assem- case, finite differences can win. But t
CHAP. 1 SEC. L5. THE RITZ METHOD AND LINEAR ELEMENTS 31
1 K 1 + K 0 • These matrices need not be bling the mass matrix:

: instead the entrí~s can be computed
; of solving the final system, KQ = F.
f'vh, by which the inhomogene6us load
ral is linear in the coordina tes q.J.·
J: frp' dx = prq,
K0 differs from K 0 only in that the zeroth row has to be retained, since in
general f(O) -=F O.
::omponents This result approximates the true linear term P q, that is, the one which
would be obtained by exact integrations. We denote the approximation by
frq; the mass matrix K 0 gives the coefficients in the approximate load vector
f:
d, just like the stiffness and mass ma- Fi = ~ [f((j- l)h) + 4f(jh) + f((j + l)h)].
t ata time. Suppose that over the jth
The difference between Fi and the true value Fi = f frpj dx is not difficult
to estima te. lff is linear, then of course the two agree, since the interpolate
fr coincides with f itself. For a quadratic f, however, a discrepancy appears.
lf f(x) = x 2 , then
coefficient of a given qk is Fk = pk +
1entary form the bookkeeping which jh . f(j+ l)h ( ) h3
A gíven node kh enters as the right Fi= f x 2 (-j+ 1 + Z)dx+ . x 2 j+ 1- Z =j2h 3 +6'
g to a term Pkq k in the integral, and
(j-1 )h ¡h
Jbinterval, leading to rxk+ 1 qk. The as- whereas

both these subintervals are incident
sults. (There seems to be no doubt
1ger to program finite elements than
1dary-value problem; again the real Therefore, the error in numerical integration is h 3 /6 = h 3 f"(jh)/12. For an
1ension.)
arbitrary smooth f, this will be the leading term in Fi - Fi.
1ot be computed exactly, and sorne There does exist an integration formula, with the same simplicity as
y. One possibility is to approximate Fi, which gets also the quadratic (and even cubic) terms correct. It is given by
•ther words,f is replaced by its piece- Fi = h(fi-I + 10fi + fj+z)/12 and is known as Collatz's Mehrstellenverfahren
where fk is the_ value off at the node [H3]. It can be derived by a quadratic interpolation off over the double mter-
; the same inner products f rpirp' which val [(j- l)h, (j + I)h], fÓIIowed by exact integration of Fi = f f1 rp~ dx.
1ss matrices k 0 • Over the jth interval This derivation is unnatural, however, for finite element programs. Instead
of computing once and for aii an integral over [jh, (j + I)h], we have to go
over each such interval twice, once to evaluate Fi using the quadratic inter-
polate at the nodes xj-P xi, xj+P and then again to evaluate Fj+t using the
interpolate at x J., x J·+z, x J'+?·
-
This is a typical instance in which the most
td the integration of(vh)2 is that one efficient formula in a certain class is not found by the finite element assembly
d by the two nodal values off process; given complete freedom to find the best formula in each special
sults from j = 1 to j = N by assem- case, finite differences can win. But the essential point for complicated prob-
í'
32 AN INTRODUCTlON TO THE THEORY CHAP. 1 SEC. 1.5. THE RI
lems is that the finite element formula is not far from optimal, and it is p>O,
generated in a systematic and painless way by the computer.
In practice, the replacement off by its interpola te / 1 has been superseded
by direct numerical integration. Over each subinterval, fvh is integrated by
a standard quadrature formula can be zero only if "E qlp~ is identicall~
q¡ o.
In fact, the global stiffness matrix K
matrixL\ or rather hL", ofthe previous
The most common choice, and the most effi.cient, is Gaussian quadrature, leading terms are identical, both being
for example with equal weights wt at the two symmetrica11y placed evaluation with weigh ts 2, -1. The zero-or
points et {j + 1/2 ± 1/,.J'J")h. This is again exactly correct for cubic nal entries of L". In K it appears also
knowns, and is "smoothed" with the wt
f, and therefore it will automatica11y attain the same accuracy as the Collatz
ei
formula described above. The one-point Gauss rule, with at the midpoint rule. We reemphasize that once the •
the discrete forms of different terms in
of each interval, is already suffi.cient to maintain the accuracy inherent in ·
linear trial functions: Jt yields F1 h(f1_ 1í 2 + h+ 112 )/2. [The trapezoidal Ritz method is a "package deal," and
rule, with equal intervals, gives exactly the right side F1 = hf(jh) ofthe to make independent decisions about <
simple three-point difference equation of Section 1.4.] In particular, the treatment of bom
The result of any numerical integration is to replace the true linear terms the alternatives in the difference equati<
FT q by sorne approximate expression ft q, still linear in the unknowns tion, of which order of accuracy, is
element equation. Looking at the Jast r
qp • • • 'qN•
The same ideas apply to the quadratic terms, if the coeffi.cients p(x) and x = n in the case p q = 1 is
q(x) in the differential equation actually depend on x. The integrals of
p(x)((vh)')2 and q(x)(vh)2 are again computed on each subinterval by a
numerical qtiadrature. The assembled results are stored as approximate
stiffness matrices, whose quadratic forms are el ose to the ttue integrals qrK 1 q
and qT K 0q. We shall assume for the present thcit all these integrals are com- Substituting the true value u(x1) for
puted exactly, and study in Secti~n 4.3 the effect of errors in numerical series about x n:, the truncation err
integration.
Now we try to assemble our own ideas. Writing K for the sum K 1 +K 0, h , + h '"
u, - 2u
2
6u _
3
h uiv
24
+ ~(3u
6
the computations so far have produced the formula
~¡+!!.
2 (
This is the discrete expression to be minimized in the Ritz method. Observe using the differential equation -u"+
1 that it is in exactly the standard variational form; the minimizing vector Q O. In terms of difference equations, tl
is determineO. by the linear equation happens to be third-order accurate.
The final step in compúting the fini
KQ ==F. the linear system KQ =F. We prop
methods, which are ·preferred to ite1
We refer to this as the finite element equation. lts solution is the central step majority offinite element programs. (1
in the numerical computations, and if h is small, it will be a large system. The fall of iterative methods during the 1
matrix K is guaranteed to be positive definite, and therefore in vertible; since of hard work and good mathematic
\ CHAP. 1 SEC. 1.5. THE-RITZ METHOD AND LINEAR ELEMENTS 33
1Ia is not far from optimal, and it is

: way by the computer. p >0,
)y its interpola te / 1 has been superseded
ea_ch subinterval, fvh is integnited by
can be zero only if ~ qlp~ is identically zero, and this happens only if every
qj =O.
In fact, the global stiffness matrix K is remarkably like the finite difference
matrix L\ or rather hL\ ofthe previoussection. With constant coefficients, the
1ost efficient, is Gaussian quadrature, leading terms are identical, both being proportional to the second difference
te two symmetrically placed evaluation with weigh ts ~ 1, 2, -l. The zero-order term qu ente red only the diago-
is is again exactly correct for cubic nal entries of Lh. In Kit appears also in the coupling between adjacent un-
ttain the same accuracy as the Collatz knowns, and is "smoothed" with the weights 1, 4, 1, reminiscent ofSimpson's
1t Gauss rule, with ei at the midpoint rule. We reemphasize that once the approximating subspace Sh is chosen,
to maintain the accuracy inherent in the discrete fornú of different terms in the equation are al! determined. The
h(h-1/2 + h+ 1/2)/2. [The trapezoidal Ritz method is a "package deal," and neither requires nor permits the user
tly the right side Fj = hf(jh) of the to make independent decisions about different parts of the problem.
of Section 1.4.] In particular, the treatment of boundary conditions is fixed, and recalling
~tio_!l is to replace the true linear terms the alterna ti ves in the difference equation, it is natural to wonder which condi-
rt FT q, still linear in the unknowns tion, of which order of accuracy, is chosen "automatically" by the finite
element equation. Looking at the .last row of K, the equation at the boundary
atic terms, if the coefficients p(x) and x = n in the case p = q = 1 is
tally depend on x. The integrals of
omputed on each subinterval by a
j results are stored as approximate
1s are close to the true integrals qTK1 q
esent that all these integrals are com- Substituting the true value u(x) for qP and expanding u and fin Taylor
·.3 the effect of errors in numerical series about x = n, the truncation error is
eas. Writing K for the sum K 1 +K u1 - Tu

h 11 + 2
h 111
3
h iv + h (3 u- hu1 + Tu
2
h 11)
6 u - 24 u
0'
the formula 6
- !!:__ f + h2 JI - h3 ¡~~ + ... ~ h3 u"(n) + ...
2 6 24 24
nimized in the Ritz method. Observe using the differential equation -u"+ u= f and boundary condition U 1(n) =
ional form; the minimizing vector Q O. In terms of difference equations, this means that the boundary condition
happens to be third-order accurate.
F.
The final step in compúting the finite element approximation uh is to solve
the linear system KQ = F. We propose to discuss only direct elimination
methods, which are preferred to iterative techniques in an overwhelming
tation. Its solution is the central step
majority offinite elementprograms. (It is fascinating to reflect on the rise and
is smaii, it will be a large system. The
fall of iterative methods during the last generation. A tremendous amount
·finite, and therefore invertible; since
of hard work and good mathematics went into the development of over-
34 AN INTRODUCTION TO THE THEORY CHAP. 1· SEC. 1.5. THERI
relaxati9n and alternating direction methods; this was a dominant theme of a lower triangular matrix times an
in numerical analysis. Now these methods are increasingly squeezed between which is the solution Q we want to cot
elimination and devices Iike the fast Fourier transform; the latter is unques- the two triangular matrices are easy to i
tionably more efficieñt when the geometry and the equation are appropriate, geneous term as it stands after elimim
and otherwise-especially when the same linear system is to be sólved for after back substitution. (If there are r
many rigl)t-hand sides, as in design problems-elimination is convenient with different data bpt the same stiffne
and straightforward.) should be stored.)
We want to discuss very briefly the theory of Gaussian elimination. Now we consider what is special a
In applying this familiar algorithm toa general coefficient matrix K, the first is known, as in our example, to be s~
1
unknown Q is eliminated from the last N- 1 equations, then Q 2 is elimi-
nated from the last N - 2 equations, and finally QN_ 1 is eliminated from
agonal. The first point is that the proc1
carried out, and the factorization K= 1
(~ ~). since Q, could not t

the last equation. The system KQ = F is transformed in this way toan equiva-
lent system with K=
by subtracting a multiple 12 , 1 of the fl
~1N)(~1)
u11
each of the matrices in the upper left e
UQ= 2N • ~ F'.
(
UNN ºN
should ha ve. a nonzero determinan t.

The unknowns Q¡ are now determined by back substitution, solving the last
determinants are all positive, and the1
equation for QN, the next to last for QN_ 1 , and so on.
with no exchanges of rows. In fact, the
.It is important to understand in matrix terms what has happened. Suppose
u11 u22 . . . uii' so that the pivot elerr
we begin to undo the elimination process by adding back to the last equation
of U, are alf positive.
the multiple IN,N- 1 of the (N- l)st equation, which was subtracted off
Something more is required for th<
in the process of eliminating QN_ 1 • Next we add back to the last two equations
ically stable: The pivots U¡¡ must be
the multiples IN- 1 ,N_ 2 and IN,N- 2 of equation N- 2, which were subtracted
large. Otherwise, the information' in ti
in the elimination of QN:.. 2. Final~y, we recover the original system KQ = F,
stroyed as the algorithm progresses. w~
having added back to each equatio-n i the multipJes l¡¡ of the previous equa-
of these pivots, and therefore sorne ro u
tions, j = 1 through j = i - 1, which were subtracted in the elimination of
sic sensitivity of K to small perturbatio
Q 1 through Qt_ 1 • In matrix terms the system KQ = F has been recovered by
roughly the ratio of its largest eigenval
multiplying the transformed system U Q = F' by the matrix
of Chapter 5. This number will depen<
of the differential equation. In sorne
can spring not from an ill-conditionec
L= With a matrix like K= {: ~). for e:

IN-1, N-2 be exchanged, or pivoted. If we exchan!
IN1 IN, N-2 IN, N-1 the pivot element Uú, then the Gauss
stable as the condition number allow
This means that LUQ = LF' is precisely the same as KQ = F; Gaussian example is, of course, not positive-defi
elimination is nothing but the factorization of K into a product which arise in the "mixed" method (Sf
The direct stiffness method-in whi1
K=LU u, as in the greater part of this book-:
CHAP. 1 t SEC. 1.5. THE RITZ METHOD AND LINEAR ELEMENTS 35
1ethods; this was a dominant therne of a lower triangular matrix times an upper triangular matrix. Thus K- 1F,
ods are increasingly squeezed between which is the solution Q we want to compute, is identical with u- 1L -• F, and
)Urier transform; the latter is unques- the two triangular matrices are easy to invert. In fact,L- 1Fis F', the inhomo-
:try and the equation are appropriate, geneous termas it stands after elimination, and then Q is u-•F', the result
.me linear system is to be sólved for after back substitution. (If there are many systems KQ Fn to be solved,
problems-elimination is conveni~nt with different data bpt the same stiffness matrix K, then the factors L and U
¡·-.. should be stored.)
:he theory of Gausslan elimination. Now we consider what is special about the elimination process when K
. general coefficient matrix K, the first is known, as in our example, to be symmetric, positive-definite, and tridi-
;t N 1 equations, then Q2 is elimi- agonal. The first point is that the process succeeds-the eliminations can be
and finally QN_ 1 is eliminated from carried out, and the factorization K LU exists. lt would not ha ve succeeded
•transformed in this way toan equiva-
with K = (~ ~ ). since Q 1 could not be ellminated from the second equatioti
by subtracting a multiple 12 , 1 of the first. The condition for success is thar
~IN)(~!)
2N • = F'.
each of the matrices in the upper left corner of K,
UNN ºN
by back substitution, solving the Iast should have a nonzero determinant. For a positive-definite matrix, these
-¡, and so on.
determinants are all positive, and therefore elimination can be carried out
with no exchanges of rows. In fact, the determinant of K<i) equals the product
'ix terms what has happened. Suppose
U 11 U 22 • • • Un, so that the pivot elements Un, lying on the main diagonal
.s by adding back to the last equation
~quation, which was subtracted off
of U, are all positive.
we add back to the last two equations Something more is required for the eljmination algorithm to be numer-
tation N 2, which were subtracted ically stable: The pivots Un must be not only nonzero but also sufficiently
recover the original system KQ = F, large. Otherwise, the information in the original coefficients Kii will be de-
1e multiples lii of the previous equa- stroyed as the'algorithm progresses. We ha ve only partial control overthe size
vere subtracted in the elimination of ofthese pivots, and therefore sorne roundoff error is unavoidable. The intrin-
vstem KQ F has been recovered by sic sensitivity of K to small perturbations is _measured by its condition number,
F' by the matrix roughly the ratio of its largest eigenvalue to its smallest, which is the subject
of Chapter 5. This number will depend on the mesh size h and on the order
of the differential equation. In sorne cases, however, numerical difficulties
can spring not from an ill-conditioned K but from an ill-chosen algorithm.
With a matrix like K= (: ~), for example, the two equations should first
N-2 be exchanged, or pivoted. If we exchange rows at each stage so asto maximize
·-2 IN,N-1 the pivot element UiJ; then the Gaussian elimination algorithm becomes as
stable as the condition number allows. The troublesome matrix K in this
!ly the same as K Q = F; Gaussian example is, of course, not positive-definite; it is more typical ofthe matrices
1 of K into a product · which arise in the "mixed" method (Section 2.3).
The direct stiffness method-in:which the unknowns are the displacements
,u u, as in the greater part of this boók-automatically yields a positive definite
36 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.5. THEI
K. In this case, elimination without row exchanges is not only possible but Correspondingly, the vector F' = L-
also numerically stable. To understand this, we first give the factorization from back substitution are
K = LU a more symmetric form, by divfding out the diagonal part of U:
F'.
:._¿
d¡
The important point is that the num

is proportional only to N.
This leaves K = LD( D- 1 U), where all three factors are uniquely deterrni{\ed: In many-dimensional problems, th
Las a lower triangular matrix with unit diagonal, n- 1 U asan upper triangu- · and symmetric positive-definite. Howc
lar matrix with unit diagonal, and D as a positive diagonal matrix. By ordering of the unknowns. In fact, ti
symmetry, n- 1 Umust be the transpose of L. Thus K LDLT, the symmetric subroutines have been developed to yie
form of the factorization. It is even possible to go one step further, by intro- of Gaussian elimination.
ducing a new lower triangular matrix L = LD 112 ; this yields the Cholesky The simplest and most popular cri
factorization K= LLT. Now we can explain why positive-definite symmetric Suppose it is known that only the firs
matrices never require pivoting: The factor L in this decomposition behaves superdiagonals, are nonzero. (For tri
like a square root of K and cannot get out of scale. In fact, the condition this definition is w = l.) Then at eacl
number of L, appropriately defined, is exactly the square root of that of K. information can be used to advantag
This is to be compared with to be eliminated below each pivot, an
tracting a multiple of one row from
(f1 o1) = ( 1 o)1 (fo -e-•

o )(o1 have only 2w + 1 nonzero entries. Tht
1
t- )
t-• 1 .' out elimination and isinherited by t
symmetric matrix, the number of open
where the factors on the right are large even though the matrix on the left is N 3 /3 required for a full matrix.
no t. The simplest illustration of good
Finally, the fact that K is tridiagonal means an enormous reduction in view of the bandwidth, is furnished in
the number of computations. "Since the first unknown Q 1 has only to be of nodes. If there are fewer nodes in tt
eliminated from the second equation-it does not appear in the others-all tion, the unknowns should be numben
the multiples / 3 ,., ••• , lN- 1, 1 ofthe first equation, normally required in elimi- than each column. The bandwidth wil
nating Q1 , are zero. At the second stage, Q 2 has only to be eliminated from for a given node we must wait that lo1
the third equation, and so on. Thus in the lower triangular matrix L, the sumably linked to it by a nonzero er
only nonzero elements líe on the main diagonal and on the first subdiagonal; in the ordering. In general, finite ele
the factors of a tri diagonal matrix are bidiagonal. structure than those from conventiom:
Since the Cholesky factorization involves square roots of the pivots and ordering is far from self-evident.
presentf some extra problems in avoiding operations with zeros, the most There is a second criterion which
popular numerical algorithm is based on the decomposition K= LDLT. The matrix; it is a little more precise and
two factorizations are equivalent, both mathematically and from the point on the profile, or skyÍine, of the matr
of view of numerical stability. (We assume no overflow.) For a tridiagona~ in the ith row. If this occurs in colum
matrix, the en tries of L and D satisfy a simple recursion formula: · ~ puter, it will not be necessary to subt
Jrom row i. The multiplying factors 11, :
d¡ =K¡,¡- d¡-1lJ,J-H d0 =0, since QP ... , Q1_ 1 do not require eli
are absent in the first place. The prc
/.+1 ' = Kl+t.J.
J • d, non-zero entries in each row, and li
CHAP. 1 SEC. 1.5. THE RITZ METHOD AND LINEAR ELEMENTS 37
row exchanges is _not only possible but Correspondingly, the vector F' L - 1F and the solution Q which results
nd this, we first give the factorization from back substitution are
y dividing out the diagonal part of U:
F0 O,
QN+1 = 0.
The important point is that the number of arithmetic operations involved

is proportional only to N.
1three factors are uniquely determined: In many-dimensional problems, the stiffness matrix K will still be sparse
1it diagonal, n- 1 U as an upper triangu- and symmetric positive-definite. However, there will not be such an obvious
D as a positive diagonal matrix. By ordering of the unknowns. In fact, the nodal ordering becomes crucial, and
ie of L. Thus K= LDLT, the symmetric subroutines ha ve been developed to yield an ordering that reduces the expense
ossible to go one step further, by intro- of Gaussian elimination.
x L = LD 112 ; this yields the Cholesky The simplest and most popular criterion is the bandwidth of the matrix.
explain why positive-definite symmetric Suppose it is known that only the first w subdiagonáls of K, and the first w
factor L in this decomposition behaves [ superdiagonals, are nonzero. (For tridiagonal matrices, the bandwidth by
~et out of scale. In fact, the condition this definition is w 1.) Then at each stage in the elimination process this
is exactly the square root of that of K. information can be used to advantage. There are only w nonzero elements
to be eliminated below each pivot, and furthermore each elimination-sub-
tracting a multiple of one row from another-operates on rows known to
(oe O)(1o e-1),

1 .
-f"- 1
have only 2w + 1 nonzero entries. The band structure is preserved through..;
out elimination and is inherited by the triangular factors L and U. For a
symmetric matrix, the number of operations is roughly Nw 2 f2 rather than the
~e even though the matrix on the left is N 3 /3 required for a full matrix.
The simplest illustration of good and bad orderings, ·from the point of
onal means an enormous reduction in view of the bandwidth, is furnished in two dimensions by a rectangular array
the first unknown Q 1 has only to be of nodes. If there are fewer nodes in the horizontal than in the vertical direc-
-it does not appear in the others-all tion, the unlmowns should be numbered consecutively along each row rather
·st equation, normally required in elimi- than each column. The bandwidth will be roughly the length of a row, since
tge, ·Q2 has only to be eliminated from for a given node we must wait that long for the node above it (which is pre-
in the lower triangular matrix L, the sumably linked to it by a nonzero entry in the stiffness matrix) to appear
t diagonal and on the first subdiagonal; in the ordering. In gerleral, finite element matrices are less systematic in
~ bidiagonal. structure than those from conventional difference equations, and an optimal
nvolves square roots of the pivots and ordering is far from self-evident.
·iding operations with zeros, the most There is a second criterion which takes into account the sparseness of a
on the decomposition K LDLT. The matrix; it is a little more precise and subtle than the bandwidth. lt is based
th mathematically and from the point on the profi/e, or sk)line, of the matrix. Consider the first nonzero element
tssume no overflow.) For a tridiagonal in the ith row. If this occurs in columnj and that fact is known to the com-
·a simple recursion formula:· puter, it will·not be necessary to subtract multiples of rows 1, 2, ... , j - 1
from row i. The multiplying factors 11, P • . • , l1, j - t would all be zero anyway,
IJ,j- H since QP . .. , Qi_ 1 do not require elimination from the ith equation; they
are absent in the first place. The profile. is formed by locating these first
non-zero entries in each row, and like the band structure it is preserved
38 AN INTRODUCTION TO THE THEORY
.. CHAP. 1 SEC. 1.6.
thro~ghout Gaussian elimination and inherited by the factor L. In case the band scheme using Nw N 312 locati
profiie goes significantly inside the band-so that a good many rows have dimensions.
length well below 2w + 1 non-zero entries-it may be worthwhile for the Whatever the algorithm-this wil1
computer to know the profile, and even to order the unknowns according to research-the eventual result of the e
a "profile-reducing" algorithm. KQ F. With this, the finite element
We fmphasize that the number of arithmetic operations is not the only
criteriorl\in choosing an algorithm; the storage requirements may be at least
as important. For a band matrix, a standard procedure is to store successive 1.6. THE ERROR WITH LINEAR ElEf
diagonals ofthe matrix; this is probably near to optimal, and uses about Nw
How close is the Ritz approximati~
locations. For a linear or bilinear element on a 50 x 50 square mesh, N =
as possible, according to the following
2500 and w ~50; at about this point, a typicallarge computer would have
in the error u - u" is a mínimum. TIJ
to store information outside of core, and the programming and data manage-
assuming that the energy · is measure
ment become much more complex. Therefore considerable attention is being
must be one which is associated with t
paid to algorithms which take account of, ap.d· maintain as far as possible,
fact it is specified by the functional /(v
the sparseness of a matrix even within its band or profile. At the extreme,
degree term in /(v). (This definition dH
we may identify the Iocation of every nonzero entry in A and order the un-
is physically correct; it is convenient jt
knowns by a "sparse matrix algorithm" so as to minimize the number of
functional is written in the form
nonzero entries in the lower triangular L. It seems to us thát for finite elements
this is too expensive; it takes no account of the systematic structure of the (19) I(v) = a(v, v
matrices.
We prefer, if the problem is so large that the standard band or profile the energy in the function v is given by ,
algorithm goes outside of core, to follow the analysis given in a series of The ertergy therefore coincides witt
papers by Alan George. His goal, for finite elements with',~ unknown para- as (Lv, v) and been integrated by parts.
meters in the plane, is to achieve O(N 312 ) arithmetic operations ~!1 elimination symmetric expression, and this symn
and O(N log N) non-zero en tries in L. This would be essentially optimal. a{v, v). In particular, if (Lv, w) is integr
(There are sorne special direct methods, ana_logous to the fast Fourier trans- ric form
form, which need only O(N log N) operations and O(N) total storage loca~
tions; but they are restricted to _simple problems on rectangles.) His goal is a(v, w) = J: (p(x)v'(x)w
achieved by an ordering [03] which is like the mínimum degree algorithm:
at each stage, the. unknown to be eliminated should be the one currently This is the energy inner product. It is de
connected to the fewest unknowns. This is very different from Irons' frontal space :re1 and represents the inner proc
method, and it appears to demand a large program overhead-probably lem.
too large. A more recent suggestion is given in George's paper "An efficient Our goal in this section is first to
band-oriented scheme forsolving n by n grid problems." He divides the domain that the energy in the error is minimi:
into thin strips, and applies a band algorithm to the corresponding submat- apply this theorem in establishing bou
rices (thé unknowns inside the strip having been numbered to reduce the band-
width). Between two strips will come a line of unknowns, and in George's THEOREM 1.1
ordering they should appear after the strips which théy separate. It is then Suppose. that u minimizes I(v) over
the comparatively small number of unknowns on these separating lines which is any closed subspace of :Je1. Then:
contribute.to a filling out ofthe band; the bigger ~ubmatrices corresponding · (a) The mínimum of I(v") and the
to the link between one strip and another are empty. With a number of strips ranges over the subspace S", are achie1
proportional to h- 112 , the storage requirement turns out to be O(Nsf 4 )-'-not
optimal, but better by the substantial factor N 114 than a straightforward (20) a( u u", u- u")= 1
~
y 1
¡¡,CHAP. SEC. 1.6. THE ERROR WITH LINEAR ELEMENTs 39
td inherited by the factor L. In case the band scheme using Nw = N 312 locations. The saving is the same in three
band-so that a good many rows have dimensions.
entries-it may be worthwhile for the Whatever the algorithm-this will obviously remain an area of active
ven to order the unknowns according to research-the event\lal result of the elimination process is the solution of
KQ F. With this, the finite element approximation is determined.
::>f arithmetic operations is not the only
:he storage requirem~nts may be at least
:tandard procedure is''to store successive 1.6. THE ERROR WITH LINEAR ELEMENTS
tbly near to optimal, and uses about Nw
How close is the Ritz approximation u" to the true solution u? As close
ement on a 50 X 50 square mesh, N =
as possible, according to the following theorem, in the sense that the energy
1t, a typical Iarge computer would have
in the error u u" is a mínimum. The Ritz method is therefore optimal,
and the programming and data manage-
assuming that the energy · is measured in the natural way. This measure
~herefore considerable attention is being
must be one which is associated with the particular problem at hand, and in
unt of, and· maintain as far as possible,
fact it is specified by the functional1(v) itself: The energy in vis the second-
hin its band or profile. At the extreme,
degree term in f(v). (This definition differs by a factor f from the one which .
y nonzero entry in A and order the un-
is physically cot~ct; it is convenient just to ignore this factor.) Thus if the
hm" so as to minimize the number of
functional is writtén in the form
r L. lt seems to us that for finite elements
~ount of the systematic structure of the (19) I(v) a(v, v) 2(f, v),
large that the standard band or profile the energy in thefunction vis given by a(v, v).
:Ollow the analysis given in a series of The energy therefore coincides with the term which up to now has arisen
r finite elements with N unknown para- as (Lv, v) and been integrated by parts. This integration has produced a more
312
' ) arithmetic operations in elimination symmetric expression, and this symmetry is emphasized by the notation
L. This would be essentially optimal. a(v, v). In particular, if (Lv, w) is integrated by parts, the result is the symmet-
·ds, analogous to the fast Fourier trans- ric form
)perations and O(N) total storage loca~
'le problems on rectangles.) His goal is a(v, w) J: (p(x)v'(x)w'(x) + q(x)v(x)w(x)) dx.
is like the mínimum degree algorithm:
1iminated should be the one currently This is the energy inner product. It is defined for all v and w in the admissible
"'his 'is very different from Irons' frontal space X 1 and represents the inner product that is intrinsic to the given prob-
a large program overhead-probably lem.
is given in. George's paper "An efficient Our goal in this section is first to prove the theorem described above-
n grid problems." He divides the domain that the energy in the error is minimized by the Ritz method-and then to
.lgorithm to the corresponding submat- apply this theorem in establishing bounds for the error with linear elements.
.ving been numbered to reduce the band-
;, a line of unknowns, and in George's THEOREM 1.1
te strips which théy separate. lt is then Suppose that u minimizes l(v) over the full admissible space 3C};, and S"
tknowns on these separating lines which is any closed subspace of X};. Then:
1; the bigger submatrices corresponding (a) The minimum of I(v") and the minimum of a(u v", u- v"), as v"
ther are empty. With a number ofstrips ranges o ver the subspace Sh, are achieved by the same function u". Therefore
tuirement turns out to be O(N 514 )-not
ial factor N 114 than a straightforward (20)
40 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.6. T
(b) With respect to the energy inner product, u" is the projection of u onto The right side is
S". Equivalent(y, the error u- uh is orthogonal toS":
a(uh + ev", uh + ev")- 2(f, uh + f?J'
(21)
Therefore,
(e) 'f.he minimizing function satisfies
O 2e[a(uh, v") - (/
(22) for al/ v" in S".
Since this is true for small f of either si
In particular, if S" is the whole space JC1, then This equation expresses the vanishing ~
direction of v". In particular, a(u, v) =
(23) a(u, v) (f, v) foral/ v in JC1. vanishes in every direction V. This is th
(e) is established; it is the equation ofvi
COROLLARY
The choice v = u in this equation
Jtfollowsfrom (21) that a(u u", u")= O, or'a(u, u") a(u", uh), and the at the minimum, the strain energy is the
Pythagorean theorem holds: The en¡!rgy in the .error equals the error in the
energy, (25) I(u) = a(u, u) - 2C
Similarly, /(u") = -a(u", u"). In every e¡
o ver a larger class of functions, and tll
Furthermore, since the left side is necessarily positive, the strain energy in u"
the result stated in the corollary, that
always underestimates the strain energy in u:
estimated:
(24) a(u", u")<
This whole theorem is fundamental to the Ritz theory, and its three parts The proof of the theorem is now co
ate very much interdependent. We can deduce (b) immediately from (e): the existence nor the uniqueness of uh (
If (23) holds for all v, it holds for every v" in S\ and subtracting (22), the of u itself) has been proper1y est~blish·
result is (21). omission means that the proof has ha
Also (b) follows from (a): Ín an inner product space, the function in a him by poittting to the key word in the
subspace S" Closest to a given u is always the projection of u onto S". In the space S" is required to be closed. Tbis n
opposite direction, to show that (b) implies (a), we compute its own limiting functions. If there is a s
a( u u" v", u - u" v") = a(u u", u u") - 2a(u u", vh)
+ a(v", v").
then there must exist in S 11 a limit v, suc
If (21) holds, then
Equality occurs only if a(v", v") O, in other words, only if v" =O. Thus This will always be true if S" is finite-dir.
u" is the unique minimizing function in (20), and (a) is proved. plated in the Ritz method. In general, <
The problem is now to establish (e), since that will imply (b) and there- a function uh in S" which is absolutely
fore (a). If u" minimizes 1 over S", then for any f and v", is closed. We cite the example Sh = iJC
contains functions arbitrarily close to u
the "projection" breaks down.
' ..¡ . ,
y CHAP. 1 SEC. 1.6. THE ERROR WITR UNEAR ELEMENTS 41
1er product, u" is the projection of u onto The right side is

Jrthogonal toS": · ' '----./
a(u" + EV", u" + EV") - 2(f, u" + EV") = J(u") + 2E[a(uh, ?1')- (f, v")]
+ éa(v", v").
Therefore,
zes
) for all v" in S". ·

Since this is true for small E of either sign, it follows that a(u", v") (f, 11').
:JC1, then This equation éxpresses the vanishing of the first variation of 1 at u", in the
direction of v". In'particular, a(u, v) = (f, v), so that at u the first variation
for all v in X1. vanishes in every direction v. This is the equation (11) derived earlier. Thus
(e) is established; it is the equation of virtual work.
The choice v = u in this equation leads to an interesting result, that
, u") O, or a(u, u"r a(u", u"), and the at the minimum, the strain energy is· the negative of the potential energy:
rgy in the error equals the error in the
(25) /(u) a(u, u) - 2(f, u) = -a(u, u).
Similarly, /(u") -a(u", u"). In every case /(u) /(u"), since u is minimizing
over a larger class of functions, and therefore a change of sign reproduces
cessarily positive, the strain energy in u"
the result stated in the corollary, that the strain energy is always under-
~y in u:
estimated:
a(u, u). a(u", u")< a(u, u).
al to the Ritz theory, and its three parts The proof of the theorem is now complete, except for one point: Neither
can deduce (b) immediately from (e): the existence nor the uniqueness of u" (or, in case S" is the whole space X1,
very v" in S", and subtracting (22), the of u itself) has been properly established. To a functional analyst, such an
omission means that the proof has hardly begun. We shall try to mollify
inner product space, the function in a him by pointing to the key word in the hypothesis of the theorem: The sub-
vays the projection of u onto S". In the space S" is required to be closed. This means that the subspace must contain
mplies (a), we compute its own limiting functions. If there is a sequence vN in S" such that
'J(u - uli, u - u") - 2a(u - u", v")

+ a(v"' v").
then there must exist in S" a limit v, such that
as N~ oo.
, in other words, only if v" = O. Thus This will always be true if S" is finite-dimensional, which is the case contem-
in (20), and (a) is proved. plated in the Ritz method. In general, one cannot guarantee that there exists
(e), since that will imply (b) and there- a function u" in S" whieh is absolutely the closest to u, unless the subspace
:n for any E and v", is clósed. We cite the example S"= Xi, whieh is nota closed subspace. It
contains functions arbitrarily close. to u(x) x, but there is no closest one;
the "projection" breaks down.
To ,prove the existence of u, defined to be the minimizing function over Now by borrowing part ofthe right si
the whole space X1, we need to see that X1 is itself closed. This is exactly
what was achieved in the completion process, enlarging X~ to the full admis-
sible space X1. In particular, it was in compl<?ting (or closing) the admissible f p(v')' + qv' > Pmi• f( v'
space that the natural boundary condition u'(n) O was dropped. There
was one ~echnical point in that process which we hurried by: the space was This is the required inequality a(v, v) ~
completed in the natural energy norm, a(v Vm v - vN) -O as in (10), lem is el/iptic. Together with (26a), it i
and yet w~·have described the completed space in terms ofthe 3C 1 norm. This in the standard norm 11 v 11 1 or in the e
step is justified by the equivalenceof the two norms: there exist constants space is the same, 3C1.
u and K such that . Note that with a natural boundary
is entirely different. The Poincaré im
(26a) a(v, v)< Kllvl!I, v(O) = O, and will be violated by every <
and with q O, the energy a(v, v) j
(26b) a(v,v) ullvllt· that v = O. Equivalently, the differenti:
The last inequality also yields the uniqueness of u and uh, sínce it means
-(p;
that the energy is positive definite: a(v, v) =O if and only if v =O. The sur-
face l(v) is strictly convex, and can have only one stationary point, at the u'(O) =d
minlmum. has no unique solution; u is determine(
The first inequality is easy, since apure Neumann problem-physically, ~
are possible-may lead to technical d·
J[p(v') 2
+ qv 2
] dx max(p(x), q(x)) J[(v') 2
+v 2
] dx. will be indefinite.
So much for the general theory. The
Therefore, K can be chosen as max(p, q). example, with Sh composed of broken
The inequality (26b) in the other direction starts in the same way, since estímate of the error eh = u - uh. The
p is bounded below by a positive constant Pmin: (20):
a(eh, eh)< a(u vA,, u
(27) Jp(v') 2
dx > Pmin J(v')2 dx. Of course, the function u is not know1
in 3C 0 , then 'u is in X~. Therefore, the qu
The difficulty comes with the undifferentiated, or zero-order, terms, since function u in JC~ be approximated by n
q need not be bounded away from zero; in fact, we may have q O. There- to work only with the Ritz approxima1
fore, we need an inequality of Poincaré type, bounding v in terms of v'. Wíth Sh a good approximation to u, since uh v
the boundary condition v(O) = O, the natural idea is to write tion of the error eh becomes a straigh
theory: How far away are functions i
~a(v, v)?
The most convenient choice of a fu
and apply the Schwarz inequality: interpolate u1 • The two functions agree
_in between. It can be expanded in termf
N
uJ(x) =E"'
l
Integration o ver O x0 n yields a Poincaré inequality:
At any node, only the one correspondí:
We. compare u with u1 first on the
ment, leading to a pointwise estímate <
y
CHAP. 1 SEC. 1.6. THE ERROR WITH LINEAR ELEMENTS 43
.ned to be the minimizing furiction over Now by borrowing part of the right side óf (27),
· that 3C1 is itself dosed. This is exactly
' process, enlarging JC_i to the full admis-
ín compl~ting (or closing) the admissible I p(v') 2 + qV > Pmin
2
I( v')
2
> 2~zPrnin I(
v')
2
+V
2
•
mdition u'(n) O was dropped. There

:ess which we hurried by: the space was This is the required inequality a(v, v) ull v !Ir, which asserts that the prob-
)fm, a(v vN, v - ~}v) O as in (10), lem is elliptic. Together with (26a), it implies that whether JC_i is completed
eted space in terms ofthe 3C 1 norm. This in the standard norm 11 v 11 1 or in the energy norm ~, the completed
Jf the two norms: there exist constants space is the same, 3C1.
Note that with a natural boundary condition at both ends, the situation
is entirely different. The Poincaré inequality depended on pinning down
<KJivm, v(O) = O, and will be violated by every constant furiction. With v = constant,
ul!vm. and with q O, the energy a(v, v) J p(v')Z can be zero without implying
that v = O. Equivalently, the differential equation
e uniqueness of u and uh, since it means
{v, v) = O if and only if v O. The sur-
-(pu')' = f,
have only one stationary point, at the u'(O) = u'(n) O
has no unique solution; u is determined only up toa constant function. Thus

apure Neumann problem-physically, a problem in which rigid body motions
are possible-may lead to technical difficulties: the quadratic form a(v, v)
{p(x), q(x)) J[(v') + v
2 2
] dx. will be indefinite.
So much for the general theory. The goal is to apply it to the finite element
v,q). example, with Sh composed of broken-line functions, and emerge with an
direction starts in the same way, since estímate of the error eh = u - uh. The key lies in the minimizing property
lStant Prnin: (20):
~ Prnin f (v') dx.

2
Of course, the fuº"ction u is not known; we can be certain only that if f is

in 3C 0 , then'u is in JC_i. Therefore, the question is: How closely can an arbitrary
ferentiated, or zero-order, terms, since function u in :JC_i be approximated by members of Sh? We need not attempt
~ro; in fact, we may ha ve q O. There- to work only with the Ritz approximation uh; it will be sufficient to find in
ré type, bounding v in terms of v'. With
Sh a good approximation to u, since uh will always be better. Thus the estima-
: natural idea is to write
tion of the error e" becomes a straightforward problem in approximation
r:· v'(x) dx,

theory: How far away are functions in JC_i from S\ in the natural norm
,.ja(v, v)?
The most convenient choice of a function in Sh which is close to u is its
interpolate u1 • The two functions agree at every node x = jh, and u1 is linear
_in between. 1t can be expanded in terms of the roof functions as
N
u¡{x) = I; u(jh)ffJ1(x).
1
a Poincaré inequality:
At any node, only the one corresponding basis function ffJ1 is nonzero.
We compare u with u1 first on the basis of a simple Taylor-series argu-
ment, leading to a pointwise estímate of their difference.
44 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.6. T
THEOREM 1.2 A more careful proof of (29) would i

Jf u" is continuous, then and the extreme function illustrated in
to be the best possib1e.
(28) maxl u(x)- ulx) 1 < -!h 2 maxl u"(x) 1 It follows from the theorem that if z.
and
(29) maxl u'(x)- u;(x) 1 < h maxl u"(x) 1- Since the Ritz approximation uh is at le
with linear e/ements satis.fies
Proof Consider the difference .d(x) = u(x) - ulx) over a typical interval
( j - I)h < x <jh. Since .d_vanishes at both ends ofthe interval, theremust
be at least one point z where .d'(z) = O. Then for any x,
This is almost the result we want. 1
reflects the observed rate of decrease of
.d'(x) = J~ .d "(y) dy. imperfection is in the other factor, ma)
assume that u" is continuous, or even th
But .d" = u", since u1 is linear, and (29) follows immediateiy: is an estímate of the error in the enerJ
on1y that u" has finite energy in the 3C 0
1.d'(x) 1 = 1 J~ u"(y) dy 1 < h maxl u" 1- < oo. The proof of this sharper resu1t,
polation, will be based on Fourier series
The maximum ofl .d(x) 1 will occur ata point where the derivative vanishes, error estímate is given in (34) below.
= O. We look to see in which half of the interval z lies; suppose, for
L\'(z)
example, that it is closer to the right-hand endpoint, so that jh - z < h/2. THEOREM 1.3
Then expanding in a Taylor series about z, Jf u" /ies in 3C 0 , then
.d(jh) = .d(z) + (jh- z).d'(z) + J:(jh- z) 2 .d"(w), (30)
where z < w < jh. Since .d vani~hes at the endpoint jh, and .d" = u", (31)
(32)
This constant-! is the best possible, not only for the error in linear interpo-
lation, but even if we a1low an arbitrary piecewise linear approximation. Proof Consider any subinterval oJ
The most difficult function to approximate has u" alternating between + 1 O < x < h. The difference .d(x) = u(x
and -1 on successive intervals (Fig. 1.5). The best piecewise linear approxi- and we represent it as a Fourier sine se
mation in this extreme case is identically zero, and the error is h 2 /8.
L\(x) = :El
u(x) = x(h -x)/2 1
By direct computation
J: (A')' dx =-
Fig. 1.5 The extreme case in piecewise linear approxirnation.

J: (.d ") 2 dx = -
CHAP. 1 SEC. 1.6. THE ERROR WITH LINEAR ELEMENTS 45
A more careful proof of (29) wóuld improve it to maxl A' 1 < !h maxl u" l,
and the extreme function illustrated in Fig. 1.5 also shows this constant !
to be the best possible.
1 < -kh 2 maxl u "(x) 1 It follows from the theorem that if u" is continuous, then
1 < h maxl u"(x) 1.

Since the Ritz approximation uh is at least as close as u1, the error in energy
) = u(x) - u¡(x) over a typical interval with linear elements satisfies
1t both ends of the interval, there must
O. Then for any x,
This is almost the result we want. The factor h2 is perfectly correct, and
d"(y)dy. refiects the observed rate of decrease of the error as the mesh is refined. The
imperfection is in the other factor, maxl u" 12 • It is unsatisfactory to ha ve to
assume that u" is continuous, or even that it is bounded, when the conclusion
~) follows immediately:
is an estímate of the error in the energy. It should be sufficient to assume
only that u" has finite energy in the X 0 norm, in other words that (u") 2 dx f
) dy 1 < h maxl u"l. < oo. The proof of this sharper result, in the special instance of linear inter-
polation, will be based on Fourier series rather than Taylor series. The proper
Lt a point where the deriva ti ve vanishes, error estímate is given in (34) below.
alf of the interval z lies; suppose, for
·hand endpoint, so that jh - z < h/2. THEOREM 1.3
)Ut z, 0
Jf u" lies in X , then
.1'(z) + !(jh _:_ z) 2 d"(w), (30) 11 u- U¡ < ~'h 2 ll u" llo,

llo -n
.t the endpoint jh, and .1" = u",
(31) 11 u' - u~ llo < _!__hllu"llo,
n
(32)
not only for the error in linear interpo-
:rary piecewise linear approximation. Proof Consider any subinterval of length h, for example the first one,
jmate has u" alternating between + 1 O < x < h. The difference d(x) = u(x) - u¡(x) vanishes at both endpoints
.5). The best piecewise linear approxi- and we represent it as a Fourier sine series:
lly zero, and the error is h 2 /8.
= . nnx
d(x) = ~ansm¡:z·
By direct computation
J: 2
(.1') dx = ~~ (n:ra;,
>iecewise linear approximation. J: 2
(.1") dx = ~ ~ (n:ra; .
Since n I, As an alternative to Fourier serie

been derived by solving a variational
to A(O) A(h) · O, J (A")2 l. The
and the extremum falls naturally af sir
Therefore, summing on n,
The interpolate u1 is not to be con_
(33) J: (A')'
Both are piecewise linear, but u" is
simply a convenient choice close to z,
is closer still, yields the first of the fol
Here A" = u" because u1 is linear. Equality holds if and only if every coefii-
cient an after the first ís zero; A must have the form sin nxfh. CoROLLARY
The conclusion (33) holds equally well over each subinterval, say from The error e" = u - u" in the finite ~
U 1)h to jh, and we can sum over all these subintervals:
(34)
N fjh h2 N fjh
L; (A')2 < 2 L; (u")Z. The last inequality comes from (3),
1 (}-l)h 1(, 1 (J-l)h
data. The leading terms in the constar
We want to simplify this to for c2.
The final result is thus a bound of
putation shows this bound to be compl
a(u - u", u - u") i& almost exactly p
very crude meshes (h ! or ! ?). Sucl
This step, which looks completely obvious, is justified only because there is lishing an asymptotic expansion for
no trouble at the points where the subintervals meet. Notice that if A" were difference equations in (16).
still on the right side, as in (33), the equality Our convergence proofs have so far
excluding the case in which f is a ¿
f~
L;
f }h
(j-l)h
(A ")2 =
o
(A ")2 "ramp." An easy computation show~
of order h, unless a node is placed ri!
would have been completely false, the right side being in reality infinite. O(h) for a line discontinuity in two di
(A 11 is a ó-function at the nodes.) This point will recur when we consider the u lies in JC1, it is impossible to say a11
difference between "conforming" and "nonconforming" elements; when the it may be arbitrarily slow as h ~ O.
t~ial functions are not smooth enough to lie in the admissible space, l(v) pro ve.
cannot be computed element by element.
By a similar argument, THEOREM 1.4
f no Az dx
ll
h ~ 2 < h4 ¡~
2 ~ an - 1(,4
(
o u
")2
.
For any solution u in JC};-in othe.
data space-the finite element method
Therefore, a(e", e") O

Proof. Since JC1 was constructed
there is a sequence vN in JC.i convergí
fixed N, the finite element approxima
Theorem 1.3. Therefore, choosing N 1
tion v~ in S" arbitrarily close.to u. Sint
This completes the proof. the sequence uh must converge to u.
CHAP. 1 THE ERROR WITH LINEAR ELEMENTS 47
As an alternative to Fourier series, the key inequality (33) could have

been derived by solving a variational problern: Maximize f (11')2, subject
to !1(0) !l.(h) __:_ O, f (11'') 2 l. The stationary points are !l. = sin nnx/h,
and the extremum falls naturally at sin nx/h.
The interpolare u1 is not to be confused with the Ritz approximation zi'.
Both are piecewise linear, but uh is determined variationally while u1 is
(!l. ")2 h2
11:2
fh (u ")2 .,., . simply a c~mvenient choice el ose to u. Theorem 1.1, which asserts that uh
o .
is closer still, yields the first of the following inequalities.
quality holds if and only if every coeffi-
have the form sin nxfh. CoROLLARY
y well over each subinterval, say from The error eh = u - uh in the finite element method satisfies
di these subintervals:
(34)
h 2 Nfjh
: I; (u")z. The last im!quality comes from (3), bounding the solution in terms of the
1 (/-l)h
data. The leading terms in the constants are Pmax/n 2 for C1 , and Pmax/n 2p~ 1 n
for C 2 •
The final result is thus·a bound of order h 2 for the error in energy. Com-
putaÜon shows this bound to be completely realistic, and in practice the error
a(u - u\ u - uh) is almost exactly proportional to h 2 , beginning even at
very crude meshes (h = ! or i ?). Such regularity can be explained by estab-
>vious, is justified only because there is Iishing an asymptotic expansion for the error, of the kind introduced for
bintervals meet. Notice that if !l." were difference equations in (16).
~quality
Our convergence proofs ha ve so far assumed two deriva ti ves for u, thereby
excluding the case in which f is a €5-function and u is a piecewise-linear
"ramp." An easy cQmputation shows that the resulting error in energy is
of order h, unless a no de is placed right at the discontinuity in f. [It is also
.he right side being in reality infinite. O(h) for a line discontinuity in two dimensions.] In general, given only that
s point will recur when we consider the u lies in X 1, it is irripossible to say anything about the rate of convergence;
"nonconforming" elements; when the it may be arbitrarily slow as h ~O. However, convergence itself is easy to
gh to líe in the admissible space, /(v) pro ve.
en t.
THEOREM 1.4
For any solution u in X 1-in other words for any data fin the matching
'
, a, 2
< h4
n4
f" (u ) . n 2 data space-the finite element method converges in the energy norm:
0
as h ~O.
Proof. Since X 1 was constructed in the first place by completing x¡,

there is a sequence vN in x¡ converging in the energy norm to u. For each
fixed N, the finite element approximations vR, converge to vN as h ~ O, by
Theorem 1.3. Therefore, choosing N large and then h small, there is a func-
tion v'N in Sh arbitrarily close to u. Since the projection uh will be even closer,
the sequence uh must converge to u.
TQ.is argument applies unchanged to all such minimization problems and On the other hand, Theorem 1.1 asse
need not be repeated in every case. The necessary and sufficient condition for Subtracting from (36), this gives
convergence in the Ritz method is obviously this, \hat for every admissible
u, the distance to the trialspacesSh (measured by the energyj should approach (37) a(z v\ eh
zero as h ---+ O. The point of the previous theorem is that once this conver-
To the left side we apply the Schwarz .
gence h~s been verified for a "dense" subspace-one whose completion in
the ener$y norm yields all admissible functions-then convergence is auto;. la(v,w)l (a(v,~
matic for··every u. Therefore, the interesting problem is to establish the rate
of convergence in energy when u is sufficiently smooth. with v = z vh, w eh. By the coroU:
lt is equally interesting, and somewhat harder, to find this rate of conver-
gence in a different norm. The result of the corollary is that the strains-the (a( eh, é)) 112 ·
first derivatives (uh)'-are in error by O(h). We inquire next about the error
in the displacement. How quickly does eh= u- uh decrease when measured Choosing vh as the Ritz approximation
by 11 ehlfo?
The crudest answer is to apply Poincaré's inequality 1eh(x 0 ) 1< ,./1tl! eh !1 1 ,
derived above. This bounds the error at every point x 0 uniformly by O(h).
Thus the Schwarz inequa]ity applied t,
One might expect, however, to improve this estímate to O(h 2 ). Such an im-
provement appears almost obvious from (30), where the error in the interpo-
lating element u1 is of second order. This at least proves that Sh contains a
function which is within O(h 2 ) of u in displacement. The difficulty is that Finally, we can bound the solution z t
in the ;reo norm the Ritz approximation uh is not minimÚ:ing. There is no
assurance that uh is as close to u as u1 is, and we shalllater find fourth-order !lz"llo 11 zl
problems in which the error in displacement is no better than the error in
slope. Here is the key point; that to estima
In the present example, however, the displacement error is indeed O(h 2). which is variationally unnatura1, one
One possible proof is simply to forget the variational derivation of the finite solutions in the norm of 3C 2 • The Iatt
element equations KQ = F, and to compute their truncation error as differ- from the viewpoint of differential' equ'
ence equations. (At the boundary x 11: this has aiready been done.) Applying theory was exactly this, to estímate tt
the maximum principie, the result is actually a pointwise bound 1eh(x) 1 = in 3C 0 • Substituting this bound into t
O(h 2 ), which is optimal. Nevertheless, this approach by way of finite differ- the common factor 11 eh 11 0 , this argum
ences is not entirely satisfactory, because its extension to irregular finite literature as Nitsche's trick) has estab
elements in two-dimensional problems .becomes extremely ·difficult. There- displacement:
fore, it is essential to look for an argument which will establish variationally
the rate of convergence of the displacement error 11 eh llo· THEOREM 1.5
The following trick is remarkably successful. Let z be the solution to The piecewise linear finite' element
the origínal variational problem over 3C1, when eh u- uh is chosen as the method, satisfies
data. Then by the vanishing of the first variation,
(38)
(35) a(z, v) =(eh, v) for all v in 3C1.
It is interesting that this bound ~
In particular, we may choose v = e": to the fact that approximation to ord1
fact is therefore another consequenct:
(36) approximation in 3C 1 , then it can achi
CHAP. 1 THE ERROR WITH LINEAR ELEMENTS 49
to all such minimization problems and On the other hand, Theorem 1.1 asserts that a(vh, eh)= O for all vh in Sh.
1e necessary and sufficient condition for Subtracting from (36), this gives
·bviously this, that for every admissible
1easured by the energy) should approai:h (37)
vious theorem is that once this conver~
To the left side we apply the Schwarz inequality in the energy norm, that is,
:" subspace-one whose completion in
e functions-then cohvergence is auto- 1 a(v, w) 1 < (a(v, v)) 112 (a(w, w))t1 2
:resting problem is to'' establish the rate
tfficiently smooth. with v z- v\ w = eh. By the corollary to Theorem 1.2,
what harder, to find this rate of conver-
of the corollary is that the strains-the
O(h). We inquire next about the error
s eh = u - uh dacrease when measured Choosing vh as the Ritz approximation to z, the same corollary gives
ncaré's inequality 1eh(x 0 ) 1 Jnll eh IIP

r at every point x 0 uniformly by O(h).
Thus the Schwarz inequality applied to (37) yields
>ve this estímate to O(h 2). Such an im-
om (30), where the error in the interpo-
This at least proves that Sh contains a
in displacement. The difficulty is that Finally, we can bound the solution z by its data eh; according to (3),
'ion uh is not minimizing. There is no
is, and we shalllater find fourth-order
acement is no better than the error in
Here is the key point; that to estímate the Ritz error eh in the norm 3C 0 ,
the displacement error is indeed O(h ). 2 which is variationally unnatural, one nee<;is an equally unnaturat bound for
: the variational derivation of the finite solutions in the norm of 3C 2 • The latter bound is entirely normal, however,
>mpute their truncation error as differ- from the viewpoint of differential equations; in fact, the central result of the
n this has already been done.) Applying theory was exactly this, to estímate the solution in JC 2 in terms of the data
actually a pointwise bound 1eh(x) 1 in 3C 0 • Substituting this bound into the previous inequality and cancelling
, this approach by way of finite differ- the common factor 11 eh 11 0 , this argument (known in the numerical analysis
cause its extension to irregular finite Jiterature as Nitsche's trick) has established an h. 2 estímate for the error in
1s beco mes extreme! y· difficult. There- displacement:
tment which will establish variationally
;ement error 11 eh llo· THEOREM 1.5
r successful. Let z be the solution to The piecewise linear finite element approximation u\ derh•ed by the Ritz
3C 1, when eh = u uh is chosen as the method, satisjies
st variation,
(38)
for all v in X 1.
It is interesting that this bound was derived without appealing directly
to the fact that approximation to order h 2 is possible in the JC 0 norm. This
fact is therefore another consequence of the theorem: If Sh achieves O(h)
approximation' in X', then it can ·achieve O(h 2) approximation in 3C 0 •
50. AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.7. THE FINITE ELEMI
We have remarked that the rates of decrease predicted for the error-h 2 [We ha ve not been able to use the stronges1
for di'splacement and h for strain-have been repeatedly confirmed by derivative of u - ií is O(h 2 ).]
numerical experiment. Sorne experimenters have calculated only the errors The proof is completed by applying th.:
at individual mesh points, instead ofthe mean-square errors over the interval, is the projection onto Sh of u - ií, an~
again with the same rates of convergence. [To predict these pointwise errors energy,
we must either return to the maximum principie, or assume more mean-
square cÓntinuity of th.e data and refine the variational estímate. In a few a(uh - fíh, uh - üh) a(u - ií, u- ií)
important problems the Ritz solution is actually more accurate atthe nodal a((u ií)-
points than elsewhere; for -u" u(O) = u(n) =O, uh agrees identically a( u - ií, u - üj
with u at the nodes, and the accuracy is infinite.] There have been a few cases,
however, in which the expected convergence has not been confirmed, because
·. We conclude that the error due to i11
of a simple fiaw i.n the experiment: The computer has worked with the
quantity in the sense of energy) than the h2 error i
mation on linear elements.
Eh max 1eh(jh) 1·
J
This expression introduces, with every decrease in h, a new and larger set of 1.7. THE FINITE ELEMENT METHOD U
mesh points. In particular, points nearer and nearer to the boundary, where DIMENSION
the error is often greatest, will determine the computed value Eh. It is unrea-
sonable to expect that this error, occurring at a point which varíes with h, This section extends the previous om
will still display the optimal h 2 rate of decrease. inhomogeneous boundary conditions, ele
Finally, there is the error which arises when the load f is replaced by its rather than linear, and differential equati4
linear interpolate f 1 • This change, which was introduced to simplify the inte- estimates for different finite elements wil
grations J frp1 dx, alters the load vector from Fto F. It leads to approximate they Iater fall within the theory which
finite element solutions Q = K-tp and fíh I; Q,rp;, which are the exact The steps in the method itself are exactl)
jinlte element approximations for the problem with data f 1 • Therefore, we statement of the problem, the construct
have only to consider the change in the Rit~ solution due to change in the space of the admissible space, and the :
data. equations K Q = F. The pattern will t1
in one dimension.
THEOREM 1.6 We begin by retaining the same diff
Jf f is. replaced by its interpolate j¡, then the induced error uh - uh in the but with the more general boundary co1
finite eleinent approximation satisfies
u(O) = g, u'(n:
The first of these conditions is agajn ess

function v in the admissible space 3C1. 1
Prooj. The exact solution u - ií, corresponding to data f- j¡, is bounded between any two admissible functions w,
by ~ (0) O. We denote by V0 the space of
slble space when the essential condition:
(39) The boundary condition at the othe1
u' and u, and therefore the functional '
The Iast inequality gives the error in linear interpolation; it is copied from the system represents a string which is
Theorem 1.3. Now it follows that · x = n; instead it is connected to a sprin
a(u ií,u-ií) Kllu ü!!i<KIIu-iíll~

JRY CHAP. 1 SEC. 1.7. THE FINITE BLEMBNT METHOD IN ONE DIMBNSION 51
~es of decrease predicted for the error-h 2 '

[We\have not been able to use the strongest part of (39)-that even the second
.in-have been repeatedly confirmed by derivauve of u ü is O(h 2).]
rimenters have calculated only the érrors The proof is completed by applying the corollary to Theorem l.l: u11 - i?
>fthe mean-square errors overtheinterval, is the projection onto s~a of u - u, and this projection cannot increase
:rgence. [To predict these pointwis~ errors energy,
-~:.imum principie, or assume more mean-
refine the variation~,l estímate. In a few a(uh - ü\ uh - uh) a(u - Ü, u - u)
ion is actually more a:ccurate at the nodal a((u - ü) - _(uh i?), (u - ü) (uh- uh))
=J, u(O) = u(n:) O, uh agrees identically
u, u- ü) < ~ h 4 ll/nll~·
2
;:,y is infinite.] There have been a few cases, < a(u

tvergence has not been confirmed, because
nt: The computer has worked with the We conclude that the error due to interpolating, the data is smaller (h4
in the sense of energy) than the h2 error inherent in basing the Ritz approxi-
max 1eh(jh) j. mation on linear elements.
1
'ery decrease in h, a new and larger set of 1.7. THE FINITE ELEMENT METHOD IN ONE
1earer and nearer to the boundary, where
DIMENSION
rmine the computed value Eh. 1t is unrea-
'ccurring at a point which varies with h, This section extends the previous ones in three directions-it introduces
: of decrease. inhomogeneous boundary conditions, elements which are quadratic or cubic
1 arises when the load f is replaced by its rather than linear, and differential equations of order 4 as well as 2. The error
vhich was introduced to simplify the inte- estimates for different finite elements wi11 be stated rather than proved, since
:ctor from Fto f. It leads to approximate they later fall within the theory which it is our main object to describe.
~ and üh = L; Q¡rp~, which are the exact The steps in the method itself are exactly the same as before: the variational
'he problem with data fi· Therefore, we statement of the problem, the construction of a piecewise polynomial sub-
n the Ritz solution due to change in the space.. of the admissible space, and the assembly and solution of the linear
equations KQ =F. The pattern will therefore be more or less complete,
in one dimension.
We begin by retaining the same differential equation -(pu')' + qu
? JI, then the induced error uh uh in the but with the more general boundary conditions
·s
u(O) = g, u'(n:) + tXu(n:) = b.
The first of these conditions is again essential, and must be satisfied by every
function v in the admissible space 3C1. Therefore, the difference V 0 = V1 V 2
, corresponding toda~af- JI, is bounded between any two admissib/e functions will satis/y the homogeneous condition
v0 (0) =O. We denote by V 0 the space ofthese differences v0 ; it was the admis-
sible space when the essential conditions were homogeneous.
, The boundary condition at the other end is óf a new kind, involving both
u' and u, and therefore the functional /(v) must be recomputed. Physically,
in linear inter'polation; it is copied from
the system represents a string which is neither fixed nor completely free at
· x = x; instead it is connected to a spring. The new functional is
J(v) = J: (p(v') 2 + qv 2
) dx +tXp(n:)v (x)- 2 J:
2
fv dx 2bp(n:)d.:n;).
1
52 AN INTRODUCTION TO THE THEORY t_ CH;AP. 1 SEC. 1.7~ THE FINITE E
Thus the new condition has introduced a boundary term both in the linear system will be identical to the one in 1
part óf the functional and in the energy the matrix, corresponding to the left ei
other row except that q 0 g. t Therefo
a(v, v) J (p(v') 2 + qv 2 ) dx + ap('n:)v 2 (n). coefficients taken to be p = q = 1) is
The last,,term represents the energy in the spring. -g + 2Q 1 - Q2 + !!_(g -t

1
We rt 9w verify that the vanishing of the. first variation in every direction
h 6
v 0 leads tó the same conditions on the minimizing u as the differential equa-
Shifting the terms involving g to the <
tion and boundary conditions. [Note that u is perturbed by functions v 0
as before. The inhomogeneous conditi
in V 0 , which assures that the essential condition (u + Ev 0 )(0) gis satisfied.
F 1 of the load vector, by the amount
The perturbations are not the functions v in 3C1.] The coefficient of 2E in
At the other end the computation
I(u + EV 0 ) is
I(v") are
J [pu'v~ + quv 0 ] + ap(n)u(n)vo(n) - J fv 0 - bp(n)v 0 (n)
J[-(pu')' + qu f]v 0 + p(n)[u~(n) + au(n)- b]v0 (n). Therefore, in the last equation alfaqN
is an extra bp(n) in the load componet
entry KNN· Again the local nature oft:
This expression vanishes for all v 0 if and only if u satisfies the differential
city of this change; the only basis fum
equation and the new boundary condition. Therefore, 'this boimdary condi-
therefore coupled to the boundary coi
tion is naturalfor the modifiedfunctional l(v).
The error estimates for this probl1
In the general Ritz method, it no longer makes sense to ask that Sh be a
sectíon, namely
subspace of JC1. X1 itselfis nota vector space; ithas been shifted away from
the origin. Therefore, we ask S" to ha ve the same form. The tria! functions a(u u\ u
1, but the difference of any tria! func-
vh need not lie in the admissible space X and
tiáns must be in the homogeneous space V 0 • These differences v~ = v~ v~
form a finite-dimensional space sg, whicli is required to be a subspace of V 0 •
For linear finite elements, the procedure is clear. There ís no constraínt The first estímate again depends on ·
at x n, where the boundary condition is natural. The trial space S" will u" than it is to u1 • In turn, this depends
therefore consist of al! piecewise linear functions which satisfy v"(O) = g. u1 in the trial space Sh. Therefore, we:
(This essentia1 condition can be imposed exactly in one dimension, since it Theorem 1.3 for the distance between
constrains vh only at a point. In two or more dimensions, a boundary condi- The second extension of the methc
tion vh(x, y) g(x, y) cannot be satisfied by a polynomial, and Sh is not are more "refined'' than the piecewise
contained in X1.) S~ is the same piecewise linear trial space introduced in approximated by quadratic or cubic
previous sections, vanishing at x = O, and every vh can be expanded as will be a corresponding improvemem
natural to construct trial spaces S" v
higher degree.
To start, let S" consist of all pi<
Thus the coefficient of qJ~ is fixed: q 0 g. continuous at the nodes x = jh and
The ~ potential energy I(v") is a quadratic in the unknowns q~' . .. , qN,
tThis corresponds to the way in which
and its minimization leads again toa linear system KQ =F. In the interior in practice. They are ignored until thematrio
of the interval-that is, for all but the first and last rows of the matrix-this (in this case q 0 ) given the value which is p1
CHAP. 1 SEC. l. 7. THE FINITE ELEMENT METHOD IN ONE. DIMENSION 53
(
ced a boundary term both in the linear syst~m will be identical to the one in the previous sections. The first row of
rgy
the matrix, corresponding to the left end of the interval, looks just like every
other row except that q 0 = g. t Therefore the first equation in the system (with
coefficients taken to be p = q = 1) is
n the spring.
· of the first variatioJ\in every direction
te minimizing u as the ·differential equa-
-g + ~·- Q, + ~ (g + 4Q, + Q,) =
2
r Jrpl dx.
te that u is perturbed by functions v 0 Shifting the terms involving g to the other side, the first row of K is exactly
1condition (u+ fv 0 )(0) =gis satisfied. as before. The inhomogeneous condition has altered only the first component
;ons v in X 1.] The coefficient of 2f in F 1 of the load vector, by the amount gfh- gh/6.
At the other end the computation is almost as simple. The new terms in
J(vh) are
J fvo - bp(rc)vo(rc) ..
- f]v 0 + p(rc)[u'(rc) + rtu(rc) - b]v 0 (rc). Therefore, in the last equation ai¡aqN = o, after deleting the factor 2, there
is an extra bp(rc) in the load component FN andan extra rtp(n) in the stiffness
~ and only if u satisfies the differential entry KNN· Again the local nature of the basis was responsible for the simpli-
.dition. Therefore, 'fhis boimdary condi- city of this change; the only basis function which is nonzero at x = n, and is
'1al I(v). therefore coupled to the boundary condition, is the last one.
Ionger makes sense to ask that Sh be a The error estimates for. this problem will be the same as in the previous
:or space; it has been shifted away from section, namely
tave the same form. The tria! functions
JC 1, but the difference of any tria! func- and
ce V 0 • These differences vi = v~ - v1
hich is required to be a subspace of V 0 •
'cedure is clear. There is no constraint The first estimate again depends on the variational theorem·: u is closer to
tion is natural. The trial space Sh will tih than it is to u1 • In turn, this depends on the presence ofthe linear interpolate
·ear functions which satisfy vh(O) = g. u1 in the trial space Sh. Therefore, we may again appeal to the approximation
,sed exactly in one dimension, since it Theorem 1.3 for the distance between a function and its interpolate.
r more dimensions, a boundary condi- The second extension of the method is the introduction of elements which
sfied by a polynomial, and Sh is not are more "refined" than the piecewise linear functions. A given u(x) is better
cewise linear trial space introduced in approximated by quadratic or cubic than by linear interpolation, and there
and every v" can be expanded as will be a corresponding improvement in the accuracy of uh. Thereforé, it is
N natural to construct trial spaces Sh which are composed of polynomials of
1 + I; qlp~(x).
1
higher degree.
To start, let Sh consist of all piecewise quadratic functions which are
= g. continuous at the nodes x = jh and satisfy vh(O) = g. Our first object is to
Iadratic in the unknowns q 1' • • · ' qN•
linear system KQ =F. In the interior tThis corresponds to the way in which essential boundary conditions are introduced
in practice. They are ignored until the matrices are assembled, and only then is the unknown
first and last rows of the matrix-this (in this case q 0 ) given the value which is prescribed by the boundary condition.
54 AN INTRODUCTION TO TiiE THEORY CHAP. 1 SEC. 1.7. THE FINITE EL
compu~e the dimension of Sh (the pumber of free parameters q1) and to cient of q0 is the right half of rp 1 in Fig.
determine a basis. Note that as x passes a node, continuity imposes only and to zero at x = h/2, x = h. The co,
one constraint on the parabola which begins at that node; two parameters Fig. 1.6b, and the coefficient of q 1 is th(
of the parabola remain free. Therefore, the dimension must be twice the The element matrix k 1 is computed
number of parabolas, or 2N. the result as (qoqt¡zq.)Tk.(qoqt¡zq¡)· Nc
A ba~is can be constructed by introducing the midpoints x = (j - !)h three of the parameters q appear in an
as nodesf\in addition to the endpoints x = jh. There are then 2N nodes, mined by three conditions. In the cal(
since x ~-O is excluded and Nh n; we shall denote them by zi' j = and we give only the final result:
1, ... , 2N._ To each node there corresponds a continuous piecewise quadratic
which equals one at z1 and zero at Z¡, i
(40)
These functions are of two kinds, depending on whether z1 is an endpoint Notice that k 1 is singular; applied to t
(Fig. l.6a) ot a midpoint (Fig. 1.6b). Notice that both are continuous and vector q0 1, qt 1z = 1, q 1 = 1 corresp<
4[(x/h)- (xlh) 2] a horizontal line vh = 1, so its deriva1
a useful check; k 0 would not be singuh
1- 3(x/h) + 2 (x/h) 2 These ideas extend directly to cut
functions imposes one constraint at ea
free parameters in the cubic, and the
a basis we place two nodes within eacl
the endpoints. Together with the end]
o h
The functions ·which satisfy rp1(z1) b
Fig. 1.6 The basis functions for piecewise quadratic elements.
kinds ·(Fig. l. 7). The element matrices
therefore in X1. The function rp 2 , which is n01i'zero only over one subinterval,
~
is not intrinsically determined by _the subspace; the intemal node could have
been chosen anywhere in the interval. The choice of the midpoint affected
the basis but not the space itself. -h o h
Je
The element stiffness matrix k 1 , corresponding to the integral of (v') 2
Fig. 1.7 Cubics which are
overO x < h, is computed from the quadratic which equals q0 at x O, 01
q 112 at the midpoint x = h/2, and q 1 at x =h. This quadratic is

There is anoth~r cubic element wh
1t is constructed by imposing continuit
on its first derivative. This means that
of the preceding one, with one new cor
To relate this to the basis functions, collect the coefficient of each q: nodes x = h, 2h, ... , n h. Therefo
will be 3N (N- 1) 2N + 1, a re~
' parameters to be calculated. ·The onl~
vh(x) q 0 (1-3~ +2~:)+q 112 (4~ -4~:)+q 1 (-~ +2~:)· fails to improve on the old one is in th
do not have a continuous derivative. 1
These three coefficients are exactly the three parabolas in Fig. 1.6. The coeffi- case-u is in X1 but not in Xj-for a p
CHAP. 1 THE FINITE ELEMENT METHOD IN ONE DIMENSION 55
number of free p~rameters q1) and to cient of q 0 is the right half of rp 1 in Fig. 1.6a-a parabola equal to 1 at x = O
•asses a node, coritinuity imposes only and to zero at x = h/2, x = h. The coefficient of q 112 is the parabola qP in
h begins at that node; two parameters I.6b, and the coefficient of q 1 is the first half of rp 1 •
ore, the dimension must be twice the The element matrix k 1 is computed by integrating (dvhjdx)2 and writing
the result as (q 0 q 112 q 1 )Tk 1 (q 0 q 112 q 1). Notice that k 1 is a 3 X 3 matrix, sincc
1troducing the midp?ints x = (} - !)h three of the parameters q appear in any given interval; a parabola is deter-
nts x = jh. There ate then 2N nodes, mined by three conditiop.s. In the calculation it is every man for himself,
n; we shall denote ·them by z j' j = and we give only the final result:
ponds a contin.uous piecewise quadratic
, i =1= j; -8
16
-8
~pending on whether z 1 is an endpojnt
Notice that k 1 .is singular; applied to the vector (1, 1, 1) it yields zero. This
. Notice that both are c~ntinuous and
vector q 0 = 1, q 112 = 1, q 1 = 1 corresponds toa parabola vh which is actually
4[(x/h)- (x/h) 2] a horizontal line vh = l, so its derivative is zero. This singularity of k 1 is
a useful check; k 0 would not be singular.
These ideas extend directly to cubic elements. Continuity of the trial
functions imposes one constraint at each of the nodes x = jh, leaving three
free parameters in the cubic, and the dimension of Sh is 3N. To construct
a basis we place two nodes within each interval, say at a· distance h/3 from
o h
the endpoints. Together with the endpoints this gives 3N nodes zj = jh/3.
The functions which satisfy rp /z;) = J;j constitute a basis and are of three
or piecewise quadratic elements. kinds (Fig. 1.7). The element matrices will be of order 4.
;h is nonzero only over one subinterval,
~
subspace; the internal node could have
l. The choice of the midpoint affected
-h o h
corresponding to the integral of (v')2
te quadratic which equals q 0 at x = O, Fig. l. 7 Cubics which are only continuous at the nodes.
t x = h. This quadratic is
There is another cubic element which is better in almost every respect.
It is constructed by imposing continuity not only on the function vh but also
on its first derivative. This means that the trial space is actually a subspace
of the preceding one, with one new constraint at each ofthe N- 1 internal
ollect the coefficient of each q: nodes x = h, 2h, ... , _n - h. Therefore, the dimension of this new space
will be 3N- (N- 1) = 2N + 1, a reduction of o.ne-third in the number of
parameters to be calculated. The only way in which this new cubic space
fails to improve on the old Óne is in the approximation of solutions u which
do not ha ve· a continuous derivative. We saw in Section 1.3 that'this is the
: three parabolas in Fig. 1.6. The coeffi- case-u is in JC1 but not in JCi-for a point loador for a discontinuous coeffi-
56 AN INTRODUCTION TO THE THEÓRY CHAP. 1 SEC. 1.7. THE FINITE ELI
cient p(x) in the differential equation. In these cases of interna! singularities

it is essential not to require excess smoothness of the trial functions; the
Mass matrix k 0 : s: (
v") 2 = qTk 1
singularity X 0 shotild be chosen as a node, and across that node the cubic Stiffness matrix k 1 : s: ((Y)
v11 2
q
should only be continuous. This will preserve "the order of convergence.
The placement of nodes for the smoother cubics beco mes more interesting:
There is a double node at each point x jh. Instead of being determined by
Bending matrix k 2 : J:((if)") 2 = ~
at
its values the four distinct points O, h/3, 2h/~, and h, the cubic is now deter- Notice that because v 11 is in 3C 2 , it catJ
mined by its values and its first derivatives at the two endpoints, that is, by at that point the bending matrix will be
V 0 , v~, v~' and v'1 • Both v 1 and v'1 will be shared by the cubic in the next There are several ways in which too
subinterval, assuring continuity of t' and v'. The basis functions are, therefore, matrix k 0 • One of the best is to form tl
of two kinds· (Fig. 1.8). These basis functions ha ve a double zero at the ends nodal parameters in the vector q to the
of the cubic polynomial v11 : A = Hq, e
l/f(x/h) w(x/h)
o/(0) =1 ao o
a. o 1
3 2
-h2 -¡¡
2 1
h3 h2
'i/=0
~~----4-----~~
-h o h
The integration of (v 11) 2 (a 0 + a x -1
1
w(x) = x(lxl-1) 2
Fig. 1.8 Hermite cubics: v and v' c9ntinuous.
(j 1)h. They are called Hermite-cubics, since they interpolate both function
value and derivative; the earlier examples might ha ve been associated with the
name of Lagrange.
The cubic polynomial on O x < h which takes on the four prescribed
values v 0 , v~, v 1 , and v'1 is
Denoting this matrix by N 0 , the result
(41)
Therefore, the ele~ent mass matrix m1
The element matrices, which are of order 4, are computed by the following All this is easily programmed.
integrations, where q is the column vector (v 0 , v~, v~' v'1Y: For the stiffness matrix, the link Ji
- CHAP. 1 THE FINITE ELEMENT METHOD IN ONE DIMENSION 57
1. In these cases of interna! singularities

smoothness of tHe trial functions; the Mass matrix k 0 : J: (v") 2 qTk 0 q,
1 node, and across that node the cubic
'iii preserve the order of convergenee.
Stiffness matrix k 1 : J: ((v")') 2 qrk 1 q,
10other cubics becomes more interesting:
x = jh. Instead of being determined by
Bending matrix 1<;2 : J: ((v")") 2
=· qTk2q.
, h/3, 2hf3, and h, the;:~ubic is now deter-

'atives at the two endpoints, that is, by Notice that because v" is in 3e2, it can be us~d in fourth-order problems;
ill be shared by the cubic in the next at that point the bending matrix will be needed. .
nd v'. The basis functions are, therefore, There are several ways in which to organize the computat10n of the mass
Jnctions have a double zero at the ends matrix k 0 • One of the best is to form the matrix H ~hich cortnects the four)
nodal parameters in the vector q to the four coeffictents A = (ao, al, a2, a3
w(x/h)
of the cubic polynomial v": A = Hq, or from (41 ),
ao o o o Vo
al o 1 o o V~
J 2 3 1
-h2 h2 --¡¡ v.
2 2 1 v~.
h3 h2 -h3
The integration of ( v") 2 (a 0 + a x + a 2x 2 + a

1 3 x 3 ) 2 is completely trivial,
h2 h3
w(x) = x(!xl-1) 2 h
2
es: v and v' ct;mtinuous. h2 h3 h4
~es, since they interpolate both function J: (11')' dx (aoata2a3) h3

2
h4
4
hs
at
a2
•les might háve been associated with the 3 4
: h which takes on the four prescribed
Denoting this matrix by N 0 , the result beco mes

X _:_ h) + v 1cv (X-h-h)
+- V 1 rp ( -h- 1
2
'o - v'1h - 2v~h)~ 2
Therefore, the element mass matrix must be
1 x3
+- hvo'Jj¡J ·
·der 4, are computed by the following All this is easily programmed.

~ctor (v0 , v~, v~' v'1)T: For the stiffness matrix, the l:ink H between the nodal parameters q and
1.7. THE FINITE ELE!\
58 AN INTRODUCTJON TO THE THEORY CHAP. 1 SEC.
the coe~cient vector a is unchanged. The only difference now is that - tion of new elements, and of combination
method and force methods, has been ovet
J: ((vh)') 2 dx = J(a 1 2a 2x + 3a 3 x 2)2 ATN¡A

stiffness matrices in their singular form,
role of all four nodal parameters v0 , v~, v
o o- o o For application to the problem -(pu
o h h2 h3 must be assembled into the global stiffi
coefficients, a typical row (or rather pairo
AT o h2 4h 3 3h 4 A. u. and u~ at the mesh point x 1 = jh) of tl
3 T }
3h 4 9h 5
o h3
·T 5 {oh( -36u1 _ 1 - 3hu~_ 1 + 12u,
The element stiffness matrix is k 1 HTN 1H. The results ofthese computa- (42a) + :;0(54u1- + I3hu~-• + 1
tions, and of a similar one for kz, are the following matrices (which are to be
completed by symmetry): F1 =· I f(x)"( ~ - j);
~[156
22h 54
4kl. 13h -!3h)
-3h 2 j0 (3u 1_ 1 hu~-• + Sh~ 3¡
ko 420 156 -22h ' qh2 .
4h 2
(42b) + 420(-13u1_ 1 - 3h~-t
F~ = Jf(x)a>( ~
3h ~36
3h) ' - i)·
e
1· 4h 2 -3h -h2
kl 30h 36 -3h
It is interesting to ·make sense of th.
4h 2 p 1, q O, and 1, so that the di:
~e
6h 12 finite element equations above become
4h 2 -6h 6h )
' 2h 2 .
1
k2 h3 6 u 1,.2;+_Lt_ ___,2u-='1 +-=--u..!..

1---!c1
12 -6h h2
i-·
-
4h 2
1
-S
Ui+t
2h
U·
J-t + S1 U, i - 3]
The matrix k 0 is positive definite, whereas k 1 has a zero eigenvalue corre-
sponding to the constant function vh 1, that is, to q = (1, O, 1, 0). The matrix
k 2 should be doubly singular, since every linear vh will have (vh)" O. Th~ After Taylor expansions, the first eqt
new null vector, corresponding to vh(x) x, is q = (0, 1, h, 1). and the second with -h 2 u"'f15 = O, wl
It is sometimes valuable to change k 1 and k 2 from the above singular the first. This is exactly the point at whi
element matrices to natural element matrices. This means that the matrices tributed a new and valuable idea to the
are made nonsingular by removing the rigid hody motions, which are the cases Instead of operating only with the unk
(vh)' O and (vh)" O. The orders of the matrices are reduced to 3 and 2, equation per mesh point, the finite elen
respective! y, and they now carry no redundant information · the fact that both displacements and slopes as unknc
' ' '
formally consistent with the original eq
(1, O, 1, O) is a null vector of k~' which is automatic because it corresponds
toa state with no strain, is taken for granted and not displayed. It appears is that high accuracy can be achieved,
that these "natural" matrices may make it simpler for a single program to approximated, without abandoning. the
accept a wide variety of elements, which is a tremendous asset; the prolifera:- equation. The more irregular the mesh
ORY
CHAP. 1 SEC. 1.7. THE FINITE ELEMENT METHOD IN ONE DIMENSION 59
!d. The on1y difference now is t~t tion of new elements, and of combinations between our present displacement
method and force methods, has been overwhelming. In the book we keep the
stiffness matrices in their singular form, because they display so clearly the
o role of all four nodal parameters v 0 , v~, Vp and v'1 •
o
For application to the problem -(pu')' + qu _.:_ J, these elementmatrices
h2 h3
must be assembled into the global stiffness matrix K. Assuming constant
4h 3 3h 4 coefficients, a typical row (or rather pair of rows, since there are two unknowns
T A.
2 u1 and u~ at the mesh point x1 jh) ofthe assembled K is
3h 4 9h 5
2 T
3~h( -36u1_ 3hu~_ 1 + 12u1 36u1+ + 3hu~+ 1 )
1 1
= nrNt H. The resu1ts of these computa-

re the fo1Jowing matrices (which are to be
(42a) + h_~(54u1 _ 1 + 13hu~_ 1 + 312u1 + 54u1 _ 13hú1-t) 1
22h 54 = F1 · J f(x)w( ~ - i)'

4h 2 13h -l3h)
-3h 2
156 -22h '
j 0 (3u 1_ 1 - hu~-t + 8hu~ 3u1+1 - hu~+t)
4h 2 (42b)
qh2 -
+ 420 ( 13u1 .,. 1 3hu~-t + 8hu~ + 13u1+ 1 -
, )
3hu1+t
3h -:-36
lh2 -3h 3h) '
-h2
36 -3h
4h 2 1t is interesting to make sense of this as a ditference equation. Suppose
6h j
12 p = 1, q =O, and f = 1, so that the d~fferential equation is -u" = 1. The
2 finit~ element equations above become
-6h 2h 2 .
12 -6h
4h 2
hereas k 1 has a zero eigenvalue corre- 2uj + u~-t) =O.

:= 1, that is, to q (l, O, 1, 0). The matrix
~very linear vh will have {vh)" o. The After Taylor expansions, the first equation is consistent with -u" = 1,
•(x) x, is q = (0, 1, h, 1). and the second with -h 2 u'"fl5 ~O, which can be derived by differentiating
ge kt and k 2 from the above singular the first. This is exactly the point at which finite element equations have con-
natrices. This means -that the matrices tributed a new and valuable idea to the established finite difference technique.
~ rigid body motions, which are the cases Instead of operating only with the unknown u1 , so that there is always one
tf the matrices are redu~ed to 3 and 2, equation per mesh point, the finite element difference equations may eouple
redundant information; the fact that both displacements and slopes as unknowns, the equation for the slope being
:h is automatic because it correspo~ds formally consistent with the original equation differentiated once. The effect
gra?te~ and not displayed. It appears is that high accuracy can be achieved, and derivatives of high arder can be
tke. tt SJmpler for a single program to approximated, without abandoning. the strictly local nature of the difference
;h ts a tremendous asset; the prolifera- equation. The more irregular the mesh and the more curved the boundaries,
60 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.7. THE FINI':
the more im.portant thi§ innovation becomes. It should be taken seriously by

the developers of difference schemes, since there-without being limited by
having to derive discrete analogues from polynomial displacements and the
Ritz method-even greater efficiency might be possible.
Suppose we apply Taylor expansions to test the order of accuracy of the
Hermite ,difference equations (42). Starting with the assumptions
:•.
(43)
-2 -1
and expanding v 1± 1 and v~± 1 about the central point jh, it is possible tó use
the original differential equation and its differentiated forms to check that Fig. 1.9 The cubic E-spline:
nodes.
En= e~. and that these terms disappear for n = 1, 2, 3. In other words, the
coupled difference equation is fourth-order accurate. This matches exactly Schoenberg. In p~rticular, Schoen
the error estimatesto be derive<:f variationally, justas both the Taylor series on [0, n] can be written as a linear e'
and the variational bounds on u- u1 were O(h 2 ) in the linear case. However,
an unanticipated and slightly distressing point emerged in [M8]: The asymp-
totic expansions (43) are spoiled by a mismatch at the boundaries, and the
finite eh:~ment errors seem to be less systemátic as h ____,.O than a simple power
series in h. The errors are, however, definitely of order h 4 • These basis functions tp'
are formt
There is stÜl another important cubic space, formed from those functions scaling the independent variable fn
for which even the second derivative is continuous at the nodes. A piecewise lie at the node jh:
cubic which has continuous second derivatives is called a cubic spline. This
tJJXx) =
space is again a subspace of the preceding one, the Hermite cubics, with
one new constraint at each of the N - 1internal nodes. Therefore, the dimen-
sion of the spline subspace is 3N 2(N +
1) = N 2. This means that In the expression for vh, the cons~
there is one unknown per mesh point, including the extra points x 0 = O been included. If lf'" and lf'~* are co1
(where v0 O but the slope v~ may be regarded as the free parameter) and essential condition, the admissible
xN+t n + h (or we may prefer that the last patameter be vjy). At interna! as
mesh points the displacement vi is the unknown, and the finite element
equations will once more look exactly like conventional finite difference
equations.
1t is no longer obvious which four nodal parameters determine the shape · In this form the N+ 2 unknowr
of the cubic over a given subinterval, say U I)h x <jh. The nodes would be q*, ... , qN+t· The unk
x 1 _ 1 and x 1, which belong to the interval, account for only two conditions, boundary condition imposes no e
and the other two must come from outside. Therefore, cubic splines cannot it would be interesting to see the e:
ha ve a strtctly local basis, and the shape of vh within an element must be affect- thereby removing the last unknow
ed by displacements outside the element. In fact, the spline which vanishés small.
at all nodes except one (a cardinal spline) will be nonzero over every subinter- Two final remarks on splines:
val between nodes.
To compute with these splines, we need to construct the one which equals l. It may be that their greatest
1 at the origin and which drops to zero as rapidly as possible (Fig. 1.9). This than minimization. A given relation
function is known as a basic spline, or B-spline. lt is fundamental to the whole approximated by a spline, while the
theory of splines, and was among the many remarkable discoveries of 1. J. tion is less convenient.
1
/
CHAP. 1 SEC. 1.7. THE FINITE ELEMENT METHOD IN ONE DIMENSIÓN 61
Jmes. lt should be taken seriously by cp(x)

.nce there-withmit being limited by
n polynomial displacements and the
ight be possible.
s to test the order of accuracy of the
:ing with the assumptions
-2 -1 o 2
central point jh, it is possible to use
:s differentiated forms to check that Fig. 1.9 The cubic B~spline: continuous second derivatives at the
for n 1, 2, 3. In other words, the nodes.
rder accurate. This matches exactly
Schoenberg. In particular, Schoenberg has proved that every cubic spline
onally, just as both the Taylor series
on [0, n] can be written as a linear combination ofB-splines:
~re O(h 2 ) in the linear case. However,
; point emerged in [M8]: The asymp-
nismatch at the boundaries, and the
!matic as h ~ O than a simple power
initely of order h 4 • These basis functions rp1 are formed from the B-spline in the figure by re-
; space, formed from those functions scaling the independent variable from x to xfh, and translating the origin to
:ontinuous at the nodes. A piecewise lie at the no de jh:
vatives is called a cubic spline. This
:ding one, the Hermite cubics, with
interna! nodes. Therefore, the dimen-
:N 1) N + 2. This means that In the expression for vh, the constraint that it vanish at the origin has not
including the extra points x 0 = O beep included. If cph and <~'!* are combinations of <p~ and cp~ which satisfy this
regarded as the free parameter) and essential condition, the admissible cubic splines can be characterized exactly
te last patameter be v~). At interna! as
~ unknown, and the finite element
· like conventional finite difference
,dal parameters determine the shape

In this form the N+ 2 unknowns in the firtite element system KQ = F
say (j- I)h < x <jh. The nodes
would be q*,, .. , qN+l· The unknown qN+l appears because the natura~
tl, account for only two conditions,
boundary condition imposes no constraint in the variational formulation;
side. Therefore, cubic splines cannot
it would be interesting to see the effect on u" of satisfying this condition and
f v" within an:element-must be affect-
thereby removing the last unknown. Presumably this effect is exponentially
t. In fact, the spline which vanishes
small.
will be nonzero over every subinter-
Two final remarks on splines:
!d to construct the one which equals l. It may be that their greatest importance comes in approximation rather
.s rapidly as possible (Fig. 1.9). This than minimization. A given relationship between data is well and conveniently
pline. It is fundamental to the. whole approximated by a spline, while the search for an unknown spline by minimiza-
tany remarkable discoveries of l. J. tion is less convenient.
62
J
AN INTRODlJCTION TO THE THEORY CHAP. 1 SEC. ].8. BOUNDARY-VJ
2. With unequally spaced nodes they are still nonzero over four intervals Equation (45) represents the bendh
(the mínimum for a cubic spline), and if every node x 2 N-t approaches x 2 N, the boundary conditions at that erid a
the E-splines degenerate into the Hermite basis functions rp and ro.
u(O) =u
It is easy to summarize the approximation properties ofthese more refined
spaces. If ~he polynomials are of degree k 1 (k = 3 for quadratics, k 4 and these are essential: Every trial fm
for cubicsi~;~then any smooth function u differs from its interpolate by origin. To see which are the natural col
in the equation a(u, v) = (J, v) for th
result is that for every v in the admiss
The errors in the derivatives lose one power of h for every differentiation: (46) f [(ru")"- (pu')' + qu- f]i
(44)
Th us if no conditions are imposed on 'l
This estímate makes sense only if u1 is known to possess s derivatives, in tions on u are those which physically
other words, only if the piecewise polynomials lie in :res. Therefore, there is end:
a limit s < q on the inequality (44): q = 1 for the eo quadratics and cubics, u"(n) =O, (p
which are in :JC 1 , q = 2 for Hermite cubics, and q = 3 for splines. For s > q
the estímate can still be proved element by element, but ó-functions appear There is an additional case of great in
at the nodes. supported: u(n) =O. This becomes :
These approximation results lead to the expected rates of convergence functions v must satisfy. Therefore, 1
for the finite element method, provided that the derivative u<k> has finite is automaÍically zero, with no conditi
energy: the slopes are in error by O(hk-t), the strain energy by O(h 2 <k-u), grated term is zero for all v only if í
and the displacements u uh by O(hk). Since these rates are confirmed must remain. Thus a simply supportc
numerically, there is good reason to compute with refined finite elements. natural boundary condition:
The final step in this section is to admit differential equations of fourth
order, u(n~ = 1
(45) Lu (ru")' 1 - (pu')' + qu =f. Other combinations of boundary c•

The operator: L is still formally self-adjoint, since u"' and u' are not allowed u'(n) = u"(n) O. But these are we
to occur by themselves, and it is positive definí te if r > rm in > O, p > O,
q o. 1.8. BOUNDARY-VALUE PROBLEM
The associated energy inner product is
DIMENSIONS
a(u, v) = f (ru"v" + pu'v' + quv) dx, This section introduces severa! pn

domain D bounded by a smooth cun
and the Euler equation for the minimization of I(v) = a(v, v) - 2(/, v) the differential forms of these probler
is Lu =f. For the Ritz method to apply, the trial functions vh must have the biharmonic operator 4 2 appear,
finite energy, which means that vh must lie in :JC 2 • The Hermite and spline This means that in the variational stat
cubics are therefore applicable, but not the polynomials which are only con- spaces in which the solution is sough
tinuous. They are "nonconforming," and to use them by ignoring the ó- boundary conditions, and as in-the t'
functions in their second derivatives at the nodes would in one dimensiort will be a distinction between Dirichl
guarantee disaster. Neumann conditions (natural con
CHAP. 1 SEC. 1.8. BOUNDARY-VALUB PROBLEMS IN TWO DIMENSIONS 63
1ey are still nonzer.o over foui intervals Equation (45) represents the bending of a beam. If it is c1amped at x = O,
d if every node x 2 N-I approaches x 2 N~ the boundary conditions at that erid are
mite basis functions rp and ru. ·
u(O) = u' (O) = O,
mation properties ofthese more refined
ree k 1 (k = 3 for quadratics, k = 4 and these are essential: Every trial function must have a double zero at the
1 u differs from its irr~erpolate by origin. To see which are the natural conditions at x = n, we integrate by parts
in the equation a(u, v) (f, v) for the vanishing of the first variation. The
~ Chk 11 u<k> llo· result is that for every v in the admissible space 3C};,
~ power of h for every differentiation:

(46) J [(ru")" (pu')' + qu f]v + ru"v'ln + (pu'- (ru")')??ln =O.
Thus if no conditions are imposed on v at x = n, the natural boundary condi-
is known to possess s derivatives, in
tions on u are those which physically correspond to a free (or cantilevered)
ynomials 1ie in :JC.s. Therefore, there is end:
= 1 for the eo quadratics and cubics
Jbics, and q 3 f.or splines. For s > ~ u"(n) =O, (pu' - ru'")(n) = O.
nt by element, but J-functions appear
There is an additional case of great importance, in which the beam is simply
to the expected rates of convergence supported: u(n) O. This becomes an essential condition, which the trial
:fed that the derivative u<k> has finite functions v mtist satisfy. Therefore, the last term in the first variation (46)
hk- 1), the strain energy by O(h2<k-I>), is automatically zero, with no condition on u at n. However, the other inte-
:hk). Since these rates are confirmed grated term is zero for all v only if u"(n) = O, and this natural condition
compute with refined finite elements. must remain. Thus a simply supported end combines one essential and one
admit differential equations of fourth natural boundary condition:
u(n) u"(n) O.
~pu')' + qu f
Other combinations of boundary conditions could be imagined, such as
joint, since u"' and u' are not allowed u'(n) u"(n) = O. But these are weird, both physically and variationally.
sitive definite if r r min > O'P >
_ O'
::t is 1.8. BOUNDARY-VALUE PROBLEMS IN TWO

DIMENSIONS
+ pu'v' + quv) dx, This section introduces severa] problems in the plane, or rather in aplane
domain n bounded by a smooth curve r. Our purpose is first ofall to match
timization of I(v) a(v, v) - 2(/, v) the differential forms of these problems, in which the Laplace operator 11 and
•ply, the trial functions vh must have the biharmonic operator 11 2 appear, with the equivalent variational forms.
st lie in 3C 2 • The Hermite and spline This means that in the variational statement we ha veto identify the admissible
t the polynomials which are only con- spaces in which the solution is sought. Naturally these spaces depend on the
and to use them by ignoring the J- boundary conditions, and as inA:he two-point boundary-value problem, there
tt the nodes would in one dimension will be a distinction between Dirichlet conditions (essential conditions) and
Neumann conditions (natural conditions). The examples are extremely
'¡
64 AN INTROJ?~TION TO T.HE THEORY CHAP. 1 SEC. 1.8. BOUNDARY-V
conventional, but they are the simplest models for plane stress and plate The sign in the differential equation is
bending~ and it seems worthwhile to illustrate once more the basic ideas:
L -11 =
l. The equivalence of differential and variational problems, the admissible
space for the latter being constructed by completion in the energy norm.
2. The vanishing of the first variation, giving the weak form a(u, v) will be associated with a positive, n
(f, v) of th~ equation, and leading to Galerkin;s method. (Lu, u).
3. The Ritz process of minimization over a subspace. As in the one.. dimensional exampl
the inhomogeneous term, in other w
Then the next section introduces the finite element method in earnest, includ- which the Dirichlet problern is to be
ing many of the most important choices of piecewise-polynomial "elements." . same as before:
The requirement that the boundary curve r be smooth introduces one
difficulty while preventing another. On the one hand, it is certain that the
11/llo (J JIJ
n
interior O cannot be subdivided into polygons, say triangles, without leaving
a· "skin" around · the boundary. The approximation theory in Chapter 3. lf this norm is finite, then f belongs
will reflect this difficulty. On the other hand, the smoothness of the boundary dimensional .Problems, this space inc
makes it reasonable to suppose that the solution is itselfsmooth. This property excludes á-functions.
follows rigorously from the· theory of elliptic boundary-value problems, As possible so1utions to the diffé1
if the coefficients in the equation and the data are also smooth. u which vanish on the boundary r at
In contrast, consider the problem uxx + uYY = 1 in a polygon, with u= O in O. The natural norm for this solut
on the boundary. For the unit square, the solution behaves radially like
r 2 log r near a corner, and the second derivatives break down. (Sorne break-
down is obvious, since at the corner uxx and uYY both vanish, while their sum
llullz [JJ (u
n
2
+u;+ u;
is one just inside.) In this problem u does ha ve second derivatives in the mean-
square sense; u lies in X 2 but not in X 3 • This can be verified directly by con- The space of all functions for whicl
structing u as a Fourier series. Therefore, the error estimates for piecewise second derivatives have finite energ
linear approximation will apply to this problem, but the accuracy associated for the Dirichlet problem is then,a sul
with higher-order elements cannot be attained without either refining the ary condition u o on r.
mesh at the corners, or introducirtg special trial functions which possess It should be clear from the defi1
the same kind of singularity as the solution u. For a nonconvex polygon, operator L -11 is a bounded oper;
for example an L-shaped region, u fails evento have second derivatives in a
l!Lullo::
mean-squáre sense. The solution must lie in X 1 , which is the admissible space
(or rather X 1 contains the admissible space, depending on the boundary_ The crucial point of the Dirichlet the
conditions), but since u behaves like r 2 13 at the interior angle of the L, it is the inverse of L-which is given by
excluded from X 2 • Such singularities, arising when r is rtot smooth, are yields a solution u which depends C4
studied in Chapter 8. that to every f there corresponds one '
We begin this section by studying the Dirichlet problem, governed inside
the domain by Poisson's equation (47)
-11u =f in n, The construction of this solutio

difference methods, most commonly
and on the boundary by the Dirichlet condition
(48)
u=O onr.
CHAP. 1 SEC. 1.8. BOUNDARY-VALUE PROBLEMS IN TWO DIMENSIONS 65
est models for pl~ne stress ~nd plate The sign in the differential equation is chosen so that the operator
illustrate once more the basic ideas:
nd variational problems, the admissible L=-A

i by completion in the energy norm.
Ltion, giving the weak form a(u, v) = will be associated with a positive, rather than a negative, quadratic fórm
Galerkin 's method. (Lu, u).
m over a subspace. As in the one-dimensional example, we need to prescribe sorne norm for
the inhomogeneous term, in other words to choose sorne set of data f for
1ite element method in earnest, includ- which the Dirichlet problem is to be solved. Our choice of norm will be the
~s of piecewise-polynomial "elements." same as befo re:
·y curve r be smooth introduces one
n the one hand, it is certain that the 11 f llo (JJ1f(x, Y) 1 dxdy)
2
112
n
olygons, say triangles, without leaving
approximation theory in Chapter 3 lf this norm is finite, then j belongs to the data space X 0 (0). ·As in one-
hand, the smoothness of the boundary dimensional .Problems, this space includes any piecewise continuous f and
solution is itselfsmooth. This property excludes c5-functions.
Jf elliptic boundary-value problems, As possible solutions to the differeniial equation we admit alJ functions
1e data are also smooth. u which vanish on the boundary r and have derivatives up to second order
-"x + u:r:r 1 in a polygon, with u O in Q. The natural norm for this solution space is
·e, the solution behaves radially like
lerivatives break down. (Sorne break-
' and u:r:r both vanish, while their sum
11 ull2 = [J J(u 2+u;+ u;+ u;x + u;:r + u;:r) dxdyJ12 ·
n
s ha ve second deriva ti ves in the mean-
:. This can be verified direct1y by con- The space of all functions for which this norm is finite-functions whose
ue, the error estimates for piecewise second derivatives have finite energy_!_is X 2 (Q). The solution space Xi
problem, but the accuracy associated for the Dirichlet problem is then a subspace of X 2 , determined by the bound-
attained without either refining the ary condition u= o on r.
pedal trial functions which possess It should be dear from the definitions of the norms that the Laplace
lution u. For a nonconvex polygon, operator L = -A is a bounded operator from Xi to X 0 :
even to have second derivatives in a
e in X 1 , whích is the admissible space IILullo < Kllull2·
space, depending on the boundary
3 The crucial point of the Dirichlet theory goes in the opposite directioQ: that
' at the interior angle of the L, it is
the inverse of L-which is given by the Green's function of th~ problem-
arísing when r is riot smooth, are
yields a solution u which depends continuously on the data f. This means
that to el'ery f there corresponds one and only one u, and for some constant p,
e Diríchlet problem, governed inside
(47) 11 U llz P 11 f llo ·
in n, The construction of this solution could be attacked directly by finite
difference methods, most commonly by the .five-point difference equation
mdition
on r. (48)
66 AN INTRODUcjiON TO THE THEORY CHAP. 1 SEC. 1.8. BOUNDARY-VA
Near the boundary this equation must be modified, but it is possible to pre- on r, and u satisfies Poisson's equatic
serve bÓth the second-order accuracy of the scheme and the discrete maxi- Therefore, u is minimizing.
mum principie which is obviou~ from (48): If/ =O, then U1,¡ cannot exceed Of course, this process works equal
all four ofthe values U1±t,J±t· (We do not know whether there is a theoretical Poisson equation -Au =fis put into
limit to the order of accuracy of schemes satisfying a maximum principie. v which vanishes on the boundary, inte
It is certain that as the accuracy increases, the boundary conditions on the Ieft side by Green's theorem. The we~
difference ;~quation will become extremely complicated.) 1t is to this form that Galerkin's meth
Withou't· going into details, we note that the equation implicitly requires on a subspace Sh and the solution uh is
fto be continuous, in order for J;,¡ f(i Ax,j Ay) to be well defined. When a(v, v) is self-adjoint and positive definit
f is less smooth, sorne averaging process (which will be built into the varia- vanishing of the first variation are equi
tional methods !) may be applied. The best estima te for the five-point scheme (f, v) retains its meaning, however, eve
in a square seems to be This argument depended on the th
u. Supp'ose, instead, that we begin wi
max 1U 1, ¡ - u1,¡ 1< Ch 2 1ln hl max 1/1· its J)linimum directly. Then the first tas
i,j
v are admissible in the minimization.
Here h Ax Ay. The extra factor !In h l, which was not needed ,in one The reasoning runs parallel to the on
dimension, is required by the r 2 Jog r behavior ata corner. Any smooth v which vanishes on r mus1
For us the important formulation ofDirichlet's problem is the variational happen to be _:_Av, this function v woul
one: the admissib/e functions V vanish on the boundary r, and the so/ution U On the other hand, the requirement tb
is the one which minimizes the quadratic functional is more than is necessary for I(v) to b(
admissible space: if a(v- vN, v- vN:
I(v) = JJ(v; + v; - 2fv) dxdy. then v will also be admissible. This e
n which is intuitively natural: v must va
an essential one), but it has only to poss
The first step is to check that a solution u to the differential problem does sense. In other words, the quadratic ter
minimize /. Perturbing u in any direction v, make sense,
(51) a(v, v) JJ (v; -i
We have only to show that the coefficient of E is zero. Then since the coeffi- The functions satisfying (51) lie in 3C 1 l
cient of é is positive unless vis constant, which would imply by the boundary the Dirichlet condition V = o on r is d
condition that v O, u will be the unique function which minimizes /. The would be 3C1, but in the special case
vanishing of the coefficient of f, that is, of the first variation, means that o .
3CA; the notation 3C 1 also appears in thc
sible functions for the Dirichlet problem
(49) for all admissible v. It is important to emphasize that
retained for the whole admissible spacc
To prove that the solution to the Dirichlet problem satisfies this identity, stubborn enough to remain vaJid in th
we apply Green's theorem; or in other words, we integrate by parts o ver n: condition v O is stable in the 3C 1 nO:
and converges in the natural strain en
(50) -ff (Uxx + Uyy + f)v dxdy + fr UnV ds = 0, vanishes on r.
n This stability of the boundary con<
where un is the derivative in the direction ofthe outward normal. Since v ·o (52) -Au + qu =
CHAP)l SEC. 1.8. BOUNDARY-VALUE PROBLEMS IN TWO DIMENSIONS 67
\t be modified, but it is possible to'pre- on r, and u satisfies Poisson's equation in Q, the first variation vanishes.
' of the scheme and the discrete maxi- Therefore, u is minimizing.
(48): If/ =O, then U1,¡ cannot exceed Of course, this process works equally well in the opposite direction. The
not know whether there is a theoretical Poisson equation -Jiu = f is put into its weak form by multiplying by aily
emes satisfying a maximum principie. v which vanishes on the boundary, integrating over Q, and transforming the
·eases, the boundary conditions on the left side by Green's theorem. The weak form is exactly the equation (49).
!mely complicated.) ~··\ It is to this form that Galerkin's method app1ies; this equation is imposed
te that the equation i'inplicitly requires on a subspace Sh and the solution uh is also in Sh. As we see, when the form
f(i Ax,j Ay) to be well defined. When a(v, v) is self-adjoint and positive definite, the Ritz minimization and Galerkin's
;ess (which will be built into the varia- vanishing of the first variation are equivalent. The weak equation a(u, v)
best estímate for the five-point scheme (/, v) retains its meaning, however, even without self-adjointness.
This argument depended on the theory of elliptic equations to produce
u. Suppose, instead, that we begin with the quadratic I(v) and try to find
~ Ch 2 jln h 1 max 1f j. its mínimum directly. Then the first task is to decide exactly which functions
v are admissible in the minimization.
· jln h j, which was not needed in one The reasoning runs parallel to the one-dimensional example of Section 1.5;
Jehavior ata corner. Any smooth v which vanishes on r must be made admissible, since if f should
of Dirichlet's problem is the variational happen to be -Av, this function v would be the minimizing solution we want.
on the boundary r' and the solution u On the other hand, the requirement that v be smooth, say that v lie in X~,
ic functional is more than is necessary for I(v) to be defined. Therefore, we complete the
admissible space: if a(v- vN, v vN)---+ O for sorne sequence vN in X~,
v; - 2/v) dxdy. then v will also be admissible. This completion process ·leads to the class
which is intuitively natural: v must vanish on r (the Dirichlet condition is
an essential one), but it has only to possessjirst derivatives in the mean-square
1tion u to the differential problem does
sense. In other words, the quadratic term a(v, v) in the functional I(v) should
ion v,
mak~ sense,
(51) a(v, v) JJ(v! + v;) dxdy < oo.

.ent of E is zero. Then since the coeffi- The functions satisfying (51) líe in 3C 1(Q); the subspace which satisfies also
lt, which would imply by the boundary the Dirichlet condition v O on r is denoted by 3CA(Q). (Our own notation
ique function which minimizes l. The would be 3C 1, but in the special case of the Dirichlet problem we accept
is, of the first variation, means that o
3CA; the notation 3C 1 also appears in the literature.) This is the space of admis-
sible functions for the Dirichlet problem.
for all admissible v.
It is important to emphasize that the boundary condition v O was
retained for the whole admissible space, not by fiat, but only because it was
tirichlet problem satisfies this identity, stubborn enough to remain valid in ·the limit. In other words, the Dirichlet
· words, we integrate by parts over Q: condition v = O is stable in the 3C 1 norm: If the sequence vN vanishes on r
and converges in the natural strain energy norm to v, then the limit v also
dxdy +Ir unv ds O, vanishes on r.
This stability of the boundary condition is not present. for the equation
1n of the outward normal. Since v O (52) -Jiu+ qu = f in Q
68 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.8. BOUNDARY-VAI
with a natural boundary condition This differenée between the Dirid

appear also in the theory, at the poin
un o on r. ~ is equivalent to the Stf!ndard
The functional for this Neumann problem is ThlJS the crucial question for existence ,
problem is elliptíc; is there a constant (
I(p) = a(v, v)- 2(/, v) = JJ (v! + v; + qv 2
- 2fv) dxdy.
(53) a(v,v) ullvm f
The differential equation looks for a solution u in Xi, that is, possessing two For the equation -Au + qu this i
derivatives and satisfying the Neumann condition un O. However, every
function v in X 1 is the limit of a sequence taken from Xi; Xi is dense in
X 1 • After completion, therefore, the admissible space for the Neumann
variational prob/em is the who/e space X 1 (Q). As a result, no boundary condi-
tions are imposed· on the trial functions vh in the Ritz method; any subspace With q > O this ellipticity is obvious,
Sh e X 1 is acceptable. In practica} applications ofthe finite element method~ (or uh, in the Ritz method). With q =O t:
this means that the values of vh at boundary points are not constrained, yield- lem; if v 1, the left si de is zero and ·
ing a modest simplification in comparison with the Dirichlet problem. (In fact, problem, the rigid body motion v 1
the increased numerical instability due to natural boundary conditions leads of Poincaré type does hold:
us to doubt that the advantages are entirely on the side of Neumann.)
Of course the minimizing function u (not the approximate mínima uh)
must somehow automatically satisfy the Neumann condition when enough
smoothness is present. This is confirmed l?Y equation (50) for the first varia- Thus, ellipticity holds even for q = O in
tion, which now vanishes for all v in the admissible space X 1(Q). In the city simply means that q exceeds Amax' ·
first place, W = Uxx + Uyy + j must V~nish throughout Q, since V may be operator A. In the Neumann case Amax =
chosen equal to w in any small circle inside r and zero elsewhere, giving q = O fails to achieve ellipticity. In the 1
JJ w2 O. Once w O is established, it follows easily that un = O on the lem would even remain elliptic for a rar
boundary. Therefore, the Neumann conditiori holds for u, even though it It is, of course, possible to impose u
is not imposed on all admissible v. say rl, and to have un o on r2 r
If q =O in (52), then obviously no solution u is unique; u+ e will be mixed prob/em contains those v in X 1 '
a solution for any constant c. This single degree of freedom suggests, if will ha ve a singularity at the junction b
Fredholm 's theorem of the alterna ti ve is to hold, that there must be a single A further possibility is that the bom
constraint on the data f Integrating both si des of the differential equation of an oblique derivative:
- Au f o ver Q, the left si de vanishes by Green's formula (50) with v· l.
Therefore, the constraint onjis: No solution to the Neumann problem with un+ c(x, y)u$ =
q = O can exist unless JJf dx dy = O. This nonexistence can be verified
explicitly for the one-dimensional Neumann problem where us is the tangential derivative. TI
associated with the inner product
u'' 2, u'(O) = u'(l) O.
The parabola u = x 2 + Ax + B, which gives the general solution to u" = 2,

cannot be made to satisfy the bouhdary conditions. Correspondingly,
This inner product yields Poisson's equ
uxvx + uyvy. In fact, Green's identity tr
J u" u'(l) - u'(O) = O =1= J 2.
In contrast, the solution to the Dirichlet problem is unique. - JJ(Au + f)v + J
CHAP. 1 SEC. 1.8. BOUNDARY-VALUE PROBLEMS IN TWODIMENSIONS 69
This differenée between the Dirichlet and Neumann conditions must

appear also in the theory, at the point of verifying that the energy norm
on r. Ja(v,V) is equivalent to the st_andard norm 11 v 11 1 on the admissible space.
blem is Thus the crucial question for existence and uniqueness is whether or not the
problem is elllptic; is there a constant a > O such that
f (v;_ + v; + qv 2
1,fv)
["·,
dxdy.
(53) a(v,v) allvlli for all admissible v?
;olution u in x¡, that is, possessing two For the equation - Au + qu = f, this is the same as
ann condition u, O. flowever, every
::¡uence taken from Xi; Xi is dense in
1e admissible space for the Neumann
3C 1 (!l). As a result, no boundary condi-
ns v11 in the Ritz method; any subspace With q > O this ellipticity is obvious, and there is a unique minimizing u
·plications of the finite element method, (or u\ in the Ritz method). With q ==O ellipticity failsfor the Neumann prob-
ndary points are not constrained, yield- lem; if v = 1, the left si de is zero and the right si de is no t. In the Dirichlet
son with the Dirichlet problem. (fn fact, problem, the rigid body motion v = 1 is not admissible, and an inequality
1e to natural boundary conditions leads of Poincaré type do es hoid:
mtirely on the side of Neumann.)
·n u (not the approximatf! mínima u11) JJv; + v; > a' JJvz for v in Xó.
the Neumann condition when enough
ted by equation (50) for the first varía- Thus, ellipticity holds even for q O in the Dirichlet problem. In fact, eiJipti-
n the admissible space X 1(!l). In the city simply means that q exceeds 1malo the largest eigenvalue of the Laplace
vanish throughout !l, since v may be operator A. In the Neumann case Amax O, with v = 1 as eigenfunction, and
e inside r and zero elsewhere, giving q =O fails toachieve ellipticity. In the D~richlet case Amax <O, and the prob-
1, it follows easily that u, o on the ]em would even rema in elliptic for a range of negative values of q.
:=ondition holds for u, even though it It'is, of course, possible to impose u= O on on]y a part ofthe boundary,
say r 1 , and to ha ve u, O on r 2 r r 1 • The admissible space for this
10 solution u is unique; u+ e will be mixed problem contains those v in X 1 which vanish on r P .and the solution
single degree of freedom suggests, if will ha ve a singularity at the junction between r 1 and r 2'
is to hold, that there must be a single A further possibility is that the boundary condition express the vanishing
both sides of the differential equation of an oblique deriJ•atil'e:
s by Green's formula (50) with v· l.
olution to the Neumann problem with u,+ c(x, y)us o on r,
). This nonexistence can be verified
tmann pro,blem where us is the tangentia1 derivative. This is the natural boundary condition
associated with the inner product
)) u'(l) =O.
1 gives the general solution to u" = 2,

ry conditions. Correspondingly,
This inner product yields Poisson's equ~tion just as surely as the integral of
t'(O) = O =F f 2. uxvx + uyvy. In fact, Green's identity transforms a(u, v) = (/, v) into
et problem is unique. -ff (Au + f)v + f (u,+ cus)v ds =o.

70 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.8. BOUNDARY-V)
This illu.strates that there is no unique way of integrating (Lv, v) by parts to We emphasize that the first new term
obt,ain an energy norm a(v, v). Different manipulations of(Lv, v) lead to differ-
ent forms for a(v, v) and to correspondingly different natural boundary
conditions. This is illustrated again below, where Poisson's ratio enters the
a( v, v) = JJ(v;
boundary condition and the energy form a(v, v) but not the operator L = A2 •
Inhom9geneous boundary conditions for the equation - Au = f are of and thus contributes to the stiffness n
two kinds/:either a prescribed displacement linear term - 2bv adds a boundary ce
admissible space is the w~hole of3C 1 , a
u = g(x, y) on r which is continuous between elements
We want to propose also three pro
ora prescribed traction, which can be included in the general case ofNewton's the biharmonic equation
botindary condition
(54)
un + d(x, y)u = b(x, y) onr.
This equation is a model for the tran
Variationally these two are completely different. 'fhe first is an inhomogene- bent by the body force f(x, y), with b
ous Dirichlet condition and must be imposed on the trial functions; the solu- As usual, the number m of boundary <
tion u minimizes JJ v; + v; - 2fv over the class offunctions v in JC 1 satisfy- · equation, so that m = 2.
ing v = g on the boundary. Notice that this admissible class JC1 is no longer A first possibility is to impose Diri
a "space" of functions; the su m of two admissible functions would equal the problem of a clamped plate:
2g on the bounda·ry and would be inadmissible. However, the difference
of any two admissible functions vanishes on the boundary ~nd líes exactly u= O and u
in the space JCJ. The easiest description of JC1 is therefore the following:
Choose any one member, say G(x, y), which agrees with g on the boundary, An alternative of great interest is the
and then every admissible v equals G + v 0 , where v 0 lies in JCJ. In short, in which one boundary condition is
natural. As in the oblique derivative 1
JC1 = G + JCJ.··. equation, the form of the natural cm
variational integral l(v). In elasticity th
In the Ritz method it is not requiréd. that the tria/ functions agree exactly with ratio v, which measures the change in
g on the boundary. lt is enough if thé trial functions are of the form vh = Iengthwise; v = 0.3 is a common choic
Gh(x_, y) + L; qitp1(x, y), where the '7'1lie in JCA and Gh nearly agrees with g. by physical reasoning are
This means that the trial class is
(55) u = O and vll.u + (J
The cases v = O and v = l, which m
and that S8 is a subspace of JCA. Section 4.4 verifies that the basic Ritz theo- course be included. Note that on a stra'
rem 1.1 applies to this case (and ultimately we remove evén the hypothesis are all zero, Au = unn' and V disapp~
that S8 e JCA, but this leads outside of the Ritz framework). remarkable and apparently paradoxic
Finally we consider the boundary condition un + du = h. This will be a for polygonal approximation of a circ:
natural boundary condition, provided the basic functional JJ v; + v; - 2fv Finally, there is the pure Neuma
is altered to take account of d and b. Both alterations appear as boundary boundary. The second of the bounda1
terms: written as
I(v) = JJv; + v; ·_ 2fv + Jr (dv 2
- 2bv)ds. -M
_
D
_nn=U
nn
+1'
Q
CHAP. 1 SEC. 1.8. BOUNDARY-VALUE PROBLEMS IN 1WO DIMENSIONS 71
'!way of integrating (Lv, v) by parts to We emphasize that the first new term enters the energy
t manipulations of (Lv, v) lead to differ-
pondingly different natural boundary
dow, where Poisson's ratio enters the a(v, v) = JJ(v! + v;) + Jdv 2
,
~m a(11, v) but not the operator L L.\ 2 •

ons for the equatio~ - L.\u = f are of and thus contri bu tes to the stiffness matrix K in the Ritz method. The new
!ment r\ linear term - 2bv adds a boundary contribution to the load vector F. The
admissible space is the whole of 3C 1 , and therefore any piecewise polynomial
·) on r which is continuous between elements can be used as a tria! function.
We want to propase also three problems of fourth order, all governed by
ncluded in the general case ofNewton's the biharmonic equation
(54) inn.
b(x, y) on r.
This equation is a model for the ·transverse displacement u of a thin plate
' different. The first is an inhomogene- bent by the body force f(x, y), with bending stiffness normalized to D l.
1posed on the trial functions; the sotu- As usual, the number m of boundary conditions will be half the order of the
er the class offunctions v in 3C 1 satisfy- equation, ,so that m = 2.
Lt this admissible class 3C1 is no Ionger A first possibility is to impose Dirichlet conditions, leading physically to
wo admissible functions would equal the problem of a clamped plate:
nadmissible. However, the difference ·.
:hes on the boundary and lies exactly u= o and un= o on r.
:on of X 1 is therefore the following:
which agrees with g on the boundary, An altemative of great interest is the próblem of a simply supported plate,
+
V 0 , where V 0 lies in XA. In short, in which one boundary condition is essential (or forced) and the other is
natural. As in the oblique derivative problem described above for Poisson's
equation, the form of the natural condition will depend on the form of the
variational integral J(v). In elasticity this natural condition involves Poisson's
zt the tria! functions agree exactly with ratio v, which measures the change in width when the material is stretched
trial functions are of the form vh lengthwise; v 0.3 is a common choice. The boundary conditions determined
ie in XA and Gh nearly agrees with g. by physicai reasoning are
,y)+ Sg
(55) u = O and vi.\ u + (1 - V )unn O onr.
The cases v =O and v = 1, which may arise in other applications, will of

n 4.4 verifies that the basic Ritz theo-
course be included. Note that on a straight boundarythe tangential derivatives
ately we remove evén the hypothesis
are all zero, L.\u unn, and v disappears from the boundary condition; the
the Ritz framework).
remarkable and apparently paradoxical consequences of this disappearance,
ondition un + du h. This will be a
for polygonal approximation of a circle, are discussed in Section 4.4.
te basic functional f f v! + v; - 2f v
Finally, there is the pure Neumann problem, corresponding to afree
Both alterations appear as boundary
boundary. The second of the boundary conditions (55) remains; it is often
written as
SEC. 1.8. BOUNDARY~VALl
where rp is the angle between the normal and the x~axis. The condition u = O, With all edges free ellipticity must fail, si
which fixes the edge, no longer applies. To fill its place one has to compute + +
a linear function a bx cy; with the e
the coefficient of Ju, when the energy functional in (56) below is varied. This qu = f, corresponding to a plate under
coefficient is given by Landa u and Lifshitz explicitly in terms of u and rp; this case is also elliptic.
it is normally written in the contracted Kirchhoff form Qn + aMnsfas O. Elli ptici ty Qan become difficult to pr'
In practic;;tl problems it is un usual for the whole boundary to be free. elasticity-there are two or three unknm
All th¿·se conditions can be coped with somehow by directly replacing only certain combinations e.iJ !(ut,J +
derivatives by finite differences. The construction of accurate equations near inequality which establishes that this str
the boundary becomes fantastically complicated, however, anda much better ~
~ f e. ..e. .. > fJ' ~ f U¡ Jui
l} l} - .~ ,. '
¡·· 1'" '
starting point is furnished by the variational Jorm. Before proceeding to the constructic
Variationally, the problem is to minimize of these problems,we want to comment
Jn one dimension they were deceptivel
I(v) = a(v, v) 2(f, v) :res[O, lJ if it is the s~fold integral of a
(56) guarantees that v and its first s - 1 e
JJ (v!~ + v;Y + 2vv~xvyy + 2(1 - v)v;Y 2fv) dxdy sth derivative, that is, the original funct
Q
In the. plane" it is possible for v to b
di.fferentiab/e. One such function in :JC 1 i:
subject to the appropriate boundary conditions. In the Neumann problem
there are no constraints at the boundary, and the space of admissible v is
exactly 3<! 2 (0). For the clamped plate, the Dirichlet conditions v O and Jfv;+v;= Jf[v;+
vn = O are imposed; the subspace satisfying these constraints is 3<!5(0). In
the intermediate case of the simply supported plate, only the condition v = O
is essential; we denote the corresponding subspace by x;s,
noting that
= JJ(log r)
OC5 e: x;s e: OC • Of course, it is Green's theorem which yields the equiva~
2
Thus the derivatives have finite energy,
lence of these variational problems with their differential counterparts.
is not continuous at the origin. The gent
We emphasize that the fourth derivatives which appear in the term l1 2 u
ing: lf v is in :res and s > n/2, then ~ is C1
in Green's theorem are not required for the variational formulation-the
theorem of minimum potential energy-to be valid. In fact, the opposite is (57) max 1v( x 1 , ••• , ;
true. Any limitation to solutions with four continuous derivatives would be
awkward in the extreme, and the whole idea of completion is to arrive at the This is the essence of the celebrated Sol
admissible space under mínimum restrictions, requiring only that the es~ erties of continuity and finite energy o
sential boundary conditions hold and that the energy a(v, v) be finite. The s nf2, it cannot be guaranteed that
solution u then satisfies the equation in its weak form a(u, v) (f, v). By duality, Sobolev's rule will als'
F or the theory of the Ritz method, the key point is that the energy norm function lies in :re-s. Of course, this is
,.ja(v, v) should be equivalent to the standard norm llvl! 2 • This is again the after enough integrations that the J.
condition of ellipticity: For sorne fJ' > O and all admissible v, The norm on :re-s is defined as in (12):
a(V, V)> fJ' 11 V lli

or (58) llwll-s max
ti
'11'Vjl)l =
S
1
If w is a J-function, say at the origin, t

JJv!~ + v;y + 2vvxxvyy + 2(1 v)v!Y
> fJ' JJ(V + v! + v; + v!~ + v!y + v;y).

2
)RY CHAP. 1 SEC. 1.8. BOUNDARY-VALUE PROBLEMS IN 'IWO DIMENSIONS 73
·rmal and the x-axis. The co~dition u = O With all edges free ellipticity must fail, since the solution is unique only up to
?líes. To fill its place one has to comput~ a 1inear function a+ bx + cy; with the differential problem altered to A.2u +
gy functional in (56) below is varied. This qu ,corresponding to a plate under steady rotation with q pro 2 > O,
l Lifshitz explicitly in terms of u and rp; this case is aJso elliptic .
.cted Kirchhoff form Qn + aMnsfas O. Ellipticity can become difficult to prove when-as in problems of linear
or the whole boundary to be free. dasticity-there are two or three unknowns u1, and the strain energy involves
ped with somehow ~y directly replacing only certain combinations s .. = zt (u . . +u,¡, 1) oftheir derivatives. It is Korn's
lj l,j
~ construction of accurate equations near inequality which establishes that this strain energy dominates the 3C 1 norm,
complicated, however, and a much better "f
,¿_,¡ et.e..
J l} ,¿_,¡ fu ..u...
u"· l,j l,j , •
·ariational form. Befo re proceeding to the construction of finite elements for the solution
minimize of these problems, we want to comment briefty on the function spaces acs(Q).
1 In one dim~nsion they were deceptively ..simple to describe; v belongs to
3Cs[O, IJ if it is the s-fold integral of a function f with f / 2 dx < oo, This
r guarantees that v and its first s - 1 derivatives are continuous; only the
I sth derivative, that is, the original function J, might have jumps or worse.
In the plane' it is possible for v to be discontinuous and at the same time
y conditions. In the Neumann problem differentiqb/e; One such function in 3C 1 is v Iog log 1/r, on the circle r -!:
ndary, and the space of admissible v is
!lte, the Dirichlet conditions v = O and
atisfying these constraints is 3CMO). In
JJ v; + v; JJ[v; (v:YJrdrdO
= JJ(log r)-: d(log r) dO =
upported plate., only the condition v = O 2
Jonding subspace by x;n noting that ;: ·
1 2
reen's theorem which yields the equiva-
v'Íth their differential counterparts. Thus the derivatives have finite energy, and so does v itself, but the function
:rivatives which appear in the term A2u is not continuous at the origin. The general rule in n dimensions isthefollow-
d for the variational formulation-the ing: Jf v is in ¡res and s > n/2, then v is continuous and
v-to be valid. In fact, the opposite is
(57)
h four continuous derivatives would be
>le idea of completion is to arrive at the This is the essence of the celebrated Sobolev inequality, relating the two prop;.
!strictions, requiring only that the es- erties of continuity and finite energy of the derivatives. If v líes in acs with
d that the energy a(v, v) be finite. The s < n/2, it cannot be guaranteed that · v is continuous.
in its weak forro a(u, v) = (J, v). By duality, Sobolev's rule will also decide when the n-dimensional o-
l, the key point is that the energy norm function líes in ¡re-s. Of course, this is impossible unless -:S < O; it is only
standard norm llv 1! 2 • This is again the after enough integrations that the o-function might have finite energy.
> O and aH admissible v, The norm on ¡re-s is defined as in (12):
(58) llw 11-s

If w is a o-function, say at the origin, this gives
l.¡ o¡¡ -s. = max

v
lv(O)I ·
11 V lis
74 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.9. TRIAN
Accord~ng to Sobolev's inequality (57), this is :finite and the ó-function is great importance; that is a rare and h
in x-s if and only if s > n/2. piecewise polynomials which are de1
In particular, the ó-function in the plane' is not in x- 1, and correspond- ofnodal values, and yet have the desi
ingly the fundamental solution of Laplace's equation, u = log r, is not in mation.
:JCl: There are a great many competi
writing whether it is more efficient t
00, into quadrilaterals. Triangles are ob'
boundary, but there are advantag~
Therefore a point-loaded membrane is, strictly speaking, inadmissible in the rectangles) in the interior: there an
variational prob/em. simple elements of high degree. The:
For a plate the situation is different, since the differential equation is of choice may be a mixture of the two ¡
order 4. The solution space is 3C 2 (with boundary conditions) and the data by nodes which ensure the required
space is four derivatives lower, that is, x- 2 • In the plane the ó-function does We begin by subdividing the ba
lie in ::JC- 2 , and a point-loaded plate is acceptable.
There is one additional question about these function spaces 3'C 8 which r
is fundamental for :finite élements: What is the condition for an e/ement to be
conforming? In other words, given a differential equation of order 2m in n
independent variables, which piecewise po/ynomials /ie in the admissible space
3C~? It is easy to check on the essential boundary conditiorts; the only
question is how smooth the element must be to líe in :JCm ~
The standard conforming condition is well known: The trial function and
its :first m - 1 derivatives should be continuous across the element boundaries.
This condition is clearly sufficient for admissibility, since the mth derivatives ·
have at worst a jump between elements, and their energy ·is finite. On the
other hand, the example log log 1/r may seem to put in doubt the necessity
of this condition for conformity; there exist.functions which do not have
m - 1 continuous derivatives, but which nevertheless lie in 3Cm and are
admissible. Fortunately, however; such troublesome functions cannot be Fig. 1.10 Subdivision of the 1
piecewise polynomials. If a function v is a polynomial (or a ratio of polyno-
mials) on each side of an element boundary, then v /ies in xm if and only if The union of these triangles will be
every derivative of order /ess than m is continuous across the boundary. The a curved boundary-there will be a
success ofthe finite element method lies in the construction of such elements, edge of the jth triangle will be denote
retaining both a convenient basis and a high degree of approximation. convenience that Qh is a subset of !
part way along the edge of another. Ir
has. to know the location of every
1.9. TRIÁ'NGULAR ANO RECTANGULAR should be carried out as far as possi
ELEMENTS matic subroutine can begin by cover
then make the necessary adjustmen
This section describes sorne of the most important finite elements in the very coarse in one part of n and m
plane. Their construction has been going on for 30 years, if we include Cou- has corners and other points of sin~
rant's early paper on piecewise linear elements, and it seems to have been one the triangulation by hand. In any
of the most enjoyable occupations in applied mathematics. lt involves not refinement of a coarse mesh, which
much more than high-school algebra, and at the same time the results are· of each triangle into four similar triang
CHAP. 1 SEC. 1.9. TRIANGULAR AND RECTANGULAR ELEMENTS 75
this is finite and the J-function is great importance; that is arare and happy combination. The goal is to choose
piecewise polynomials which are determined -by a small and convenient set
.ane is not in X-I, and correspónd- of nodal values, and yet have the desired degree of continuity and of approxi-
:tce's equation, u = log r, is not 'in mation.
There are a great many competing elements, and it is not clear at ·this
writing whether it is:more efficient to subdivide the region into triangles or
r drd8 = JJr- 1
drd~ = oo. into quadrilaterals. Triangles are obviously better at approximating a curved
boundary, but there are advantages to quadrilaterals (and especially to
trictly speaking, inadmissible in the rectangles) in the interior: there are fewer of them, and they permit very
simple elements of high degree. These remarks already suggest that the best
since the differential equation is of choice may be a mixture of the two possibilities, provided they can be united
boundary conditions) and the data by nodes which ensure the required continuity across the junction.
2
- • In the plane the o-function does We begin by subdividing the basic region Q into triangles (Fig. 1.10).
ceptable.
ut these function spaces x_s which
r
is the condition for an element to be
erential equation of order 2m in n
lynomials lie in the admissible space
ial boundary conditioris; the only
:be to lie in xm.
well known: The trial function and
uous across the element boundaries.
1issibility, since the mth derivatives
and their energy is finite. On the
seem to put in doubt the necessity
!xist functions which do not have
:h nevertheless lie in x_m and are
troublesome functions cannot be Fig. 1.10 Subdivision of the polygon Qh into irregular triimgles.
L polynomial (or a ratio of polyno-
uy, then v lies in x_m if and only if The union of these triangles will be a polygon 0\ and in general-if r is
mtinuous across the boundary. The a curved boundary-there will be a nonempty "skin" O- Qh. The Iongest
; the construction of such elements, edge ofthe jth triangle will be denoted by hj, and h max hr We assume for
igh degree of approximation. convenience that Qh is a subset of Q and that no vertex of one triangle líes
part way along the edge of another. In practice-since the computer eventually
has to know the location of every vertex zi (xj, yj)-the triangulation
should be carried out as far as possible by the computer itself. A fully auto-
matic subroutine can begin by covering O with a regular triangular grid and
then make the necessary adjustments at the boundary. If the grid is to be
•st important finite elements in the very coarse in one part of Q and more refined in another, or if the domain
on for 30 years, ifwe include Cou- has corners and other points of singularity, it may be necessary to establish
ents, and it seems to have been one the triangulation by hand. In any case, George has argued [G2] that the
plied mathematics. It involves not reftnement of a coarse mesh, which can be achieved simply by subdividing
at the same time the results are of each triangle into four similar tria:ngles, is a mechanical procedure for which
76 AN INTRODUCTION TO THE THEORY CHAP • .1 SEC. 1.9.
the co.mputer is ideally suited. This a1lows the user to begin by describing as a linear combination
a coarse mesh to the computer, at a minimpm cost in time and effort, and
v"(x,y)
then to work eventually with a fine mesh. Furthermore, he can more easily
carry out the one check on his program which is the mostconvincingand high-
ly recommended of ali: to test whether a reduction in h leads toan acceptable In this context the coordinate q has
1
or an unficceptable change in the numerical solution. placement vh at thejth node z1 = (x1

Givert:~a triangulation, we describe now the simplest and most basic of elements which have been developt
all trial functions. It is linear inside each. triángle (vh . a 1 + a 2 x + a 3 y) each q1 can be associated with the v
and continuous across each edge. Thus the graph of vli(x, y) is a surface made specific nodal point in the domain.
up offlat triangular pieces, joined along the edges. This is an obvious generali- The optimal coordinates Q1 are
zation of broken-line functions in one dimension. The subspace Sh composed I('L: q1rp), which is quadratic in q1, .
of these piecewise linear functions was proposed by Courant [Cll] for the ing function uh = 1: Q1rp1 is indepe.
solution of variational problems; it is a subspace of 3C1 , since the first deriva- of choosing a basis is simply to pw
tives are piecewise constant Its independent development by Turner and computable (or operational) form }¡
others was the beginning of finite elements, and the trial functions are sorne- numerical calculations are strongly .
times referred to as Turner triangles. matrix K' from a different choice of
The effect of continuity is to avoid b'-functions in the first derivatives at K' SKST for sorne S, with F repl;
interelement boundaries; without this constraint the functions are not admis- is to choose a basis in which K is spa
sible, and the (infinite) energy over the domain Q cannot be found by adding the entries of K and Fas easy to com
the separate contributions from within each element. It is here tha! the finite element
The simplicity of Courant's space líes in the fact that within each triangle, values, is an effective compromise. ·
the three coefficients of vh a 1 + a 2 x + a 3 y are uniquely determined by tioned and reasonably sparse, since t1
the values of vh at the three vertices. This means that the function can be to the same element. Furthermore,
conveniently described by giving its nodal values; equivalently, Sh will have F1 (J, rp1) can be found by a very l
a convenient basis. Furthermore, along any edge, vh reduces to a linear func- is to compute, not one inner prodt
tion of one variable, and this function is obv.iously determined by its values to all the i_nner products from one
at the two endpoints of the edge. The value of vh at the third vertex has no the integrals are computed over eacl
effect on the function along this ~dge, no matter whether this third vertex ness matrices ki. Each ki involves o
belongs to the triangle on one side or the other. Therefore, the continuity of partition; its other entries are all zc
vh across the edge is assured by continuity at the J'ertices. containing all inner products, is ass
In case of an essential boundary condition, say u O on r, the simplest was carried out on an iriterval in Sect
subspace Sh 3Có is formed by requiring the functions to vanish on the poly- lar mesh in the next section.
gonal boundary rh. Extended to be zero on the skin Q Qh, such a function .Courant's triangles yield an inteJ
equation, the standard five-point d
vh is continuous o ver the whole domain Q; it lies in 3Có, and it is admissible
in the Dirichlet problem. angles are formed in a regular way, ~
The dímension of the space Sh-the number N of free parameters in the all the diagonals in the northeast di
functions vh-coincides with the number of unconstrained nodes. (A boundary scheme can be constructed in a simi
node at which vh is required to vanish, or to equal sorne other prescribed dis- this is largely a curiosity.) The appf
placement, is constrained and does not add to.the dimension ofthe subspace.) stiffness matrix means that the fast
For proof, let rpix, y) be the trial function which equals 1 at thejth node and KQ = F; this is extremély successf
zero at all other nodes. Then these pyramid functions rp 1 form a basis for the . applications are being rapidly develo
tria[ space Sh. An arbitrary vh in S 1' can be expressed in one and only one way matically, a key property of the 5-¡
SEC. 1.9. TRIANGULAR ANO RECTANGULAR ELEMENTS 77 .
ows the user to b~gin by describing ' as a linear combimítion

inimum cost in time and effort, and
;h. Furthermore, he can more easily
1hich is the mostconvincingandhigh-
reduction in h leads toan acceptable In this context the coordinate q1 has direct physical significance, as the dis-
·ica] solution. , placement vh at thejth node z1 = (x1, y1). This is an important feature ofthe
now the simplest antl:.most basic of elements which haye been developed for engineering computations: that
•ach triangle (vh ·+
a 1' a2 x + a 3 y) each g 1 can be associated with the value of vh or one of its derivatives ata
e graph of vh(x, y) is a surface made specific nodal point in the domain.
he edges. This is an obvious generali- The optimal coordinates Q1 are now determined by minimizing I(v")
mension. The subspace Sh composed I(L: q1rp1), which is quadratic in q1, . .. , qN. We emphasize that the minimiz-
proposed by Courant [C 11] for the ing function uh = L: Q1rp1 is independent of the particular basis; the effect
ubspace of X 1 , since the first deriva- of choosing a basis is simply to put the problem of minimization into the
:nden~~evelopment by Tumer and computable (or operational) form KQ F. On the other hand, the actual
1ts, and the trial functions are sorne- numerical calculations are strongly dependent on this choice. The stiffness
matrix K' from a different choice of basis would be congruent to the first:
i-functions in the first derivatives at K' = SKST for sorne S, with F replaced by F' SF. Therefore, the object
nstraint the functions are not admis- is to choose a basis in which K is sparse and well conditioned, while keeping
Jmain Q cannot be found by adding the entries of K and Fas easy to compute as possible.
ach element. It is here thatthe finite element choice rp1 , based on interpolating nodal
in the fact that within each triangle, values, is an effective compromise. The matrix K is reasonably we11 condi-
+ a 3 y are uniquely determined by tioned and reasonably sparse, since two nodes are coupled only ifthey belong
'his means that the function can be to ·the same element. Furthermore, the inner products K 11 = a(rp1, rp1) and
1 J•alues; equivalently, Sh will have F1 = (f, rp1) can be found by a very fast and systematic algorithm. The trick
ny edge, vh reduces to a linear func- is to compute, not one inner product. after another, but the contributions
obviously determined by its values to alLthe i_nner products from one triangle after another. This means that
tlue of vh at the third vertex has no the integrals are computed over each triangle, yielding a set of element stiff-
lO matter whether this third vertex ness matrices k 1• Each k 1 involves only the nodes of the ith triangle of the
e other. Therefore, the continuity of partition; its other en tries are all zero. Then ~he global stiffness matrix K,
t the J'ertices. containing all inner products, is assembled from these pieces. This process
lition, say u = o on r, the simplest was carried out on an interval in Section 1.5 and will be extended toa triangu-
the functions to vanish on the poly- lar mesh in the next section.
Jn the skin Q - Qh, such a function Courant's triangles yield an interesting stiffness matrix K. For Laplace's
l; it lies in Xó, and it is admissible equation, the standard :five-point difference scheme is produced if the tri-
angles are formed in a regular way, starting with a square mesh and drawing
umber N of free parameters in the all the diagonals in the northeast direction. (The more accurate nine-point
f unconstrained nodes. (A boundary scheme can be constructed in a similar way from bilinear elements [F6], but
to equal sorne other prescribed dis- this is largely a curiosity.) The appearance of such a simple and systematic
d to.the dimension ofthe subspace.) stiffness matrix means thát the fast Fourier transform can be used to solve
t which equals I at the jth node and KQ F; this is extremély· successful on a rectangle, and non-rectangular
lid functions rp1 form a basis for the · applications are being rapidly developed by Dorr, Golub, and others. Mathe-
expressed in one and only one way matically, a key property of the 5-point scheme is the maximum principie:
78 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.9. TRIAN<
all off-diagonal terms Ku in the stiffness Jnatrix are negative, they are domi- C0 quadrotic
nated by the diagonal en tries Km and as· a consequence the inverse matrix
K- 1 is non-negative. A simple calculation shows that this property holds for
linear elernents on any triangulation, if no angle exceeds rc/2. (The precise
condition is that the two angles subtended by any-·given edge add to no more
than :n:.) A similar result holds for linear elernents in n dimensions. Because
K- 1 and án rpix) are non-negative, the finite element approxirnation uh obeys
the same physicallaw as the true displacement u: if the load f is everywhere
positive, then so is the resulting displacemeni. Fried has asked whether this
holds also for elements of higher degree; it will not be true that all off-diago-
nal K 11 are negative, but this is not necessary for a positive K- 1 •
In the Neurnann problem there is no constraint on vh at the boundary
nodes, and the dirnension of Sh becomes the total number of interior and
boundary nodes. The basis functions rp1 again equal one at a single node and
zero at the others. In -this case, however, defining the trial functions to be
zero on the skin would not leave them all continuous. Instead, we maysimply
continue into each piece of n Qh the linear function defined in the adjacent
triangle.
There are other possibilities, of which we rnention one-to ignore the Fig. 1.11 Nodal placement f(
skin entirely and study the Neurnann problem only within the approximate
domain Qh. Of course, the approximate solution may be significantly altered, at all others. This rule specifies rp a1
producing sorne loss in accuracy in cornparison with minimizing the integral vertices and three midpoints, thereb:
over n. Such errors, introduced by modifying the boundary-value problem, (59).
are estimated in Chapter 4. The crucial point to decide is wl
We turn now toa more accurate and more refined element. This was a determined by this rule in each sepa
decisive step in the finite element techniqu<(, to generalize the basic idea edges between triangles. The proof is
behind Courant's simple trial functions. Rathér than assuming a linear func- vh is a quadratic in one variable,' and
tion within each triangle, vh will-be permitted to be a quadratic: at the rnidpoint and two at the ends).
by these three nodal values, shared b
edge. The nodal values elsewhere in
In order to lie in X 1, vh must again be continuous across the edge between along the edge, and continuity holds. 1
adjacent triangles. .rh in the Dirichlet problem, the three
To compute with this subspace, a different description is required; we quadratic) are zero.
need a basis for the space. Therefore, we must find a set of continuous piece- Any vh in Sh has a unique expam
wise quad!atic functions rp1 such that every member of S has a unique expan- q1 is the value of vh at the jth interiOI
sionas these rp1 forma basis for Sh, and the d
strained nodes.
For continuous piecewise cubics,
There is a beautiful construction of such a basis, if we add to the vertices way. A cubic in the two variables x,
a further set of node~, placed at the rnidpoints of the edges (as in the first tri- in particular by its values at the lC
angle of Fig. 1.11 ). With each node, whether it is a vertex or the midpoint of Again the 4 nodes on each edge dete
an edge, we associate the function f{Jj which equals 1 at that node and zero and continuity is assured. This is the
matrix are negative, they are domi.;. C0 quadratic C0 cubic

ts a consequence the inverse- marrix
n shows that this proper~y holds for
· no angle exceeds n/2. (The precise
~d by any given edge add to no more
r elements in· n dimensions. Because
nite element approxift;¡ation uh obeys
::ement u: if the load f is everywhere
ment. Fried has asked whether this
it will not be true that all off-diago-
;sary for a positive K- 1 •
o constraint on vh at the boundary z3 cubic
s the total number of interior and
:1gain eqpal one at a single node and
r, defining the trial functions to be
continuous. Instead, we maysimply
near function defined in the adjacent
eh we mention one-to ignore the Fig. 1.11 Nodal placement for quadratic and cubic elements.
oblem only within the approximate
olution may be significantly altered, at all others. This rule specifies rp at six points in each triangle, the three
,arison with minimizing the integral vertices and three midpoints, thereby determining the six coefficients a1 in
ifying the boundary-value problem, (59).
The crucial point to decide is whether or not the piecewise quadratic,
i more refined element. This was a determined by this rule in each separate, triangle, is continuous across the
1ique, to generalize the basic idea edges.petween triangles. The proofis deceptively simple. Along such an edge
Rather than assuming a linear func- vh is a quadratic in one variable, and there are three nodes on the edge (one
Iitted to be a quadratic: at the midpoint andtwo at the ends). Therefore, the quadratic is determined
r4 X 2 +a xy +a y 2• by these three nodal values, shared by the two triangles which meet at that
5 6
edge: Th'e nodal values elsewhere in the two triangles have no effect on vh
ontinuous across the edge between along the edge, and continuity holds. For an edge lying on the o uter boundary
rh in the Dirichlet problem, the three nodal values (and therefore the whole
ifferent description is required; we quadratic) are zero.
must find a set of continuous piece- Any vh in Sh has a unique expansion into I; qlp1, where the coordinate
y member of S has a unique expan- q1 is the .value of vh at the jth interior node (vertex or midpoint). Therefore,
these rp1 forma basis for S\ and the dimension N equals the number of uncon-
lPix,y). strained nodes.
For continuous piecewise cubics, a basis can be constructed in the same
t a basis, if we add to the vertices way. A cubic in the two variables x, y is determined by 10 coefficients, and
oints of the edges (as in the first tri- in particular by its values at the 10-node triangle of the previous figure.
her it is a vertex or the mid point of Again the 4 nodes. on e~ch edge determine the cubic uniquely on that edge,
ich equals 1 at that node and zero and continuity is assured. This is the triangular analogue of the one-dimen-
80 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.9. TRIAN
sional. cubic element constructed in Section 1.6 (that is, the one which was There is an additional point of
only continuous, and had two nodes inside each interval). elements: The unknowns Qc corres
The same construction extends to polynomials of any degree k - l eliminated immediately from the fir
in any number of variables x 1 , • • • , xn, provided the basic regions are always system with effectively three unknow1
simplexes-intervals for n 1, triangles for n = 2, and tetrahedra for n of unknowns is known as static cond
3. In fa1t, we can obtain discrete analogues of arbitrarily high accuracy in corresponding basis functions fPc ar
n-space./'Unfortunately, there is for practical applications a fatal flaw; the each Qc is coupled only to the other 1
dimension of Sh, which equals the total number of interior nodes, grows the cth equation can be solved for {
enormously fast with k and n. This is an essential problem with finite elements, Qc can be eJiminated from the sys
to impose further constraints on the trial functions (thereby reducing the The number of arithmetical operati o
dimension of Sh) without destroying either the approximation properties of centroids which are removed, whi
or the simplicity of a local basis. sional problems, it is impossible to el
The key is to increase proper1y the level of continuity required. In the Physically, the optimal displacemen1
cubic case, we may add the _condition that the first derivatives of vh should sively by the nine parameters whic1
be continuous at every vertex. This is clearly a subspace of the previous cubic boundary of th~, element.
case, in which only continuity of vh was required. If in triangles meet at a Mathemati~'i'ny, this, process m~
given interior vertex, the continuity of v~ and v; imp.Óses new constraints on of the basis functions rp1 with resp
the trial functions, and the dimension N is correspondingly reduced. To con- Of course, the Gram-Schmidt proce:
struct a basis we remove the midedge nodes and concentrate a triple node at the whole basis, leaving the trivial s~
each vertex. In other words, the 10 coeffi.cients in the cubic are determined insane; it is quicker just to solve KQ
by v, v.:c and V y at each vertex, together with v at the centroid; this tenth node tion ag_ainst fPc is possible since it che
has nowhere else to go. A unique cubic is determined by these 10 nodal values, within the given triangle. It appear:
and along each edge it is determined by a subset of 4-the values of v and its knowns in other than the normal •
edge derivative at each end, a combination which again assures matching of a band matrix, except in sp~ia
along the edge. The result is an extremely useful cubic trial space, which we reduction, which systematically elimi
denote by z3. Laplace difference equation. , .
Piecewise polynomial elements-,like Z 3 aré easy to describe in the interior There is an important variánt of
of the domain; at the boundary we ha ve to be more careful, and there are condensation is avoided: the centroi
various alternatives. In case of an essential condition u = O, there is first of the nine nodal parameters given by 1
all the constraint vh = O. If vh is to vanish along all of rh, then also the ingly, one degree of freedom must t
derivatives along both chords must be set to zero, leaving no parameters and a familiar device is to require eq
free at that vertex. A more satisfactory approximation is obtained by the 1
isoparametric method of Section 3.3, or-if it is preferred to stay with the vh = a1 +a x +a y +a x +a

2 3 4
2
5J
x-y variables-to impose only the condition that tl)e derivative of v vanish
in the direction tangent to r. In the latter case we must give up the Dirichlet
condition that vh vanish on rh and by extension on the true boundary r. This restriction damages ~he accura
vh
Instead, the functions will actual/y l'iolate the essential condition for admissi- rate of convergence. to the degree .e
bility, and in this version of the cubic Z 3 , the tri al space will not be a subspace reproduced in the trial space, and
of the Dirichlet space JCA. Nevertheless, it is possible to give a rigorous esti- Nevertheless, this restricted cubic i
ma te of the error induced by such inadmissible elements; this wilJ be a key and is preferred by many engineers
result ofChapter 4. We may expect vh to be nearly zero oil t, and if we go so excellent. (However, we observe in Se
far as to compute o ver the curved triangles of O rather than the real triangles of freedom are not invariant With n
of 0", the numerical results will be good. for sorne orientations they are not
:tion 1.6 (that is, the one which was There is an additional point of practical importance about the cubic
1side each intervaf). elements: The unknowns Qe corresponding to the centro id nodes can be
polynomials of any degree k elirhinated immediately from the finite element system KQ F, Jeaving a
;->rovided the basic regions are always system with effectively three unknowns per mesh point. This early elimination
~s for n = 2, and tetrahedra for n = of unknowns is known as static condensation. It depends on the fact that the
>gues of arbitrarily high accuracy in corresponding basis functions 'Pe are nonzero only inside a single triangle;
a\
lctical applications fatal flaw; the each Qe is coupled only to the other nine parameters ofits own triangle. Thus
al number of interio'r, nodes, grows the cth equation can be sol ved for Qe in terms of these nine nearby Qi, and
!Ssential problem with finite elements, Qe can be,. eliminated from the system without increasing its bandwidth.
rial functions (thereby reducing the The number of arithmetical operations is directly proportional to the number
~ither the approximation properties of centroids which are removed, which is exceptional; in general two-dimen-
sional problems, it is impossible to eliminate n nodes with only a,n operations.
level of continuity required. In the Physically, the optimal displacement Qc at the centroid is determined exclu-
that the firf-L~derivatives of vh should sively by the nine parameters which establish the displacement along the
uly a subspace of the previous cubic boundary of th~ element.
.s required. If m triangles meet at a Mathemati~~'lly, this process may be regarded as the orthogonalization .
·~ and v~ imposes new constraints on of the basis functions 'Pi with respect to each centroid basis function 'Pe·
is correspondingly reduced. To. con- Of course, the Gram-Schmidt process could always be used to orthogonalize
,des and concentrate a triple node at the whole basis, leaving the trivial stiffness matrix K /, but this wou1d be
fficients in the cubic are determined insane; it is quicker just to solve KQ Fdirectly. The special orthogonaliza-
ith v at the centro id; this tenth node tion ag_ainst 'Pe is possible since it changes only the nine adjacent rpi' and only
determined by these 10 nodal values, within the given triangle. It appears that in general the elimination of un-
1 subset of 4-the values of v and its knowns in other than the normal Gaussian order will increase the width
tion which again assures matching of a band matrix, except in special circumstances-such as the odd-even
ly useful cubic trial space, which we reduction, which systematically eliminates every other node in the five-point
Laplace difference equation. ' .
•3 are easy tó describe in the interior There is an important variant of the cubic space Z 3 , in which this static
e to be more careful, and there are condensation is avoided: the centro id is not a node at all, and there are only
:ial condition u = O, there is first of the nine nodal parameters given by v, vx, and vy at each vertex. Correspond-
nish along aH of r\ then also the ingly, one degree of freedom must be removed from the cubic polynomials,
set to zero, Ieaving no parameters and a familiar device is to require equal coefficients in the x 2 y and xy 2 terms:
approximation is obtained by the
-if it is preferred to stay with the vh = a1 +a x +a y +a
2 3 4 x2 +a 5 xy +a y
6
2
tion that the derivative of v vanish + a1x + a (x y + xy + a

3
8
2 2
) 9 y3•
r case we must give up the Dirichlet
~xtension on the true boundary r. This restriction damages the accuracy of the element; we shall connect the
te the essential condition for admissi- rate of convergence. to the degree of polynomial which can be completely
:he trial space will not be a subspace reproduced in the trial space, and this degree has dropped from J to 2.
it is possible to give a rigorous esti- Nevertheless, this restricted cubic is among the most important elements,
tissible elements;this will be a key and is preferred by many engineers to the full Z 3 • Its nodal parameters are
:>e nearly zero on r, and if we go so excellent. (However, we observe in Section 4.2 that the remaining nine degrees
:S of a rather than the real triangles of freedom are not invaríant with respect to rotation in the x-y plane, and
for sorne orientations they are not even uniquely determined -by the nine
82 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.9. TRIANGl
nodal parameters. Anderheggen has proposed JJ vh = O as an alternati ve

Full quintic
constráint, and Zienkiewicz [22, p. 187] analyzes a third, and very attractive,
possibility.)
None of the spaces described so far can be applied to the biharmonic
equation, since they are not subspaces of 3C 2 • (Or rather, their use would be
illegal; the restricted Z 3 is frequently used for shells. Elements which are
nonconf<>t~:ming along interior edges are discussed in 4.2.) Therefore, denoting
by ek the. . class of functions with continuous-derivatives through order k,
we want now to construct elements which belong to e 1 • The essential new
condition is that the normal slope be continuous across interelement bound-
aries. A function in e 1 is automatically in 3C 2 and therefore admissible for
fourth-order problems; the integrals over n can be computed an element at
a time, with no ó-functions at the boundaries.
,--/Severa! possibilities exist. One is to modify the cubic polynomials in
Z 3 by rational functions, chosen to cancel the edge discontinuities in the nor-
mal slopes without affecting the nodal quantities v, vx, and v>'. This makes
vh piecewise rational instead ofpiecewise polynomial, and again the accuracy
is reduced; the space does not contain añ arbitrary cubic. We are opposed
to rational functions, however, on other and more important grounds; they
Fig. 1.12 Two quintic triang
are hard to integtate, and even numerical integration has very serious diffi-
culties (Section 4.3).
lf we prefer to work exclusively with polynomials, we are led to one of An equally effective way of dispo!
the most interesting and ingenious of all finite elements-the quintics is to require that vn reduce toa cubica
(Fig. 1·.12). A polynomial of fifth degree in x and y has 21 coefficients to be leading coefficient in what would othe
determined, Of Which 18 Will COme from the Va}UeS Of V, Vx, Vy, Vxx' VxY' and to vanish. With this constraint the n<
_ v>'>' at the vertices. These second derivatives . represent "bending moments" cubic vn along the edge, and the eleme
of physical interest, which will now be continuous at the vertices and avail- reducing the displacement error 'in u
able as output from the finite element process; they are given directly by however, the dimension of Sh is sigr
the appropriate weights Qr Furthermore, with all second derivatives present see Table 1.1) a;nd this more than co
there is no difficulty in allowing an irregular triangulation. The quintic along tetms. In fact, a series of careful nun
the edge between two triangles is the same from both sides, since three condi- prize to this remarkable element. Adr
tions are shared at each'end ofthe edge-v and its edge derivatives vs and vss' with-the continuity of second deriv:
which are computable from the set of six parameters at the vertex. domain, where the true solution u is le~
lt remains to determine three more constraints in such a way that also Note that to achieve a e 1 triangul:
the normal derivative vn is continuous between triangles. One technique is
to add the value of vn at the midpoint of each edge to the list of nodal para-
tThere are important problems in whic
meters. Then since vn is a quartic in s along the edge, it will be uniquely fixed incompressible fluids, for example, it is usefu
by this parameter together with the values of vn and vns at each end. {The The French school (Crouzeix, Fortin, Glowi
normal n at the middle of an edge points into one triangle and out of the the Courant triangles are completely inac
adjacent one, adding somewhat to the bookkeeping.) This constrüction pro- the use of
duced the complete e 1 quintic, which was devised independently in at least t. standard quadratics in the plane and
four papers. lts accuracy in the displacement will be O(h 6 ), if the boundary
e
2. these 1 quintics as stream functio
velocity field of quartic polynomials constr
is successfully approximated; comparable accuracy with finite differences 3. a non-conforming element, linearon
has apparently never been achieved for fourth-order problems in the plane. of the edges (cf. Section 4.2).
S proposed f f Vh 0 as an alternative
87] analyzes a thi~d, and very attr"'ctive, Full quintic Reduced quintic
> far can be applied to the biharmonic

Vn is o
es of JC 2 • (Or rather, their use would be . /cubic on
ly used for she11s. Elements which are each edge
.re discussed in 4.2.) ~herefore, denoting
ontinuous derivatives.'through order k,
which belong to G1 • The essential new
continuous across inter:element bound-
Llly in JC 2 and therefore admissible for
overO can be computed an element at
undaries.
to modify the cubic polynomials in
tcel the edge discontinuities in the nor-
d quantities v, vx, and vy. This makes
ise polynomial, and again the accuracy
n an arbitrary cubic. We are opposed
ter and more important grougds; they
;ical integration has very serious diffi- Fig. 1.12 Two quintic triangles and a cubic macro-triangle.
ith polynomials, we are led to one of

An equally effective way of disposing of the three remaining constraints
of all finite elements-the quintics
is to require that v, reduce toa cubic along each edge, i? other wo~ds, that the
ee in x and y has 21 coefficients to be
Ieading coefficient in what would otherwise be a quartic polynomtal b~ made
m •the valués of v, vX' vY' vXX' vXY' and
. to v~mish. With this constraint the nodal values of v, and v,s determm~ the
•attves represent "bending moments"
cubic. v, along the edge, and the element is in e 1 • The complete quin tic is ~ost,
continuous at the vertices and avai1-
reducing the displacement error in u- uh· from h6 to hs. At the same ttme,
: process; they are given directly by
however the dimension of Sh is significantly reduced (by a factor of 9: 6;
see Tabl~ l.l) and this more than compensates for the loss ofthree quintic
re, with all second derivatives present
~ular triangulation. The quin tic along
tetms. In fact, a series of careful numerical experiments [CI4] has given first
ne frorn both sides, since three condi-
prize to this remarkable element. Admittedly, it is not always simple to.work
-v and its edge derivatives v and v
1( pararneters at the vertex. s w
with-the continuity of second derivatives must be relaxed at corners m the
domain where the true solution u is less continuous-but it is highly accurate. t
constraints in such a way that also
between triangles. One technique is
Not~ that to achieve a e 1 triangular element, it was necessary to prescribe
f each edge to the Iist.of nodal para-
mg the edge, it will be uniquely fixed tThere are important problems in which the physics introduces side co~ditions .. For
ues of v, and v,.s at each end. (The incompressible fluids, for example, it is useful to impose div vh = O on all the trtal funct10ns.
lts into one triangle and out of the The French school (Crouzeix, Fortín, Glowinski, Raviart, Témam) has found th~t althou~h
the Courant triangles are completely inadequate, convergence can be estabhshed wtth
)Okkeeping.) This construction pro- the use of
as devised independently in at Jeast 1. standard quadratics in the plane and cubics in 3-space;
ment will be O(h 6 ), if the boundary 2. these e t quintics as stream fuT).ction in two dimensions, with a divergence-free
>le accuracy with finite differences velocity field of quartic polynomials cgnstructed by differentiating the qui!ltics;
:Ourth-order problems in the plane. 3. a non-conforming element, linea'r on triangles but continuous only at the _midpoints
of the edges (cf. Section 4.2).
\
84 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.9. TRIANG
even th~ second derivatives at the vertices. Zenisek has found a remarkable Table 1.1 TRIAI
theorem [Zl] in this direction: To achieve piecewise polynomials of class eq

Element Type Contim
on an arbitrary triangulation, the nodal parameters must include al/ derivatives
at the vertices of order less than or equal to 2q. He has constructed such ele- .Linear eo
ments in n dimensions by using polynomials of degree 2nq + 1, conjectured Quadratic eo
to be the rp.inimum possible. Cubic eo
Fortun~tely there is a way to avoid this severe limitation. and still achieve Cubic Z3 e 0 , vx, vy contin
at vertices
a conformi.ng e 1 element. It consists of building up a "macroelement" out As above, plus <
Restricted Z 3
of several standard elements. One of the best known is the Clough-Tocher coefficients of
triangle, forrned by combining different cubic polynomials in three subtri- Quintic e 1, Vxx, Vxy, Vyy
angles (Fig. 1.12). The final nodal parameters for the large triangle are v, vx, continuous at
and vy at the vertices and vn at the midpoints of the edges-12 in all. These Reduced quintic e 1, Vxx, Vxy, Vyy
continuous at
guarantee that even the normal slope will match that of the adjoining macro- e1
Cubic macrotriangle
triangle, since the slope is a quadratic and is· therefore determined along
an edge exclusively by the parameters on that edge-vn is given directly at
the midpoint, and indirectly at the two vertices as a combination of V and 2n2 triangles, each of the small squ
vy. Since each of the three cubics has 10 degrees of freedom, and thex final which has .slope +l. Only the leadi
macrotriangle has 12 nodal parameters, 18 constraints may be imposed. It be correction terms depending on th
turns out. that this is-just right to achieve e 1 continuity intermilly in the tri- but the important constant is M. T
angle. Requiring a common value of v, vX , and vy furnishes 3 constraints at element in which each vertex is a p-·
.
each externa} vertex and 6 at the interna! one, and agreement of vn at the the interior. of each triangle contaim
midpoints of the three interna! edges makes 18. angulation there are twice as many tri
Assuming that a given triangulation allows the triangles to be combined for 180° in ~ngles and each vertex for
three at a time into macrotriangles, we have just found a basis for the space triangles is 3 : 2, since each edge is sh
Sh of all piecewise cubics ofdass e 1 • This is one case of an important and 1-2-3 weighting of p, r, and q. (Seconj
.apparently very difficult problem-to determine. a basis for the space of piece- inside an éxisting triangle, it bring
wise polynomials v(xp . .. , xJ of degree k - 1 which have continuity eq of two triangles.) This 1-2-3 rule hol
between simplexes. We have no idea (for n > 1) even of the dimension of that on average there are M/2 unkno
this space: According to Table 1.1, the e 1 cubics have M= 6 parameters nodes are desirable; an increase inp
for each pair of macrotriangles, and therefore an average of one unknown for dimension and to bandwidth. Inten
each of the original triangles !t they can be removed by static co1
To summarize the polynomials described in this section, we shall tabulate worst in respect of computation tim
their essential properties. The column headed d gives the number of para- Let us remark that in any theon
meters required to determine the polynomial within each region, that is, the the one of higher degree k _;_ 1 will
null?-ber of degrees of freedom if adjacent elements imposed no constraints. cally as h ~ O. The error decreases 1
The integef k - 1 is the degree of the highest "complete polynomial" which both ~n the particular element .an
is present; this means that sorne polynomial of degree k cannot be found in fore, the only theoreticallimitation ~
the trial space, and consequently (as we shall prove) that the error in u- uh ness of u, and even this can be ave
is of order hk. Finally, N is the dimension ofthe trial space S\ assuming that The balance is altered, however
Q is a square which has been partitioned into n 2 small squares and then into asks for the element which achieves
width h will be finite, not infinitesi
whether the.calculation is to be do
tAdded in proof: We now have a conjecture about the dimension, but no inkling of a
basis, for the space of polynomials of degree k - 1 and continuity eq.
in programming is a necessity, or w
ces. Zenisek has found a remarkable Table 1.1 TRIANGULAR ELEMENTS

eve piecewise polyhomials of class eq
arameters must include al! derivatives Element Type Continuity d k N Mn2
l to 2q. He has constructed such ele-
mials of degree 2nq 1, conjectured Linear eo 3 2 n2
Quadratic eo 6 3 4n 2
Cubic eo lO 4 9n2
his severe limitationJt.nd stil1 achieve Cubic Z3 eo, vx, vy continous
:- building up a "macroelement" out at vertices 10 4 5n2
1e best known is the Clough-Tocher Restricted Z 3 As above, plus equal
coefficients of x2y and xy 2 9 3 3n2
cubic polynomials in three subtri-
Quintic e t, 'Vxx, 'Vxy, 'Vyy 21 6 9n 2
te~ers for the large triangle are v, vx, continuous at vertices
1omts of the edges- 12 in aH. These Reduced quintic et, 'Vxx, 'Vxy, 'Vyy
l match that ofthe adjoining macro- continuous at vertices 18 5 6n2
an~ is therefore ~etermined along Cubic macrotriangle et 12 4 M=6
m that edge-vn,lis/ given directly at
vertices as a combination of v and 2n 2 triangles, each of the small squares being cut in half by the diagonal
) degrees of freedom, and thex final which has slope + 1. On1y the leading term N Mn 2 is given; there will
18 constraints · may be imposed. It be correction terms depénding on the constraints imposed at the boundary,
e e 1 continuity internally in th/tri- but the important constant is M. The coefficient Mis p 3q + 2r for an
'x• and vY furnishes 3 constraints at element in which each vertex is a p-tuple node, each edge has q nodes, and
al one, and agreement of v at the the interior.of each triangle contains r nodes. The reason is that in any tri-
kes 18. n
angulation there are twice as many triang1es as vertices; each triangle accounts
t11ows the triangles to be combined for 180° in angles and each vertex for 360°. Furthermore, the ratio of edges· to
ave just found a basis for the space triangles 'is 3: 2, since each edge is shared by two triangles. This explains 'the
is is one case of an important and 1-2-3 weighting of p, r, and q. (Second prqof: lf one new vertex is introduced
rmine a basis for the space of piece- insid.e an existing triangle, it brings three new edges and a net increase
e k ~·1 which have continuity eª oftwo triangles.) This 1-2-3 rule holds on any set oftriangles, and it means
r n > 1) even of the dimension of that on average there are M/2 unknowhs per triangle. lt follows that multiple
e 1 cubics have M= 6 parameters nodes are desirable; an increase in p is economical in its contribution both to
ore an average of one unknown for dimension and to bandwidth. Interna! nodes are next best, especially since
they can be removed by static condensation. Nodes along edges are the
Jed in this section, we sha11 tabulate· worst in respect of computation time.
aded d gives the number of para- Let us remark that in any theoretical comparison of two finite elements,
lial within each region, that is, the the one of higher degree k - 1 will always be the more efficient, asymptoti-
elements imposed no constraints. cally as h O. The érror decreases like Chk, with a constant C which depends
hest "complete polynomial" which both Ón the particular element and on the kth derivatives of u. There-
ial of degree k cannot be found in fore, the only theoreticallimitation on the rate of convergence is the smooth-
talJ prove) that the error in u- uh ness of u, and even this can be avoided (Section 3.2) by grading the mesh.
)fthe trial space Sh, assuming that The balance is altered, however, if one fixes the accuracy required and
nto n 2 small squares and then into asks for the element which achieves that accuracy at least expense. The mesh .
width h will be finite, not infinitesimal. Furthermore, there is the question
bout the dimension, but no inkling of a whether the calculation is to be done only a few times, so that convenience
1 and continuity e11. in programming is a necessity, or whether the programming arrd preparation
86 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.9. TRIAN<
costs qm be amortized over a long program lifetime. It is our judgment seem slightly superior to linear ones,
that elc~ments of limited order, like those in Table 1.1-together with others term xy. Nevertheless, a square "m~
constructed to have s¡milar characteristics, such as quadrilateral elements, triangles and have the same element l
three-dimensional elements, and shell elements-will provide the basic range -A, the asseinbled bilinear stiffness
of selection for the practica} application of the finite element method. main diagonal and to _:_ 1 at points
We w~nt now to discuss rectangular element~, which are rapidly increasing the plane. The extension to a trililí
in popularf~ty. They are of special value in three dimensions, where each cube a 5 xy + a 6 xz + a 7 yz + a 8 xyz in 3-sp
occupies the same volume as six very elusive tetrahedra. (An irregular de- at each corner.
composition of 3-space into tetrahedra is hard even on the computer.) Just as the bilinear element on n
Furthermore, a number of important problems in the plane are defined on triangles, there are biquadratics anc
a rectangle or on a union of rectangles. The boundary of a more general quadratics and cubics on triangles.
region cannot be described sufficiently well without using triangles, but it degree k - 1 in each of the variab1
will often be possible to combine rectangular elements in the interior with freedom within each "box" element
triangular,elements near the boundary. . are easy to describe. Figure 1.13 shc
Thésimplest consiruction, in analogy with the linear element for triangles,
is based on functions which are piecewise bilinear: vh == a 1 + a 2 x + a 3 y +
a 4 xy in each rectangle. Th~se four coefficients are determined by the values
of vh at the vertices. -:fhe overall trial fuitction is continuous and may be ap-
plied to differential equations of second order. There is an obvious inter-
polating basis: rp1 equals one at the jth node and zero at the others. Its graph
•
V
looks like a pagoda, or at least like our conception of one, and rp1 will be
called a pagoda function. It is a product 'lf(X)'If(Y) of the basic piecewise linear
roof functions in one variable, so that the space Sh is a tensor product of two V
-
V
simpler spaces. This is a valuable element.
It is important to notice that for arbitrary quadrilaterals, these piecewise Fig. 1.13 Nodes for biq
-bilinear functions. would not be continuous from one element to the next. Sup-
pose that two quadrilaterals are joined by a line y = mx + h. Then along 1
coefficients of a biquadratic are dete
that edge, the bilinear function reduces toa quadratic; it is linear only if the nodes on each edge assure continuit
edge is horizontal or vertical. A quadratic cannot be determined from the ratic can be merged with the quadr
two nodal values at the endpoints ofthe edge, and in fact the other nodes do vertices and midpoints are nodes. ~
affect the value of vh. Therefore, bilinear elements may be used only on rec- A simple modification of the bi
tangles. Given a more general quadrilateral, however, it is still possible to reducing the number of nodes in e~
change coordinates in such a way that it becomes a rectangle, and then bilin- biquadratic is restricted by the rem4
ear functions in the new coordinates are admissible. In fact, this coordinate to the approximation of u was very 1
change can itself be described by a bilinear function, so that the same po/yno- element belongs to the "serendipi1
mials are úsed for the transformation of coordinares as for the shape functions particularly useful for quadrilater~
within each element. This is the simplest of the isoparametric elements, which change transforms the region into
are discussed in detail in Section 3.3. becomes, along with the bilinear (
The bilinear element on rectangles can be merged with Courant's linear element in the plane.
elements on triangles, since both are completely determined by the value of The corresponding three-dime
vh at the nodes. Other combinations are also possible: A bilinear function which again is especially valuable fe
on a triangle, with a midpoint node along one edge, is sometimes merged with or multiple nodes. Its nodes líe at
a full quadratic on a neighboring element. In general, bilinear trial functions the twelve edges of the brick~ This
,rogram lifetime. It is our judgment seem slightly superior to linear ones, since they exactly reproduce the twist
se in Table l.l-t6gether with others term xy. Nevertheless, a square "macroelement" can be formed from two
stics, such as quadrilateral elements, triangles and have the same element stiffness matrix. For the Laplacian L
lements-will provide the basic range -A, the asseinbled bilinear stiffness matrix K is proportional to 8 along the
n of the finite element method. main diagonal and to __:: 1 at points correspondingto its eight neighbors in
dement~, which are rapidly increasing the plane. The extension to a trilinear function a 1 + a 2 x + a 3 y + a4 z +
in three dimensions, tWhere each cube a 5 xy + a 6 xz + a 7 yz + a 8 xyz in 3-space is clear; again there is one unknown
elusive tetrahedra. (An irregular de- at each corner.
ra is hard even on the computer.) Just as the bilinear element on rectangles matches the linear element on
•roblems in the planeare defined on triangles, ·thei-e are biquadratics and bicubics which correspond to the eo
:s. The boundary of a more general quadratics and cubics on triangles. (Uitimately there are polynomials of
well without using triangles, but it degree k- 1 in each of the variables x 1 , • • • , xn, possessjng kn degrees of
.ngular elements in the interior with freedom within each "box" element.) The biquadratics are often used and
· are easy-to describe. Figure 1.13 shows the arrangement of nodes. The nine
{ with the linear element for triangles,
ise bilinear: vh = a 1 + a 2 x + a 3 y +
ficients are determined by the values
nction is continuous ancÍ may be ap-
d order. There is an obvious inter-
ode and zero at the others. Its graph •
V
r conception of one, and rp j will be
r(x)Vf(y) ofthe basic piecewise linear
e space Sh is a tensor product of two
lt. V
-
V
~itrary quadrilaterals, these piecewise

Fig. 1.13 Nodes for biquadratic and Hermite bicubic.
Yfrom one element to the next. Sup-
by a Iine y= mx + b. Then along coefflcients of a biquadratic are determined by its nodal values, and the three
:o a quadratic; it is linear only if the nodes on each edge assure continuity between rectangles. Clearly the biquad-
ttic cannot be determined from the ratic can be merged with the quadratic element on triangles, since the same
edge, and in fact the other nodes do vertices and rhidpoints are nodes. Similar remarks apply to the bicubics.
· elements may be used only on rec- A simple modification of the biquadratic is to rt!move the interior node,
:eral, however, it is still possible to reducing the .number of nodes in each element to eight. To compensate, the
becomes a rectangle, and then bilin- biquadratic is restricted by the remo val of the x 2y 2 term, whose contribution
admissible. In fact, this coordinate to the approximation of u was very marginal anyway. This simple and pleasant
u function, so that the same polyno- element belongs to the "serendipity class" described by Zienkiewicz. It is
1ordinates as for the shape functions particularly useful for quadrilaterals with curved sides, after a coordinate ~
,f the isoparametric elements, which change transforms the region into a square (Section 3.3); in this context it
becomes, along with the bilinear element, the most valuable isoparametric
~n be merged with Courant's linear element in the plane.
1pletely determined by the value of The corresponding three-dimensional element is the 20-point brick,
also possible: A bilinear function which again is especially valuable for isoparametrics because it has no interna!
one edge, is sometimes merged with or multiple nodes. Its nodes Iie at the eight vertices and at the midpoints of
. In general, bilinear trial functions the twelve edges of the brick. This total matches the number of terms x•yPzr
88 SEC. 1.9. TRIAN(
AN INTRODUCTION TO THE THEORY CHÁP. 1
which 3;re of second degree in at most one ofthe variables-x 2yz is allowed, We emphasize that these nodes must ll
but not x 2y 2 z or x 3 • in one direction and y = Yz in the ot
The higher-order serendipity elements in the plane have 4p nodes equally intersections absolutely determine
spaced along the 'perimeter of the rectangle, including four at the corners. Therefore, the bicubic element is res
The shape functions include all terms xrr.yP in which neither exponent exceeds . linear transformation of the plane, t
p, and the ,smaller exponent is O or l. The first missing term is therefore x 2y 2 , On a su:fficiently rectangular regio
a~d k = 4.~ These are again especially useful in coordinate changes, where elements. Its degree of continuity fo
the boundaries of the rectangle can be transformed into arbitrary polynomial in (61); since r¡1 and ro are both in e 1 ~
curves of degree p, and there are no annoying interna! nodes. Therefore, this bicubic may be appl
Thereare·several fascinating elements (dueto Clough, Felippa, and others) functions lie in 3C 2 • Furthermore, e
in which a quadrilateral is produced as the union of two or more triangular all continuous. (This suggests a ba~
pieces, so that the polynomial varíes from piece to piece within the quadri- space Sh; it consists of all piecewisc;
lateral macroelernent. The nodal arrangement and continuity between pieces are continuous; we say that v is in <
can become rath~r subtle. However, we shall describe only one other element, extra degree of continuity does not s~
the Hermite bicubic. The trial functions are again cubic in each variable sepa- The function vxy is quadratic along
rately, vh = .L; aiixiyi for O< i,j < 3, yielding 16 degrees of freedom in rectangles which share that edge, ye
each rectangle .. The parameters are determined by the values of v ' v x' vY' points are automatically held in com
and vxyat the four vertices.t Thus the dimension of Sh ~s grea,tly reduced a quadratié!
in comparison with the ordinary bicubics described above, which are based The Hermite idea can be extend
on 16 distinct nodes. In one dimension there would be q
This element should be understood as a natural extension of the funda- and ro of the cubic case; all their de
mental ene-dimensional element, the Hermite cubic, which was discussed at the nodes, except that the pth fu
in Section l. 7. In that case v = a 1 a2 x +a3 x 2+ +
a4 x 3 was determined o ver origin. This yields the most natural b
each subinterval by the values of v and vx at the endpoints. This construction 2q - 1 having q - 1 continuous de
left vx continuous at the nodes, and therefore everywhere, so that the element products cop(x)cop.(y), which means 1
was in e 1 • The standard basis in one dimension contained two kinds of many ofthem are cross derivatives of
functions, r¡1(x) and ro(x), which interpolated function values and derivatives, too inefficient after bicubics or biquiJ
respectively: can be reduced by imposing additio
tion is achieved by the splines, whi~
1 at node x = xi, becoming a single polynomial ove
"'j = {o at other nodes,
co 1 = O at all nodes,
unknown for each node, with basis J
(60) line and by tp(x)tp(y) in the plane. TI
d• •
11
,_.,_J = 0 at a ll no d es, dco 1 {1 at node x = x 1, different elements become coupled
dx dx = O at other nodes. tions become more complicated, aJ
away from rectangles) are impossit
These functions span all piecewise cubics of class e 1 •
Because all these spaces are SJ
The Hermite bicubic space is a product of two such cubic spaces, and the
basis functions, the stiffness matric<
four parameters at a typical node z = (xi, y 1) lead to four corresponding
one-dimensional operators. Rough
basis functions:
of variables can be applied to the e
importance occurs in parabolic p1
(61)
W¡ = 'fliX)r¡lz{y), w2 = 'flix)co¡{y),
to implicit difference schemes, witl
w3 = ro/x)r¡~¡{y), · C0 4 = colx)co¡{y).
matrix, or Gram matrix, formed 1
tThis element was introduced in the engineering Iiterature by Bogner, Fox, and Schmit. inverted at each time step. Por difl
CHAP. 1 'SEC. 1.9. TRIANGULAR AND RECTANGULAR ELEMENTS 89
one of the variables-x 2yz is allowed, .we emphasize that these nodes must lie on a rectangular grid; the lines x = x 1
in one direction and y y 1 in the other can be spaced arbitrarily, but their
Hs in the plane have 4p nodes equally intersections · absolutely determine the two-dimensional array of nodes.
angle, including four at the corners. Therefore, the bicubic element is restricted to rectangles (or, after a simple
'yP in which neither exponent exceeds .linear transformation of the plane, to parallelograins).
1e first missing term is therefore xzyz, On a sufficiently rectangular region the Hermite bicubic is one of the best
useful in coordinat~;¡,_changes,, where elements. Jts degree of continuity follows immediately from the basis given
·ansformed into arbitrl;lry poly~omial in (61); since t¡1 and ro are both in e 1 , all their products inherit this property.
oying interna! nodes. Therefore, this bicubic may be applied to fourth-order equations; the trial
; (dueto Clough, Felippa, and others) functions lie in JC 2 • Furthermore, even the cross derivatives iJ 2v/iJx iJy are
the union of two or more triangular all continu9us. (This suggests a basis-free char.acterization of the Hermite
)ffi piece to piece within the quadri- space Sh; it consists of all piecewise bicubics v such that v, vx, vy, and vxy
~ment and continuity between pieces are continuous; we say that vis in e 1• 1 .) The remarkable thing is that this
hall describe only one other element extra degree of continuity does not seem to follow from the usual arguments.
ue again cubic in each vafiable sepa~ The function vxy is quadratic along each edge and is the same for the two
yielding 16 degrees of freedom in rectangles which share that edge, yet only the values of vxy at the two end-.
:ermined by the values of v v v points are automatically held in common-and two values cannot determine
, ' X' Y'
· dtmension of Sh is gre~tly reduced a quadratic!
:s described above, which are based The Hermite idea can be extended to elements of higher degree 2q 1.
In one dimension there would be q different functions, corresponding to t¡1
lS a natural extension of the funda- and ro of the cubic case; all their derivatives of order less than q will vanish
ermite cubic, which was discussed at the nodes, except that the pth function rop(x) has (iJ/iJx)P-trop = 1 at the
+ a 3 X 2 + a4 x 3 was determined o ver origin. This yields the most natural basis for the space of polynomials of degree
at the endpoints. This construction 2q - 1 having q - 1 continuous derivatives. In two dimensions we need all
r-ore everywhere, so that the element products rop(x)rop.(y), which means that there areq 2 unknowns at every node;
limension conütined two kinds of many ofthem are cross derivatives of high' order, and the construction beco mes
ted function values and derivatives too inefficient after bicubics or biquintics. As always, the number ofunknowns
'
can be reduced by imposing additional continuity. The ultimate in this direc-
tion is achieved by the splines, which are as continuous as possible without
w 1 = O at all nodes, becoming a single polynomial over the whole domain. There is only one
unknown for each node, with basis function given by the B-spline VJ(x) on the
1 at node x x1 line and by VJ(X)VJ(Y) in the plane. The difficulty, of course, is that nodes from
{ O at other no des.' different elements beco me coupled in the construction; the boundary condi-
tions' become more complicated; and isoparametric transformations (to get
of class e 1 • away from rectangles) are impossible.
t of two such cubic spaces, and the Because all these spaces are spanned by products of one-dimensional
'!C1 , Yr) Iead to four corresponding basis functions, the stiffness matrices K may also admit a decomposition into
one-dimensional operators. Roughly speaking, this occurs when separation
of variables can be applied to the differential operator L. A case of practical
2 = Vfix)wr(Y), importance occurs in parabolic problems, where Galerkin's method Ieads
wix)wr(y). to implicit difference schemes, with a two-dimensional matrix M-the mass
matrix, or Gram matrix, formed from the inner products of the VJ1-to be
g literature by Bogner, Fox, and Schmit. inverted at each time step. For difference equations it was this kind of diffi-
90 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.10. ELEMENT MAT
culty which gave rise to the alternating direction methods, in which the tic in the coordinates qi' describing
~nversion is approximated by two one-dimensional computations. For any
space formed from the products of one-dimensional elements, the same (62) a(v, v
technique can be applied, with the usual qualification that if the domain
is not precisely rectangular, then the success of the alternating direction The entries of K are the energy innc
In practice these inner products
method is observed but not proved~
computation is the evaluation of the E
In lt4ble 1.2, M = p + 2q + r for an element with p parameters at each
in other words, over the separate ~
Table 1.2 RECTANGULAR ELEMENTS Each such fraction of the overall er
Element type Continuity d k N Mn2 (63)
Bilinear eo 4 2 n2 in analogy with (62). The vector q

Biquadratic eo 9 3 4n 2 which actually contribute to the em
Restricted biqt~dratic e o 8 3 3n2
the full set of coordinates qi coult
Ordinary bicu , ic e o 16 4 9n2
Hermite bicubic et,t 16 4 4n 2 host of zeros into the element stiffm
Splines of degree k ek-z,k-z k2 k n2 those functions tp1 which vanish in
Hermite~ degree k- 1 2q 1 eq-t;q-t k2 k q2n2 the order of k e equal to the number
Serendipity, p > 2 eo 4p 4 (2p- l)n2 function within e.) For linear fun
for example, the matrix ke will be e
vertex, q along each edge, and r inside each rectangle. The weights 1, 2, 1 three components of qe will be the
come from the combinatorics of an arbitrary quadrilateral subdivision: The ratics d 6, and for the quintic e
number of vertices equals the number of quadrilaterals equals half the
number of edges. There are n subintervals in each coordinate direction.
T,
Here, for example, v; is the x-deri
it equals the weight qi attached t•
1.10. ELEMENT MATRICES IN
at this vertex equals one and wl
TWO-DIMENSIONAL PROBLEMS
In the following we shall use th1
In this section we shall outline the sequence of operations performed by basic example, since they illustrat•
the computer in assembling the stiffness matrix K, that is, in setting up the There are now two problems,
discrete finite element system KQ F. The calculation of the consistent and second to assemble them intc
load vector F will also be briefly described. Our iíitention is not to give the a(v, v) =
intiníate details which a programmer would require, but rather to clarify the
crucial point in the success of the finite element method: polynomial elements
like those ofthe previous section are astonishingly convenient in the practica! The latter is a questioq of efficien
application of the Ritz method. Perhaps un'iquely so. relative size of the computer and ·
We recall first the source ofthe matrix K. The functional /(v) to be mini- handled in core, one possible m<
mized has as leading term a quadratic expression a(v, v), representing in in which the elements (~ubregior
many cases the strain energy (or strictly speaking, twice the strain energy). The matrices for each element in
The Ritz method restricts v toa subspace S by introducing as trial functions element · containing sorne particu
a finite-dimensional family v = ~ q1rpr (The superscript h will be dropped tThe nine entries of the element sti
in this section, while the Ritz method is applied toa fixed choice of elements.) from the dot products of the si des of th
The substitution ofthe trial functions v into the strain energy yields a quadra- Every off-diagonal k¡¡ :S O unless the t1
CHAP. l SEC. 1.10. ELEMENT MATRICES IN TWO-DIMENSIONAL PROBLEMS 91
zting direction methods, in which .the

tic in the coordinates qi' describing the energy on the subspa~ S:
;e-dimensional computations. Por any
one-dimensional elements, the same (62) a(v, v) qTKq.
sual qualification that if the1 domain
~ success of the alternating direction The entries of K are the energy inner products KN, a(rpk, rp1).
In practice these inner products are calculated only indirectly. The basic
an element with p p~rameters at each computati6n is the evaluation ofthe energy integral a(v, v) over each element e,
in other words, over the separate subdomains into which O is partitioned.
'lGULAR ELEMENTS
Each such fraction of the overall energy takes the form
d k N Mn2 (63)
4 2 n2
9 in analogy with (62). The vector q, now contains only those parameters q1
3 4n2
8 3 3n2 which actually contribute to the energy in the particular region e. (Of course,
16 4 9n2 the full ser of coordinates q1 could be included, at the cost of inserting a
16 4 4n2 host of zeros into the element stiffness matrix k,. It is much neater to suppress
~-2
k2 k nZ
r-l k2 those functions rp1 which vanish in the particular region e, and to maintain
k q2n2
4p 4
the order of k, equal to·the number of degrees offreedom d ofthe polynomial
function _within e.) For linear functions v = a 1 + a 2 x + a 3 y on triangles,
for example, the matrix k e will be of order d 3. t With the usual basis, the
e. each rectangle. The weights 1, 2, 1
three components of q, will be the values of v at the vertices of e. For· quad-
11trary quadrilateral subdivision: The
ratics d 6, and for the quintic element with 18 degrees of freedom,
:r of quadrilaterals equals half the
·als in each coordinate direction.
Here, for example, v; is the x-derivative at the second vertex of the triangle;
it eq~lls the weight q1 attached to the 'basis function rp1 whose x-derivative
at tnis vertex equals one and whose other nodal parameters equal zero.
In the following we shall use these quintic functions on triangles as the
equen~e of operations performed by basic example, since they illustrate most of the difficulties that can arise.
; matnx K, that is, in setting up the There are now two problems, first to compute the element matrices k,,
: The calculation of the consistent and second to assemble them into the overall strain energy
•ed. Our intention is not to give the
uld require, but rather to clarify the a(v, v) = qTKq "_E q~keqe.
e
ement method: polynomial elements
nishingly convenient in the practical The latter is a question of efficient bookkeeping and depends in part on the
uniquely so. _ relative size of the computer and the problem. ·For problems too large to be
x K. The functional/(v) to be mini- handled in core, one possible mode of organization is the frontal method,
expression a(v, v), representing in in which the elements (~ubregions) are ordered rather than the unknowns.
speaking, twice the strain energy). The matrices for each element in turn are assembled, and as soon as every
:S by introducing as trial functions element · containing sorne particular unknown Qn has been accounted for,
(T~e superscript h will be dropped
>phed toa fixed choice of elements.) tThe nine entries of the element stiffness matrix for Laplace's equation come directly
to the strain energy yields a quadra- from the dot products ofthe sides ofthe triangle: kiJ = s1 •s¡/2A, A= area ofthe element.
Every off-diagonal k¡¡ ~ O unless the' triangle is obtuse.
92 AN INTRODUCTION TO THE THEORY CHAP. 1 SEC. 1.10. ELEMENT MATRI
eliminati9n is ..carried 6ut in the corresponding row of K and the results are For the 18-degree element, the las1
stored. In this way, the only unknowns iQ core ata given moment are those geneous constraints on the coefficient
belonging to several elements, sorne of which have already been assembled slope vn is a cubic along each edge. T
and sorne of which ha ve not. \
In this section we shall concentrate on the computation of the element
matrices, e~tending to two dimensions the technique used in Section 1.7 for
the Hermitftcubic calculations. At the end we comment on the use of numer-
ical integratfon.
The essential point is that with a polynomial basis rpi and polygonal
(not curved) elements, the energy is a weighted sum of integrals of the form
Inverting, the matrix H which conne1
columns:
\'- ' prs = JI xrys dxdy. A
These integrals will depend on the location of the element e and on constants For any element, the jirst step is toco
which can be tabulated. Therefore, the problem is to find a convenient coordi- the nodal parameters q and the polym
nate system in which to describe. the geometry of e, and to relate. the nodal Now we calculate the energy inb
parameters qe to the coefficients in the polynomial v. There is general agree- the plate-bending example in Sectior
ment that the "global" x-y coordinates are inappropriate, but there seems to
be much less agreement on which local system is the best. We shall therefore
describe two or three of the possibilities. ·
aeCv, v) JJ(vix + Vfy +
The first is to translate the coordinates so that the origin lies at the cen- e
troid (x 0 , y 0 ) of the triangle e. Bell [8] refers to this as the local-global system.
Substituting the polynomial v, this i
Since the transformation is linear, a quintic polynomial in x, y remains
a quintic in the new variables X= x X 0 , Y =~y- y 0 :
(66)
v = a1 +a 2X +a 3Y + a4X +L.+ a
2 5
21 Y •
where the matrix N requires the ev:
8
It is easy to find the nodal parameters of q~ in terms of these coefficients are computed, the calculation of thf
a1 : If the new coordinates of the vertices are (X1, Y 1),. then with A= Hqe,
V
1
= a 1 + a 2 X 1 + a 3 Y 1 + a 4 Xi + · · · + a 21 Yi,
(64) v! = v_l. a2 + 2a 4 X 1 + ... , so that finally
(67)
and so on. Eor the 21-degree element the three midedge slopes ha ve al so to be
Thus the element stiffness matrix i
computed; the nodal parameters for this element areq; 1 (q; 8 , v~, v;, v;).
connection matrix H between the
Their relationship with the coefficient vector A = (a 1 ~ ••• , a 21 ) can be written
coefficients, and a matrix N giving t
in matrix formas A different set of local coordina1
(65) Lindberg, and Olson [C14]. In tb
b, e, and ()in Fig. 1.14 become p.aJ
whe,rethe first 18 rows of G are given by (64); the last three rows involve nót the coordinates (x¡, y 1) of the vertlc
only the coordinates X 1, Yi of the vertices but also the orientation of the· polynomials in the rotated coor~in
triangle. lnverting (65), we have A G- 1 1 Hq; 1 • q; of a new matrix N, giving the m1
CHAP~ 1 1 SEC. 1.10. ELEMENT MATRICES IN TWO-DIMENSIONAL PROBLEMS 93
ponding row of K and the results are

For the 18-degree element, the last three rows of G are changed to homo-
, in core at a giveri moment are those
geneous constraints on the coefficients a¡, in order to ensure that the normal
_. which have already been ass~mbled
slope vn is a cubic along each edge. This alters the relationship to
~ on the computation of the element

the technique used in Section 1.7 for qJ8)
(~
nd we comment on th1~ use of numer-
= G0 A.
polynomial basis rpj and polygonal
·eighted su m of integrals of the form
Inverting, the matrix H which connects qe toA this time has 21 rows and 18
·ys dxdy. columns,:
on of the element e and on constants

For any element, the jirst step is to compute this connection matrix H between
roblem is to find a convenient coordi-
the nodal parameters q and the polynomial coefficients a.
ometry of e, and to relate the nodal
Now we calculate the energy integral in terms of the coefficients a;. For
tolynomial v. There is general agree-
the plate-bending example in Section 1.8, this integral is
tre inappropriate, but there seems to r.,
ystem is the best. We shall therefore ,
ae(v, v) = JJ (vix + vJ;y + 2vvxxVyy + 2(1 - v)viy) dXdY.
:es so that the origin lies at the cen- e
ers to this as the local-global system.
¡uintic polynomial in x, y remains Substituting the polynomial v, this is easily put in the form
Xo, Y= Y- Yo:
(66)
where the matrix N requires the evaluation of the integrals P,s· Once these
)f q} 8 in terms of these coefficients are computed, the calculation of the stiffness matrix is complete. From (66),
: are (X¡, Y;), then
with A= Hqe,
.. ' so that finally
(67)
hree midedge slopes ha ve also to be
elementareq 2 t =(qta vt vz v3) Thus the element stiffness matrix involves two separate computations, the
)r A = (a~' .. ~, a 21 ) ;a~ b~ wnritt~~ connection matrix H between the nodal parameters and the polynomial
coefficients, and a matrix N giving the integrals of polynomials.
A different set of local coordinates was recommended by Cowper, Kosko,
4, Lindberg, and Olson [C14]. In their system the geometrical quantities a,
b, e, and O in Fig. 1.14 become paramount; they are easily computed from
64); the last three rows involve not
the coordina tes (xi' y¡) of the vertices. The trial functions will still be qui~tic
:s but also the orientation of the
q¡t = Hq;t. e
polynomials in the rotated coordinates and '1· Therefore, the computation
of a new matrix N, giving the integrals of polynomials over the triangle,
1 •, ELEMENT MAl
y _ length of PQ
t1 - length of AQ =
~ + C + ~ = orea of (BPC +
1 2 3 · orea of (
y
X
Fig. 1.14 Another choice of local coordinates.
depends on the existence of a convenient formula:

Fig. 1.1~ Area c•
II .er,¡saea11 = cs+t(ar+l (-by+ t)

(r + s''+ 2)!
r .s. . Geometrically, C1 is the fraction A
fraction ztfh 1 of the distance to th
e cording to (68), the integrals Pr, be
We also need a new connection matrix H, and this is computed in two steps.
First, entirely in the e-11 variables, H' connects the nodal parameters to the
polynomial coefficients. Then there is a rotation matrix R to link the local
and the global coordina tes, and H = H' R. The·details are clearly explained
and we need only the formulas for
in [C14l and leave the impressiori that a cubic variation of the normal slope
and BeU [8], p. 84):
is simplest to impose in this system.
The computation of Prs = JJ xr ys dX dY can be carried out without
any rotations by introducing area coordinates. These are the most natural
parameters and are also known to engineers as triangular coordinates and to
II CiC1C~ = (ñ
mathematicians as barycentric coordinates. In this system each point has

This takes info ·account the Jacob
actually three coordinates, whose sum is always ( 1 + ( 2 + (3 = l. The ver-
n p =O, the coefficient 2! appe
tices of e become the points (1, O, 0), (0, 1, 0), and (0, O, 1). If the same ver-
nant ofthe matrix B in (68) is reco!
tices havé rectangular coordinates (Xp Y1), (X2 , Y 2 ), and (X3 , Y 3 ), then by
element.
]inearity the coordinates (X, Y) and ((p ( 2 , ( 3) of an arbitrary point Pare
Holand and Bell [8] have give
related by
keeping the origin of the X- Y sy!
xl remarkable that for r + s < 6, t1
(68) yt
( 1
CHAP. 1 SEC. 1.10. ELEMENT MATRICES IN TWO-DIMENSIONAL PROBLEMS . 95
~ = length of PQ = orea of BPC

1 length of AQ orea of BAC
~ + ~ + ~ = orea of (BPC + CPA + APB) =
1 2 3 orea of (BAC)
y
X 8=(0,1,0)
;e of local coordinates.
nt formula: X
Fig. 1.1~ Area coordinates for a triangle.
(-by+t) r!s! . Geometrically, ' 1 is the fraction A 1 /A of the area in Fig. 1.15 and also the
(r + s + 2)! fraction z 1/h 1 of the distance to the opposite side. In area coordinates, ac-
cording to (68), the integrals Prs become ·
H, and this is computed in two steps.
connects the nodal parameters to the
t rotation matrix R to link the local
rt R. The ·details are clearly explained
. cubic variation of the normal slope and we need only the formulas for integration in area coordinates (Holand
and Bell [8], p. 84):
r dX dY can be carried out without
dinates. These are the· most natural m!n!p!
(m+ n + p + 2) 1detB.
teers as triangular coordinates and to
rtes. In this system each point has
This takes into account the Jacobian of the coordinate change; with m =
; always ' 1 + ' 2 + ' 3 = l. The ver-
, l, 0), and (0, O, l ). If the same ver- n = p O, the coefficient 2! appears in the denominator, and the determi-
Y1), (X2, Y2 ), and (~, Y3), then by nant ofthe matrix B in (68) is recognized as twice the area A ofthe triangular
, '2' '3) of an arbitrary point P are element.
Holand and Bell [8] ha ve given explicit formulas for the integrals Prs'
keeping the origin of the X- Y system at the centroid of the triangle. lt is
remarkable that for r +· s < 6, these integrals are all of the simple form
1.10. ELEMENT MATRI<
96 AN INTRODU:C'I10N TO THE THEORY CHAP.l SEC.
The c<;>nstants are c 1 O, c 2 = 1/12, c 3 c4 = 1/30, c 5 2/105. From over the elements; if there are variab
these formulas Bell d~rives the stiffness matrix k" for the complete quintic taken into account, then numerical in
applied to the plate-bending pr()blem. Finally, there is the calculation oft
We mention that another local coordinate system could be based entirely geneous.~erm in the finite element eqt
on the natural coordinates C1• In this case the parameters q", which involve element ~t a time:
nodal der~vatives like vl.)l, must be converted into derivatives with respect
to the Ct· I~is our impression that algebraic manipulations in this system take (70)
a consideráble amount of time (and practice!), although Argyris and his
coworkers did succeed with this system in forming H = G- 1 analytically.
This natural coordin_ate system really comes into its own with numerical Again the calculation starts with A=
quadrature. of the element to the coefficients a1
Next we turn to the computation of the mass matrix M. This is the matrix coefficients,
which arises from an undifferentiated term JJ v 2 dx dy in the energy. In
other words, writing V. I; qlPi' M' is the Gram matrixt formed from th~ (71)
inner pro4ucts of the basis functions rp i:
where the components of the d-dimer
Mik = (rpk, rpi) JJ ffJk'Pi dxdy.
a
This matrix plays a central part in eigenvalue calculations ..

Just as with K, the mass matrix is assembled from the. separate elements:
Matching (71) with (70), the require<
(69) JJv2 = qTMq I; q~meqe = ~ JJv2.
a
Two special cases in the calcula

The element mas~ 111atrix, by ~our previous computations, can be written as
The first is that of a uniform l9ad J
integrals appear again:
Here H is the same matrix connecting the parameters qe to the coefficients of

the polynomial ·
d
V L a¡Xm;yn;. The second is that of a point load
i=l
(X0 , Y0). If this point is outside e, t
Z comes directly from the inner products of these monomials:
u,=
Again these calculations can be do1

For a more general loading f,
by a quadrature formula over each
These numbers are tabulated, and any ofthe local coordinate systems should
element which líes in the subspace ~
be acceptable. Thus Z plays the role for mass matrices which N played for
parameters fr'-th~ values off and i
stiffness matrices. In both cases we have tacitly assumed constant coefficients
the interpolating element
tWe have tried to think of a pun on the coincidence of mass matrix and Gram mafrix,
but the editor says it will have to wait. (72)
y
CHAP. 1 SEC. 1.10. ELEMENT MATRICES IN TWO·DIMENSIONAL.PROBLEMS 97
/12, c3 C 4 = 1/30, c 5 = 2/105. From o ver the elements; if there are variable material properties which must be
'ness matrix k e for the complete ,quintic taken into account, then numerical integration is recommended.
\1.
Finally, there is the calculation ofthe load vector F, whichis the inhomo-
)Ordinate system could be based entirely geneous term in the finite element equations K Q = F. This, too, is done an
s case the parameters qe, which involve element at a time:
converted into derivatives with respect
ebraic manipulationsc;jn this svstem take
:l practice !), although, Argyrls and his
(70) ff f
o
V qT F ~ q~ Fe ~ ff f V.
~tem in forming H G- 1 analytica11y.

Iy comes into its own with\ numerical Again the calculation starts with A = Hqe, connecting the nodal parameters
of the element to the coefficients a1 in the polynomial. In terms of these
)f the mass matrix M. This is the 1inatrix coefficients,
~~ t<?rm JJ V 2 dx dy in the energy. In
1s the Gram matrixt formed from the
(71)
rp j:
= II 'Pk'Pj dxdy.
o
where the components of the d-dimensional vector a are ju_st
senvalue calculations.
1ssembled from the separa te elements:
Matching (71) with (70), the required element load vector is
wus computations, can be written as Two special cases in the calculation of the a 1 are pointed out in [C14].
The first is that of a uniform load f(x, y) / 0 , in which case the tabulated
integrals appear again:
~he parameters qe to the coefficients of
The second is that of a point load f(x, y) = f0 ~, concentrated at the point

(X0 , Y 0 ). If this point is outside e, then of course Fe = O; if it is inside, then
:ts of these monomials:
Again these calculations can be done in any local system.

For a more general Ioading J, the integrals a1 may be computed either
:- the local coordina te systems should by a quadrature formula over each element, or else by interpolating f by an
r mass matrices which N played for element which Iies inthe subspace S 11 • In the latter case we compute the nodal
tacitly assumed constant coefficients parametersfr-~th~ values offand its derivatives at the nodes-andconstruct
the interpolating element
tcidence of mass matrix and Gram matrix,
(72)
98 AN IN'{RODUCTION TO THE THEORY CHAP. 1 SEC. 1.10. ELEMENT MATRICE
The change from f to / 1 gives a perturbed load vector F with components In other words, uh may exceed u in som
smaller derivatives in the mean.:.square
mates of both displacement and slope al
can be changed either by a fundamente:
stress, mixed, and hybrid methods are
Mis the mass matrix described above and f' is the vector formed from the
intentionally committing numerical erre
nodal para¡:p.eters
'·.
f J.. This is a comparatively easy computation. The latter effect is the one produce(
So far we have considered only analytical computation of the element
ically, this effect appears as a decrease
matrices, based on exact integration formulas for polynomials over polygons.
approximate K satisfies
In more and more cases, the element matrix computations are being carried out
approximately, by someform of Gaussian quadrature. For curved elements, (74)
arising from shell problems or from curved boundaries in planar problems,
this numerical quadrature is virtually a necessity. In this case there will be It may seem paradoxical that this leads
a perturbation not only from F to F but also from K to K. This means that the Ritz solution, but the paradox is e
the corresponding Q is the finite element solution for a perturbed problem. · Q = K.- 1F. Its strain energy·is QrK{
In a later chapter we shall estimate the error Q - Q; it must depend on the energywas QTKQ FTK- 1 F. Sincethe
accuracy of the quadrature formula. is equivalent (after sorne argument) 1
We regret to report that these inexact numerical integrations have even inverses,
been shown in sorne cases to improve the quality of the solution. This is one
instance (nonconforming elements are another) in which computational
experiments yield results which are frustrating to the mathematical analyst and it follows that the strain energy is 1
but nevertheless numerically valid and important. The improvement for In case of a point load P at the jth
finite h is due partly to the following effect: The strict Ritz procedure always case to a similar conclusion about the é
corresponds to an approximation which is too stiff, and the quadrature error ponent of the load vector F is Fi = p,
reduces this excess stiffness. is Q. (K- 1F)i ft(K- 1)ir If K is ~
J -
Stiffness is an intrinsic property of the Ritz method. By limiting the dis- 1
increased to ft(K- )ii.
placements v to a finite number of m o des rp 1 , •• ·• , rpN' instead of allowing all It remains to see why Gaussian qu:
admissible functions, the numerical structure is more constrained than the effect of reducing K. Obviously K will
real one. In eigenvalue problems, the result of this constraint is that ít? always k is reduced. A rigorous proof of the
lies above the true Ar In static problems, the potential energy l(uh) exceeds Razzaque [16] in the one-dimensional ca
l(u), since uh is obtained by minimization over the subspace spanned by in terms of Legendre polynomials. The
(/Jp .•• , 'PN· This overestimate of 1 corresponds to an underestimate of the
strain energy a, as proved in the corollary to the fundamental theorem 1.1 : qTkeq = f ((vh)')z f (~o +
(73)
2(~5 + ~} + ... +
In the special case of a point load ó(x0 ), where the displacement u(x 0 )
is proportional to the strain energy, this displacement is al so underestimated The effect of n-point Gaussian quadratu
by the Ritz method; the overstiff numerical structure has less "give" under ~;_ 1 , since they come from polynomials
a point load than the true' structure. For\ distributed loads, the tendency is late the term in~;, since Pn vanishes a1
the same: The finite element displacement uh(x) is generally below the true are defined.) Therefore in this special
displacement u(x). This is not a rigorous theorem, because the Ritz method and the same tendency reappears in me
minimizes en-ergy, and its link to displacement is not strictly monotone~ The least sti.f!ness matrix criterion i:
CHAP. 1 SEC. 1.10. ELEMENT MATRICES IN 1WO-DIMENSIONAL PROBLEMS 99
:urbed load vector F with components In other words, uh may exceed u in sorne part of the structure and _still ha~e
smaller derivatives in the mean-square sense. Nevertheless, one-stded esti-
mates of both displacement and slope are common with finite elements. This
can be ·changed either by a fundamental alteration of the Ritz process (the
~ and f' is the vector formed from the stress mixed, and hybrid methods are described in the next chapter) or by
atively easy computation. inten{ionally committing numerical errors in the right direction.
malytical computatidn of the element The latter effect is the one produced by Gaussian quadrature. Algebra-
•rmulas for polynomütls over polygons. ically, this effect appears as a decrease in the positive definiteness of K; the
ltrix computations are being carried out approximate K satisfies
sian quadrature. For curved ~lements
:urved boundaries in planar prbqlems: (74) for all q.
a necessity. In this case there wlll
be
It may seem paradoxical that this Ieads to an increase in the strain ene~gy ~s
mt also from K to K. This means that
the Ritz solution, but the paradox is easily explained. The new solut10n Is
ent solution. for a perturbed problem.
Q K.-tF. Its strain energy is QTKQ PK- 1F, whereas t~e old stra~n
~error Q Q; it must depend on the
energy was QTKQ = prK- 1F. Sin ce the inequality (74! on. the stt~ness matn.x
is equivalent (after sorne argument) to the opposlte mequahty on thetr
mct numerical integrations have even
in verses,
he quality of the solution. This is one
·e another) in which computational for all F,
ustrating to the mathematical analyst
and it follows that the strain energy is increased.
td important. The improvement for
In case of a point load p at the jth node, this leads as in the continuous
fect: The strict Ritz procedure always
caseto a similar conclusion about the displacement. The only nonzero com-
h is too stijf, and the quadrature error
ponent of the load vector F is Fi p, and the displa~eme~:lt ~t the jth no~e
is Q. = (K-tF)i p(K- 1 )ir If K is decreased to K, th1s dtsplacement IS
the Ritz method. By Iimiting the dis- J - '
increased to P(K- 1) ir . .
~es 'PP ... , rpN, instead of allowing all
It:remains to see why Gaussian quadrature tends to have thts desuable
ucture is more constrained than the
effect of reducing K. Obviously K will be reduced if every element matrix
mlt ofthis constraint is that A.1 always
k is reduced. A rigorous proof of the latter effect was found by Irons and
as, the potential energy l(uh) exceeds
Razzaque [16] in the one-dimensional case, by imagining that (vh)' is. expanded
Ltion over the subspace spanned by
in terms of Legendre polynomials. The strain energy over [-1, 1] ts
Tesponds to an underestimate of the
ary to the fundamental theorem 1.1 :
qTkeq J ((vh)') 2 J (rt 0 + rt P (x) + · · · + rtnPn(x))
1 1
2
a(u, u).
= J(x 0 ), where the displacement u(x0 )

= 2(a;~ + ~t + . . . + 2na¡ 1) .
displacement is also underestimated
The effect of n-point Gaussian quadrature is to preserve all the terms thro~g?
rical structure has less "give" under
a; 2 since they come from polynomials of degree less than 2n, and to anmht-
>r'-distributed loads, the tendency is
~nt uh(x) is generally below the true
l;t~1~he termina;, since Pn vanishes at the Gauss points. (This is how they
are defined.) Therefore in this special case the integral is clearly reduced,
s theorem, because the Ritz method
and the same tendency reappears in more general problems.
lacement is not strictly monotone.
The least stiffness matrix criterion is a valuable basis for the comparison
of different elements. Mathematically, it must reflect the approximation

qualiÜes of these elements, and in particular the numerical constants which
enter approximation of polynomials one degree higher than those repro-
duced exactly by the element. Computationally, it is a clearly visible effect,
and a number of theoretically useful elements have been discarded as too
stiff. The criterion raises the possibility of optimizing the stiffness matrix
under t]¡}:~ constraint of reproducing polynomials of degree k - l.
2 A SUMMARY OF 1
2.1. B.ASIS FUNCTIONS FOR THE FINI

\
In this chapter we collect in one pla~

ing the finite element method. The gm
for the analysis, in which each of the
Then succeeding chapters will take up
detail.
The first step is to decide which sub~
at the examples already given in Chaptt
which are mathematically essential.
One general description which fits t
we call the nodal finite element method. l
analysis. Every trial function vh is detet
are the unknowns qi of the discrete pro
is the value, ata given node z i' of either
tives. Thus the unknowns are
where the differential operator Di is

parameter is just the function value, an
az¡ax ay, and so on.
To each of these nodal parameteJ
determined by the following property •
all other nodal parameters of rp i are
node z J. to be shared by severa! paran
11
JRY CHAP. 1
:ally, it must reflect the approximation

particular the nunierical constants which
ls one degree higher than those repro-
putationally, it is a clearly visible· effect,
fui elements .have been discarded as· too
>ility of optimizing the stiffness inatrix
; polynomials of degtye k 1.
2 A SUMMARY OF THE THEORY
2.1. BASIS FUNCTIONS FOR THE FINITE ELEMENT SUBSPACES Sh
In this chapter we collect in one place sorne of the main results of analyz-
ing the finite element method. The goal is to describe a general framework
for the analysis, in which each of the various error estimates has its place.
Then succeeding chapters will take up these estimates one by one, in more
detail.
The first step is to decide which subspa~es Sh to study. Therefore, we look
at the ..examples aiready given in Chapter 1 and try to draw out the properties
which ·are mathematically essential.
One general description which fits these examples is the following, which
we call the nodalfinite element method. It will be the foundation for our whole
analysis. Every trial function vh is determined by its nodal parameters, which
are the unknowns q1 of the discrete problem. Each of these nodal parameters
is the value, at a given no de z 1, of either the function i tself or one of its deriva-
tives. Thus the unknowns are
where the differential operator D 1 is of order zero (D1vh = vh) in case the
pararneter is just the function value, and otherwise it rnay be a¡ax, a¡ay, a¡an,
a2 jax ay, and so on.
To each of these nodal pararneters q1 we associate a trial function rp1,
deterrnined by the following property: At z1 the value of D1rp1 is 1, whereas
all other nodal pararneters of rp1 are zero. Note that it is possible for the
node z 1 to be shared by severa! parameters, in other words to be a rnultiple
101
102 A SUMMARY OF THE THEORY CHAP. 2 SEC. 2.1. BASIS FUNCTIONS FOR TI
node;it is not z 1 but the pair (z1, D 1) which uniquely defines the parameter to use in the analysis. This property e
ql' Thus the key property is that each rp1 satisfies Dlp/z1) 1 but gives zero formed in a geometrically regular pattet
when matched with a different pair (zn D,): of as covered by a regular mesh of wié
positions in each mesh square. The pro¡
(1) lation invariance: If a basis function rp is
node z is translated to Iie at the new pe
These fu,JJ,.ctions rp1 form an interpolating basis for the trial space, since every
tion rp* associated with (z*, D) is just th
trial funbiion can be expanded as
(2)
vh = ~qJ'Pr
In n dimensions, l = (/ 1, ••• , In} is a vec
By applying D1 to bJth sides and evaluating at z1, the parameters q1 are ex-
actly what they shoul~ be:
Clearly Drp~(z +lh) = Drp(z) = 1; the i
to give a function rp* interpolating at
lation will haye to break down at the
conditions happen to be periodic; in th
The whole object of the Ritz method is to find the optimum values Q1 for has a completely periodic pattem.
these parameters, by minimizing I(vh). Then the particular tria! function Let us take as an example the spa(
which possesses these optimal parameters is the finite element approximation functions which was deséribed in Sect
triangular mesh, as indicated in Fig. 2.1
Let us fit two or three examples into this general framework. For the basic
piecewise linear functions on triangles (Turner triangles, or Courant trian-
gles) the nodes z 1 are the vertices in the triangulation, and all derivatives D 1
are of order zero, D 1vh vh. The unknowns are q1 = vh(z¡), and the basis is
formed from the pyramid functions determined by rpiz1) ~w The same is
true for bilinear functions on quadrilatera1s and for quadratics on triangles,
this time with midedge nodes included in the Zr For the Hermite cubics in one
dimension, the derivatives do enter: Every node z1 appears in both pairs Fig. 2.1 Nodes for CO q
(z 1, I) and (zp dfdx). We distinguished the two kinds of basis functions
rp1 by t¡J1 and ror For Hermite bicubics there are four parameters per node, to unity. Note that we can associate .VI
corresponding to v, vx, V y, and vxY' that is, to D = [, a¡ax, a¡ay, and iJ 2/ilx ay. four nodes; the four associated with tl
For the cubic space Z 3 on triangles, the vertex nodes are triple and the mesh point (1, O) are crossed. Note a
centroids are only simple. gives the number of basis functions fP.
To complete the description of the nodal finite element method, we say square, differs from the number d of d
something about the geometry. The domain Q is partitioned into a union of angle. M is the coefficient in the last
closed subdomains, ot elements, overlapping only at the interelement bound- giving the dimension of the space Sh. O
aries. Each element e contains no more than d of the nodal points z1 , and al! dimension is M/h 2 •
the basis functions rp1 , except those which correspond to these d nodes, are zero We shall denote by W1 , W2 , W3 , and
throughout e. Thus rp1 is also a local basis. This framework seems to be suffi- late at the nodes z 1 to z 4 , respective!~
ciently general to include most of the finite element spaces in current use. vanish at every no de except their own;
We note, however, that cubic splines are not included, since their basis func- basis functio:hs at the four nodes assoi
tions (the B-splines) are nonzero over severa! elements. found just by translation as describe
We call attention to one additional property of the rp1 which can be put W1(x- lp y- 12 ), i = 1, 2, 3, 4. lf the
CHAP~ 2 SEC. 2.1. BASIS FUNCTIONS FOR THE FINITE ELEMENT SUBSPACES Sh 103
•) which uniquely defines the parameter to use in the analysis. This property emerges only when the elements are
h rpi satisfi.es Di'P¡(zi) = 1 but gives zero formed in a geometrically regular pattern, where the domain can be thought
(z¡, D 1): of as covered by a regular mesh of width h, and the nodes fall in the same
positions in each mesh square. The property of the basis is then one of trans-
!¡) Oo·
lation invariance: If a basis function rp is associated with a pair (z, D), and this
~ting basis for the tri~~ spibe, since every node z is translated to lie at the new point z* = z + lh, then the basis func-
tion rp* associated with (z*, D) is just the translate of rp:
(2) rp*(x) = rp(x - lh).
aluating at z 1, the parameters q1 are ex- In n dimensions, l (l P . . • , !J is a vector ofintegers, that is, a multiinteger.
Clearly Drp~(z + lh) = Drp(z) = 1; the interpolating function rp is just shifted
to give a function rp* interpolating at z*. Óf course, this pattern of trans-
lation will have to break down at the boundary of n unless the boundary
d is to find the optimum values Qi for conditions happen to be periodic; in that case we can imagine that the basis
v"). Then the particular trial function has a complet~ly periodic pattern.
~ters is the finite element approximation
Let us take as an example the space of continuous piecewise quadratic
functions which was described in Section 1.7. The nades fall on a regular
triangular mesh, as indicated in Fig. 2.1, where the mesh width is normalized
to this general framework. For the basic

es (Turner triangles, or Courant trian-
he triangulation, and all deriva ti ves D.J
(nowns are qi = v"(z ), and the basis is
:etermined by rp¡(z1) Ow The same is
aterals and for quadratics on triangles,
. in the Zr For the Hermite cubics in one
Every node zi appears in both pairs Fig. 2.1 Nodes for C0 quadratics on a square mesh .
.ed the two kinds of basis functions
:;s there are four parameters per node, to unity. Note that we can associate with each mesh point a unique set of
.t is, to D /, a¡ax, a¡ay, and ai¡ax ay. four no des; the four associated with the origin are circled and those for the
, the vertex nades are triple and the mesh point (1, O) are crossed. Note also that this number M 4, which
gives the number of basis functions 'Pi and unknown weights qi per mesh-
e nodal finite element method, we say square, differs from the number d of degrees of freedom; d 6 in each tri-
omain n is partitioned into a union of angle. M is the coefficient in the last column of the table in Section 1.9,
1pping only at the interelement bound- giving the dimension of the space Sh. On a unit square, with periodicity, this
e than d of the nodal points zi, and al! dimension is Mjh 2 •
-:h correspond to these d nodes, are zero We shall denote by <ll 1, <ll 2 , <ll 3 , and (1) 4 the basis functions whichinterpo-
asis. This framework seems to be suffi- late at the nades z 1 to z 4 , respectively. They are piecewise quadratic, and
: finite element spaces in current use. vanish at every node except their own; this determines them completely. The
.re not included, since their basis func- basis functions at the four nodes associated with the mesh point (1 1, 12 ) are
several elements. found just by translation as described abo ve; these shifted functions are
Ll property of the rpi which can be put <Plx- lp y - / 2 ), i 1, 2, 3, 4. if the meshwidth is reduced to h, this just
104 A SUMMARY OF THE THEORY CHAP.2 SEC. 2.2.
rescales the independent variables; the four basis functions associated with over each subinterval by v and v' at t
the mesh point (/ 1h, / 2 h) are CfJ;(xjh 11 , yjh- / 2). parameters for each meshpoint. Their ger
This pattern of tra.nslation is so useful and important that we shall make t¡l and m of Fig. 1.8.
it the foundation for a second general description of the finite element Experiments have been carried out wi
method. This-.description applies on a regular mesh, and we call it the the reduction in M is still greater, but tl
abstract finite e1ement method. In n dimensions it begins with the choice of regular if there is to be any compariso
M functi:~ms w·1(x), ... , CfJM(x). These will eventually lead to M unknowns method. The bandwidth of the stiffness 11
per mesh cube, and the finite element equations KQ F will take the form decreased by a reduction in M; for cubic
of M coupled finite difference equations. seven nonzero entries in each row of K, :
To form the basis functions associated with one particular grid point, the four intervals. This is virtually the same
one at the origin, we simply rescale the variable x = (x¡, ... , xn), obtain- where the usual ordering of unknowns gi
ing W1(x/h), ... , CfJM(xjh). To form the basis functions associated with a
different grid point lh (/~' ... , !Jh, we translate the functions just con-
(~ ~) (~ ~)
structed. Thus to denote all the basis functions constructed in this way, by
scaling and translation, we need two subscripts i and /:
K=
l, ... ,M.
We must add the requirement that the original functions CfJ;(x) shall vanish
outside sorne sphere 1 x 1 <R. Then rptz will vanish outside the local sphere The order of the matrix K is of course
1 x - lh 1 < Rh, and we shall again have a local (but possibly not a strictly and splines do appear to be more efficien
local) basis. domain O is conveniently rectangular. 1
Note that in the abstract finite element method, we do not require the (f)1 of Chapter 8.
to interpolate at sorne node zi' and we do not require them to be piecewise We frankly believe that the nodal n
polynomials. (We prove, however, thatthe latter are the most efficient.) The geometry that it will continue to domina
interpolating property will be possessed by every basis which falls also within polating property is not essential t<;> par
the scope of the nodal finite element method, but our theory in the abstract convenient when on a regular mesh to be
case will not use this property. Asan example we consider the case of cubic tion in terms of functions CfJ 1 , ••• , CfJM·
splines in one dimension. In this case M 1, and a suitable choice for CfJ 1 is of Sh depends on these functions, they m1
the B-spline of Fig. 1.9. The space Sh of all combinations ¡: q1rp?,z is then ex- to all our questions about approximat
actly the spline space of all twice differentiable piecewise cubic functions with fore, this abstract approach turns a si
joints at x O, ±h, ±2h, .... It is essential to see that the B-spline CfJ 1 is analysis into an agreeable problem in ft
not interpolating; it is nonzero over four intervals instead of the two allowed
in the nodal method, and in particular it is nonzero at three nodes instead
ofone. 2.2. RATES OF CONVERGENCE
The spline subspaces Sh included in the abstract finite element method
have been used with sorne success in one-dimensional applications. The Suppose that u is the solution to (
assembly of the equation KQ F and the treatment of the boundary problem of order m. This means that ;
require modifications in the technique which is standard in the nodal case, class JC}l, restricted by homogeneous 01
but the ground rules are still those common to any form of the Ritz conditions, and (by ellipticity) that tp
method. The chief advantage in splines is that the extra continuity reduces a(v, v) crllvll~· Suppose also that uh
the dimension of the trial space, without destroying the degree of approxi- trial class Sh and that üh is the solutio1
mation. In the nodal case, the Hermite cubic polynomials are determihed numerical integration; the coordinate
CHAP.2 SEC. 2.2. RATES OF CONVERGENCE 105
the four basis functions associated with over each subinterval by v and v' at the endpoints; this means M = 2
- 1¡, yjh - /2). parameters for each mesh point. Their generating functions el> 1 a11d tl> 2 are the
useful and important that we \)shall make IJf and ro of Fig. 1.8.
~neral description of the finf~e element Experiments have been carried out with splines in two dimensions, where
on a regular mesh, and we ""call it the the reduction in Mis still greater, but the boundary is almost forced to be
· dimensions it begin~ with the choice of regular if there is to be any comparison with the simplicity of the nodal
.ese will eventually le:ad to M unknowns method. The bandwidth of the stiffness matrix K is of course not necessarily
nt equations KQ = Fwill take the form decreased by a reduction in M; for cubic splines in one variable there will be
1ns. seven nonzero en tries in each row of K, since the B-spline el> 1 stretches over
ciated with one particular grid point, the four intervals. This is virtually the same bandwidth as in the Hermite case,
~ the Variable X = (x 1 , ••• , Xn), obtain- where the usual ordering of unknowns gives a matrix of the form
1 the basis functions associated with a
)h, we translate the functions just con-
;is functions constructed in this way, by
' subscripts i and /: "(~ ~) (~ ~) (~ ~)
K=
/), i = 1, ... , M . (~-~) (~ ~) (~ ~)
.t the original functions «l>¡(x) shall vanish
Ptrwill vanish outside the local sphere The order of the matrix K is of course directly affected by the number M,
have a local (but possibly not a strictly and splines do appear to be more efficient (at least in solution time) when the
domain n is conveniently rectangular. This is confirmed in the calculations
!ment method, we do not require the «l>i of Chapter 8.
.ve do not require them to be piecewise We frankly believe that the nodal method allows such flexibility in the
at the latter are the most efficient.) The geometry that it will continue to domínate the use of splines. Since the inter-
ed by every basis which falls also within polatihg property is not essential to part of the analysis, however, it will be
method, but our theory in the abstract convenient when on a regular mesh to be able to use the more general descrip-
t example we consider the case of cubic tion in terms of functions el> 1 , • • • , el> M· Since everything in the construction
M= 1, anda suitable choice for Cl> 1 is of Sh depends on these functions, they must contain in themselves the answers
of all combinations .L; q1rpf, 1 is then ex- to all our questions about approximation and numerical stability. There-
rentiable piecewise cubic functions with fore, this abstract approach turns a significant part of the finite element
essential to see that the B-spline el> 1 is analysis into an agreeable problem in function theory.
'our intervals instead of the two allowed
ar it is nonzero at three nodes instead
2.2. RATES OF CONVERGENCE
in the abstract finité element method
in one-dimensional applications. The Suppose that u is the solution to an n-dimensional elliptic variational
and the treatment of the boundary problem of order m. This means that u minimizes l(v) over an admissible
e which is standard in the nodal case, class JC~, restricted by homogeneous or inhomogeneous essential boundary
! common to any form of the Ritz conditions, and (by ellipticity) that the strain energy is positive definite:
tes is that the extra continuity reduces a(v, v) >a 11 v 11!: Suppose also that uh is the minimizing function over the
wut destroying the degree of approxi- trial class Sh and that uh is the solution when the problem is perturbed by
tite cubic polynomials are determined numerical integration; the coordinates of uh satisfy KQ = F. Suppose,
106 A SUMMARY OF THE THEORY CHAP. 2 SEC. 2.2.
'~
finally, that üh is tHe solution which is actually computed, differing from üh Iated. (Most notably in the biharmonic e
because of roundoff error in the numerical solution. Obviously the three twist term xy was left out of Sh.) We sha:
approximations u\ ü\ üh include progressively more sources of error. We we know, the first proof.:_that this con<
want to summarize, for problems with smooth solutions and for typical the case of a regular mesh. Such a the
finite elements, what are the orders of magnitude of these errors. finite element theory, which admits th1
The s~arting point is always the same: lf Sh is a subspaceof 3C~, thenby regular subdivision.
the fund~mental Theorem 1.1 We have stated the constant strain <
for irregular meshes, but it is not. At leé
(3) a(u u\ u- uh) = min a(u- v\ u vh). be mapped into another domain n' b~
v" in S"
mation T which does not preserve pol
The theorem still holds with inhomogeneous essential conditions, where the satisfy the constant strain condition on l
difference of any two functions in JC~ líes in V0 and the difference of any two However, convergence on n implies cor
functions in Sh lies in Sg, as 1\)ng as Sg is a subspace of V 0 • (For proof see le m on n', since we can go freely back ¡
Section 4.4.) Therefore, the strain energy in u-uh is a question in pure.approxi- on errors for isoparametric elements is
mation theory, to estímate the distance between u and Sh. The basic hypo- Convergence in strain energy is esse
theses will be, first, that Sh is of degree k - l-it contains in each element atives of uh to those of u. This deriva
the complete polynomial of this degree, restricted only at the boundary by Ritz method is minimizing in energy. F
essential conditions-and second, that its basis is uniform as h O. The smaller or larger than m, convergence 1
1atter is effectively a geometrical condition on the element regions: lf the the order of the best approximation to
diameter of ei is h1, then e1 should contain a sphere of radius at least -ch1, This rate of convergence will normally
where T is bounded away from zero. This forbids arbitrarily small angles in · Using the "Nitsche triek," which gave tl
triangular elements. Angles near n must also be forbidden in quadrilaterals. dimensional linear element~ in Section
Under these conditions, the distance between u and Sh is
(4)
a(u - u\ u - uh) C 2h 2 <k-m)¡ u 1¡.
In almost all cases the first exponent i
Therefore, h 2 <k-m) is the rate of convergence in strain energy. governs the convergence rate. [For s
The form of this error bound is typical of the results of numerical analysis. error in displacement over an eleinent
There are three factors. The power of h is the simplest to find, since it depends order better (hk+l) than the displaceme
only on the degree ofthe polynomials, and it indicates the rate of convergence which the h 2 <k-m) term would enter: lf'
as the mesh is refined; this effect should be clearly observable numerically. sixth-ordet: problem, or more realist
The constant C depends on the construction of the element and its nodal equation) element complete only throu:
parameters. For regular geometries a good asymptotic value of C can be limited to 2(k m)= 2 even in disph;
computed as the error in approximating polynomials of the next higher The convergence rate at individual]
degree k (Section 3.2). The third factor, 1 u !k, reflects the properties of the provided u has k derivatives at every p
problem itself-the degree to which the solution is smooth, and therefore easy differentiability and therefore the rate
to approximate accurately. This norm is the mean-square value of the kth the mean-square and pointwise senses.
derivatives of u and therefore-according to partial-differential-equation optimum pointwise error bounds.) At
theory-directly related to the derivatives of order k 2m of the data f converge faster than it does in the mear,
Notice that convergence occurs if and only if k > m; this is 'the constant for example, the nodes themselves wer
strain condition, that the elements should reproduce exactly any solution tion is exact at the nodes. This is true):
which is a polynomial of degree m. This requirement for convergence appeared to the homogeneous differential equat
very gradually in the engineering literature, partly developing from intuition Thomée noticed a special convergence
and partly fron the numerical failures which were observed when it was vio- and Dupont are extending this princi¡
CHAP. 2 SEC. 2.2. RATES OF CONVERGE~CE 107
1 is actually computed, differing&rom uh Iated. (Most notably in the biharmonic case m = 2 of plate bending, when the
wmerical solution. Obviously the three twist term xy was left out of Sh.) We shaJI describe a rigorous proof_:_as far as
rogressively more sources of error. We we know, the first proof-that this condition is necessary for convergence in
with smooth solutions and for typical the case of a regular mesh. Such a theorem fits naturaJly into the abstract
of magnitude of these errors. finite element theory, which admits the most general trial functions on a
same: If Sh is a subspace of JC~, thenby regular subdivision.
We have stated the constant strain condition as if it were necessary also
for irregular meshes, but it is not. At least it is not quite necessary. n could
min a(u - v\ u - vh).
¡¡hin Sh be mapped into another domain Q' by a fixed smooth invertible transfor-
mation T which does not preserve polynomials, and then elements which
)geneous essential conditions, where the satisfy the constant strain condition on n will not do so in the new variables.
~ lies in V0 and the difference of any two However, convergence on n implies convergence of the corresponding prob-
: S~ is a subspace of V 0 • (For proof see lem on n', since we can go freely back and forth. The effect of the Jacobian
• h. • • .
rgy m u-u 1s a questwn m pure approxi- on errors for isoparametric elements is considered in Section 3.3.
mce between u and Sh. The basic hypo- Convergence in strain energy is essentiaJly convergence of the mth deriv-
~ree k - l-it contains in each element atives of uh to those of u. This derivative is therefore special, because the
gree, restricted only at the boundary by Ritz method is minimizing in energy. For the sth derivative, where s may be
hat its basis is uniform as h ~ O. The smaller or larger than m, convergence cannot be [aster than O(hk-s); this is
mdition on the element regions: If the the order of the best approximation to u from a space Sh of degree k - l.
:ontain a sphere of radius at Ieast rh¡, This rate of convergence will normaJly be attained by the Ritz solution uh.
This forbids arbitrarily smaJI angles in Using the "Nitsche triék," which gave the O(h 2 ) rate in displacement for one-
mst also be forbidden in quadrilaterals. dimensional linear elements in Section 1.6, we shaJI show that
between u and Sh is
(4) 11 u - uh lis = O(hk-s + hz<k-ml).
In almost all cases the first exponent is smaller, and approximation theory
!rgence in strain energy. governs thé convergen ce rate. [For s = -1, the leff si deis effectively the mean
Ji cal of the results of numerical analysis. error in displacement over an element, and we see that it may weJI be one
h is the simplest to find, since it depends order better (hk+l) than the displacement itself.] There are, however, cases i~
, and it indica tes the rate of convergence which the h 2 <k-mJ term would enter: If we imagined cubic splines applied toa
)Uld be clearly observable numerically. sixth-order problem, or more realistically a plate-bending (fourth-order
struction of the element and its nodal equation) element complete only through quadratics, then the rate would be
a good asymptotic value of C can be limited to 2(k- m) = 2 even in displacement.
ating polynomials of the next higher The convergence rate at individual points can be expected to be the same,
tor, 1u lk, reflects the properties of the provided u has k derivatives at every point. (With singularities, the order of
e solution is smooth, and therefore easy differentiability and therefore the rate of convergence are quite different in
n is the mean-square value of the kth the mean-square and pointwise senses. We shall not give a detailed proof of
ording to partial-differential-equation optimum pointwise error bounds.) At special points the error may actually
ltives of order k - 2m of the data f converge faster than it does in the mean. With linear elements for -u" =f,
r- and only if k > m; this is 'the constant for example, the nodes themselves were special: uh = u1, and the Ritz solu-
r;hould reproduce exactly any solution tion is exact at the nodes. This is true in general if the elements are solutions
requirement for convergence appeared to the homogeneous differential equation [H6, T4]. For the heat equation,
ature, partly developing from intuition Thomée noticed a special convergence rate at the nodes of splines; Douglas
which were observed when it was vio- and Dupont are extending this principie to their collocation methods.
) )
)¿ !
1os A SUMMARY OF THE THEORY CHAP. 2 SEC. 2.2.
The rate of convergence is maintained with inhomogeneous boundary Iow-is normally of arder h 3 • Roughly
conditions (Section 4.4), provided the boundary data are interpolated (or triangles and interpolate the essential
approximated) by polynomials of at least the same degree k- l. these cubics will still violate the cor
For a domain Q ofirregular shape,a new type of approximation error may in Section 4.4, we deduce from this tht
enter. 1t will generally be necessary to approximate the boundary r by a There is still another alternative, wh
piecewis~ polynomial boundary rh. In the simplest case rh is piecewise linear; Curved elements can be straightened m
Q is repÚtced by a polygon Qh. Such a polygon can be carved into triangles, transformation may even be necessary t
and the finite element method may proceed by ignoring the skin Q - Qh Iaterals whose sides are already straight
r
between the true boundary and the polygon. Therefore,it is as ifthe original These coordinate changes are a central t~
differential problem were moved to Qh. In Section 4.4 we investiga te the effect In theory we can straighten almost :
of this change of domain. Briefly, the mth derivative is in error by O(h) at the practice that is absurd. Piecewise polyr
boundary, but this error decays rapidly in the interior. There is a boundary- ríes for the same reasons that they are th
layer effect, and the mean error is O(h 312 ). Since the strain energy depends on ments: they can be handled efficientl)
the square of this derivative, the error in energy due to computing on Qh describe the coordinate changes by th~
(with natural or essential boundary conditions) is O(h 3 ). This estima te applies used as trial functions; this is the metl
to Q as well as Qh ifthe finite element solution is extended in the natural way, It is a brilliant idea. Coordinate chang
by extending each polynomial out to r. Otherwise all the energy in the skin trial functions: The mapping must be e
will be lost, and this amounts to O(h 2 )-proportional to the volume of the so that elements which are adjacent i
skin. adjacent in the f,-r¡ plane. If the transfo
Note that the h 3 error due to change of domain will domínate when the ted from nodal parameters in the sta
finite elements used in the interior are of degree higher than m. Ifthe polyno- e
continuity in and r¡ (as we are for tl
mials are of the mínimum degree m which is re'quired for convergence, then elements), then the isoparametric mai
the change of domain effect will be submerged (at least in the interior of Qh) whose boundaries are polynomials of d
by the h 2 ck-m> = h 2 error which arises from ordinary approximation theory. raises new questions of approximation
If the boundary r
is approximated by piecewise polynomials of higher are no longer polynomials in x and y.
degree 1, there is a corresponding reduction in the change of domain error. mations need not decrease the orde1
The error in the mth derivative (the strain) at the boundary is O(N), and in the sth derivative is achieved, pro'vide<
the overall strain energy in Qh it is O(h 21 + 1). This assumes, in case of an es- smooth (Section 3.3). In this sense the
sential condition u = O, that the condition is exactly satisfied by the polyno- for second-order equations and curve<
mial trial functions on the approximate boundary. Mi.tchell has found a condition u = g can be handled natur;
neat construction of cubic elements which vanish on a boundary made up fundamental order of accuracy.
of piecewise hyperbolas, with h 5 error in strain energy and h 3 in displacement. The errors enumerated so far hav
For an essential condition on a curved boundary, say u = O, an alterna- assumed the Ritz approximation to b<
tive is to use any of the standard elements and to require that they interpo- ever there will also be errors in nun:
late the conditions at boundary nades. In this case the trial functions will not of u~) and in the solution of the finall
satisfy tlre essential condition along the whole boundary; each trial function It is essential to know the scale of the i
may vanish along sorne curve clase to r, but this zero curve will vary from rule can be chosen which is neither .~
one function to the next. As a result, the Ritz theory will not apply; the trial curacy. We mentioned in the·exampl
functions are not in JC~, either on the exact Q or on an approximate Qh. lt bilities: (1) the inhomogeneous data a1
no longer follows that u\ which is still eh osen to minimize /( v), is the closest with x can be replaced by interpola.ti
trial function to u. Nevertheless, it is possible to estímate the error, taking the resulting problem computed exactl
into account that each vh is nonvanishing but small on the boundary. The can be carried out from the beginnir
best error bound on the strain energy-which surprised us by being rather Gaussian quadrature. For a simple pr
' CHAP~ 2 RATES OF CONVERGENCE 109
ntained with inhomogeneous boundary low-is normally of order h 3 • Roughly speaking, if we work with cubics on
the boundary datá are interpolated (or triangles and interpolate the essential boundary conditions, the worst of
: least the same degree k - 1. these cubics will still violate the conditions by O(h 312 ) between nodes;
:::,a new type of approximation'error may in Section 4.4, we deduce from this the O(h 3 ) error. in energy.
V to approximate the boundary r by a There is still another alternative, which is undoubtedly the most popular.
rh
n the simplest case is piecewise linear; Curved elements can be straightened out by a change of coordinates. Such a
a polygon can be ca,t.yed into triangles, transformation may even be necessary to achieve continuity between quadri-
proceed by ignoring· the skin n - Qh laterals whose sides are already straight, unless they happen to be rectangles.
polygon. Therefore, it is as if the original These coordina te changes are a central technique in the finite element method.
~h. In Section 4.4 we investiga te the effect In theory we can straighten almost any boundary curve, but of course in
mth derivative is in error by O(h) at the practice that is absurd. Piecewise polynomials are the best element bounda-
Hy in the interior. There is a boundary- ries for the same reasons that they are the best approximations to the displace-
:73/2). Since the strain energy depends on
ments: they can be handled efficiently by the computer. In fact, we may
~or in energy due to computing on Qh describe. the coordinate changes by the same class of polynomials that are
mditions) is O(h 3 ). This estimate applies used as trial functions; this is the method of isoparametric transformations.
solution is extended in the natural way, It is a brilliant idea. Coordinate changes lead to the same difficulties as for
r. Otherwise all the energy in the skin trial functions: The mapping must be continuous across element boundaries,
12 )-proportional to the volume of the so that elements which are adjacent in the original x-y plane will remain
adjacent in the e-t¡ plane. If the transformations x(e, t¡), y(e, t¡) are construc-
nge of domain will dominate when the ted from nodal parameters in the standard way, and if we are assured of
of degree higher than m. If the polyno- e
continuity in and t¡ (as we are for the standard rectangular or triangular
vhich is required for convergence, then elements), then the isoparametric mappings will succeed even for elements
1bmerged (at least in the interior of Qh) whose boundaries are polynomials of degree k - 1 in x and y. This technique
s from ordinary approximation theory.
:::d by piecewise polynomials of higher
raises new questions of approximatiort theory, since polynomials in and t¡e
are no longer polynomials in x and y. N~vertheless, isoparametric transfor-
uction in the change of domain error. mations need not decrease the order of accuracy; the full order hk-s in
.train) at the boundary is O(h1), and in the sth derivative is achieved, provided these transformations are uniformly
'(h 21 + 1 ). This assumes, in case of an es- smooth (Section 3.3). In this sense the isoparametric technique is the best one
ition is exactly satisfied by the polyno- for second-order equations and curved boundaries. An essential boundary
nate boundary. Mitchell has found a condition u= g can be handled naturally and efficiently, with no loss in the
.vhich vanish on a boundary made up fundamental order of accuracy.
in strain energy and h 3 in displacement. The errors enumerated so far have all contributed to u- uh; we have
rved boundary, say u = O, an alterna- assumed the Ritz approximation to be calculated exactly. In practice, how-
lents and to require that they interpo- ever, there will also be errors in numerical integration (yielding U.h instead
In this case the trial functions will not of uh) and in the solution of the final linear system (yielding üh instead of U.h).
e whole boundary; each tri al function 1t is essential to know the scale of the integration errors, so that a quadrature
r, but this zero curve will vary from rule can be chosen which is neither wasteful of time nor destructive of ac-
he Ritz theory will not apply; the trial curacy. We mentioned in the·example of Chapter 1 the two leading possi-
: exact n or on an approximate Qh. It bilities: (1) the inhomogeneous data and any material coefficients which vary
chosen to minimize I(v), is the closest with x can be replaced by interpolating polynomials and the integrals for
possible to estimate the error, taking the resulting problem computed exactly; or (2) integrations over all elements
hing but small on the boundary. The can be carried out from the beginning by a standard quadrature rule, say
-which surprised us by being rather Gaussian quadrature. For a simple problem like -(pu')' + qu = J, the user
110 A SuMMARY OF THE THEORY CHAP. 2 SEC. 2.2.
has sorne freedom of choice, and we mention one result ofthe analysis: Both Galerkin's method applies directly to
interpolatíon by polynomials of degree k - 1, and Gauss quadrature. with uh = ~-Qitpi in the subspace Sh such th(
k - 1 points, yield errors in the strains of order hk. For more complicated
problems,'method (2) is strongly indicated, and in fact numerical integration (M(uh), rpk) =O
has become one of the major components of the finite element method. It The number of (nonlinear) equations eqt
produces a solution üh whose convergence to u requires a certain accuracy
cients; that is, it equals the dimension
in the nuliherical quadrature: The mth derivatives of al! tria! functions must be classes described abo ve, it is possible to p.
integrated'exactly. For each additional degree of exactness, the error ül¡- uh
uh and its convergence to u-provided
dueto numerical quadrature improves by a power of h. The proof depends on imposed on the operator. In fact, om
an identity given in Section 4.3, and we also determine the extra accuracy solution u follows these lines; the existe
required for isoparametric elements. diinensions, as well as an a priori boum
Finally we come to the roundoff error. This has a character completely sorne cornpact set. Then the uh must ha"
different from the others; it is proportional to a negative power of h. As h u. Ciarlet, Schultz, and Varga [C4] have ~
decreases there is a region of crossover, prior to which roundoff is negl.igible very little between linear and rnonotone
and uninteresting, and after which it is all-imp,ortant. The roundoff does not The finite element method is in cot
depend too strongly either on the degree of the polynomials u sed or on the for example in·elastic-plastic materials o
number of dimensions; the key factor h- 2 m is set by the mesh width and the tion to the text by Oden [15], there is a raJ
order of the equation itself. Por second-order equations the crossover usual- we rnention among many others the early
ly occurs below the mesh widths which are typical in practica! problems, but Marcal [MI]. 1t seems fruitless for us ton
for equations of order 4 it is not so delayed; accurate calculations may well formulations, distinguishing between th
require double precision. We discuss in Chapter 5 both a priori estimates of earities dueto large deflections and thost
the condition number and a posteriori estimates of the roundoff actually doing something substantial about the
committed in a given problem.
that the convergence problems are we1
These are the principal errors to be analyzed for linear static problems ripe for solution. We hope that ultimate
Lu =f. In every case, the analysis is based on the fundamental variational
come sufficiently complete to justify a h
equation, the vanishing of a(u, v) - (f, v). In comparison with finite diff~r We do want to include one warning e:
ences, there can be no dispute that this produces a more coherent and satts- to problems such as nonlinear elasticity,
factory rnathematical theory. In part, the same techniques can be extended to
minimization of a functional like
nonlinear equations; this rnust surely be at present the outstanding theoret-
ical problem, to isolate classes of nonlinear problems which are both phys-
ically important and mathematically amenable, and to study the dependence
of the approxirnation on the trial space, domain, and coefficients. Lions has
rnade trernendous progress in this direction (see [9]). Since J(v) is not quadratic in v, the vanis
Two classes have already been isolated, as yielding the simplest general- a linear equation for u. Therefore son
izations of linear elliptic equations. We cannot say how comprehensive are simplest being the method of successJve
their applications in engineering and physics, but they appear to be very coe:fficient at the nth approximation un, e:
natural. One is the class of strictly monotone operators, satisfying a linear problem. .
Our warning is this: if such an Ite
J [M(u)
n M(v)](u v) dx >a 11 u v 11!.
problem, so that u,.+ 1 minimizes
The other (closely related) class contains the potential. operators, those for (d)
which Mis the gradient of sorne nonquadratic convex functional l(v). We
refer to the book of Vainberg [18] for an introduction to the theory. then the iteration will converge to the l1
CHAP. 2 SEC. 2.2. RATES OF CONVERGENCE 11f
e mention one result ofthe analysis: Both Galerkin's method applies directly to these operators; we seek a function
~g~ee k l, and Gauss quadrature with uh Í:~iqJi in the subspace Sh such that \
r~ms of ord~r hk. For more complicated
i1cated, and m fact numerical integration for all rpk.
ponents of the finite element method. It
ergence to u requires a certain accuracy The number of (nonlinear) equations equals the number of unknown coeffi-
th derivatives of all ttfipl functions must be cients; that is, it equals the dimension of the trial space Sh. For the two
nal degree of exactne·ss, the error ü" uh classes described abo ve, it is possible to pro ve the existence of such a solution
es by a power of h. The proofdepends on uh and its convergence to u-provided a suitable continuity condition is
d we also determine the extra accuracy imposed on the operator. In fact, one possible existence proof for the
solution u follows these lines; the existence of uh is demonstrated in finite
f er:or. This has a character completely dimensions, as well as an a priori bound which establishes that all uh lie in
~ortiOnai to a negatil'e power of h. As h sorne compact set. Then the uh must ha ve a limit point as h......-¡. O, and this is
ter, prior to which roundoff is negligible u. Ciarlet, Schultz, and Varga [C4] have shown that the error estimates differ
· is all-important. The roundoff does not very little between linear and monotone nonlinear problems.
!gree of the polynomials used or on the The finite element method is in constant use for nonlinear problems,
)r h- zm is set by the mesh width and the for example in elastic-plastic materials or in thermo-viscoelasticity. In addi-
•nd-order equations the crossover usual- tion to the text by Oden [15], there is. a rapidly growing engineeringliterature;
eh are typical in practica! problems, but we mention among many others the early survey articles by Martín [M2] and
~elayed; accurate calculations may weii Marcal [Ml]. It seems fruitless for us to repeat here the possible finite element
m Chapter 5 both a priori estimates of formulations, distinguishing between the techniques for geometric nonlin-
iori estimates of the roundoff actually earities due to large defiections and those for material nonlinearities, without
doing something substantial about the mathematics. It is absolutely clear
be analyzed for linear static problems that the convergence problems are well posed, extremely interesting, and
i based on the fundamental variational ripe for solution. We hope that ultimately this mathematical.theory will be-
(/, v). In comparison with finite differ- come sufficiently complete to justify a book (by someone else}
tis produces a more coherent and satis- We do want to include one warning about nonlinear equations. It applies
the same techniques can be extended to to problems such as nonlinear elasticity, and we take as a simple model the
~e at present the outstanding theoret- minimization of a functional like
1hnear problems which are both phys-
tmenable, and to study the dependence I(v) J [p(v, vx)v! - 2fv] dx.
::e, domain, and coefficients. Lions has
·ection (see [9]). Since I(v) is not quadratic in v, the vanishing of the first variation will not be
lated, as yieiding the sirnplest general- a linear equation for u. Therefore sorne iteration will be considered, the
le cannot say how comprehensive are simplest being the method of successive substitution: evaluate the nonlinear
physics, but they appear to be very coefficient at the nth approximation un, and determine un+ 1 as the solution of
monotone operators, satisfying a linear pro blem. ·
Our warning is this: if such an iteration is applied to the variational
problem, so that un+ 1 minimizes
1ins the potential operators, those for (d)

tquadratic convex functiona1 I(v). We
an introduction to the theory. then the iteration will converge to the wrong answer. The reader can easily
112 A SUMMARY OF THE THEORY CHAP.2 SEC. 2.2.
verify that the limit u* ?r such an iteration satisfies

J
At a given t, the coefficient ph(x) \
modulus p. Instead it will depend o
* *) du*)
d ( p(u '
dx dx -f,
Ux - '
u~(-r), 1' t. To prove convergence,
strains ha ve remained elose to the
given instant is close to the true p:
which is not the equation ofvirtual work for I(v). (Take p vx.) The mistake
was to p1~.rform successive substitutions before taking the first variation; if
we first establish the nonlinear equation for the minimizing function u, and max l p(x) - ph(x) 1 <
X
then solve that equation by iteration, the limit will be the right one.
There is one particular nonlinear problem, describing the deformation To prove convergence, we must
of an elastic-plastic material, which illustrates both the possibilities and the mizes /(vh), and its x-derivative the
difficulties in nonlinear convergence proofs. lt might be useful to present coefficient p in the variational pri
sorne of the details. For simplicity of notation we consider a one-dimensional time t into two parts,
model, with strain dufdx; the same arguments apply to the system ofstrains
e1¡ occurring in two- or three-dimensional elasticity. The deformation is a
nonlinear function of the externa! forces, and cannot be determined from a wh is the function which minimizes
knowledge only of the final loading. Instead it depends on the history of the the true elastic coefficient p at time ¡
problem, that is on the chronological order in which the forces are applied approximation, and the second to a
over the domain., This introduces an artificial "time" parameter, and at a first we can appeal to the standa1
given instant t the rate of change ü of the deformation is the function which problem at each instant of time, ar
minimizes space of degree k 1 is
I(v) f [p(vJ 2
- 2jv] dx.
The crucial quantity is the elastic modulus p(x). If this coefficient is in- For the second part we must turn to
dependent of the stresses within the material, then there is no need for the asserts that the effect of a change i1
artificial time; the final deformation u(T) can be determined by a single mini- 11 w!(t)- u!(t) 11
mization, as in the rest of this book, using the final load f(T). In the case of a
nonlinear stress-strain law, the coefficientp at time t depends on the stresses, Substituting these last two bounds,
and not only on their values at the given instant: it depends on the whole
stress history. This "path-dependence" enters physically in severa! ways. 11 e(t) 11 < e hk-1
For example, once the elastic Iimit has been exceeded, a loading followed by
an equal unloading leaves a net change in the state of stress in the material. This is exactly the situation in whic
This phenomenon actually creates a nonlinear problem for v at each instant Gronwall's lemma: dividing throug
of time, since the elastic modulus then depends not only on the past history multiplying by CC", and integratin!
but also on the current rate of change; p is influenced by the sign of v xv x'
and has one value for loading and another for unloading. For simplicity we
shall avoid this extra difficulty, and assume there is no unloading. We do
log ( C' hk- 1 + CC" s: 11 e(í
not· require, however, that the stress be a· single-valued function of the Taking the exponential, and substit
strain-only that it can be computed from a knowledge of the strains ux(-r)
at all times -r < t. lle(t)ll e
In the Ritz approximation, Ü' is the function in the trial suhspace Sh
Finally, an integrati~n with respect ·
which minimizes
CHAP. 2 SEC. 2.2. RATES OF CONVERGENCE ·113
on satisfies At a given t, the coefficient ph(x) will not have the same value as the true
modulus p.· Instead it will depend on the history of the Ritz approximations
du*)
-
dx
=/, u~(r), 't' t. To prove convergence we :rp_ust assume that ifthese approximate
strains have remained close to the true ones, then the coefficient ph at the
given instant is el ose to the true p:
for I(v). (Take p = vx.) The mist~ke
befo re taking the fir~t variation; if
for the minimizing fdnction u, and
limit will be the right une.
To prove convergence, we must estímate how the function úh which mini-
oblem, describing the deformation
:rates both the possibilities and the mizes l(iJh), and its x-derivative the rate of change of strain, depend on the
ofs. It might be useful to present coefficient p in the variational principie. Our plan is to split the error at
.tion we consider a ene-dimensional time t into two parts,
1ents apply to the system of strains úx(t) - ü!(t) (ú/t) w!(t)) + (w!(t) - ü!(t));
al elasticity. The deformation is a
, and cannot be determined from a wh is the function which minimizes I(iJh) over all trial functions in S\ using
!ad it depends on the history of the the true elastic coefficient p at time t. In other words, the first error is dueto
ier in which the forces are applied apptoximation, and the second to a change in the elastic coefficient. For the
:ificial "time" parameter, and at a first we can appeal to the standard approximation theory; it is a linear
· deformation is the function which próblem at ·each instant of time, and the error in the first derivati ves for a
space of degree k 1 is
2/v] dx.
For the second part we must turn to Section 4.3. The corollary in that section
dulus p(x). If this coefficient is in-
asserts that the effect of a change in coefficient is bounded by
erial, then there is no need for the
::an be determined by a single mini- 11 w!(t) ü!(t) 11 < C" max lp(x) ph(x) 1.
the final load f(T). In the case of a
p at time t depends on the stresses, Substituting these last two bounds, and writing e úx - ü!,
n instant: it depends on the whole
~nters physically in severa! ways.
m exceeded, a loading followed by
the state of stress in the material. This is exactly the situation in which to apply the argument which leads to
near problem for iJ at each instant Gronwall's lemma: dividing through by the right side of the inequality,
pends not only on the past history multiplying by CC", and integrating with respect to t,
· is influenced by the sign of v xiJ x'
r for unloading. For simplicity we
me there is no unloading. We do
log ( C'hk-t + CC" s: 11 e('t') 11 d't') log C'hk-t < CC"t.
a single-valued function of the Taking the exponential, and substituting into the previous inequality,
a a knowledge of the strains uit')
11 e(t) 11 C' hk-t exp( CC" t).
function in the trial subspace Sh
Finally, an integration with respect to t yields
lle(t)ll < f He(t)ll dt < C"'hk-t.

( )
' (, l 2.2.
114 A SUMMARY OF THE THEORY ·cHAP.2 SEC.
Therefore. the rate of convergence in h is the same for this nonlinear plastic Galerkin's rule: The residual u~+ Luh
problem as it is for linear elasticity. the true solution u happens to lie in tt
Notice that we have assumed a Ritz ápproximation computed continu- Sh should ·vanish. Thus the original
ously in tin:e: the only discretization so far has been the change from the full In practice, time must also be made
admissible space to its subspace Sh. This is in keeping with the presentation a finite difference method. The Crar
ofinitial ~alue problems in Chapter 7, where the Ritz error is separated from centered at the time tn+ 1¡ 2 when uh(t,+ 1)
the error due to a finite difference method (or other procedure) in the time accurate to order tlt 2 • Thus the final,
direction. For the nonlinear problem there has been considerable discussion error as well as the Galerkin error dut
of the best "incremental method," but we believe all of the leading possibil- one which we analyze in detail, in ord
ities to be convergent. They should simply contribute a new error-propor- ofthe optimal order ftk-s in the sth der
tional to a power of tlt, in case of a difference equation. of parabolic type, for example the he~
There is one technical difficulty in the proof, however, which an over- liptic operator that occurs in static prc
zealous conscience will not allow us to ignore. It is a question of the choice tions, which include no dissipative t
of norm: if the double bars representa mean-square non~, then the pointwise method is somewhat diminished; the
bound given for p - ph is not strictly valid. On the other hand, to use maxi- explicit difference methods may be to•
mum norms throughout requires a fresh examination of the hk-t estimate for the comparatively automatic treatmen1
üx - w!. This bound followed from mean-square approximatio~ theory, and tage, and we include in Chapter 7 a s
perhaps the simplest remedy is to establish instead a pointwise estlmate of theory.
the Ritz error in static linear problems. Another possibility is to use an idea For want of space, the analysis of
proposed by the first author (for subsequent applications see [W3]) which tion, and roundoff will be limited to
permits switching backand forth between these two norms when the solutión eigenvalue and initial-value problems
is smooth; such a technique is frequently required in nonlinear problems the case of quadrature errors these e]
when the error estimates are global but instabilities can arise locally. Or a ried out by Fix (Baltimore Finite Eler
third possibility is to improve the corollary in Section 4.3, to depend instead In the final chapter, we present th
on the mean-square norm of the perturbation p - ph. We are confident that numerical experiments. One of the me
the basic proof is correct, and that a combination of experiment and analysis strong singularities, produced by a e
will soon lead to a much fuller understanding of nonlinear errors. is a classical problem in fracture mee
This summary of the theory must include also eigem•a!ue problems and factor at the head of the crack; aroun
initial-1•alue problems. The finite element method applies directly and suc.cess- Therefore, a number of questions aris
fully to both. For self-adjoint eigenvalue problems, the computation of
upper bounds by minimizing the Rayleigh quotient over a subspace is a l. Do our error estimates yield ti
classical technique; it leads to a discrete eigenvalue problem KQ J..MQ, pointwise and in mean-square?
where K and M are precisely the same stiffness and mass matrices already 2. The singularity reduces the sm(
encountered. We devote Chapter 6 to deriving this discrete form and to convergence; does a reduced rate ap¡:
estimating the errors in eigenvalue and eigenfunction which depend on so that the singularity pollutes the wh
approximation theory~they are due to replacing the true admissible space 3. Is it possible, by gradin'g the me
JC~ by Sh. The results are simple to descibe: J..- ).h is of order fl2Ck-m>, tions at the singularity, to recover tl
and for k 2m the eigenfunction errors in the sth derivative are of the method?
maximum order hk-s permitted by approximation theory.
+
For initial-value problems u, Lu = f, the position is equally favorable. The answer in each case is "yes,"'
The finite element solution has the form uh(t, x) 2: Q/t)rp/x); the time vincing. Both for static and for eigenv
variable is left continuous, while the dependence on x is discretized in terms of singularities which are introduced by
the standard piecewise polynomial basis functions fPr The coefficients Q:(t) duction of trial functions which mirr
are determined by a system of N ordinary differential equations, expressing Straight interfaces between mater
r__,.Q;IAP. 2 SEC. 2.2. RATES OF CONVERGENCE 115
is the same for this nonlinear plastic Galerkin 's rule: The residual u~ + Luh - f will not be. identically zero unless
- ( the true solution u happens to líe in the trial space Sh, but its component in
tz approximation CO}nputed continu- Sh should vanish. Thus the original equation holds "on the subspace."
) far has been the change from the full In practice, time m ust al so be made discrete. This is assumed to be done by -
:is is in keeping with the presentation a finite difference method. The Crank-Nicolson scheme, for example, is
.vhere the Ritz error is separated from centered at the time tn+ 112 when uh(t n+ 1) is computedJrom uh(tJand is therefore
.hod (or other procei:l,ure) in the· time accurate to order /lt 2 • Thus the final computed approximation includes this
1ere has been considerable discussion error as well as the Galerkin error due to discretization in x. The latter is the
we believe all of the leading possibil- one which we analyze in detail, in order to show that for k > 2m, it, too, is
Iply contribute a new error-propor- ofthe optimal order hk-s in the sth derivative. This result applies to equations
fference equation. of parabolic type, for example the heat equation; L is the same kind of el-
the proof, however, which an over- liptic operator that occurs in static problems. In the case of hyperbolic equa-
ignore. It is a question of the choice tions, which include no dissipative term, the power of the finite element
nean-square norm, then the pointwise method is somewhat diminished; the price to be paid in comparisqn with
alid. On the other hand, to use maxi- explicit difference methods may be· too high. Nevertheless, even in this case
1 examination ofthe hk-_1 estímate for the comparatively automatic treatment of boundaries is a tremendous advan-
an-square approximation theory, and tage, and we include in Chapter 7 a sketch of the hyperbolic finite element
blish instead a pointwise estímate of theory.
Another possibility is to use an idea For want of space, the analysis of change of domain, numerical integra-
equent applications see [W3]) which tion, and roundoff will be limited to the static case Lu = f The results for
:n these two norms when the solutión eigenvalue and initial-value problems are certain to be very similar, and in
ntly required in nonlinear problems the case of quadrature errors these extensions of the theory have been car-
t instabilities can arise locally. Or a ried out by Fix (Baltimore Finite Element Symposium).
lary in Section 4.3, to depend. instead In the final chapter, we present the results of a rather extensive series of
bation p - ph. We are confident that numerical experiments. One of the most interesting concerns a problem with
nbination of experiment and analysis strong singularities, produced by a crack which runs into the material. It
:rstanding of nonlinear errors. is a ciassical problem in fracture mechanics to compute the stress intensity
nclude also eigenl'alue problems and factor at the head of the crack; around that point the stress varíes like r- 112 •
method applies directly and success- Therefore, a number of questions arise:
:tlue problems, the computation of
!eigh quotient over a subspace is a l. Do our error estimates yield the observed rate of convergence, both
:e eigenvalue problem KQ ).MQ, pointwise and in mean-square?
stiffness and mass matrices already 2. The singularity reduces the smoothness of u and therefore the rate of
deriving this discrete -·form and to convergence; does a reduced rate apply even where the solution is smooth,
nd eigenfunction which depend on so that the singularity pollutes the whole computation?
• replacing the true admissible space 3. Is it possible, by grading the mesh or by introducing special trial func-
jescibe: ). - ).h is of order h 2 (k-m), tions at the singularity, to recover the normal rate of convergence of the
)rs in the sth derivative are of the method?
·oximation theory.
= f, the position is equally favorable. The answer in each case is "yes," and the numerical results are very con-
·m uh(t, x) 2: Q/t)rp/x); the time vincing. Both for static and for eigenvalue problems, an excellent remedy for
:ndence on x is discretized in terms of singularities which are introduced by sharp angles in the domain is the intro-
, functions rp r The coefficients Q/t) duction of trial functions which mirror the singularity correctly.
ary differential equations, expressing Straight interfaces between materials create a slightly different problem.
(-
('
116 A SUMMARY OF THE THEORY CHAP.2 SEC. 2.3. GALERKIN'S METHOD, COLI
There is a jump in the derivatives of u across the interface, and we strongly at every point. The discrete form of
recommend the following simple solution: relax any continuity imposed on method, discussed below. At the othe1
the derivatives of the trial functions, so that uh is free to copy the singularity smooth functions which vanish in a b
in u. We do not believe that in normal circumstances the jump condition- parts · shifts all the deriva ti ves off u a
or any other natural boundary condition-should be imposed.
Finally, we have chosen to give the theoretical background for one com- (u, L *v) (f, v), L
paratively\new computational technique-the Peters-Wilkinson algorithm
In this weakest form, u satisfies the «:
for the ma:trix eigenvalue problem Kx lAfx. The solution of the linear sys-
tions." No doubt a discrete form hal
tem KQ = F is of course a still more fundamental problem, and it is subject
Between these extremes there lie a
to considerable refinement in the ordering of the unknowns or in the choice
the equation is said to hold in the stn
of a gradient procedure; but it is comparatively well understood. The eigen-
More generally the choice V = xs(
value problem is more subtle, and without an efficient algorithm the num-
shifted from u to v; if L is of order 2
ber of unknowns will be artificially limited-below the number required
be sought in JC 2 m-s. The cases m
to represent the physics of the problem. Therefore, we have described the
ready been given the notation
Peters-Wilkinson idea (as well as sorne more established algorithms) in
Chapter 6, and applied it to the numerical experiments of Chapter 8. a(u, v
In all these manipulations, the bom

2.3. GALERKIN'S METHOD, COLLOCATION, . With the strong form (s = 0), the full
ANO THE MIXED METHOD on u; the solution must lie in x;m. A
derivatives to qualify for the solutio
As we ha ve described it, the Ritz technique applies only to problems of the which make sense are those of order
classical variational type, in which a convex functional is to be minimized. the essential boundary conditions, t a
The corresponding Euler differential equation is self-adjoint and elliptic. number of conditions which are imiJ
It is well known, however, that equations of quite general type can also be emed by the deriva ti ves of order less 1
written in a weak form, and that this form suggests a generalization from the for s m, v too lies in JC']i. Thus the
Ritz to the Galerkin method. Applied to initial-value problems, this is the space and the test space.
subject ofChapter 7. Here we discuss two types ofstatic problems, first those Galerkin 's method is the obviou
in which derivatives of odd order spoil the self-adjointness of an elliptic general it involves two families of ft
equation, and then those in which the associated functional is not positive space and a subspace Vh of the test
definite-the problem is to find a stationary point rather than a minimum of is the element of Sh which satisfies
I(v). This arises in the Hellinger-Reissner principie in elasticity, and in the
corresponding mixed method for finite elements. It leads to sorne difficult (5)
mathematicat questions in the proof of convergence. t In matheroatical terros, an essential e<
First sorne comments on the weak forms of a differential equation Lu f linear operator B on thé space 3Cm of all fu
There are several, but they all share the following basic principie: The equa- which satisfy Bv O forro a closed subspat
tion is mu1tiplied by test functions v(x) and integrated o ver the domain n, to and cannot be, iroposed on every v; but 1
yield are satisfied by the roiniroizing u. The ust
involve only derivatives of order less thar
(Lu, v) = (/, v). sufficient. In a two-diroensional problero, ft
v O ata given single point P. The valut
This is to holp for each function v in sorne test space V, and everything hinges arbitrarily large, as with log log r, while th
tions which satisfy v(P) = O do not forro ~
on the choice of V. If V includes all o-functions, then the equation Lu = f close in strain energy to trial functions wl
will have to hold in the most classical (sorne would say old-fashioned) sense, thero would give the saroe resultas to roin
CHAP. 2 117
SEC. 2.3. GALERKIN'S METHOD, COLLOCA TION, AND THE MIXED METHOD
i. across the interface, and?we strongly

at every point. The discrete form of this test space leads to the collocation
twn: relax any continuity, 'imp<fSed on
method, discussed below. At the other extreme, V may contain only infinitely
o that uh is free to copithe sirlgularity
smooth functions which vanish in a boundary strip. A formal integration by
l circumstances the jump condition-
parts shifts all the derivatives off u and onto v, leading to the equation
on-should be imposed.
: theoretical background for one com- (u, L*v) (/, v), L* = formal adjoint of L.
¡ue-the Peters-Wil~inson algorithm
A.Mx. The solution ·Úf the linear sys- In this weakest form, u satisfies the equation only "in the sense of distribu-
llndamental problem, and it is subject tions." No doubt a discrete form has been studied.
ring of the unknowns or in the choice Between these extremes there lie a great many possibilities. If V= 3C 0 (0),
>aratively well understood. The eigen- the equation is said to hold in the strong sense, or sometimes in the L 2 sense.
thout an efficient algorithm the num- More generally the choice V xs(Q) permits S of the derivatives to be
:imited-below the number required shifted frorri u to v; if L is of order 2m, then presumably the solution u will
m. Therefore, we have described the be sought in xzm-s. The cases= m is of prime importance, and it has al-
me more estabiished algorithms) in ready beert given the notation
!rical experiments of Chapter 8.
a(u, v) = (/, v).
In all these manipulations, the boundary conditions must play their part.
B.TION,
With the strong form (s = 0), the full set of boundary conditions is imposed
on u; the solution must lie in Xim. As s increases and u needs only 2m s
derivatives to qualify for the solution space, the only boundary conditions
mique applies only to problems of the which make sense are those of order less than 2m - s; for s = m, these are
mvex functional is to be minimized. the essential boundary conditions,t and u lies in X'::. At the sarne time, the
!quation is self-adjoint and eliiptic. number of conditions which are imposed on v is increasing. These are gov-
ms of quite general type can also be erned by the deriva ti ves of order less than s which appear in Green's formula;
~m suggests a generalization from the
for s = m, v too lies in 3C'I:. Thus the Ritz case is symmetric between the trial
to initial-value problems, this is the space and the test space.
'O types of static problems, first those
Galerkin's method is the obvious discretization of the weak form. In
il the self-adjointness of an elliptic general it involves two families of functions-a subspace Sh of the solution
associated functional is not positive space anda subspace Vh of the test space V. Then the Galerkin solution uh
wry point rather than a minimum of is the element of Sh which satisfies
1er principie in elasticity, and in the
elements. It Ieads to sorne difficult (5)
f convergence.
~ms of a differenti,al equaÚon Lu =f.. tin mathematical terms, an essential condition Bv =gis one specified by a bounded
linear operator B on the space Jem of all functions with finite strain energy. The functions
following basic principie: The equa- which satisfy Bv = O form a closed subspace. Natural conditions are those which are not,
md integrated over the domain n to .· and cannot be, imposed on every v; but because of the special form given to I(v), they
' are satisfied by the minimizing u. The usual test for essential conditions is that B must
(/, v). involve only derivatives of order less than m, but this condition is neither necessary nor
sufficient. In a two-dimensional problem, for example with m 1, we cannot impose that
v = O at a given single point P. The value at P is ·not a bounded functional (it may be
e test space V, and everything hinges arbitrarily Iarge, as with Jog log r, while the function has unít strain energy) and the func-
"unctions, then the eqmttion Lu tions which satisfy v(P) O do not form .. a closed subspace. In fact, they come arbitrarily
)me would say old-fashioned) sense, close in strain energy to trial functions which do not vanish at P, and to minimize over
them would give the same result as to' minimize over all of JC t.
(
118 A SUMMA_RY ÓF THE THEORY CHAP.2 SEC. 2.3. GALERKIN'S METHOD, COLI
The left side will need s integrations by parts, as in the continuous problern. Problems which are not positive definí¡
When Sh and Vli are of equal dirnension N, this Galerkin equation goes We propose to analyze in more d1
into operational forrn in the usual way: lfqJ 1, • : . , qJNis a basis for Sh and Sh coincides with the test space Vh-tl
f// 1 , • • • , t¡JN is a basis for V\ the solution uh í: QiqJi satisfies weak form is essentially the a(u, v)
problern departs from the classiéal
(Í: QiLqJ i' f/1 k) = (/, f/1 k), k l, ... ,N. functional. The first departure arises
i~ GQ = as in the constant-coefficient example
1
In matrix form this F, with
Lu = -pu"
The adjoint operator L * will ha ve thi

The Ritz case is of course the one in which Sh Vh e 3Ce, qJi fJ1 i' and
(Lrpi, qJk) = a(rpi, rpk); G becomes the stiffness matrix K. tive of odd order:
Suppose that Sh is a finite elernentspace of degree k - 1, and Vh a finite
element space of degree 1 - l. Then the· expected rate of convergen ce of the J(-pu" + 1:u' + qu)v =
sth derivatives in the Galerkin method rnight be

This suggests an energy inner produc
(6)
As in (4), the first exponent of h reflects the best order of approxirnation

which is possible from Sh, and the second exponent is inftuenced also by
approxirnation in the test subspace and the order 2m ofthe differential equa- which itself is not syrnrnetric: a( u, v)
tion. In theory it appears possible to rnake sorne econornies by choosing adjoint; it corresponds to the imagin
1 < k, say k 4 (cubic splines) and 1 = 2 (linear test functions) in a second- part - pu" + qu is as positive as eve1
order problem ([B21]). The convergence rate is as good as with 1 = 4 and the product is extended to complex func1
bandwidth is reduced; however, Gis no longer syrnrnetric, even in self-adjoint
problems, and we are dubious. a(u, v) Spu'
A related possibility is to "lurnp" sorne terrns in the discretization, thereby
departing frorn the Ritz equations K Q = F and K Q A-M Q, by using ·
The real part of a(u, u) is f p l u'\ 2 + •
subspaces of lower degree on the lower-order terrns in the equation. In the
nary.
past this was useful in eigenvalue problerns, in order to replace Jhe con-
We want to show that the rate of
sistent mass matrix M by a diagonal lumped mass matrix. In the newest established as hk-t, justas before,
algorithms for the eigenvalue problern (Chapter 6) ít is no longer so impor-
method may be very poor. The argut
tant that the mass rnatrix be diagonal, and we shall refer to [T8] for an error
proof when the real part (self-adjoint
analysis of the lurnping process.
as a special case, but the Galerkin rr
In the collocation method, the test subspace Vh has a basis of J-functions:
odd-derivative term (the irnaginary
fJfix) J(x x). The Galerkin equation (5) then dernands that the differ-
size.
ential equation hold at each node: Lu(x) = f(x). In the error bound (6), the
method will norrnally actas if 1 =0 and converge at the rate !zk-zm. There are
THEOREM 2.1
special collocation points, however, which increase this order of convergence
and rnake the method very interesting [07, B24]. G has a srnaller bandwidth Suppose that 1a(u, v) \ < Kll u llmll'
than K, and there are no inner products or elernent matrices to compute; for is elliptic: Re a( u, u) O' !1 u ll~for u¡
complicated nonlinear problerns, these advantages may weii cornpensate for which satisfies the Ritz-Galerkin equ
the heavier approxirnation and srnoothness demands on Sh. Then the order of convergence in ener~
CHAP. 2 SEC. 2.3. GALERKIN'S METHOD, COL LOCATION, AND THE MIXED METHOD 119
>y parts, as i.n th,e con!inuouS'probl~m. Problems which are not positive de.finite symmetric.
:nswn N, th1s Galerkm equation goes
We propose to analyze in more detail two cases in which the trial space
1y: If 'l't• ~ .. , 'PN is a basis for Sh and
S 11 coincides with the test space Vh-there are m integrations by parts, so the
tion uh = L: Q1rp1 satisfies
weak form is essentially the a(u, v) = (/, v) of the Ritz method-but the
rk), k= 1, ... , N. problem departs from the classical one of minimizing a positive-definite
functional. The first departure arises when the equation is not self-adjoint,
as in the constant-coefficient example
Lu = --pu" + 1:u' + qu = f
which
.
Sh = Vh e xmE• 'f'm.=
J
tiF
'Y}'
and The adjoint operator L * will ha ve the opposite sign multiplying the deriva-
;tiffness matrix K. tive of odd order:
space of degree k - 1, and Vh a finite
1e expected rate of convergence of the
might be
J(-pu" + 1:u' + qu)v = Ju( -pv"- tv' + qv).
This suggests an energy inner product
cts the best order of approximation a(Zf, v) == Jpu'v' + 1:u'v + quv,

:cond exponent is influenced also by
the order 2m of the differential equa- which itself is not symmetric: a(u, v) -=1=- a(v, u). In fact, the new term is skew-
mak.e sorne economies by choosing adjoint; it corresponds to the imaginary part of the operator L, whose real
= 2 (hnear test functions) in a second- part -pu" + qu is as positive as ever. This appears most clearly if the inner
: rate is as good as with l = 4 and the product is extended to complex functions,
longer symmetric, even in se1f-adjoint
ne terms in the discretization, thereby

a(u, v) = Jpu'v' + 1:u'v + quv.
Q = F and KQ = A.MQ, by using ·
·-order terms in the equation. In the The real part of a(u, u) is J p 1u'1 2 + q 1u 12 , and the new term is purely imagi-
•b1ems, in order to replace Jhe con- nary.
lumped mass matrix. In the newest We want to show that the rate of convergence in slope can be rigorously
:chapter 6) it is no Jonger so impor- established as hk-t, just as before, but that for large 1: the finite element
nd we shall refer to [T8] for an error method may be very poor. The argument is quite general. The convergence
proof when the real part (self-adjoint part) is elliptic includes the Ritz method
bspace Vh has a basis of J-functions :· as a special case, but the Galerkin method may be unsatisfactory when the
on (5) then demands that the differ- odd;.deri vative term (the imaginary or skew-adjoint part) is of significant
) = f(x¡). In the error bound (6), the size.
converge at the rate !zk-zm. There are
h increase this order of convergence THEOREM 2.1
>7, B24]. G has a smaller bandwidth Suppose that 1a( u, v) 1 < K11 u 11m11 v 11m and that the real part ofthe problem
or element matrices to compute; for is elliptic: Re a( u, u)> u 11 u 11~for u in X~. Let uh be thefunction in Sh e X~
dvantages may well compensate for which satisfies the Ritz-Galerkin equation a(uh, vh) = (f, vh) for al/ vh in Sh.
!ss demands on Sh. Then the order of convergence in energy (equivalently, the order of convergence
120 A SUMMARY OF THE THEORY CHAP.2 SEC. 2.3. GALERKIN'S METHOD, COl
of the mth derivatfVr_s) equals the best possible order of approximation which from x 2 + y 2 to xy, which has a sa•
can be achieved kY 8-h: · origin.
A good éxample is given by the e
(7) 11 U - uh ll m K min 11 u - vh 11m. a beam under loading. Suppose that
(JI SA
new unknown. Then the original eq1
For afini{e element space of degree k l, this order is hk-m. system of two second-order equatior
Proof Since a(u, v) (J, v) for all v in :rer;z, subtraction yields a(u - u\
(8)
vh) = O for all vh. Therefore,
(J 11 u uhll! Re a(u - u\ u uh)

This reduction of order brings sever
Rea(u u\ u) Re a(u u'¡, u
variational form need only be cont
fourth-order equation was associat
Dividing by the common factor 11 u uhllm' the proof is complete. Nitsche's
method could now be used to establish the usual rate of convergence (4) for
f (v")2 .- 2fv, and this is _finite only
thermore, the condition number of tl
the displacement.
by a reduction in the ordet ofthe diffe
In spite of this convergence, the Galerkin method may in practice be a are replaced by second differences,
bad choice. Suppose that Sh is the standard piecewise linear subspace. Then O(h- 4 ) to O(h- 2 ). This may seem mü
in our example the Galerkin equations for Q1 vh(x1) will be just the be achieved by a purely f9rmal intrc
difference equations unknown, but the improvement is q1
We examine this roundoff error in
beam, with u and M vanishing at b•
and w" M can be solved separatel
ther finite differences or finite elemen
tion of M" = f includes a roundoff e
h- 2 2-t if the computer word leQ.gth
Notice that the first derivative is replaced by a centered difference, regard- w" = -fwilllnclude first ofall its o"
less of the sign of -r. The true solution, however, may depend very and also an inherited error E 3 • This en
strongly on this sign; as 1-r 1--¡. <XJ, the advection term -ru' dominates the exactly the discretization of this equa
second derivative, and the solution u is of boundary-layer type. Over most of E 3 is also of order h- 2 • The roundoff,
the interval u is essentially the solution toan initial-value problem, for which As a second check, we shall compt
the centered difference is completely inappropriate; at the far end there is a system. In the finite difference case, t
rapid variation in order to satisfy the other boundary condition, and an
extremely fine mésh is required. The need for one-sided (upstream) differences
(9)
is well known in chemic'al engineering. Mathematically the dominance of -r is
reflected fn a larg~ value for K/u in the error estímate (7).
A second departure from the classical Ritz formulation arises when the and the eigenvalues p, of this block
potential-energy functional l(v) is not convex-a(u, u) is not positive defi- A. of 0 2 by
nite-and the task is to find a stationary point rather than a mínimum. This
occurs naturally in the mixed method, when both the displacement and its
derivatives &re taken as independent unknowns. The potential energy involves We know that the eigenvalues l of tl
products which are just as likely to be negative as positive; it is like going from 0(1) to O(h- 2). Solving the quad
r
CHAP.2 SEC. 2.3. GALERKIN'S METHOD, COLLOCATION, AND THE MIXED METHOD 121
possible order of approximation which from x 2 + y 2 to xy, which has a saddle point instead of a mínimum at the
origin.
A good example is given by the equation wuv> f(x), for the bending of
a beam under loading. Suppose that the moment M = w" is introduced as a
new unknown. Then the original equation becomes M" J, and we have a
- 1, this order is hk-m. system of two second-order equations:
l v in X'E, subtractioh.yields a(~

(8)
This reduction of order brings several advantages. The trial functions in the
variational form need only be continuous between elements, whereas the
uh!lm, the proof is complete. Nitsche's · fourth-order equation was associatttd with the potential energy l(v) =
t the usual rafe of convergence (4) for f( v")2 - 2fv, and this is finite only if the slope v' is also continuous. Fur-
thermore, the condition number of the stiffness matrix is completely altered
by a reduction in the order ofthe differential equations; the fourth differences
alerkin method may in practice be a are replaced by second differences, and the condition number goes from
1dard piecewise linear subspace. Then O(h- 4 ) to O(h- 2 ). This may seem miniculous, that such an improvement can
~ns for Qi = vk(x) will be just the be achieved by a purely formal introduction of the derivative M as a new
unknown, but the improvement is quite genuine.
We examine this roundoff error in two ways. Considera simply supported
beam, with u and M vanishing at both ends. Then the equations M" f
and w" = M can be solved separately, first for M and then for w, using ei-
ther finite differences or finite elements. Suppose that the approximate solu-
tion of M"= f includes a roundoff error E 1 , which will normally be of order
h- 2 2-:t if the computer word length is t. Then the approximate solution of
ced by a centered difference, regard- w" - f will in elude first of all its own roundoff error E2 , of this same order,
!ution, however, may depend very and al so an inherited error E 3 • This error satisfies E~' = E 1 , or rather it satisfies
te advection term ru' dominates the exactly the discretization of this equation which is being used, and therefore
of boundary-layer type. Over most of E 3 is also of order h- 2 • The roundoff error is not compounded to h- 4 •
toan initial-value problem, for which As a second check, we shall compute the condition number of the discrete
1ppropriate; at the far end there is a system. In the finite difference case, the matrix form is
: other boundary condition, and an
d for one-sided (upstream) differences
~athematically the dominance oiT is
(9)
error estímate (7).
;;al Ritz formulation arises when the and the eigenvalues p of this block matrix are related to the eigenvalues
convex-a(u, u) is not positive defi- l of Ó2 by
y point rather than a mínimum. This
when both the displacement and its
nowns. The potential energy involves We know that the eigenvalues l of the second difference operator Ó2 range
negative as positive; it is like going from 0(1) to O(h- 2 ). Solving the quadratic, the eigenvalues p fall in the same
122 A SUMMARY OF THE THEORY CHAP. 2 SEC. 2.3. GALERKIN'S METHOD, COLL
range, and the condition number llmax..l ftmin of the coupled system is indeed functional I(v) has no definite sign; th
of order h- 2 • and nota mínimum.
We wish to offer an explanation for this miracle. lt rests on the observa- Let us note the computational cons
tion that the usual measute of condition number for these matrices is un- piecewise linear elements and Nh = n
natural. We are regard_ing them for numerical purposes as transforming
Euclidean space (discrefe 3C 0 ) into itself, and therefore we use the same norm
for both the residual in the equation and the resulting error in the solution.
This is completely contrary to what is done in the differential problem, or
(Because M is unconstrained, it requ
for that matter in estimating discretization error; in such a case f is measured
are now 2N + 2 unknowns, but the s.
in 3C 0 , M and its error in 3C 2 , and w and its error in .:fe 4 • (In the variational
first for Mh and then for wh. With N
problem, these may be x-2, 3C 0 , and 3C 2 , respectively.) In fact, the operator
L = d 2 /dx 2 , with any of the usual boundary conditions, is perfectly condi-
tioned as a map from 3C 2 to 3C 0 • This was the essential point of Section 1.2,
that both L and L -t are .bounded. We can show that the same is true ofthe
difference operator Ó2 , and of any reasonable finite .element analogue, pro-
(
-G
D*
D)
O'
where G !!_
6
(~o '
vided these natural norms are retained. Therefore, there ought to exist an o
algorithm for solving KQ = F which reflects this property, and in this case
the miracle would disappear; the errÓrs in M and w would be appropriate to Gis the mass matrix (or Gram matrix)
their station. As long as a standard elimination is used, however, there will difference matrix. For large N the m
be a dífference in roundoff error between fourth-order and second-order z P q 2 , ••• to reduce the band width.
prob.lems. with G replaced by the identity, the 1
Before going further, the coupled differential equations of the mixed roots of p 2 + p = A, where A is an ei
method should be put into variational form. Multiplying the first equation matrix D* D. The errors in Mh and wh ·
in (8) by M and the second by w, and integrating by parts, will be accurate to order h. Of course tl
M h.
For two-dimensional problems, l
J (w"M + M"w M 2fw) dx 2
simple elements which are appropriah
catalogue these elements, however, v
- J (2w'M' + M + 2fw) dx + w'M + M'wl~·
2
mathematically the main point: the
when the functional is indefinite. The 1
In the simply supported case w and M vanish at each end-the admissible an infinite symmetric system Lu b · =
space satisfies full Dirichlet conditions-and the integrated term disappears. pose that Sh is a finite-dimensional sul
For the clamped beam a remarkable thing occurs: The condítion w = O is operator onto Sh: Phv is the compone
imposed at each end, but tbe vanishing of w' yields a natural boundary condi- (5) is identical with the problem of fin
tion for M. The stationary point of which satisfies
I(v) - f (2w'M' + M 2
+ 2fw),
(10)
The operator PhLPh ofthe discrete pr(

if the admissible space V contains all pairs (M, w) with M in the Neumann K, will be positive definite if L is; thü
space X 1 and w in the Dirichlet space Xó, solves exactly the problem of the even more positive than L itself; the n
clamped beam. Roughly speaking, if Mis unconstrained at the endpoints and problem is constrained toa subspace. ~
the first variation of 1 is to vanish, then the factor w' which multiplies M S\ then (PhLPhv, v) = (LPhv, Phv) =
must be zero. We emphasize that as in the Reissner principie of elasticity, fhe on the whole space: (Lv, v) > a(v, v) f
SEC. 2.3. GALERKIN'S METHOD, COLLOCATION, AND THE MIXED METHOD 123
u!Jlmin of the coupled system is indeed functional /(v) has no definite sign; the problem is one of a stationary point,
and not a mínimum.
"or this miracle. It rests on the observa- Let us note the computational consequences ofthis variational form. With
tion number for these matrices is un- piecewise linear elements and Nh n, Mh and wh are expressed by
numerical purposes as transforming
~lf, and therefore we use the same norm N+l
Mh = L.; Z/fl/x),
and the resulting ernb.r in the solution. o
is done in the differential problem or
.tion error; in such a case f is meas~red (Because M is unconstrained, it requires two extra basis functions.) There
1nd its error in 3C 4 • (In the variational are now 2N + 2 .unknowns, but the system can no longer be solved in series,
3C 2 , respectively.) In fact, the operator first for Mh and then for wh. With N= 2 the coefficient matrix is
nmdary conditions, is perfectly condi-
¡ was the essential point of Section 1.2,
~can show that the same is true ofthe -G D) 1 2
-:-1 o)1
where G
Lsonable finite .element analogue,. pro- ( D* O' h -1 2.
(
!d. Therefore, there ought to exist an o -1
reflects this property, and in this case
·s in M and w would be appropriate to Gis the mass matrix (or Gram matrix) ofthe rpi, aríd D is a rectangular second
im.ination is used, however, there will difference matrix. Por large N the unknowns should be ordered by z 0 , q u
tween fourth-order and second-order z 1, q2 , ••• to reduce the band width. The condition number is again O(h- 2 );
with G replaced by the identity, the eigenvalues are p, = 1, twice, and the
l differential equations of the mixed roots of p, 2 p, = A., where A. is an eigenvalue of the usual fourth difference
l form. Multiplying the first equation matrix D* D. The errors in Mh and wh will be of order h 2 , and their derivatives
integrating by parts, will be accurate to order h. Of course the second derivative of wh will not equal
Mh. .
2fw) dx Por two-dimensional problems, HeÜan and Herrmann have developed
simple elements which are appropriate for the mixed method. Ratherthan to
2/l·v) dx + w'M + M'w l'ó· catalogue these elements, however, we prefer to return instead to what is
mathematically the main point: the difficulty of establishing convergence
when the functional is in definí te. The problem is easiest to explain in terms of
f vanish at each end-the admissible
an infinite symmetric system Lu == b of simultaneous linear equations. Sup-
~and the integrated term disappears.
pose that Sh is a finite-dimensional subspace and Ph the symmetric projection
hmg occurs: The condition w = O is operator onto Sh: Phv is the component of v in S h. Then the Galerkin method
of w' yields a natural boundary condi- (5) is identical with the problem of jinding the approximate solution uh in Sh
which satisfies
· + M + 2fw),
2 (10)
The operator PhLPh ofthe discrete problem, which is just our stiffness matrix
'airs (M, w) with M in the Neumann K, will be positive definite if Lis; this is the Ritz case. In fact, PhLPh will be
fCó, sol ves exactly the problem of the even more positive than L itself; the mínimum eigenvalue increases when the
s unconstrained at the endpoints and problem is constrained toa subspace. This is easy to see: If vis in the subspace
m the factor w' which multiplies M
S\ then (PhLPhv, v) = (LPhv, Phv) = (Lv, v). Suppose L is pósitive definitt~
1e Reissner principie of elasticity, the on the whole space: (Lv, v) > a(v, v) for all v. Then this positive definiteness
124 A SUMMARY OF THE THEORY CHAP. 2
SEC. 2.3. GALERKIN'S METHOD, COI
is inher.ited by PhLPh on the subspace S\ and with no decrease in u.

If Lis symmetric but indefinite, then the same is to be expected of the
Galerkin operator PhLPh. (This assumes that the test space Vh and trial space
Sh are the same. Otherwise, if Qh is the projection onto V\ the Galerkin
L=
equation is QhLPhuh = Q"f, and QhLPh is not even symmetric.) It is natural
to hope tl).at with increasing S\ uh will approach u. This convergence is not
automatic,':··however, and in searching for the right hypotheses, we are led
back to the fundamental theorem of numerical analysis: Consistency and
stability imply convergence. is a good example; for every odd N, 1
last row is zero, and for f = (1, 1/2,

THEOREM 2.2 on the subspace Sh will not have a
Suppose that Galerkin' s method is McCarthy for his help with these e
(a)
consisten!: for every v, llv- Phv 11 ~O indefinite quadratic 2xy breaks dow
and Note that a reordering of the
(b) stable: the discrete operators are uniformly invertible, 11 (PhLPh)- 1 11 <
very el ose to the coupled system of 1
c.
convergence in that example, but tl
Then the method converges: 11 u - u" 11 ~ O.
choice of the subspáces Sh. Appare
Proof Denote (PhLPh)- 1 by R. Since Lu = J, we have yield such a choice. C. Johnson hasjt
two of the most important mixed ele
PhLPhu + PhL(u- Phu) = Phj,
tively constant and linear in each t
or property, after the displacement ur
moment unknowns are seen to be e
Subtracting uh = RPhJ,
for a positive definite expression (the
exactlythe familiar Ritz condition, t
are contained in the space of admiss
the prescribed load) in the full contil
Note that the rate of convergence depends both on the stabi!ity constant method, convergence depends on ap
C and on the approximation properties of S\ exactly as in Theorem 2.1. Convergence for mixed and hy
(That was effectively a special case of the present theorem, with C = 1/a and clearer, thanks to the work of Brezzi :
11 Lll = K. A more general result is given by Babuska [B4].) We could extend is that of stability.
the theory to show that consistency and stability are also necessary for con- To achieve numerical stability in
vergence, and that the existen ce of the solution u need not be assumed; Brow- matrix is indefinite, pivoting (row ar
der and Petryshyn ha ve shown how to deduce the invertibility of the original during the Gauss elimination proces
operator L. the predicted decrease in conditio11
We pólnt out again the special role of positive definiteness: It makes Connor and Will have reported son:
stability automatic. That is why the Ritz method is so safe. In the indefinite high degree: the mixed method has t
case, suppose that Sh is the subspace formed from the first N coordinate reduction of''order is. such a valuab
directions; Phv is given by the first N components of v. Then P"LPhis the Nth idea must continue.
principal minor of L-the ~ubmatrix in the upper left corner of L-and stabil- t We believe that for o.ther mixed (am
ity means that these N X N submatrices are uniformly invertible. It sounds verification of consistency (or approxim:
as if this might follow from the invertibility of the whole matrix L, but it theorem [6, B4] yields convergence. Brezzi
does not. The following invertible matrix and his technique needs to be generalized;
Miyoshi, Kikuchi, and Ando in Japan.
)
.<mAP. 2
SEC. 2.3. GALERKIN'S METHOD, COLLOCATION, AND THE MIXED METHOD 125
Sh, and with,no decrease in u.

en the same ikto ·be expected of the
s that the test space Vh and trial space
he projection onto V\ the Galerkin
is not even symmetric.) It is natural L=
1approach u. This convergence is not
for the right hypoth;€ses, we are led
numerical analysis: 'consistency and
is a good example;for every odd N, the leading principal minor is singular.lts
last row is zero, and for f = (1, 1/2, 1/4, ... ), the functional (Lv, v)- 2(/, v)
on the subspace Sh will not have a saddle (stationary) point. (We thank C.
McCarthy for his help with these questions.) Even in the 2 by 2 case, the
indefinite quadratic 2xy breaks down completely .on the subspace x = O.
Note that a reordering of the unknowns yields L (~ ~), which is
uniformly invertible, 11 (PhLPh)-1 11 <
very el ose to the coupled system of the mixed method example. We claimed
11 o. convergence in that example, but this is actually justified only for a proper
choice of the subspaces Sh. Apparently the jinite element construction does
e Lu we have yield 1uch a choice. C. Johnson has just succeeded in proving convergence for
two of the most important mixed elements, in which the moments are respec-
tively constant and linear in each triangle. His proof establishes a special
property, after the displacement unknowns have been eliminated and the
moment unknowns are seen to be determined as the minimizing functions
for a positive definite expression (the complementary energy). This property is
exactlythe familiar Ritz condition, that the trial moments in the discrete case
CIJLJJ!Iu- Pul! O. are contained in the space of admissible moments (those in equilibrium with
the prescribed load) in the full continuous problem. Therefore, as in the Ritz
!pends both on the stability constant method, convergence depends on approximation theory and can be proved. t
s of S\ exactly as in Theorem 2.1. Convergence for mixed and hybrid elements. is becoming very much
e present theorem, with C Ifu and clearer, thanks to the work of Brezzi and others; the essential extra:hypothesis
t by Babuska [B4].) We could extend is that of stabílity.
t stability are also necessary for con- To achíeve numerical stability in the mixed method, where the coefficient
lution u need not be assumed; Brow- matrix is indefinite, pivoting (row and column exchanges) must be permitted
educe the invertibility of the original during the Gauss elimination process. Computational results ha ve borne out
the predicted decrease in condition number and roundoff error, although
e of positive definiteness: It makes Connor and Will have reported sorne unsatisfactory results with elements of
z method is so safe. In the indefinite high degree: the mixed method has enjoyed only mixed success. Nevertheless
:Ormed from the first N coordinate reduction of order is such a valuable property that the development of the
nponents of v. Then PLP' is the Nth idea must continue.
1e upper left corner of L-and stabil- t We bdieve that for o.ther mixed (and hybrid) elements, the natural proof ís a direct
; are uniformly invertible. It sounds verífication of consistency (or approximability) and stability. Then Babuska's general
bility of the whole matrix L, but it theorem [6, B4] yields convergence. Brezzi has established stability for one hybrid element,
J{ and hís technique needs to be generalized; these methods have recently been discussed by
Miyoshi, K.ikuchi, andAndo in Japan.
126 A SUMMARY OF T~ THEORY CHAP.2 SEC. 2.4. SYSTEMS OF EQUATIONS; SHE
(
There are also a number of curve
2.4. S.:YSTEMS OF EOUATIONS; ·sHELL variables, to which our analysis can 1
PROBLEMS; VARIATIONS ON THE involve derivatives of different orders.
FINITE ELEMENT METHOD basis of a fully compatible shell theor:
parametrically by a set of three equa1
It may, be objected that our finite element theory deals always with a dent variables in tpe shell equations ar1
single unkriown, while most applications involv~ a system of r equations for standard plane domain Q. The curvat
a vector uriknown u (u~' ... , ur). Fortunately, the distinction is often not the deriva ti ves of the X¡, which en ter a:
essentia1. The variational principie for a system again minimizes a quadratic equations. Numerical integration will
I(v), and the error estimates depend exactly as before on the approximation evaluation of the element stiffness r
properties of Sh. essential respect like any other proble
A typical example is furnished by two- or three-dimensional elasticity. The have constructed a conforming and
unknowns at each point are the displacements in the coordinate directions, CURSHL [C12], by combining the ~
and the fit:tite element solution íi!' is again the trial function closest to the true. cubics of Section l. 9 for the normal at
solution in the,sense of strain energy-which is now a quadratic function Since the difference k -m equals 3 f<
involving all the unknowns. The approximation theory in the next chapter gen ce in strain energy will be h 6 •
will,show that, independent of the number of dimensio_ns and unknowns, the Such a construction, whose accura~
rate of convergence depends,on the degree of -the finite elements and on the periments, ought in theory to end the
approximation to the domain. has not. For most applications and rr
Sorne new and much more difficult problems arise for shells. The theory quite complicated; it has not been wi
is normally constructed as a limiting case of three-dimensional elasticity, in are many problems with special symn
which the domain n becomes very thin in one direction (normal to the shell elements can be used [04]. On tl
surface of the shell). The result of the limiting process is to introduce second mate a general shell by an assembla
deriva ti ves of the transverse displacement w into the potential-energy func- becomes effectively a simple plate ele
tional; the differential equations are offourth order in w, and ofsecond order deformations are coupled only by the
in the in-plane displacements. This imbalance is the price to be paid for the variant of this approach-which ,inevi
reduction from a three-dimensional toa two-dimensional problem. from the viewpoint of pure shell theo
Let us note immediately the increasing popularity of three-dimensional plest and most practica! way to deal 1
elements, in which this reduction· is not made. It is by no means automatic, We would very much like to ar
when a limiting process simplifies the search for exact solutions to problems gence of these combinations of ftat ¡:
with special symmetries-as it certainly does in shell theory-that the same that under reasonable conditions the
process will simplify the numerical solution of more general problems. (The as a geodesic dome) approaches that
same question arises with the Airy stress functions of plate bending; is it unable to verify that conjecture. The
numerically sensible to reduce the nufuber of unknowns and increase the extremely interesting.
order of the equations? We do,ubt it.) Obviously a thin shell will never pre- In the one-dimensional case, wht
sent a typlcal three-dimensional problem; there will always be difficulties gonal frame, we can be more explicít.
with a nearly degenerate domain. Not only the isoparametric technique but of bending and extensional energies,
also special choices of nodal unknowns and of reduced integration formulas
in the normal direction are being tried experimentally. From a theoretical
viewpoint, it will beco me necessary to estímate the effect of a small thickness
a(v, v) J
=e (~~ + ~2
parameter t (Fried has done so, in respect of numerical stability and the con-
dition number), but in general the approximation-theoretic approach can The radius of curvature is r, the thickr
2
proceed in the usual way. are that v 1 and V 2 lie in :JC 1 and :JC '
CHAP.2 SEC. 2.4. SYSTEMS OF EQUATIONS; SHELLS; FORCE A.ND HYBRID METHODS 127
There are also a number of curved shell elements, in two independent

Ell
variables, to which our analysis can be directly applied-even though they
"HE
involve derivatives of different orders. These elements are constructed on the
basis of a fully compatible shell theory: The surface of the shell is ~escribed
', parametrically by a set of three equations X¡= X¡((}¡, _0 2), _and the mdepen-
~ element theory deal~ ~lways \vith a
dent variables in the shell equations are(} 1 and 0 2-whtch _stmply vary o ver a
ns involvf? a system clf r equations for
standard plane domain Q. The curvature of the shell appe~rs onl~ throu?h
'ortunately, the distinction is often not
the derivatives ofthe x¡, which enteras variable coefficients m the dtfferenttal
a system again minimizes a quadratic
equations. Numerical integration will almost certainly be requi~ed_ for the
mctly as before on the approximation
evaluation of the element stiffness matrices, but the problem Is m every
essential respect like any other problem in the plane. Cowper and Lindberg
'0- or three-dimensional elasticity. The
ha ve cÓnstructed a conforming and highly accurate element .known as
1cements in the coordinate directions,
CURSHL [C12], by combining the e 1 reduced quintics and eo com_pleté
Lin the trial function closest to the true
cubics of Section 1.9 for the normal and in-plane displacements, respecttvely.
-which is now a quadratic function
Since the difference k -m ~quals 3 for all components, the rate of conver-
"Oximation theory in the next chapter
gence in strain energy will be h 6 • •
.ber of dimensio_ns and unknowns, the
Such a construction, whose accuracy has been confirmed by numencal ex-
~ree of the finlte elements and on the
periments, ought in theory to end the search for good shell elements-but it
has not. For most applications and most programs, CURSHL appears to be
problems arise for shells. The theory
quite complicated; it has not been widely accepted. On the one hand, there
ase of three-dimensional elasticity, in
are many problems with special symmetries, in which cylindrical or shallow
hin in one direction (normal to the
shell elements can be used [04]. On the other hand, it is possible to approxi-
imiting process is to introduce second
mate a general shell by an assemblage of flat pieces. E~ch of these pie~es
tent w into the potential-energy func-
becomes effectively a simple plate element, and the bendmg and stretchmg
fourth order in w, and of second order
deformations are coupled only by the fitting together of these plates. Sorne
mlance is the price to be paid for the
variant of this approach-which inevitably involves nonconforming elements
a two-dimensional problem.
from the viewpoint of pure shell theory.-:.appears to be surviving as the sim-
sing popularity of three-dimensional
plest and most practica! way to deal with complicated shells.
t made. It is by no means automatic,
We would very much Jike to analyze the convergence or nonconver-
:!arch for exact solutions to problems
gence ~f these combinations of flat plate elements. It has been conjectured
r does in shell theory-that the same
that under reasonable conditions the deformation of a polyhedral shell (such
1tion of more general problems. (The
as a geodesic dome) approaches that of a genuinely curved shell, but we are
·ess functions of plate bending; is it
unable to verify that conjecture. The mathematical problems are novel and
mber of unknowns and increase the
extremely interesting. ;
)bviously a thin shell will never pre-
In the one-dimensional case, when a curved arch is replaced by a poly-
em; there will always be difficulties
gonal frame, we can be more explicit. The strain energy f?r the arch is a s~m
Jnly the isoparametric technique but
of bending and ext~nsional energies, and can be normahzed to
and of reduced integration formulas
l experimentally. From a theoretical
;tima te the effect of a small thickness a(v, v) C J(dv 1
ds
+ Vr
2)
2
+ f_(_l dV¡
12 r ds
2
_ d Vz)
ds 2
2
;::t of numerical stability and the con-

proximation-theoretic approach can The radius of curvature is r, the thickness is t, and the admissibility conditions
are that vt and 112 líe in 3<: 1 and 3<: 2 , respectively; the first derivative avzfas
128 A SUMMARY OF THE TIÍEORY CHAP. 2 SEC. 2.4. SYSTEMS OF EQUATIONS; SHE
is effectivel:y,_ the rotation of the vector normal to the arch, and this is not tions on the shell) are taken to be ir
perJllitted to have a jump discontinuity. The pair v 1 = u and v2 w, which relationship to the normal displacerm
minimlze the potential energy [I(v) = a(v, v) +linear terms], are the tangen- the nodes, not over the entire surface.
tial and normal displacements ofthe arch. Note that they are determined by tinue to be practica} and valuable; ther,
differential equations, the Euler equations of l(v), which are of different and too many alternative elements, to
orders. . Even conforming elements such as Cl
For ah' approximating frame, made up of straight lines, the radius of the ground that rigid body motions are
curvature becomes r oo. This uncouples the two components of in the v is that a tall structure leaning in the wi
strain energy, and therefore Ieads to simple and independent differential as a consequence such large finite eler
equations for u and w. The coupling reappea¡s, however, at the joints of the bending, which is of prime importan<
frame, in order to prevent the structure from coming apart in its deformed curacy: and to be concerned rather '
v
state. In other words, each trial function = (v 0 v 2 ), as s s 0 is approached elements.
from the left and from the right, must move the joint to the same place and Obviously shells have provided fin
thereby leave the frame continuous; this represents two conditions relating alllinear problems perhaps the most
both components of v_ and v+ to the joint angle B. (The conditions are essen- matical analysis. But these are probl
tial, not natural.) Furthermore, there is an essential continuity condition by earlier techniques, and we believt
on the rotation av 2 jas, preserving the angle at each joint. (Note that v 2 itself under8tood.
need not be continuous; it will include t5-functions at the joints, whose In addition to systems of equatio1
derivatives from left and right are both zero.) the finite element analysis described S<
With finite elements these conditions constrain the polynomials meeting displacement method, in which an ex]
at the nodes. Typically v 2 is a modified Hermite cubic: the continuity of the placement, and the optimal coefficien1
rotation is left unchanged, but discontinuities in function value are allowed, kin principie. The numerical effective
and related to those in v 1 • Obviously such trial functions are inadmissible in and this is largely a period of perfe<
the arch problem, but since the strain energy is also modified by the removal input and output-rapid developme1
of r, the convergence question remains open. Direct evidence against con- quired to make the output intelligibl€
vergence was provided by Walz, Fulton, and Cyrus (Wright-Patterson Con- fections which can be removed only 1
ference Il) for the case of a circular arch and a regular polygon. The finite itself. In the remainder of this sectiOJ
element equations are simply difference equations, and they were found to posed changes, still remaining in tht
be consistent with the wrong differential equation. The leading terms were cor- residuals. This means that the apprm
rect-the radius of curvature reappears through the agency of the angle B tion of trial functions, but the rule 1
in the frame continuity conditions-but there were, for the particular ele- cients q.1 may be different, or the un
ment chosen, also unwanted terms of zero order in h. t This suggests the pos- The most Ímportant variant is the
sibility of a convergence condition resembling the patch test of Section 4.2. derivatives of u, rather than u itself-
For a shell the problems are inevitably more complex. Even with a poly- lems these ~re the quantities of gn
hedron bujlt from flat plates~ the continuity requirements are difficult to im- approximate them directly. The resul
pose on polynomial elements; continuity of the deriva ti ves is always difficult od, in which both displacements and s
to achieve with anything less than quintics. One useful but mathematically is indefinite. Here tlíere is a compll
cloudy approximation has been the discrete Kirchhoff hypothesis: The rota- mized, and apart from a change in
tions (the rate of change of normal displacement in the two coordina te direc- functions ha ve new interelement and
is the same. The order of approxirr
tThese unw~ted terms are, however, comparable to others which appear in relating still decisive.
the arch equations to a full two-dimensional elasticity theory. Thus the leading terms in As an example we take Laplace':
all theories, including the frame approximation, may coincide-and the same may apply
to shells. Au =O in !l,
1
CHAP.2
) SEC. 2.4. SYSTEMS OF EQUATIONS; SHELLS; FORCE AND HYBRID METHODS 129
(../ .
tor normal to the arch, and this is not

ty. The pair vt = u and v = w which tions on the shell) are taken to be independent unknowns, and their true
a(v, v) +linear terms], ar~ the ~angen-
relationship to the normal displacement is imposed only as a constraint at
the nodes, not over the entire surface. Approximations of this kind will con-
ar~h. Note that they are determined by
tatiOns of I(v), which are o?-mtrerent tinue to be practica! and valuable; there are too many kinds of shell problems,
and too many alternative elements, to decide that a single approach is best.
Even conforming elements such as CURSHL have had their detractors, on
de up of straight litÚ~s, the radius of
the ground that rigidbody motions are not reproduced exactly, Theargument
mp~es the two components of v
in the
is that a tall st.ructure leaning in the wind might have such large motions, and
J stmple and independent differential
·eappears, however, at the joints of the as a consequence such large finite element errors, as to obscure the interna]
lre from coming apart in its deformed bending, which is of prime importance. We are inclined to trust in the ac-
v
on = (v¡, v2), as s S 0 is approached curacy, and to be concern,ed rather about the convenience, of high-degree
elements.
t r:'ove the joint to the same place and
Obviously shells have provided finite elements with a very severe test-of
:~ts represents two conditions relating
n~t angle B. (The conditions are essen-
all linear problems perhaps the most severe. The same is true of the mathe-
~ IS an essential continuity condition
matical analysis. But these are problems· which were virtually inaccessible
by earlier techniques, and we believe they are now on their way to being
:~ngle at eac~ joi,nt. (Note that v2 ítself
ude ó-functiOns at the joints, whose understood.
1 zero.) In addition to systems of equations, there is another major omission in
•ns cons.train t?e polynomials meeting the finite element analysis described so far; we have concentrated only on the
displacement method, in which an expression I; qj(/Jj is assumed for the dis-
~ H~~mt~e cubtc: the continuity of the
p~acen;ten~, and the optimal coefficients Qj are determined by the Ritz-Galer-
mmties m function value are allowed
uch trial functions are inadmissible ¡~ km prmctple. The numerical effectiveness of this method is well established
mergy is also modified by the removal and this is largely a period of perfecting the elements and automating th~
lS open. Direct evidence against con-
input and output-rapid developments in computer graphics have been re-
l, and Cyrus (Wright-Patterson Con-
quired to make the output intelligible. there remain, however, sorne imper-
·ch and a regular polygon. The finite fections which can be removed only by modifying the mathematical method
:e equations, and they were found to itself. In the remainder of this section we want to describe sorne of the pro-
equation. The leading terms were cor- posed changes, still remaining in the framework of the method of weighted
·s through the agency of the angle (} residuals. This means that the approximate solution will still be a combina-
ut there were, for the particular ele- tion of trial functions, but the rule for choosing among the possible coeffi-
ero order in h. t This suggests the pos- cients qj may .be different, or the unknown itself may be altered.
lbling the patch test of Section 4.2. The most important variant is theforce method, in which the strains-the
derivatives of u, rather than u itself-are taken as unknowns. In many prob-
Jl~ more ~omplex. Even with a poly-
tmty reqUirements are difficult to im- lems these are the quantities of greatest importance, and it is natural to
is
t~ of the derivatives always difficult
approximate them directly. The result is quite different from the mixed meth-
od, in which both displacements and strains are unknowns, and the functional
ttcs. O~e useful but mathematically
crete Ktrchhoff hypothesis: The rota-
is indefinite. Here there is a complementary energy functional to be inini-
Iacement in the two coordina te direc- mized, and apart from a change in the admissibility conditions-the trial
functions ha ve new interelement and boundary constraints-the mathematics
is the same. The order of approximation achieved by the trial subspace is
par~b.Ie to others which appear in relating still decisive.
:lastJcity t?e~ry. Thus the leading terms in
1, may comc1de-and the same may apply
As an example we take Laplace's equation
Jiu o in n, u= g on r,
.,
130 A SUMMARY OF THE THEORY CHAP.2 SEC. 2.4. SYSTEMS OF EQUATIONS; Sl
in which u is-,~termined by minimizing /(v) = JJ v; + v;. In the comple- the tangential component of the g1
mentary~-energy principie these derivatives v x = f 1 and vy = f 2 are taken as component of (f" f 2 ), must be contiJ
the primary dependent variables. The pair (f ~' f 2 ) is no longer constrained the greatest dijficulty in approximatinJ
to be the gradient of sorne v; in other words the cross-derivative identity rate functions f 1 and f 2 need not be
(f 1) 7 (f 2)x, or compatibility condition, is no longer imposed. (It follows, of ofthe vector (f 1, f 2 ) must be.
course, that when approximations f~ and f; are determined, there is no It is important to establish this
unique .Jt~y ofintegrating to find a corresponding uh. In the exact application condition (11), without arguing by w
of the complementary principie the optimal f 1 and f 2 will be the deriva ti ves of general problem the idea of a strea
the true displacement u, but in the discrete approximation this link between have to attach sorne meaning to eq
gradient and displacement is lost.) and f2 lie only in creo. They may ha'
In the Laplace example, the trial functions for f 1 and f 2 need only to líe dition must be understood in a weak
in creo for the quadratic functional to be finite. There is a constraint ofequilib- z which vanishes on r' and integrate
rium, however, provided by the differential equation itself: uxx + uyy O given by this weak form of equatior
leads to
(11)
This actually means that the pair (f 2 , - f 1) will be the gradient of sorne func- For a tria! space of piecewise polyn
tion w: wx = f 2 and W7 - f 1 • We may interpret the complementary process to this form of the constraint, takii
in this special example as follows: Instead of Iooking for the harmonic func- each element the original condition j
tion u, we are computing the associated stream function w. The function w is boundary, there must be a canceilaü
conjugate to u, meaning that u + iw is analytic. arising from the two adjacent regio1
This leaves the two principies very nearly parallel in the interior of O. On requires the continuity of the norm~
the boundary, however, there is a marked distinction; the Dirichlet condition The literature on the complerr.
on.u is exchanged for a Neumann condition on w. To see this, recall that along Trefftz [T9}, who established that it
any curve, u and its stream function are related by w, = -us. (This follows gral. Friedrichs [F19} then dis~ove1
from the Cauchy-Riemann equation wx = -u7 when the curve is vertical, underlying idea was exactly the Ritz
from W7 = ux when it is horizontal, and from a linear combination in the and that boundary conditions whi<
general case.) Therefore, on r the essential condition u g .is replaced by natural in its conjugate. The ortho¡
w, = -gs- As in any inhomogeneous Neumann problem, this means that the two problems was developed by Syr
admissible space is not constrained at the boundary, but the functional lis a fundamental paper with Prager [P~
changed to gate principies leads to upper as we
and also for the displacement u; ar:
Burchard [AJO}, and the idea is ap
1' = JJ(w; + w;) dx dy + Jgsw ds. berger [W2}. It has been programme
The finite element method was f
Integrating the last term by parts to obtain gws ds, w can be eliminated complementary principie, by de Veu
in favor of its derivatives wx = f 2 and W7 - f 1 • The complementary prin-
literature. t
cipie now asserts that the pair (f ~' € 2 ), which minimizes /', is the gradient of Always the interelement conditi
the displacement u which minimizes /. has led to the construction of mult
In finite element approximation the trial functions for w must lie in X 1 ,
in order for 1' to be finite. What does this imply about the variables f 1 ~nd f 2 ? tThe analysis of the force method pr<
First, since the tria! functions for w are continuous across inter-elem.ent same approximation theorems, tbat are a
boundaries, so are their derivatives in the direction along the edge. Therefore, There is not space to develop tbis whole
CHAf. 2 SEC. 2.4. SYS'fEMS OF EQUATIONS; SHELLS; FORCE AND HYBRID METHODS 131
~
~i~g I(v) = JJ v; + v;. In the co~ple- the tangential component of the gradient (E 2 , -E 1), which is the normal
ttives vx = E 1 and vy E2 are taken as component of(E 1 , E2 ), must be continuous. lt is this constraint which imposes
~ pair (E 1, E2) is no longer constrained the greatest dijficulty in approximating the complementary principie. The·sepa-
1er words the cross-derivative idenilty rate functions E 1 and E 2 need not be continuous, but the normal component
n, is no longer imposed. (It fóllows, of of the vector (E 1 , E2 ) must be.
~ and E1 are determ,ined, thére is no It is important to establish this constraint directly from the equilibrium
respcnding uh. In the'.exact application condition (11), without arguing by way of a stream function, since in a more
timal E 1 and E2 will be the deriva ti ves of general problem the idea of a stream function is not so relevant. First we
:crete approximation this link between have to attach sorne meaning to equation (11), (E 1)x + (E 2 )y =O, when E1
and E2 líe only in ;reo. They may ha ve no derivatives, and therefore the con:-
~nctions for E 1 and E2 need only to líe dition must be understood in a weak sense; we multiply by a smooth function·
e finite. There is a constraint of equilib- z which vanishes on r, and integrate by parts. The proper constraint is then
·entiai equation itself·• u XX + uYY o given _by this weak form of equation (11):
ff (E 1 ~~ + E2 ~~) dx dy 0 for all such z.
-E 1) will be the gradient of sorne func- For a trial space of piecewise polynomials, Green's theorem can be applied
t interpret the complementary process to this form of the constraint, taking it one element at a time. Then inside
!ad of looking for the harmonic func- each element the original condition (11) must hold, and on eagh interelement
~ stream function w. The function w is boundary, there must be a cancellation ofthe boundary integrals or tractions,
analytic. arising from the two adjacent regions. It is precisely this cance1Iation which
Iearly parallel in the interior of n. On requires the continuity of the normal compdnent of (E 1 , E2 ).
~d distinction; the Dirichlet condition The' literature on the complementary method apparently begins with
ion on w. To see this, recall that along Trefftz [T9], who established that it yields a lower bound to the energy inte-
: related by wn = -u•. (This follows gral.. Friedrichs [F19] then discovered that as in the Laplace example, the
'x -uy when the curve is vertical underlying idea was exactly the Ritz method applied to a conjugate problem·,
1d from a linear ~ombination in th~ and that boundary conditions which are essential in one problem become
ntial condition u= gis replaced by natural in its conjugate. The orthogonality of the admissible spaces for the
!umann problem, this means that the two problems was developed by Synge [17] into the hypercircle method, after
:he boundary, but the functional lis a fundamental paper with Prager [P9]. A combination ofthe direct and conju-
gate principies leads to upper as well as lower bounds for the strain energy,
and also for the displacement u; an abstract account is given by Aubin and
x dy + Jg.w ds. Burchard ·[AIO], and the idea is applied to sorne model problems by Weitf-
berger [W2]. It has been programmed by Mote and Y oung.
ain - J gw. ds,
w can be eliminated The finite element method was first applied to the stresses, that is, to the
'y The complementary prin-
-E 1·
complementary principie, by de Veubeke [F11]. By now there is an enorrrious
hich minimizes /', is the gradient of literature. t '
Always the interelement conditions present practica! dif_licuties, and this
:rial functions for w must líe in X 1 has led to the construction of multiplier and hybrid methods. For example,
imply about the Variables E 1 and E2 ? tThe analysis of the force method proceeds from the same principies, and in fact the .
re continuous across inter-element
same approximation theorems, that are applied in this book to the displacement method.
direction along the edge. Therefore, There is not space to develop this whole theory in parallel.
132 A SUMMARY OF THE"THEORY CHAP. 2 SEC. 2.4. SYSTEMS OF EQUATIONS; SHEI
Andetheggen [A3]rfílapplying to a fourth-order plate problem the standard The Ritz method now minimizes ÜJ
Ritz method qf minimizing the potential energy functional I(v), proposes term, over a space Sh with no bounda
to use cubic polynomials for which the normal slope will ordinarily be dis- piecewise polynomial space, containin.
continuous between elements. Imposing the constraint of slope continuity k - l. Then there will be a balance l
associates with each element edge a Lagrange multiplier and alters the Ritz unsatisfied boundary condition and
method;i,p.to a constrained minimization. The required programming changes involved in minimizing only over a sul
are very· simple. The stiffness matrix becomes indefinite, however, and (be- to determine an optimal dependence o
cause of the edge unknowns) the computation time for cubics seems to be Babuska has also given rigorous err
comparable to the usual stiffness method for the reduced quintics proposed Lagrange multipliers, in which again th
in Section 1.9. boundary. For Poisson's equation, th
A second interesting modification ís the hybrid method pioneered by Pian point (u(x, y), 1(s)) of the indefinite fu
and Tong [P4, P5]. This copes with the interelement difficulties in an ingenious
way, by constructing a family of approximations both to the stress field F(v, A)~ So (v; + 2fv
within each element and to the displacement on the element boundaries. The
stress fields satisfy the differential equation within each element (so that the
The Lagrange multiplier runs over al11
homogeneous case/= Ois by far the simplest) and the displacements-which
at the true stationary point it is relate
are given by an independent set of piecewise polynomials-are continuous.
error in the stationary point (uh, A,h) o
For each displacement pattern the complementary energy is first minimized
estimated [B6].
within the elements separately. This leads to a family of displacements v\
The essential boundary conditions
defined now not only on the element boundaries but throughout the region
quite different approach, one with a lo1
n, to which the Ritz method can be applied; the final hybrid approximation Instead of minimizing I(v) = (Lv, v)
is foi.md by minimizing over these vh. The energy generally Iies between the
residualll Lw f W. lgnoring boundar
lower bound provided by the pure Ritz method and the upper bound of its
lying functional is
conjugate, and in many cases will yield a substantially better approximation
than either. llLw- f UZ (Lw, Lw:
We return once more to the basic Ritz method and reconsider the follow-
= (L*Lw,"
ing question: Can the variational principie be altered in such a way that
essential boundary conditions need not be imposed? The complementary The last term is independent of w, a1
principie is one possible ánswer, but there are others. In fact, there is now a actually minimizes 1' = (L*Lw, w)- 2(
standard device for dealing with unsatisfied constraints: to insert a penalty tional for the problem L *Lu L *f. Tl
.function into the expression to be minimized. (This was the main theme of adjoint, but we note that the order of t
Courant's remarkable lecture [C 11]; the finite element method was an after- With boundary conditions, the lea
thought!) For -llu f, with u g on r, l(v) is altered to lyzed for the first time in the recent w<
Taking as an example the Dirichlet p
Jh(v) SScv; + v; - 2fv) + eh fr cv - g)2 ds. th~y introduce the functional
o
The exact mínimum over the_unconstrained admissible space X 1
is achieved (12) I"(w) SJ(llw + /) 2
dx
by the fu'nction U\ which satisfies
tWe believe this method to be a prorr:
-C¡; 1 V! + Uh =o on r. tnethod, because it attacks directly the prob
normal derivative) at the boundary. Often it
· Therefore, if eh~ oo, these solutions Uh converge to the solution u, which and to find it from an approximate solutiot
vanishes on r. not entirely satisfactory.
CHAP.2 SEC. 2.4. SYSTEMS OF EQUATIONS; SHELLS; FORCE AND HYBRID METHODS 133
fourth-order plate problem the startdard The Ritz method now minimizes the functionalJ\ including the penalty
•tential energy furictional l(v), proposes term, over a space Sh with no boundary constraints. We suppose it to be a
the normal slope will ordinarily be dis- piecewise polynomial space, containing the complete polynomial of degree
osing the constraint of slope continuíty k - l. Then there will be a balance between the error u U\ due to the
Lagrange multiplier and alters the Ritz unsatisfied boundary condition and the penalty,· and the error Uh uh
:ion. The required programming,clianges involved in minimizing only over a subspace. This balance led Babuska [B5]
x b~comes indefinit~}.however, and (be- to determine an optimal dependence of eh on h:eh = ch 1 -k.
>mputation time for cubics seems to be Babuska has also given rigorous error estimates for the related method of
thod for the reduced quintics proposed Lagrange multipliers, in which again the trial functions are unrestricted at the
boundary. For Poisson's equation, this method searches for a stationary
t is the hybrid method pioneered by Pian point (u(x, y), l(s)) ofthe indefinite functional
e interelement difficulties in an ingenious
.pproximations both to the stress field
tcement on the element boundaries. The
F(v, A)= In (v; + v; 2fv)dx dy 2 Ir A(v- g)ds. ·
¡uation within each element (so that the
simplest) and the displacements-which The Lagrange multiplier runs o ver all admissible functions defined on r, and
>Íecewise polynomials-are continuous. at the true stationary point it is related to the solution by 1 = au¡an.t The
Jmplementary energy is first minimized error in the stationary point (uh, lh) over a finite element subspace can be
Jeads. to a family of displacements vh, estimated [B6].
: boundaries but throughout the region The essential boundary conditions can also be made to fall away by a
tpplied; the final hybrid approximation quite different approach, one with a long history: the method of least squares.
. The energy generaHy Jies between the Instead of minimizing J(v) (Lv, v) 2(f, v), this method minimizes the
tz method and the upper bound of its residuali!Lw- f 11 2 • Ignoring bouridary conditions for a moment, the under-
:Id a substantiaUy better approximation lying functional is
ll Lw f 11 2 (Lw, Lw) ~ 2(f, Lw) + (f,f)

Ritz method and reconsider the follow-
·inciple be altered in such a way that = (L*Lw, w)- 2(L* f, w) + (f,f).
not be imposed? The complementary
The last term is independent of w, and therefore the least-squares method
there are others. In fact, there is now a
actual/y minimizes 1' = (L*Lw, w) 2(L*J, w), which is exactly the Ritzfunc-
atisfied constraints: to insert a penalty
nimized. (This was the main theme of tiona/ for the problem L *Lu L *f This new problem is automatically self-
adjoint, but we note that the order of the equation has doubled.
he finite element method was an after-
on r, l(v) is altered to With boundary conditions, the least-squares method was properly ana-
Iyzed for the first time in the recent work of Bramble and Schatz [B30, B31].
Taking asan example the Dirichlet problem -Au =fin O, u g on r,
~fv) + eh fr (v g) 2 ds. they introduce the functional
rained admissible space X 1 is achieved (12) I"(w) = JJ(Aw + f) 2

dx dy + ch- 3
J (wr g) 2 ds.
tWe believe this method to be a promising variant of the standard displacement

r~ + uh =o on r. inethod, because it attacks directly the problem of computing the solution (or rather its
normal derivátive) at the boundary. Often it is exactly this information which is desired,
; Uh converge to the solution u, which and to find it from an approximate solution in the interior is numerically unstable and
not entirely satisfactory.
134 A SUMMARY OF THE THEORY CHAP.2 SEC. 2.4. SYSTEMS OF EQUATIONS; SHE
The factor h- 3 is not related to the degree k of the subspace S", as it was in again the serious difficulties are at the l
the penalty method. Instead, it achieves a natural balance between the dimen- [11-13] for a clearand deta}led discussi'
sions of the interior and:~ boundary terms. The minimization of 1" beco mes A second way to use exact inform2
now a problem of -simultaneous approxlmation in n and on the lower- is to introduce its Green's function ar
dimensional manifold r. gral equation over the boundary r ..
It is t~asonable to expect piecewise polynomial spaces to give the same solution on r has been taken as a pi!
degree of approximation on ras in Q. Ifr were straight, to take the simplest determined by collocation. And a th
case, then acomplete poJynomial in X p . • • , Xn reduces on r to a complete the old-fashioned global trial functio:
polynomial in the boundary variables s 1 , • • • , sn-t· With a curved boundary ments at the boundary. This makes ~
we believe the approximation théory to be effectively the same; u differs from the interior permits good approximat
its interpolate on r by O(hk-s) in the sth derivative. However, given only that tions, and the hard work is done in
approximation of degree k can be achieved in n, simultaneous approximation exist almost no theoretical analysis o1
with coefficient h- 3 is a much more subtle problem; its solution by Bramble
and Schatz showed that the balance of powers in (12) is exactly right. Pro-
vided that k > 4m, their error estímate for the least-squares solution u'Ls is
the optimal
(13)
The practica} difficulties in the least-squares method.are exactly those as-

sociated with an increase in the order ofthe differential equation frorn 2m to
4m;The trial functions rnust lie in 3C 2 m in order for the new functional to be
finite. This means that the admissibility condition is continuity of all deriva-
tives through order 2m 1, which is ditp.cult to achieve. Furtherrnore, the
bandwidth of K is increased and its condition number is essentially squared,
going from O(h~-zm) to O(h- 4 m).t Therefore,_ the nurnerical solution must
inevitably proceed more slowly. The rate ofconvergence in xs will normally
be hP, where p is the srnaller of k s and 2k 4m.
Finally, we rnention three of the techniques recently invented for problems
which have homogeneous, differential equatións, say Au = O, and inhorno-
geneous boundary conditions. There are a nurnber of important applications
in which u and dufdn are wanted only on the boundary, and it seerns rather
inefficient to compute the solution everywhere in n.
One possibility is to find a farnily of exact solutions rp¡, .•. , rpN to the
differential equation, and choose the combination .!; Qirpi which satisfies
the boundary conditions as closely as possible. This means that sorne expres-
sions for the boundary error is to be rninimized. With least squares this
leads to a linear equation for the Q¡; Fox, Henrici, and Moler [FlO] had re-
rnarkable success by using instead a minirnax principie, minirnizing the error
o ver a discrete set of boundary points. If the rp i are eigenfunctions of the
differential operator, there may be- enorrnous simplifications at sorne points;
tThis objection has been met by a modificat1on due to Bramble and Nitsche.
(
l
CHAP.2 SEC. 2.4. SYSTEMS OF EQUATIONS; SHELLS; FORCE AND HYBRID METHODS 135
egree k of the subspace S", as it was in again the serious difficulties are at the boundary. We refer t~ Mi~hlin's ~ooks
es a natural balance between the dimen- [11-13] for a clear and deta}led discussion ofthis more classtcal ctrcle of td~as.
~rms. The minimization of 1" bedomes A second way to use exact information about the homogeneous equat10n
proximation in n and on the lower- is to introduce its Green's function and transform the problem into an inte-
se polynomial space~ to give the same

gral equation over the boundary r. In sorne initial trials the approxi~ate
solution on r has been taken as a piecewise polynomial and its coeffictents
r
. If were straight, tS:'take the simplest determined by collocation. And a third idea, suggested by Mote, is to use
( t' . . . ' xn reduces on r to a complete
the old-fashioned global trial functions in the interior of n, with finite ele-
so ... , sn- r: Wíth a curved boundary ments at the boundary. This makes sense, because the smoothness of u in
:> be effectively the same; u differs from the interior permits good approximation by comparatively few global func-
th derivative. However, given only that tions, and the hard work is done in a "boundary layer". There appears to
:ved in n, simultaneous approximation exist almost no theoretical analysis of this idea, but its time will come.
1btle problem; its solution by Bramble
,f powers in (12) is ·exactly right. Pro-
te for the least-squares solution u'Ls is
t-squares method are exactly those as-

fthe differential equation from 2m to
' in order for the new functional to be
V condition is continuity of all deriva-
dilp.cult to achieve. Furthermore, the
tdition number is essentially squared,
erefore, the numerical solution must
te of convergence in Jes will normally
.nd 2k- 4m .
.niques recently invented for problems
!quatións, say Au = O, and inhomo-
e a number of important applications
on the boundary, and it seems rather
ywhere in n.
:>f exact solutions rp 1 , • • • , 'PN to the
combination 2:; Q1rp1 which satisfies
•ssible. This means that sorne expres-
minimized. With least squares this
ox, Henrici, and Moler [FlO] had re-
timax principie, minimizing the error
If the rp1 are eigenfunctions of the
mous simplifications at ~o me points;
.tion due to Brarnble and Nitsche.

)
ÍJ SEC. 3.1.
The degree of S is usually trivial t

proximation the degree is one (in otl
cubics k = 4, for the reduced quintic 1<
From these hypotheses we want t
achieved by S. Recalling that the de
ep e 2 , ••• , we shall use the distances
h¡ = diameter of i
3
in measuring the accuracy of approx
of nodal subspaces S\ parameterized
APPROXIMATION approximation error decreases like a p<
assumption on the functions QJ~, whid
functions· QJ~ are uniform to order q prc
for al! h, i, and j,
(2) max 1Da.QJ~(:x

x in e¡
la.l=s
3.1. POINTWISE APPROXIMATION
This condition is imposed on al! derivat
This section begins our discussion of the question which líes mathemati- q; that is, it holds for al/ ex sueh that
cally at the center ofthe whole finite element analysis-approximation by the 1 Di 1 is the order of the deriva ti ve whi
spaces Sh. We start with approximation in a pointwise sense, where the special QJ~ at its node z~; 1D¡ 1 = Oif QJ~ is asso
role of polynomials will be easy to recogrtize. Then in the remainder of the 1 Di 1 1 if it is associated with 'V x or
chapter, the same pattern is extended to the spaces Jes(Q), in other words, A good example in one dimensi
to the approximations in energy norms on which the Ritz-Galerkin-finite polynomials (k = 4), determined ove
element method is based. derivatives at the endpoints. On a 1
Suppose to start with that we are given a smooth function u = u(x), and ro are displayed in Fig. 1.8. Over
defined at every pOÍnt X (X 1 , • • • , Xn) Ín the n-dimensional domain Q. 'l'h 'l'(xfh) and roh = hro(x/h), res]
Suppose also that S is a nodal finite element space, spanned by q1 1 (x), ... , appearing in roh, in order to keep its
QJN(x). As in Section 2.1, this means that to each QJ i there is associated a node the reason for the term hlD;l in (2),
z¡ anda derivative D¡ such that correct. These basis functions are un
tives a J-function enters at the node
(1) typical of finite elements; the qth dt
we accept in (2), but higher derivative
Suppose, finally, that the space has degree k 1: Every polynomial in X¡, .•. ' xn
associated with the smoothness of th
of total degree less than k' is a combination of the basis functions rp i' and
eq-t . of functions with q - 1 contim
therefore lies in S. (The total degree of x 1 x 2 , for example, is 2; its presence
it is contained in the space 3Cª of fu
is required in a space of degree k- 1 2, but not in a space of first degree
square sense. The conforming condi
even though it is linear in x 1 and x 2 separately.) If P(x) is such a polynomial,
to a differential equation of order 2r.
and the linear combination ')'hich produces it is P L; p ¡QJP then by the
The uniformity condition becom
interpolation property (1) the weights are just the nodal values of the poly-
dimensions. Normally it can be tra1
nomial: ·
straint on the elemental regions ei.
P¡ D¡P(z¡). say the right triangle Twitli vertices
The nodal derivatives Di should be of order to T the basis functions and theiJ
136
SEC. 3.1. POIN1WISE APPROXIMATION 137
The degree of S is usually trivial to compute. For linear or bilinear ap-

proximation the degree is one (in other words, k = 2), for cubics and bi-
cubics k = 4, for the reduced quintic k is only 5, and so on.
From these hypotheses we want t~ deduce the order of approximation
achieved by S. Recalling that the domain Q is partitioned into elements
e 0 e2 , ••• , we shall use the distances
in measúring the accuracy of approximation. We have in mind a sequence

of nodal subspaces S\ parameterized by h, and we hope to establish that the
approximation error decreases Iike a power of h. This will require a uniformity
assumption on the functions rp?, which can be expressed as follows: The basis
functians· rpj are unifarm ta arder qpravided there exist canstants es such that
far al! h, i, and j,
(2) max 1Da.rp?(x) 1 < cshJDJI-s.

x in e¡
lo:l=s
This canditian is impased anal! derivatives Da. = aro:ljax~ 1 • •• ax~· up ta arder

of the question which Iies mathemati- q; that is, it halds far al! a such that 1a 1 a 1 + ... + a, < q. Recall that
!e?Ient a?aly~is-approximation by the 1 Di 1 is the order of the derivative which is interpolated by the basis function
:I m a pomtwise sense, where the special rpj at its node zj; 1 Di 1 = Oif rp? is associated with a function value ata vertex,
cogrtize. Then in the remainder of the 1 Di 1 1 if it is associated with vx or vy or v,, and so on.
to the spaces xs(Q), in other words, A good example in one dimension is furnished by the Hermite cubic
ns on which the Ritz-Galerkin-finite polynomia]s (k = 4), determined over each interval by their values and their
derivatives at the endpoints. On a unit interval the two basis functions r¡¡
e gi:en a smooth function u u(x), and w are displayed in Fig. 1.8. Over the reduced interval [0, h] these become
x,) m the n-dimensional domaín n. r¡¡h r¡¡(xfh) and wh = hw(x/h), respectively; one sees the extra factor h
~lement space, spanned by rp 1 (x), ... , appearing in ro\ in order to keep its slope equal to 1 at the origin. This is
tt to each rp i there is associated a no de the reason for the term h1DJI in (2), to make the inequality dimensionally
correct. These basis functions are uniform up to q 2, but for third deriva-
tives a <5-function enters at the nodes and uniformity breaks down. This is
typical of finite elements; the qth deriva ti ves involve step functions, which
:e k. l : E very palynamial in x 1 , ••• , x, we accept in (2), bu t higher derivatives are disallowed. Th us q is the parameter
~atwn af the basis fynctians rpi' and associated with the smoothness of the subspace; Sh is contained in the space
for example, is 2; its presence
· X¡X 2 , eq-t of functions with q - 1 continuous derivatíves, and more importantly,
2, but not in a space of first degree it is contained in the space Jeq of functions with q derivatives in the mean-
arately.) If P(x) is such a polynomial square sense. The conforming condition, in order to apply the Ritz method
·duces
•
it is P I; p J.rpJ., then by th~ toa differential equation of order 2m, is simply q m.
ue JUSt the nodal values of the poly- The uniformity condition becomes especially significant in two or more
dimensions. Normally it can be translated directly into a géometrical con-
P(zi). straint on the elemental regions e¡. First one considers a standard element,
say the right triangle T with vertices at (0, 0), (1, 0), and (O, 1). With respect
Jrder <k. to T the basis functions and their derivatives up to order q shou]d be
¿
138 APPROXIMATION CHAP. 3 SEC. 3.1.
\
bounded. Then ~'"a change of coordinates, taking T into the given triangle e, derivative: c0 l. Consider, however,
and involving the Jacobian, leads to the following resul~: The basis is uniform rp 1 equals 1 at z1 and zero at P, its slopt
provided that as h O, all angles in the triangulation exceed sorne lower
---¡.
e
bound o. In this case it is not difficult to find constants C¡; bounded by
constant
(sin 80 )s Comparing this with the uniformity cot
Thus, if the triangles are allowed to deg
We emphasize that the effect ofthe element·geometries on approximation will not be uniform.
is. entirely contained in this estímate. For quadrilaterals the angles must also On such a triangle the linear ínter]
be bounded away from 180°, to avoid degeneration into triangles. Consider, for example, the function u(x
We should note a situation in which this bound on es is not valid. It will at z1 and equals h2/4 at the other two
certainly fail if the polynomial is not uniquely determined within the element quently the derivative along the x-axi:
by the nodal parameters, that is, ifthe matrix Hin Section 1.10 which links is independent of x, may be seriously i
the polynomial coefficients a, to the nodal parameters q1 beco mes unbounded
for sorne element configuration. Such singular cases, once recognized, have
been avoided in the literature. The danger is greatest when e; is originally
curved, and is mapped into a polygon in order to use one of the standard This derivative error is O(h) only if th1
finíte element constructions. If the Jacobian of this mapping is allowed to that u u1 = O(h 2 ) regardless of 8; thc
vanish (see Section 3.3, for example), the mathematical properties ofthe con- less factor "hmaxfhmin" introduced by th
struction are lost. There is also another reason to av4
The geometrical condition is easy to understand for linear functions on They may injure the numerical stabilit
triangles. The pyramid rpj equals 1 at the jth vertex and zero at the others. In the condition number of K. Therefon:
between one always has 1 rpj(x) 1 < 1, so that uniformity holds for the zeroth structed to triangulate an arbitrary dor
y
than 80 ~ n/8. For a more general n-
req uiremen t can be ex pressed in ten
angles: e, should contain a sphere of
We are now ready for pointwise
The given function u will be approxirr
in other words,. by the function ui wh
as u itself:
uix) 1:; 1
We suppose that u has k derivatives

Zj = (h tan 8/2, 0} X estima te the sth derivatives of u- ui.
THEOREM 3.1
Suppose that S is of degree k - l
condiiion (2). Then for s < q,
(0, -h/2)
(3) max!Dccu(x) Dccu¡{x)i
x in e,
Je~;["'S
Fig. 3.1 Linear shape function in a thin triangle. Proof We choose an arbitrary po
CHAP. 3c.. SEC. 3.1. POINlWISE APPROXIMATION 139
tates, taking T int9 the given triangle e; derivative: e 0 = L Consider, however, the x-derivative in Fig. 3.1. Since
the following result: The basis is unifortA rp¡ equals 1 at z¡ and zero at P, its slope between these points is
in the triangulation exeeed soine lower
llt to find constants es bounded by
constant
(sin 8 0 ) 9 • Comparing this with the uniformity condition (2), e 1 will be at Ieast 2/tan 8.
Thus, if the triangles are allowed to degenerate into thin "needles," the basis
le element geometries on approximation will not be uniform .
. For quadrilª'terals the angles must also On such a triangle the linear interpolate can lead to significant errors.
id degeneration into triangles.
Consider, for example, the function u(x, y)= y 2 • The interpolate u1 vani~hes
tich this bound on es is not valid. It will
at z¡ and equals h 2 j4 at the other two nodes (and therefore at P). Conse-
uniquely determined within the element quently the derivative along the x-axis, which should vanish since u y 2
l1e matrix Hin Section 1.10 which 1inks is independent of x, may be seriously in error for the interpolate:
tod~I parameters q¡ becomes unbounded
h smgu~ar cases, once recognized, have a h 2 /4 h
da~ger Is greatest when e, is origínally ax(u- U¡)= (h/2)tan8 2tan8·
)n m order to use one of the standard
racobian of this mapping is allowed to This derivative error is O(h) only if the angle 8 is bounded below. We note
the mathematical properties of the con- that u - u1 = O(h 2 ) regardless of 8; the difficulty comes from the dimension-
less factor "hmax/hmin" introduced by the derivative.
' to understand for linear functions on There is also another reason to avoid degenera te triangles of any kind:
the jth vertex and zero at the others. In They may injure the -numerical stability of the Ritz method, as reflected in
so that uniformity holds for the zeroth the condition number of K. Therefore, severa! algoríthms have been con-
structed to triangulate an arbitrary domain n, keeping all the angles greater
than 80 ~ n/8. For a more general n-dimensional element, the geometrical
requirement can be expre~sed ·in term¿ of inscribed spheres rather than
angles: el should contain a sphere of radius at least vh,, v a fixed constan t.
We are now ready for pointwise approximation in the nodal method.
The given function u will be approximated by its interpolate in the space S,
in other words, by the function u1 which shares the same nodal parameters
as u itself:
We suppose that u has k derivatives in the ordinary pointwise sense, and

' - (h tan 812, o)· X
estímate the sth derivatives of u- u1 •
THEOREM 3.1
Suppose that S is of degree k - 1 and its basis satisfies the uniformity
eondiiion (2). Then for s q,
(3) max 1 Drt.u(x) D"'u¡{x)l Cshf-s max 1 DPu(x) [.
x in e 1
¡,.¡=s
x in e,
IPl=k
tnction in a thin triangle. Proof We choose an arbitraty point x 0 in e¡, and expand u in a Taylor
140 APPROXIMA~tON CHAP. 3 SEC. 3.1.
\ ~s,
finite element.solution uh, sinceit is ob
series: \_
I(v), the potential energy over the wh'

· u(x) P(x)+ R(x),.
u· is indeed própagated by uh-somet
where P is a polynomial of degree k 1. and R is the remainder term. We
throughout the entire domain. This is
shall need the standard estimatefor the Taylor remainder R and its deriva- There is a similar approximation t
tives, method. We shall take time to prove
two overlap for the usual nodal methc
(4) max !Da-R(x)! Ch7-s max 1 DPu(x) ¡. covered by the previous theorem, be
x in e; x in e,
lrx-l=s IPl""k basis, yet their approximation prope1
In fact, the special regularity of the 1
This can be proved by expressing R as a line integral from x 0 to x. elegant result. Approximation on a re~
The interpola te of u decomposes, by linearity, into the sum of two inter- tria! function 'fl with the following re~
polates: P of degree less than k,
u¡{x) P¡{x) + R¡{x).
The crucial point is that P1 is identical with P; any polynomial of degree
(5) P(x) =L;ll
k 1 is exactly reproduced by its interpolate. This is where polynomials play Such a superfunction 'fl can be found
a special role. In other words, P --'- P1 coincides in e1 with a trial function for S is of degree k - l. 'fl will be nonzen
which all the determining nodal parameters are zero; therefore, P P1 in e1• In the simplest one-dimensional e
This means that u - u1 = R R1 , and the only. terms left to estímate piecewise linear functions. There is on
are the derivati ves of R1 : and the basis is generated by the roof
functions are rp?(x) <1> 1(x/h - /), a1
<1> 1 itself will serve as the function r¡.
(f.x+P,
At most d nonzero terms can appear in this sum, since the other basis func-
tions vanish in the element. Therefore; if (4) is combined with the unifor- P(x) = L;P
mity condition (2),
The two sides agree at every node and
!Da-R¡(x) 1 f/Ch~-IDJI max 1 DPu !· cshJD1 1-s identical. The same will be true in tW(
functions on triangles, where the gra
C'h;-s max 1DPu ¡.
bilinear functions on rectangles, whe
Thus R1 is bounded exactly as R was, and it follows immediately that cases the approximation uQ in the folle
polate ur
The construction of 'fl is not so 1
functions for Hermite cubics, as well
This is the estima te (3), and the proof is complete.t Section 1.7. None of these functions
which is required of 'fl· However, ther
Note that this is a completely local estímate; u1 imitates the properties it is nonzero over six intervals (Fig.
of u within each element. This is a property which we cannot expect_ of the E-spline and its two neighbors, and a
a Hermite cubic) it will serve as the
tThe uniformity of the basis is not a necessary condítion for the approximation theo- too. We must emphasize here that it
rems to hold. Por bilinear or bicubic approximation on rectangles, the choice of a very fine what matters is only that somewhere
mesh in one coórdinate direction will spoil the uniformity but not the order of approxima-
tion. Even for triangles, Synge's condition in [17] is that the largest angle be bounded below is responsible, by itself, for the apprm
n, i.e., that the second smallest angle in each triangle exceed sorne 80 • Given the existence of rp(x), vani
CHAP.J SEC. 3.1. POINTWISE APPROXIMATION 141
finite element, solution uh, since it is obtained by minimizing a global function

I(v), the potential energy over the whole domain O. In fact, a singularity in
d + R(x), u is indeed propagated by uñ-sometimes with frustratingly slow decay-
1 and R is the remainder term. We throughout the entire domain. This is analyzed in Chapter 8.
1e Taylor remainder R and its deriva- There is a similar approximation theorem for the abstract finite element
method. We shall take time to prove this second theorem, even though the
two overlap for the usual nodal method on a regular mesh. Splines were not
1z7-s max l DPu(x) J. covered by the previous theorem, because they lack a local interpolating
x in e 1
IPI=k basis, yet their approximation properties are of considerable importance.
In fact, the special regularity of the mathematical structure allows a more
a line integral from x 0 to x.
elegant result. Approximation on a regular mesh hinges on the presence of a
•Y Iinearity, into the sum of two in ter-
tria/ function t¡1 with the following remarkable property: For any polynomial
P of degree less than k,
e)+ Rlx).
(5) P(x) 2; P(l)t¡!(x - /).
l
:tl with P; any polynomial of degree
;>olate. This is where polynomials play Such a superfunction t¡1 can be found in the tr_ial subspace S if and only if
oincides in e¡ with a trial function for S is of degree k l. t¡1 will be nonzero o ver a small patch of elements.
eters are zero; therefore, P = PI in e;. In the simplest one-dimensional example, Sh is made up of continuous
, and the only terms left to estímate piecewise linear functions. There is one unknown per mesh interval (M= 1),
and the basis is generated by the roof function <I> 1 • In other words, the basis
functions are qJf(x) <1> 1 (xfh l), and Sh is of degree l. In this example
<1> 1 itself will serve as the function t¡1: Por any linear polynomial P(x) =
rxx + p,
t this sum, since the other basis func-
if (4) is combined with the unifor-
The two sides agree at every node and are linear in between, so they must be
max 1DPu 1· cshJD;I-s identical. The same. will be true in two dimensions both in the case of linear
ax 1 DPul. functions on triangles, where the graph of <I> 1 = t¡1 is a pyramid, and · of
bilinear functions on rectangles, where the graph is a "pagoda." In these
td it follows immediately that cases the approximation uQ in the following theorem coincides with the inter-
polate ui.
(C + C')h1-s max 1DPu J. The construction of t¡1 is not so trivial for cubics. The two generating
functions for Hermite cubics, as well as the B-spline <1> 17 were sketched in
complete.t Section 1.7. None of these functions satisfies the identity (5) up to k = 4,
which is required of t¡J. However, there exists a cubic spline which does, and
estímate; ui imitates the properties it is nonzero over six intervals (Fig. 3.2). This t¡1 is a combination of the
perty which we cannot expect of the E-spline and its two neighbors, and also (since every spline is automatically
a Hermite cubic) it will serve as the superfunction for the Hermite space,
;ary condition for the approximation theo- too. We must emphasize here that it is never essential to know what t¡1 is;
ltion on rectangles, the choice of a very fine
uniformity but not the order.of approxima- what matters is only that somewhere in the trial space lies a function which
7] is that the largest angle be bounded below is responsible, by itself, for the approximation properties of the space.
iangle exceed sorne 8 0 • Given the existence of t¡J(x), · vanishing for 1x 1 > p and satisfying the
This is the identity (5) with x scaled by

the polynomial p(x) = P(xh), and then
The Taylor remainder R was éstir
this leaves only R 0 :
For x in the mesh cube O X¡< h,
where C' bounds the derivatives of Vf, 1

entiation D«, and the rest is an upper
it is essential that the sum is finite; s
'l'(x) = -(1/6)cp(x+1) + (4/3)cp(x)- (116kp(x -1) only finitely many values of l actually
Fig. 3.2 The superfu~ction for cubics.
estímate, with lrt 1 s:
1D«(u uQ) 1 l D«(R
superfunction identity, we approximate a given function u by the following
member of Sh: The same result follows in any other rr
of the vertices.
We refer to uQ as a quasi-interpolate based on r¡¡. lt depends as u1 did on the 3.2. MEAN-SOUARE APPROXIMATIO
local properties of u, but does not quite interpolate u at the nodes. lts ad-
vantage is that it can be written down so easily and explicitly, whereas We want to prove that also in the
interpolation by splines requires the solution of simultaneous equations of approximation is possible und~r th'
(several B-splines are nonzero at each node). Since approximation only degree k - 1 and a uniform basis. It is
demands that there exists sorne fuhction in Sh which is clase to u, we are free it natural and inevitable to work wit
to work with a convenient choice. itself is exactly such an expression, ar
of u. Since uh is the element closest to
THEOREM 3.2 estímate in the whole subject is this or
Suppose the abstract finite element space Sh is of degree k 1 and r¡¡ has In general, the Jem norm involves
bounded derivatives up to order q. Thenfor any derivative DrA. of order lrt 1 < q,
(6) max !Drku(x)- DrkuQ(x)l

- o:::;;xt<h
(rA.( "'S
We shall ha ve use also for the seminorr.

Proof. The argument is nearly the same as in Theorem 3.1. A Taylor of order exactly lrt 1 m are included i1
expansion around the origin gives u(x) = P(x) R(x), where Pis of degree it shares the properties l cv 1 1e 11 v 1a
k 1 and R is the remainder. Splitting uQ into PQ + RQ, we note first that it fails to be positive definite: For a tr
PQ coincides :with P: v O, but for these seminorms, IPlm
degree less than m. Obviously 11 v 11! =
P(x) = PQ(x) ~ P(lh)r¡¡( ~ t) · In one sense it is completely trivial1
/
\·
CHAP.3 SEC. 3.2. MEAN-SQUARE APPROXIMATION 143
This is the identity (5) with x scaled by the factor h; that is, (5) is applied to
the polynomial p(x) = P(xh), and then x-+ xjh. ,
The Taylor remainder R was estimated in the previous theorem, and
this leaves pnly Rº:
For x in the mesh cube O X¡ < h,
where C' bounds the deriva ti ves of 'JI, the factor h-1«1 comes from the differ-
entiation D«, and the rest is an upper bound for R(lh). As in Theorem 3.1,
4/3)cp( x)- {1/6)cp(x 1) it is essential that the su m is finite; since 'JI is nonzero only for 1x 1 < p,
only finitely many values of 1 actually appear. We now have the required
fu(\ction for cubics. estimate, with 1a 1 s:
1D«(u uQ) 1 1D"(R - RQ) 1< cshk-s max 1DPu 1.
e a given function u by the following
The same result follows in any other mesh cube by expanding u around one
of the vertices .
.sed on 'JI· It depends as u1 did on the 3.2. MEAN-SQUARE APPROXIMATION

ite interpolate u at the nodes. Its ad-
;vn so easily and explicitly, whereas We want to prove that also in the mean-square norms, the same degree
solution of simultaneous equations of approximation is possible under the same hypotheses on Sh-that it has
h node). Since approximation only degree k 1 and a uniform basis. It is the variational principie which makes
1 in Sh which is close to u, we are free it natural and inevitable to work with these Jes norms; the strain energy
itself is exactly such an expression, an integral of the square of derivatives
of u. Since uh is the element closest to u in this energy norm, the key error
· estímate in the whole subject is this one.
pace Sh is of degree k 1 and VI has In general, the Jem norm involves all deriva ti ves D" of order 1a 1 m:
or any derivative D« of order IIX 1 q,
We shall ha ve use also for the seminorm 1v lm,n, for which only the derivatives
same as in Theorem 3.1. A Taylor
of order exactly 1a 1 = m are included in the su m. This is a seminorm, because
P(x) + R(x), where Pis of degree
it shares the properties 1cv 1 1e 11 v 1and 1v + w 1 1v 1+ 1w 1of a norm, but
~ uº into PQ + Rº, we note first that
it fails to be positive definite: For a true norm, 11 v 11 O can happen only if
v =O, but for these seminorms, IP!m =O whenever P is a polynomial of
degree less than m. Obviously 11 v 11; 1v 1~ + . . . 1v ¡;.
In one sense it is completely trivial to extend the pointwise results to these
new nqrms. Using the interpolate u1 of the previous section, we may simply that since u(x¡, ... , x") has k mean-~
square the pointwise errÓr and integrate: If 1 et 1 = s; is well defined at any point zi:
fn 1Drr.u - Drr.u1 12 dx <(vol Q) max 1Drr.u - Drr.u1 12

(9)
Fortunately, this assumption 1 Di 1 <

< c;(vol Q)h 2<k-s> max 1DPu(x) 12 •
xinn elements, and the theorem leads direc
IPI=k
finite element method.
Thus if the kth derivatives of u are bounded, the error is of the same order To find a more awkward case, wt
hk-s as befare. This settles the rate of convergence to smooth solutions. constants (k = 1) in the plane (n = ~
Such an estímate, however, can never be entirely satisfactory. It involves in Section 1.8, Sobolev does not guar
two quite different norms, and we have made a pointwise assumption on the sen se; if a node happened to fall at
kth derivatives in order to obtain a mean-square error bound on the sth Even in this case, however, Sh does con
derivatives. One has a right to expect a more symmetric theorem, in which order h of least-square approximation
only norms of the same kind appear. Furthermore, such an improvement is is interpolated. The same constructior
made necessary by singularities. For a plate with a fracture, the solution u S h has degree k - 1 and a uniform baJ
increases o ni y like the square root of the distan ce from the end of the crack.
The function r 117.: is not even differentiable; the pointwise error will behave
like h 112 or perhaps h11 2 log h. However, the s_olution does lie in JC 1 ; it must.
Thus each finite element space Sh e
In fact, u nearly has "one and a half derivatives" in the mean-square sense;
necessary Ü1 , which approximates u to
the sth derivative behaves like r< 112 >-s, and for any s < !,
is much simpler to prove and is suffi
would be, if it were extended also to fr
J1r<1!2)-s 12 r dr d() < oo. example of a cracked plate. This exten
of the theorem, given the theory of "
In· short, we would like to pro ve that the mean-square error is roughly of as being too technical.
order h 3 l 2, and such a result can only follow from a theorem in which Remark 2. The proof even of the
hypotheses are made on the mean-square differentiability of u. technical lemma. The Taylor expansi<
was based will not work here; u may t
THEOREM 3.3
late to be defined, but not for a Taylo
Suppose that Sh is of degree k - 1, and its basis is ·uniform to order q. fore, we must rely on functional analy~
Suppose also that the derivatives Di' associated with the nodal parameters el ose to u in the mean-square sense a
are al/ of order 1 Di 1 <k'-- n/2. Then for any function u(xp ... , x") which in the pointwise sense. The key result,
has k derivatives in the mean-square sense, and any derivative nrr. of order In e very element there exists a polyn01
s<q, u - Pk-t satisfies
c;h¡<k-s) 1u l~.e,·
(7)
f e¡
1nrr.u(x) - nrr.u¡(x) 12 dx < (lOa) 1 R ls,e, < ch~-:s 1 ¡
and at every node zi,
Since the integral o ver n is the sum of the integrals o ver ei,
(lOb)
(8)
On a region of diameter hi = 1, (1 Oa) :
Remark l. The assumption 1 Di 1 <k- n/2 is necessary in order that the from (lOa) and Sobolev's ineq~ality e
interpolate u1 be defined. Sobolev's lemma (mentioned in Section 1.8) assures when the independent variables are re
CHAP. 3 SEC. 3.2. MEAN~SQUARE APPROXIMATION 145
of the previous section, we may simply that since u(x 1' .•• ' xn) has k mean-square derivatives, the derivative D ¡U
ate:Ifi~XI=s, · is well defined at any point z 1 :
(9)
>1 !l) max 1Dau - Dau1 l2
(vol !l)h 2 <k-s) max 1 DPu(x) j2 • Fortunately, this assumption 1 D 1 1 < k n/2 is fulfilled for all practica!
x in .O elements, and the theorem leads directly to the rate of convergence of the
lPI=k
finite element method.
ounded, the error is of the same order To find a more awkward case, we turn to approximation by piecewise
of converg¡;nce to smooth solutions. constants .(k 1) in the plane (n 2). Then, as for the function log log r
:ver be entirely satisfactory. It involves in Section 1.8, Sobolev does not guarantee that the interpolate of u makes
'e made a pointwise assumption on the sense; if a node happened to fall at the origin, what would be log log O?
mean-square error bound on the sth Even in this case, however, Sh does contain a function which gives the correct
:a more symmetric theorem, in which order h of least-square approximation; u can be suitably smoothed befo re it
Furthermore, such an improvement is is interpolated. The ·same construction succeeds in the general case [S6]: lf
a plate with a fracture, the solution u Sh has degree k 1 anda uniform basis, thenfor any dimension n,
the distance from the end of the crack.
tiable; the pointwise error will behave ü smoothed u.
!r, the solution does lie in 3C 1 ; it must.
Thus each finite element space Sh contains a function, usually u1 but if
:lerivatives'' in the mean-square sense;
and for any s < !, necessary Ü1 , which approximates u to the expected order. The bound for u1
is much simpler to prove and is sufficient for all practica! purposes-or it
would be, if it were extended also to fractional derivatives, as required by the
· dr dO< <X>.
example of a cracked plate. This extension is actually an immediate corollary
of the theorem, given the theory of "interpolation spaces"-whith we omit
t the mean-square error is roughly of as being too technical.
tly follow from a theorem in which
Remark 2. The proof even of the simpler theorem does still require one
:are differentiability of u.
technical lemma. The Taylor expansion on which pointwise approximation
was based will not work here; u may have enough derivatives for its interpo-
late to be defined, but not for a Taylor expansion with remainder hk. There-
', and its basis is. uniform to order q. fore, we must rely on functional analysis to produce a polynomial which is as
associated with the nodal parameters · close to u in the mean-square sense as the leading Taylor-series terms were
: for any function u(xl' ... ' xn) which in the pointwise sense. The key result, simplified from [14] and [827], is this:
~ense, and any derivative Da of order
In every element there exists a polynomial Pk_ 1 such that the remainder R =
u Pk-t satisfies
dx (lOa) S <k,
and at every node z 1,
he integrals over e¡,
(lOb) 1DjR(z¡) 1 <e' hf-IDJI-n/ZI u lk,e¡•
On a region ofdiametet hi 1, (lOa) is a standard lemma [14]; (lOb) follows
k - n/2 is necessary in order that the from (lOa) and Sobolev's ineq~ality (9). Then the given powers of h1 appear
ma (mentioned in Section 1.8) assures when the independent variables are rescaled so asto shrink the region down
,)
146 APPROXIMATwN CHAP.3 SEC. 3.2.
to e1• t The constants e and e' will depend on the smallest angle of ei; it is terms of the next higher degree k: A
simplest to assume this angle to be bounded below, as is normally required on the right si de of (7)? Bilinear elerr
for the basis to be uniform. \- · term xy, and therefore it seems supe
Proof of the theorem. The argument will be exactly the same as in the ux:v in the error bound ch 2 -s 1u b· This
pointwise case; that is the happy effect of the lemma. In each element e¡, bért [B28]: It is indeed sufficient to incl
we writ~ u Pk_ 1 + R; R satisfies the inequalities (10). The interpolate This will have valuable consequences
u1 is, by"-linearity, the sum of the two interpolates: of rectangular isoparametric elements
It may well happen that (7) holds
U¡= (Pk_l)l + RI = pk-1 + R¡, all orders s < k. The limitation to s ::
·since the polynomial Pk-t is exactly reproduced by interpolation. Therefore derivatives across element boundarie~
u- u1 = R R1 , and we consider It would be useful to know sometl
R¡{x) = 2: (D 1RXz1)rp1(x). rem. They are a direct indication of ti
if they are larger for one element tha
Only a fi.nite number d of these terms can be ·nonzero in e;. Applying the the first element is comparatively ina
uniformity ofthe basis, as well as (lOb), any derivative of order la 1~ s gives approximation on the line, with equa
Du.R1(x)l < dc'hf::-IDJI-nl 2 lu! k,et ·esh!DJI-s. constants can be computed. There
! - t t
th~ first trigonometric polynomial J
Now ifwe square, integrate over e1, and take the square root, meshpoint, and the first algebraic pe
(J e, 1 Da: R¡(x) 12 dx r 12
e" hr-s 1u !k, e¡·
its linear interpolate. In the case ofj
approximation are identically zero.
easy computation gives
\.
This is exactly the estimate given for R by (1 Oa), and exactly the same as the
right side of (7). Therefore, the proof is complete. The technique, depending
on (1 O) and on the special role of polynomials, is known as the Bramble-
Hilbert lemma. It follows from the proof of Theoret
The same theorem and proofhold for abstract finite elements on a regular are optimal, since for any u the interJ
mesh, in particular for splines; we again use uQ instead of u1 [S5J.
The theory of partial differential inequalities leads to questions of one-
sided approximation. We have established for the standard linear elements
that, given u> O, the estimates of Theorems 3.1-3.3 continue to hold if the Therefore, C0 l/n 2 and C1 1/n
approximate vh is required to satisfy O< vh u. (The interpolate vh = u1 What role is left for the quadn
is of course no use, since it need not líe below u.) We note that Duvaut-Lions function u, the constant which is 1
were able to formulate as variational inequalities several important physical previous case we fixed h and sough1
problems, including elastic-plastic phenomena-which in differential. form case we fix u and ask for the limits
lead to ~xtremely awkward elliptic-hyperbolic systems separated by an un-
known free boundary. Mosco and Strang, and independently Falk, have . min l u - v" lo
(11) Co = lIm h2j u 12 .
confirmed the usual h 2 error in energy for linear approximation of the St. h-0
Venant torsion problem, which is typical of the class of variational inequalities

The mínimum is taken o ver all v" in
known as obstacle problems.
tions. For every h these ratios are
An additional question arises if the elements contain a few polynomial
and therefore the limits cannot exc<
thf arises from rescaling the kth derivatives, and h~12 from the square root of the to be smaller, since a fixed u cann
volume of e1• sin nx/h at once. These new constan
CHAP. 3 SEC. 3.2. MEAN-SQUARE APPROXIMATION 147
epend on the smallest angle of e.· it is · terms of the next higher degree k: Are all deriva ti ves of this order needed
Jounded below, as is nonnally re~uired on the right si de of (7)? Bilinear elements, for example, reproduce the twist
term xy, and ·therefore it seems superfluous to include the cross derivati ve
1ent will be exactly the same as in the uxy in the error bound eh -s l u !2 • This point was settled by Bramble and Hil-
2
fect of the lemma. In each element e¡, bert [B28]: It is indeed sufficient to include only u;u arid Uyy in the error bound.
; the inequalities (10). The interpolate This will have valuable consequences in the following section for the theory
) interpola tes: of rectangular isoparametric elements.
R¡ = pk-t + Rn It may well happen that (7) holds inside each element for derivatives of
all orders s k. The limitation tos q in (8) arises when u1 has on]y q 1
r~produced by interpolation. Therefore
derivatives across element boundaries.
1t would be useful to know sometliing about the constants es in the theo-
:D1RXz1)rp1(x). rem. They are a direct indication of the properties of the particular element;
if they are larger for one element than for another of the same degree, then
1s can be nonzero in e¡. Applying the the :first element is comparatively inaccurate or "stiff." For piecewise l_inear
J), any deriva ti ve of order l (X 1 ~ s gives
approximation on the line, with equally spaced nodes x 1 jh, these optimal
constants can be computed. There are two functions of special interest:
the :first trigonometric polynomial f(x) = sin nxfh that vanishes at every
td take the square root, meshpoint, and the first algebraic polynomial g(x) x 2 not reproduced by
1/2 its linear interpolate. In the case off, both its interpola te and its best linear
1 <
- e" hlf-s
.z l u 1k,e1.•
approximation are identically zero. Therefore, the error is f itself, and an
easy computation giv~s
R by (lOa), and exactly the same as the 1
is complete. The technique, depending -h!flz·
íC
)]ynomials, is known as the Bramble-
It follows from the proof of Theorem l.l in Section 1.6 that these constants
·or abstract :finite elements on a regular are optimal, since for any u the interpolate u1 was shown to satisfy
.in use uQ instead of u1 [S5].
~nequalities leads to questions of one-
;;hed for the standard linear elements
!orems 3.1-3.3 continue to hold if the
Therefore, e 0 == Iín 2 and e 1 1/n for linear approximation.
O < vh u. (The interpolate vh u1
What role is left for the quadratic g(x) x 2 ? It yields, for each fixed
below u.) We note that Duvaut-Lions
function u, the constant which is asymptotieally eorreet as h O. In the
nequalities severa! important physical
previous case we fixed h and sought out the worst function sin nxfh. In this
!nomena-which in differential. form case we fix u and ask for the limits
perbolic systems separated by an un-
trang, and independently Falk, ha ve lim min !zu - vh lo and . min u
(11) e1 = l1m --+-:---.----'-"-
r for linear approximation of the St. h-o h 1 u lz h-+o
l of the class of variational inequalities

The mínimum is taken over all vh in S\ that is, over all piecewise linear func-
~ elements contain a few polynomial tions. For e~ery h these ratios are bounded by 1/n 2 and 1/n, respectively,
and therefore the limits cannot exceed these constants; we may expect them
ves, and h?12 from the square root of the to be smaller, since a fixed u cannot resemble all the oscillatory functions
sin nxfh at once. These new constants e 0 and e 1 (ifthey exist) seem to be more
natural in asses~ng the improvement to be' expected from mesh refinement

in a pi-actical prOblem, since it is the solutionwhich is fixed á.nd h which
changes.
Consider first the particular function u = g = x 2 • On the interval [-1, 1],
the hest linear approximations to displacement and slope minimize
~espectively. These expressions equal
2 4a 1 + 2a + 2ai
2
and 8 + 2a 2•
-1/3
3 1
3 2
In both cases a2 should be zero; the best approximatión to an even function

on a symmetric interval [~ 1, 1] is even. The optimal value of a 1 is j-, and the
ratios which enter e0 and e1 (with an interval oflength h = 2) become
Fig. 3.3 Error in linea
1
12,JT' The remarkable thing is that the
of the function u.
The latter happens not to be very different from 1/n; it is smaller, as it should
be. THEOREM 3.4
Notice that x 2 - j- is the second Legendre polynomial; it is the error For an arbitrary funetion u(x) in ~
after least-square approximation by linear functions ofthe quadraticg(x) h ~ O are the sqme as for the specia
x 2 •. Notice also that this error has the same value %at the two ends of the 1/2,..;-T.
interval. This makes it easy to connect it with the best approximation to x 2
on the next interval 1 x < 3. On this interval the error function after This theorem means that in the li
optimal approximation is just (x - 2)2 - j-, the same Legendre polynomial like a piecewise quadratic; the erro:
translated two units along the axis. This pattern continues: On every interval locally resembles Fig. 3.3. In other '
[2n 1, 2n + 1], the optimal error function x 2 vh is given by (x- 2n)2- j-, funetion, the more it looks like a polync
and the error function is periodic w.th period h 2 (Fig. 3.3). The con- for Taylor-series expansions. The c<
stants 1/12~ and 1/2¿'3- are the same on any union of these intervals. correct for least-square approximatic
Furthermore, these eonstants are dimensionally eorreet, and they will not u but rather to all choices. Similarly, e
ehange if the independent variable is resealed to give an arbitrary initial ínter val element error u - uh iri the second-01
[-h/2, h/2], and then translated to give an arbitrary posltion of the origin. e 1 will be dec_isive e ven in displaeemen
The x-mf;ls in the figure is scaled by a factor of h/2: 1 and the error function minimize the strain energy of the ern
on the y-axis is scaled by (h/2)2: l. It is easy to check that the ratios are in va~ er
indicates how rather than e0 enters
riant; the reason x 2 is so special is that translation to (x x 0 )2 alters'it by Theorem 3.4 will be proved first f1
a linear expression 2xx 0 - xfi, which can be taken up by the trial functions. of length h, with midpoint X 0 ,
Note that .the optimal approximation is by no means the linear interpolate;
the interpolate yields the right exponent of h but too large a constant. The
Ritz procedure achieves the best constant, beca use it minimizes o ver Sh.
to be expected from mesh refinement

! is
solution which fixed and h which
nu= g x 2 • On the interval [- 1, 1],

lacement and slope minimize
x2 -1/3 (x- 2) 2 1/3
:ai 8
3 and
3 + 2ai. -1/3
:st approximatión to an even function

. The optimal value of a1 is -j-, and the
lterval of length h = 2) beco me
. Fig. 3.3 Error in linear approximation to x 2 •
1
2,/j. The remarkable ·thing is that the asymptotic constants are independent
of the function u.
~nt from 1/n; it is smaller, as it should
THEOREM 3.4
Legendre polynomial; it is the error For an ·arbitrary function u(x) in JC 2 , the limiting values of the ratios as
tear functions ofthe quadraticg(x) h O are the same as for the special function x 2 : c 0 1/12¿) and C1 =
;ame value %at the two ends of the 1/2~.-
it with the best approximation to x 2
:his interval the error function after This theorem means that in the Iimit as h O, every function behaves
· - j-, the same Legendre polynomial Iike a piecewise quadratic; the error function after linear approximation
s pattern continues: On every interval locally resembles Fig. 3.3. In other words, the more closely you look at a
tion x 2 - vh is given by (x- 2n) 2 f, function, the more it looks like a polynomial. This is the underlying foundation
:h period h = 2 (Fig. 3.3). The con- . for Taylor-series expansions. The constant c 0 is therefore asymptotically
tme on any union of these intervals. correct for least-square approximation, not just to sorne extreme choice of
·nsionally correct, and they will not u but rather to all choices. Similarly, e1 is asymptotically correct for the finite
2/ed to give an arbitrary initial interval element error u - uh in the second-order example of Chapter l. In general,
'! an arbitrary position of the origin.
c 1 will be dec.isive even in displacement error for u- u\ since uh is chosen to
tctor of h/2: 1 and t~ error function minimize the strain energy of the error; Nitsche's argument in Theorem 1.5
easy to check that the ratios are inva- indicates how cf rather than c0 enters the displacement error.
. translation to (x - x 0 )2 alters it by Theorem 3.4 will be proved first for smooth functions u. In each interval
m be taken up by the trial functions. of length h, with midpoint X 0 ,
~ by no means the linear interpolate;
tt of h but too large a constant. The
.nt, beca use it minimizes o ver Sh.
1
The optimal (least-squares) linear approximation to the first three terms is boring. to a mathematician, property :
pro ved.
2
l(x) u(x 0 ) + (x- Xo)u'(xo) T·)
+ 31 ( 2h ) u"(x We point out now (and shall do so
x2 ~ j- are special points. Of course,
polynomial, and therefore appear .in
Using these linear functions l in each subinterval, we are effectively back to
[jh, (j + l)h) they move to (j + 1/2 ±
the quadr~tic case. The ratios in each interval, taking account of the O(h 3 )
however, they are special in another Wé
error from the remainder in the Taylor expansion, still satisfy
ratic vanishes at these points, and uh i~
(In collocation this is known to be tl
lu -ll 1 i~(l + O(h)). them as displacement points. There are a
which are of still greater importance. TI
Squaring, and summing over all the intervals, the piecewise linear function tives of the error function vanish (x =
L formed from all these pieces l satisfies in Section 3.4 that the stress errors a.
power o[ h.
(13) 1 + O(h) The previous theorem extends to a
12~.· n dimensions and more generally toan~
method. In n variables, there are sev'
There remains one difficulty: L is not continuous at the nades. The linear possibly associated with different erro
approximations l depend on Taylor expansions in their own interval and if xP happened to be present in S\ the
cannot be expected to connect with their neighbors. The discrepancy, how- Locally, we imagine the function u as
ever, is only of the order terms of degree k; those of degree k
space, and the approximation depends
_!_(!!_)
3 2
2
(u"(x 0 )
2
_ u"(x 0
2
+ h)) _
-
O(h 3 )
'
DPu of order k. This generalization OJ
following way, with matrices Ks in ph
and therefore an O(h 3 ) alteration in each linear piece l will produce a continu- THEOREM 3.5
ous L. (In the 1-norm the alteration is O(h 2 ).) After such an alteration, both Jf Sh is of degree k 1 on a reguü
(13) and its analogue in the 1-norm continue to hold for L. Therefore, as
matrices Ks such that for every u in X
h ---7 O, the ratios approach the constants 1/12~ and 1/2~. It is clear
that no other choice of vh in (11) could yield a smaller constant, beca use these (14) h 2 (s-k> min 1u if!;---+
constants ·were aiready correct when L was formed from the optimal l on sn la.
each subinterva1. The theorem is therefóre proved when u is smooth enough

The diagonal entries K~P can be
to permit the Taylor expansion (12).
The extension to all u in 3C 2 is a standard technical problem. We define
monomial u xP x1• ... x~n; for ~
derivatives D«u of order k are zero.
the linear operator P~, from 3C 2 to ocs, by
1t was convenient to square the e'
hs- 2 P~u = component of u perpendicular to Sh.
3.4; thus K 0 d = -rlo and K1 =
Two properties of this operator are airead y pro ved, that in two dimensions, the matrices Ks w
three derivatives a2 jax 2 , u2 /ux ay, ar
1. 1P!u ls < Cs! u 12 (Theorem 3.3, with k 2). spaces of arbitrary degree the const:
2. 1P!u ls ~es 1u 12 for smooth u (see previous paragraph). terms ofthe Bernoulli numbers; we bt:
spaces of corresponding degree. Also
By a standard density argument, which is uninteresting to an engineer and h and varying u, are known for splin
CfiAP. 3 SEC. 3.2. MEAN-SQUARE APPROXIMATION 151
proximation to the first three terms is boring to a mathematician, property 2 holds for all u and Theorem 3.4 is
pro ved.
We point out now (and shall do so again) that the zeros of the function
r) x 2 - t are special points. Of course, they are the zeros of the Legendre
subinterval, we are\effectively back to polynomial, and therefore appear in Gauss quadrature-on an interval
interval, taking acc9unt of the O(h 3 ) [jh, (j + l)h] they move to (j + 1/2 ± 1/,.Jj")h. For finite element purposes,
r expansion, still sadsfy however, they are special in another way: The best approximation toa quad-
ratic vanishes at these points, and uh is likely to be of exceptional accuracy.
(In collocation this is known to be the case; see Section 2.3.) We refer to
!u 11 .=
1 h u (1 + O(h)). them as displacement points. There are also stress points, discovered by Barlow,
which are of still greater importance. These are the points at which the deriva-
ntervals, the piecewise linear function tives of the error function vanish (x O in our simple example) and we show
es in Section 3.4 that the stress errors at these points are smaller by an extra
power o[ h.
_ 1 + O(h) The previous theorem extends to any finite element on a regular mesh in
. 12,J"S n dimens.ions and more generally to any example ofthe a'bstract finite element
method. In n variables, there are severa! derivatives DP of order l P1 = k,
wt continuous at the nodes. The linear possibly associated with different error constants in approximation. In fact,
~xpansions in their own interval and if xP happened to be present in S\ the corresponding constant would vanish.
teir neighbors. The discrepancy, how- Locally, .we imagine the function u as expanded in a Taylor series through
terms of degree k; those of degree k - 1 are reproduced exactly by the trial
space, and the approximation depends asymptotically only on the derivatives
DPu of order k. This generalization of Theorem 3.4 can be expressed in the
following way, with matrices Ks in place of the scalar constants cs.
h linear piece t'will produce a continu- THEQREM 3.5
O(h2).) After such an alteration, both
·ontinue to hold for Therefore, as /f Sh is of degree k l on a regular mesh, there exist nonnegatil•e definite
mts l/12,J"S and 1/2~. It is clear matrices Ks such that for e very u in :JCk,
yield a smaller constant, because these
[ was formed from the optimal 1 on
(14) h 2 (s-k) min 1 u - vh ¡; ~
Sh
L;
!cr.! =IPI=k
K~P J(Dcr.u)(DPu) dx.
:Ore proved when u is smooth enough
The diagonal entries K~P can be determined by approximation of the
andard technical problem. We define monomial u xP = x1' ... x~n, fór which DPu is a constant and the other
'by derivatives vcr.u of order k are zero.
:u perpendicular toS h. It was convenient to square the expressions which occurred in Theorem
3.4; thus K 0 e~= -,-k and K 1 ct = -(z-. For linear approximation in
ready proved, that in two dimensions, the matrices Ks will be of order 3, corresponding to the
three derivatives a2 jax 2 , a2 jax ay, and a2 jay 2 of order k 2. For spline
'ith k 2). spaces of arbitrary degree the constants K:P have been computed [S5] in
!e previous paragraph). terms ofthe Bernoulli numbers; we believe they are the same for the Hermite
spaces of corresponding degree. Also the minimal constants Cn for a fixed
h is uninteresting to an engineer and h and varying u, are known for ·splines (Babuska). In this case the extreme
152 APPROXIMAT~ON CHAP. 3 SEC. 3.2.
functions are again sines of wavelength 2h, their best approximations are turns out to be
identically zero, and the constants are es res-k.
It is possible to use Theorem 3.5 for a quantitative comparison of two (15) lu lflt
different1 polynomial elements, or of the same element with two different
geometr(cal configurations. Consider two regular triangulations of the plane, On this basis, the two configuration!
one involving diagonals in both directions, the other in only one direc- case (A) is twice as great; in other wo
tion (Fig,>·, 3.4). Combinatorially the two are quite different. In (A) sorne side h/.../2 instead of h. With this el
and configura/ion (B) is better by ara¡
error, with the number of free param
(B) by a factor of .JT. By symmetr:
For the twist term xy, the two configu
and there is nothing to choose.
These calculations are confirmed
in the engineering literature, which fa·
of higher order a computer could de1
(a) (b) finite element problem, for which tht
same time be computing the leading
Fig. 3.4 Two possible tríangulations. finite difference scheme which arise~
and 3.4).
nodes are connected to four neighbors, and others to eight; in (B) every node Theorem 3.5 has also an importa
has six neighbors. Consider Courant's space Sh of continuous piecewise linear
functions on these triangles. Since (A) has twice as many nodes as (B), the COROLLARY
dimension of S~ will be twice that of S'B. Furthermore, the space S~ contains To achieve approximation of order
S'B and is therefore at least as good in approximation. The question is Sh on a regular mesh must be of degr
whether it is twice as good, to compensate for having twice the number of element method converges, in case' of,
parameters. if k > m; this is the constan! strain e
In two dimensions, the three quadratic monomials are x 2 , xy, and y 2 • m must be present in Sh.
For each of them we propose to find the element uh of Sh which minimizes
1u - uh j 1 • This uh will be exactly the finite element solution for Poisson's This corollary is a converse, on a r
equation, when the true solution u is a quadratic. it, suppose that Sh were only of deg
We begih with u= x 2 , on a square of side h = 2 centered at the origin. and not in Sh. Then by Theorem 3.4,
From considerations of symmetry, the optimal uh takes the same value at the
four vertices, say rt in (A) and p in (B); at the center, it must equal P in case
(B) but may ha ve a different value y in (A). This means that in configuration
(B), uh is a constant on this particular square~the same constant P ! Therefore, the order of approximatio
as in the -ene-dimensional case computed abo ve. The error over the mesh corollary is proved. Clearly the theor'
square is the former applies to all u, while for t
approximation of the lowest-degree
1u - uh Ir = ff (x 2
- P)~ + (x 2 - p); = fJ(2x)2 dx dy = l.
1
space.
The conclusion is rather remarkat
This equals h2 1 u IV12, exactly as in one dimension: K~ 1 = fi. In case (A) of polynomials. It implies that piece1
there is a minimization with respect to y; the constant must be smaller, and tions, not only for their convenience bu,
, CHAP. 3 SEC •.3.2. MEAN-SQUARE APPROXIMATION 153
:gth
es 2h,ns-k.
their best approximations are
' ,....,
turns out to be
for a quantitative comparison of two (15)

the same element with two different
wo regular triangulations of the plane, On this basis, the two configurations can be compared. The dimension in
:ctions, the other in only one direc- case (A) is twice as great; in other words, it effectively works with squares of
two are quite differe:nt. In (A) sorne side h/VT instead of h. With this change in (15), the constant becomes .¿-,
. .r..,
!
( and configuration (B) is better by a ratio of 12: 9. The coefficient of x 2 in the
~--------,/t&-
error, with the number of free parameters equalized, will be smaller in case
(B) by a factor of By symmetry this is true also of the y 2 coefficient.
For the twist term xy, the two configurations turn out to be equally effective,
and there is nothing to choose.
These calculations are confirmed by numerical experiments, reported
in the engineering literature, which favor the configuration (B). For elements
of higher order a computer could determine the constants K«P by solving a
(b} finite element problem, for which the true solution is u= xP. lt will at the
. same time be computing the leading terms in the truncation error for the
le triangulations. finite difference scheme which arises on a regular mesh (see Sections 1.3
and 3.4).
and others to eight; in (B) every node Theorem 3.5 has also an important theoretical consequence.
;pace Sh of continuous piecewise linear
1 has twice as many nodes as (B), the COROLLARY
·~. Furthermore, the space S~ contains
To achieve approximation of order hk-s to the sth derivative, a tria! space
l in approximation. The question is
Sh on a regular mesh must be of degree at least k - l. Therefore, the finite
1sate for having twice the number of
element Ínethod converges, in case of a differential equation of order 2m, only
if k > m; this is the constant strain condition, that al! polynomials of degree
iratic monomials are x 2 , xy, and y 2 •
m must be present in Sh.
:he element uh of Sh which minimizes
finite element solution for Poisson's
This corollary is a converse, on a regular mesh, to Theorem 3.2. To prove
quadratic.
it, suppose that Sh were only of degree l 1, l < k. Let x« be of degree l
: of side h = 2 centered at the origin.
and not in Sh. Then by Theorem 3.4,
optimal uh takes the same value at the
: at the center, it must equal p in case hs-t min 1xa: - vh ls -+ constant :;t:O.
(A). This means that in configuration
r square-the same constant p t Therefore, the order of approximation to xa: is only l s, not k s, and the
1ted above. The erro-r over the mesh corollary is proved. Clearly the theorem is much stronger than its corollary;
the former applies to all u, while for the latter it is only necessary to consider
approximation of the lowest-degree polynomial which is not in the trial
p); = JJ(2x) 2
dx dy i3l •
space.
The conclusion is rather remarkable: everything depends on the presence
e dimension: K~ 1 fz-.. In case (A) of polynomials. It implies that piecewise polynomials are the best tria! func-
y; the constant must be smaller, and tions, not only for their convenience but also for their approximation properties.
154 APPROXIMATlON
K.rv-., ,.
· CHAP. 3 SEC. 3.2.
From the beginning~ the finite element method has worked with subspaces We conclude this section with soJJ
of the ·optimal kind. regular mesh, in other words, on appt
The result of the corollary can be proved directly [810], and because of element method. It is natural to appro
its importance we shall do so in the simplest case. We show that for approxi- transform. The mean-square norms f
mation of order h, the constant function 1 must lie in the trial space. Suppose, seval's formula to JJ1e~~u. 12 de,and thl
for examp!e, that the roof function were replaced by the cosine (Fig. 3.5). produces zeros in its Fourier transf<
generates alllinear polynomials; its tr•
Roof function Cos function has zeros of order k = 2 at all the p
spline of arbitrary degree k 1, the
the result of convolving the box funct
between polynomials of degree k 1
was discovered by Schoenberg in his
covered three times in the finite elemt
In the papers [SS, SIO] and in Au
the abstract finite element method in .
The corollary stated earlier, that on
Fig. 3.5 A non-polynomial shape function. k - 1 to achieve approximation of
methods-together with the existenc
The trial space contains all combinations v11 2:: qi Cos [n(x- jh)/2h], in Section 3.1. We confess to a prof
the notation Cos implying that the, cosine is truncated outside its central arch cannot be included in full detail, but
1(J 1 n/2. It might appear that, in the limitas h ~O, there would be little results which are unique to the case e
difference in comparison with piecewise linear approximation. important to the general theory-the
Suppose, however, that we attempt to approximate u= 1 on a unit ínter- lary, the finite difference aspect of K~
val. In the linear case, this function lies in the space. In the new case it does discussion of condition number in Cl
not, and the error function Eh will be periodic of period h. We ignore boun- Permit us to return to the main n:
dary conditions, or rather assume them to be periodic, since in any case they e1 the differerice between u and it~ in te
have only a secondary effect on approximation in the interior of the interval.
Thus the error is (16)
min 111 -
S"
vh 115 = J (E"(x)) dx.
1
O
2
What happens if there is a singulari
from belonging to the space JCk? Wi
This integral covers 1/h periods, and over each period its value is K 0 h, where gence will definitely be reduced; if u 1
K 0 is a constant independent of h. If h is changed, the error function (justas sq uare sense, the error in energy will
in Fig. 3.3) is simply scaled to its new period. Therefore, the approximation And the pointwise error in the strains
error is a constant K 0 (as predicted in Theorem 3.5) and does not decrease any improvement is possible by "g1
with h. The Cosine space has no approximation properties whatever. varying the mesh width h to produce
Of course, this adverse conclusion does not apply to the ordinary cosines, · A useful rule of thumb, when it is
which are among the most valuable trial functions in the Ritz method. In a lar trial functions, is suggested by fo
sense they are of infinite accuracy, k = oo, since they take advantage of each so as to keep h7-m 1u lk,e, roughly the St
additiona1 degree of smoothness of the solution u to be approximated. Since dimension, with an x~~ singularity at
each cosine i~ nonzero over the whole interval, this case does not fall in the should be approximately constant; th
finite element framework and the restriction that the space must contain to be. It appears that this rule haE
polynomials to be effective in approximation is no longer in force. order of accuracy can be achieved Jo
nt method has worked with subspaces We conclude this section with sorne historical remarks on the case of a
regular mesh, in other words, on approximation theory in the abstract finite
~ preved directly [SIO], and because of element method. It is natural to approach this problem by use ofthe Fourier
mplest case. We show that for approxi- transform. The mean-square norms JJ l D<~-u !2 dx can be converted by Par-
)n 1 must lie in the trial space. Suppose, seval's formula to JJ lea-a !2 de, and the condition that rp generate polynomials
Nere replaced by the cosine (Fig. 3.5). produces zeros in its Fourier transform. The roof function, for example,
generates alllinear polynomials; its tra~lsform is tp((j (sin e/2/e/2) 2 , which
Cos function
has zeros of order k = 2 at all the points e ±4n, .... For the B-
spline of arbitrary degree k 1, the exponent in tp simply becomes k; it is
the result of convolving the box function with itself k times. The connection
between polynomials Of degree k - 1 Ín X and zeros Of order k at e 2nn
was discovered by Schoenberg in his first paper on splines [S2] and redis-
covered three times in the finite element context [G5, B2, F6].
In the papers [SS, S lO] and in Aubin's book [4], the Fourier analysis of
the abstract finite element method in n dimensions is pursued very seriously.
The corollary stated earlier, that on a/ regular mesh Sh must be of degree
Iomíal shape function. k - 1 to achieve approximation of order hk, was first proved by Fouri~r
methods-together with the existence of the superfunction t¡1 referred to
1ations v" 2::: q¡ Cos [n(x- jh)/2h], in Section 3.1. We confess to a profound regret that this Fourier analysis
ine is truncated outside its central arch cannot be included in full detail, but it is only realistic for us to select those
e limit as h O, there would be little results which are unique to the case of a regular mesh and at the same time
se linear approximation. important to the general theory-the asymptotic theorem 3.5 and its corol-
to approximate u =
1 on a unit inter- lary, the finite difference aspect of KQ F described in Section 3.4, and the ·
s in the space. In the new case it does discussion of condition number in Chapter 5.
periodic of period h. We ignore boun- Permit us to return tq the main resuJt of this section, that in the element
1 to be periodic, since in any case they e¡ th.e difference between u and its interpola te satisfies
jmation in the interior of the interval.
(16)
' J~ (E"(x)) 2
dx.
What happens if there is a singularity in the solution u, which prevents it
from belonging to the space X k? With a regular mesh, the order of conver-
ver each period its value is K 0 h, where gence will definitely be reduced; if u possesses only r derivatives in the mean-
s changed, the error function (just as square sense, the error in energy will decrease like h 2 <r-m) instead of h 2 <k-m).
period. Therefore, the approximation And the pointwise error in the strains is visibly worse. The question is whether
Theorem 3.5) and does not decrease any improvement is possible by "grading" the mesh, in other words, by
dmation properties whatever. varying the mesh width h to produce a finer mesh near the singularity.
oes not apply to the ordinary cosines, A useful rule of thumb, when it is inconvenient to introduce special singu-
al functions in the Ritz method. In a lar trial functions, is suggested by formula (16): The grading should be done
oo, since they take advantage of each so as to keep hf-m 1 u lk, e, roughly the same from one element to the next. In one
solution u to be approximated. Since dimension, with an xrx singularity at the origin, this means that hk-m+ 112 xa-~k
interval, this case does n.ot fall in the should be approximately constant; the Iarger x is, the larger h= L\x is a~lowed
riction that the space must contain to be. It appears that this rule has a remarkable consequence: the same
1ation is no longer in force. order of accuracy can be achieved for a singular as for a regular solution u,
156 APPROXIMATION , CHAP.3 SEC. 3.3. CURVED ELEMENTS AND 1~
by properly grading the mesh. In other words, suppose that for an irregular to be satisfied? The isoparametric tec
n-dimensional mesh with N elements we compute an average mesh size ii polynomials to define the coordinate tran.
from the formula Niin vol n. Then the correct grading can achieve speaking, isoparametric means that the
1u u1 lm,n O(iik-m), even though u is singular and 1u lk,n = oo. for coordinate change as for the tria]
means that a subset of lower-degree ~
require continuity between elements at
3.3. CURV~,D ELEMENTS AND ISOPARAMETRIC The fundamental isoparametric ex
TRANS.FORMATIONS
from a square to a quadrilateral (Fig.
The basic idea is simple. Suppose it is intended to use a standard polyno- y
mial element, for example one of those defined on triangles or rectangles in
Section 1.8. Suppose also that the regions into which n is subdivided are
not of the proper shape; they may have one or more curved sides, or they (0,1)
may be nonrectangular quadrilaterals. By changing toa new /;-11 coordinate
system, the elements can be given the correct shape. The element stiffness
matrices are then evaluated by integrations in the new variables, over triangles (0,0}
or rectangles, and minimization leads to a finite element solution uh(l;, 11)
which can be transformed back to x and y.t
There are several points to watch. First, since a typical two-dimensional
element integral is transformed by
Fig. 3.6 Isoparametric rr

(17)
--j.- ff p(x(l;, 11), e, 11))(vel:x + v,11x)2J(I;, 11) di; d11,
E¡
y(
S and Q is given by
(18)
the coordinate change and its derivatives must be easily computable. Further-
more, the change must not distort_the element excessively, or the Jacobian
determinant J XeY'l - x,ye may vanish within the region of integration; Each edge of the square S goes linea
it is surprisingly easy for this to happen. Excessive distortion will also destroy For example, if 11 O and 1; varies fr·
the accuracy built into the polynomial element. Polynomials in the new vari- linearly from one corner (xp y 1 ) to the
ables do not correspond to polynomials in the old, and the requirement in the location ofthe other vertices (x 3 , J
order to preserve the approximation theory is that the coordinate changes This guarantees that any conforming
should be uniformly smooth. Finally, in order that conforming elements in bicubic Hermite element, for which 1
e-11 shall be conforming in x-y, there is a global continuity condition on the parametric, respectively-will still be
coordinate cllange:Ifthe energy involves mth derivatives, then the coordinate We must check that the mapping
change must be of class em-l between elements. For the present we discuss each point (x, y) in Q corresponds too
only the case m 1, arising from second-order differential equations, where the equations (18) for 1; and 11 in term
the mapping must be continuous between elements: A point common to e; square roots and lead nowhere. Inste
and ei must not split into two separate points when ei E¡, ej--+ boundaries of S and Q correspond, v
How are these properties, especially the requirement of computability, is nonvanishing inside S:
tThe isoparametric technique is equally irnportant in three dirnensions. It is sirnpler

x2
J = XeY, x,Ye =
to discuss exarnples in the plane, but there is no difference in the theory. l Y2-
CHAP. 3 SEC. 3.3. CURVED ELEMENTS AND ISOPARAMETRIC TRANSFORMATIONS 157
words, suppose that for a;n irregular to be satisfied? The isoparametric technique consists in choosing piecewise
we compute an average ~esh size fí polynomials to define the coordina{e transformations x(e, t¡) and y(e, t¡). Strictly
~n the correct grading can achieve
speaking, isoparametric means that the same polynomial elements are chosen
s singular and 1 u lk,n = oo. for coordinate change as for the trial functions themselves; subparametric
means that a subset of lower-degree polynomials is used. In either case we
~RAMETRIC
require continuity between elements and nonvanishing of the Jacobian.
The fundamental isoparametric example is the bilinear transformation
from a square to a quadrilateral (Fig. 3.6). The coordinate change between
is intended to use a standard polyno-
defined on triangles or reot~ngles in y 7] y
ions into which Q is subdivl'ded are
1
1e one or more curved sides, or they (0,1) ( 1' 1)
By changing toa new e-11 coordinate
correct shape. The eleinent stiffness
ms in the new variables, o ver triangles
1
(0,0)
S
,, X
to a finite element solution uh(e, t¡) s'

dy.t
;irst, since a typical two-dimensional
Fig. _3.6 Isoparametric mappings to quadrilaterals.
S and Q is given by
(18},
x(e, t¡) = x 1 + (x x1)e + (x3 -
2 - x1)11 + (x4- x3 - Xz + x1)e11,
: must be easily computable. Further- y(e, t¡) = Y1 + (Yz- Y1)e + (y3 - Y1)t¡ + (y4- Y3 - Yz + Y1)et¡.
element excessively, or the Jacobian
sh within the region of integration; Each edge of the square S goes linearly into the corresponding edge of Q.
Excessive distortion will also destroy e
For example, if t¡ = O and varies from O to 1, then the point (x, y) m oves
lement. Polynomials in the new vari- linearly from one corner (xP y 1) to the next corner (x 2, y 2). On this boundary,
s in the old, and the requirement in the location of the other vertices (x 3 , y 3 ) and (x 4 , y 4 ) has absolutely no effect.
eory is that the coordinate changes This guarantees that any conforming element in and t¡-say a bilinear ore
1 order that conforming elements in bicubic Hermite element, for which the mapping is isoparametric or sub-
a global continuity condition on the parametric, respectively-will still be conforming in x and y.
: mth derivatives, theq_the coordina te We must check that the mapping (18) is invertible, in other words, that
:lements. For the present we discuss each point (x, y) in Q corresponds to one and only one pair (e, t¡) in S. Solving
j-order differential equations, where e
the equations (18) for and t¡ in terms of X and y Will introduce COmplicated
en elements: A point common to e; square roots and lead nowhere. Instead, since it is already verified that the
>oints when e; --)oE¡, ei Er
--)o
boundaries of S and Q correspond, we have only to show that the Jacobian
' the requirement of computability, is nonvanishing inside S:
portant in three dimensions. It is simpler

to difference in the theory.
_ _ -j
J - Xr:Y, x,Yr: - .
x2 - x1 + At¡ x3 - x1 + AeJ: j,
Yz - Y1 + B17 Y3 - Y1 + B~.,
158 APPROXIMATION CHAP. 3 SEC. 3.3. CURVED ELEMENTS AND
where A= x 4 x 3 - x 2 + x 1 and B y 4 - y 3 - y 2 + y 1 • The Jacobian a 6 r¡ 2 is determined by its values at tl

is actualiy linear, and not bilinear, since the coefficient of er¡ in this 2 X 2 vertices and three at the midpoints.
determinant is AB AB O. Therefóre, if J has the same sign at al! four We imagine two mappings of ti
.corners of S, it cannot vanish inside. At the typical corner e = O, r¡ = O, the plane. First a simple linear map non
Jacobian is two straight edges onto the coordina
is a constant, and this step is simpl
is to connect x'-y' to e-r¡, associa·
point (X', Y') with the midpoint (~
By the cross-product formula this equals it~_~in (}, where the lengths 1 and 1' (X', Y') will often be chosen halfway
and the angle (} a~e shown in the figure. Therefore, J is positive at this corner The mapping is bilinear:
provided the interiQr angle (} is less than n. The same is true at every other
x'=e+!
corner. Consequently J is nonvanishing if and only if the quadrilateral Q is (19)
convex: AH its angles must be less than n. Otherwise, as for Q" in the figure, y' r¡ + 1
J will change sign somewhere inside S. In this case the coordinate change
is illegal. It is easy to verify that the straight e
Notice that even though polynomials in x and y do not generally trans- adjacent elements is assured. The Ja1
form into polynomials in e and r¡, the linear po~ynomia/s ], X, and y are specia/.
The coordinate transformation itself expresses x and y as bilinear functions
1 + (4X'-:
J
e
of and r¡, and of course constant functions remain constant. Since these 1 (4Y' 2)t¡
three polynomials líe in the trial space, the convergence condition is guar- = 1 + (4X'- 2
anteed to hold-all solutions u =a + Px + yy of constant strain are
reproduced exactly in sn. This is always the case for isoparametric mappings, Again the Jacobian is linear. It equal
and convergence is assured. The subparametric case is even better: If the in the triangle if and only if it is posit
e
trial space contains all biquadratics or bicubics in and r¡, then it contains tion, first shown to us by Mitchell, i
all quadratics or cubics in x and y, and Sh has degree 2 or 3, respectively.
Therefore, assuming that the angles of Q arebounded away from O and n, At (0, 1): 4X'
approxtfnation is possible to the full degree and the error in the strain is At (1, 0): 4Y'
O(hk- 1), as it should be.
The most important example for triangles is one with a curved edge, such Therefore, (X', Y') may lie anywhe1
as might arise at the boundary of r (Fig. 3.7). The simplest curve possible lines, and correspondingly (X, Y) sh
is one of second degree, and a natural choice· of element is the quadratic. On Notice that even if the original trian
the e-r¡ triangle, the trial function vh al + a2e + a3r¡ + a4e 2 + aserJ + lie in the middle half of its edge or s
a vanishing Jacobian. (Of course ir.
y y' shift it; on a straight triangle we e
variables, even with arbitrarily plac
is really intended for the case when
In this example the curved si de w
ric case, either with triangles or qw
are given by the same type of poly11
~ents, and all edges may be polynt
will be the same as those for the ele
sorne derivatives at a node, this me~
Fig. 3.7 Mappings of a curved triangle. the boundary curves must be contim
SEC. 3.3. CURVED ELEMENTS ANO ISOPARAMETRIC TRANSFORMATIONS 159
, -\~ .
'= Y4- YJ- Yz + Y 1 • The Jacobian a6 r¡ 2 is determined by its values at the six nodes of the figure, three at the
tce the coefficient of ~r¡ in this 2 x 2 vertices and three at the midpoints.
ore, if J has the same sign at al! four We imagine two mappings of the original curved triangle in the x-y
t the typical corner ~ = O, r¡ = O, the plan e. First ·a simple linear map normalizes the curved triangle, putting the
two straight edges onto the coordinate axes in the x'-y' plane. The Jacobian
is a constant, and this step is simply for convenience. The important step
is to connect x'-y' to ~-r¡, associating an arbitrarily specified boundary
point (X', Y') with the midpoint (!, !). In practice, (X, Y) and therefore
:tls 11' sin B, where the lengths 1 and /' (X', Y') will often be chosen halfway along the curve, but this is not required .
. Therefore, J is positive at this corner The mapping is bilinear:
an re. The same is true at every other
~ if and only if the quadrilaterh_l Q is x' = ~ + (4X'- 2)~r¡,
(19)
1 re. Otherwise, as for Q" in the figure, y'= r¡ + (4Y'- 2)~r¡.
~. In this case the coordinate change
It is easy to verify that the straight edges are preserved, and continuity with
1ls in x and y do not generally trans- adjaceut elements is assured. The Jacobian is simply
rear polynomials 1, x, and y are special.
(presses x and y as bilinear functions J = 11 + (4X' - 2)r¡ (4X' - 2)~ -~
nctions remain constant. Since these (4Y'- 2)r¡ 1 + (4Y'- 2)~
!, the convergence condition is guar-
= 1 + (4X'- 2)r¡ + (4Y'- 2)~.
+ Px + f'Y of constant strain are
s the case for isoparametric mappings, Again the Jacobian is linear. lt equals 1 at ~ = O, r¡ = O, and will be nonzero
arametric case is even better: If the in the triangle if and only if it is positive at the other two vertices. This condi-
bicubics in ~ and r¡, then it.contains tion, first shown to us by Mitchell, is simply
.d Sh has degree 2 or 3, respectively.
· Q are bounded away from O and re, At(O,l): 4X'-2>-l, or X'>t,
jegree and the error in the strain is
At (1, 0): 4Y'- 2 > -,---1, or Y'>!-
mgles is one with a curved edge, such Therefore, (X', Y') may lie anywhere in the quadrant formed by the dashed
;'ig. 3.7). The simplest curve possible lines, and correspondingly (X, Y) should lie in the sector shown in the figure.
hoice of element is the quadratic. On Notice that even if the original triangle were straight, the point (X, Y) must
= al + a 2 ~ + a 3 r¡ + a 4 ~ 2 + a 5 ~r¡ + lie in the middle half of its edge or shifting it to the midpoint would produce
a vanishing Jacobian. (Of course in this case there would be no reason to
r¡ shift it; on a straight triangle we could use quadratic elements in the x-y
variables, even with arbitrarily placed midedge nodes. The mapping to ~-r¡
is really intended for the case when a curved edge needs to be straightened.)
In this example the curved si de was a para bola. In the general isoparamet-
ric case, either with triangles or quadrilaterals, the mappings x(~, r¡), y(~, r¡)
are given by the same type of polynomial elements as are used for displace-
ments, and all edges may be polynomials of degree k - l. The constraints
---x'
will be the same as those for the element itself-when the urtknowns include
sorne derivatives at a node, this means that the corresponding derivatives of
· a curved triangle. the boundary curves must be continuous there. The Lagrange case is therefore
160 APPROXIMATION CHAP. 3 SEC. 3.3. CURVED ELEMENTS AN
much the simplest for isopararnetric transformations, since the only un- of terms, for example,
knowris are function values and the only constraint is the continuity between
elements, which is required in any case. Iri fact, it is especially simple if, as
in the serendipity rectangular e1ement in Fig. 3.8, there are no interna! nodes. ·
y y
~=1
~=1/2 \
~=O \
\
In general a derivative of order l Pl
' ' ..... ,,7] = 1
/'~ = 1/2
' ',,7] = 1/2
where the constant ll Fllk is comp1
//~= 1 y~, x,l' .... within the element up to
argument we are following, write ot
of the Jacobian J also enters the tra
~--------------------x ~-----------------------x
Fig. 3.8 Common isoparametric elements.
The mapping between the boundaries then determines the entire coordinate This holds for alll P1 k, and the1
change, which is otherwise highly sensitive to the movement of interna! nodes.
We must emphasize that the whole isoparametríc technique depends on
the use of numerical integration (in the e-r¡ variables) to compute the entries
of K and F. From the change of variables in the element integral (1 7), it is
obvious even for an isotropic material (p constant) that the mathematical This yields an upper bound for the
equivalent of variable material properties is introduced by the functions For the left sirle we want, a lo
ex, 1'/x, and J(e, r¡). In general the first two are rational functions and the last which led to (21) is reversed:
is a polynomial, with smoothnessdepending on the distortión of the element.
We shall establish in Section 4.3 how the numerical integration error affects (22)
the final result and what order of accuracy is required.
Here we deal with the question of approximation: How closely can iso- This time the factor 11 p-t¡¡s depen
parametric elements match the true solution u(x, y)? The answer must depend s of the in verse mapping p-t, takin!
on the size of the derivatives in the coordinate change F: J is nonvanishing, as we assume,
in terms of derivatives ofF. For th~
FCe, r¡) = cxce, r¡), y( e, r¡)). identity for the Jacobian matrices
Let ú(e, 11) denote the solution u(x, y) transformed to the new coordinates.
Then if Sh is of degree k e
1 in and r¡, the interpolate ú1 satisfies
(20) For higher derivatives this identity i

again becomes the boundedness of
Changing to the x-y variable's, each derivative of ú is transformed into a sum Now we assemble the results.
CHAP. 3 SEC. 3.3. CURVED ELEMENTS AND ISOPARAMETRIC TRANSFORMATIONS 161
transformations, since. the only un- of terms, for example,

y constraint is tb;e continuity between
:. In fact, it is especia1Iy simple if, as
1 Fig. 3.8, there are no interna! nodes.
y
!=112 !=~
!=O \
\
\
\
1
1
In general a derivative of order l p1 k will be bounded by
k
l DPu(e, 11) 1 < 11 Fllk L: l D"u(x, y) l,
1)11=1
where the constant 11 Fllk is computed from powers of the derivatives x,,
y,, x,, ... whhin the element up to order k. Ciarlet and Raviart [C7], whose
argument we are following, write out this expression in detail. The reciproca!
of the Jacobian J also enters the transformation of the integral:
~~---------------------x
>arametric elements. f E¡
IDPul2ded11<
-
.IIFII~
~~n IJ(e, 'f/) 1
f e¡
i:ID)Iul2dxdy.
1
hen determines the entire coordinate

ve to the movement of interna! nodes. This holds for alll P1 k, and therefore
isoparametric technique depends on
;-11 variables) to compute the entries (21) 11 u 112k, E,
A 11 Fll~
min IJ l 11 u 112k, e,·
>les in the element in.tegral (17), it is
p constant) that the mathematical This yields an upper bound for the right side of (20).
ties is introduced by the functions For the left side we want a lower bound, and therefore the argument
vo are rational functions and the last which led to (21) is reversed:
Iing on the distortión of the element.
e numerical integration error affects (22)
:tcy is required.
pproximation: How dosel y can iso- This time the factor 11 p-t lis depends on powers of derivatives up to order
ion u(x, y)? The answer must depend S ofthe inverse mapping p-t, taking X, y toe, 11· Ifthe Jacobian determinant
·dinate change F: J is nonvanishing, as we assume, then derivatives of p-t can be bounded
11), y( e, r¡)). in terms of derivatives ofF. For the first derivatives, this is expressed by the
identity for the Jacobian matrices
ransformed to the new coordinates.
(Yexe Y,x,)-
1
the interpolate íi1 satisfies •
For higher derivatives this identity is differentiated, and the essential question
again beco mes the boundedness of the deriva ti ves ofF.
ative of íi is transformed into a su m
1
Now we assemble the results. Substituting (21) and (22), the original
162 APPROXIMA TI ON CHAP. 3 SEC. 3.3. CURVED ELEMENTS AND 1:
late u1 in Sh satisfies
inequality 11 u u1 lls < Chf-s 11 u]]k on the element E¡ is con verted toa similar
inequality"on e¡: ,
(Jn!u U¡]2dx;
(23)
where the constant is now

(J n 1grad(u U1 )1 2 dx
c(m~x]JI)
112
C'
mmiJI IIFII k IIF- 1 11·s The constan! C' is given in (23). The re
solution uh, for a second-order differenti
in strain energy.
This is the fundamental result. The order of approximation is the same
with isoparametrics as with the standard polynomials, provided C' remains The inequalities in the theorem ar
bounded. Note that T1¡ is still the diameter of the element in the 1;-r¡ plane, derivatives, that is, to the cases s = O
even though the approximation inequality (23) is now in x-y. We may, how- order hk-s holds also for higher derivat
ever, suppose the diameters to be the same in the two coordinate systems. elements are joined the interpolate u1 is
(A change of scale in 1;,-r¡ lea ves the inequality unchanged, as it must; if h¡ only to ;JC 1 over the whole domain O.
becomes rxh¡, then the norms ofF and F- 1 are multiplied by rx-k and rxs, The uniformity required by the the
respectively.) This will be a convenient normalization, since with equal amounts essentially to this:
diameters the isoparametric approximation problem reduces exactly to the
boundedness ofF and its derivatives, and the nonvanishing of the Jacobian l. The Jacobians should be bound
J, uniformly as h O. to be bounded away from O and n, an
There is an important modification in the quadrilateral case. 1t is needed 2. The edges of the elements in the
even for the map defined by uniformly bounded derivatives, i.e., thc
r¡ - l;,r¡. along the edges must remain bounde
x(l;,, r¡) = ?;, y(l;,, r¡)
2h displaced only by O(h 2 ) from straight
ha ve 11 Fllk < constant.
This transformation takes the three points (0, 0), (h, O) and (0, h) into them-
selves and moves the fourth corner (h, h) ofthe square toa new point (h, h/2). Computations by Fried [F14] have
In other words, this is a completely typical map, transforming the square tion 2. He increased the curvature <
in the 1;,-r¡ plane into a quadrilateral of comparable size and shape. Neverthe- from the original square was ofthe sar
less, the cross derivative y;; 11 , which enters the norm 11 F\lk in the approximation On a unit square this still implies a b
inequality, is of order 1/h. This would destroy the order of approximation. . size h the curvature (being a second de
Ciarlet and Raviart were able, however, to make use of the presence of the results were correspondingly poor. P
twist term l;,r¡ in the trial space; bilinear interpolation reproduces ~t exactly, between these overly distorted element
in addition to the linear terms a 1 + a 2 /; + a 3 r¡ of the triangular case. Their by the uniformity hypothesis in theth
result is that-for quadrilaterals, the cross derivatives Xe 11 and Ye 11 do not appear On the happy side, suppose that a <
in the factor 11 Fl]k, and the expected order of approximation is possible. A of Q; its one curved si de follows the
similar conclusion holds for biquadratics and bicubics. a polynomial approximation to r,
The consequences of the inequality (23) are summarized in the following specified nodes.) Then if r is smooth
wi/1 automatically be, satisfied. The in
THEOREM 3.6 only by O(h2 ) from a straight line, a
Suppose the ¿;-r¡ tria! space is of degree k - 1, and the element transforma- terms ofthe curvature ofr. Therefore,
tions Finto x-y are uniformas h ----> O. Then if u belongs to ;JCk(Q), its interpo.:. order equations, allows essential boundc
CHAP.3 SEC. 3.3. CURVED ELEMENTS AND ISOPARAMETRIC TRANSFORMATIONS 163
1e eJement E, is converted to a ~-iJar late u1 in Sh satisjies
/2
11 Fllk 11 p-lJJs• The constan! C' is given in (23). The rate of convergence of the finite element
solution uh, for a second-order dif.ferential equation, is therefore the full Jzz<k-n
in strain energy.
order of approximation is the same
·d polynomials, provided C' remains
ter of the eJement in the e-1(}:>-lane, The inequalities in the theorem apply onJy to the function and its first
ty (23) is now in x-y. We may, how- derivatives, that is, to the cases s = O and s 1 of (23). Approximation of
;ame in the two coordinate systerhs. order Jzk-s holds also for higher derivatives within each element, but when the
~quality unchanged, as it must; if h;
elements are joined the interpola te u1 is no more than continuous, and belongs
F- 1 are multiplied by rx-k and r:t.S, only to 3C 1 over the whole domain n.
nt normalization, since with equal The uniformity required by the theorem is a rather severe condition, and
amounts essentially to this:
tion problem reduces exactly to the
1d the nonvanishing of the Jacobian
l. The Jacobians should be bounded away from zero, forcing all angles
n the quadrilateral case. It is needed to be bounded away from O and n, and
2. The edges ofthe elements in the x-y plane should be polynomials with
uniformly bounded derivatives, i.e., the curvatures and the higher derivatives
;, 11) e11. along the edges must remain bounded. In particular, the edges should ~e
2h displaced only by O(h 2 ) from straight Iines. Then a well-chosen map F wtll
have-,¡¡ Fllk < constant.
1ts (0, 0), (h, O) and (0, h) into them-
of the square to a new point (h, h/2). Computations by Fried [Fl4] have demonstrated the necessity of condi-
pical map, transforming the square
tion 2. He increased the curvature of the edges until their displacement
>mparable size and shape. Neverthe- from the original square was ofthe same order as the size ofthe square itself.
the norm 11 Fllk in the approximation On a unit square this still implies a bounded curvature, but scaled down to
lestroy the order of approximation. . size h the curvature (being a second derivative) is of order 1/h. The numerical
to make use of the presence of the results were correspondingly poor. Practical problems will fall somewhere
interpolati~n reproduces it exactly, between these overly distorted eJements and the nearly straight ones required
+ a 3r¡ of the triangular case. Their by the uniformity hypothesis in the theorem.
derivatives x~, and Y~11do not appear On the happy side, suppose that a curved element occurs at the boundary
jer of approximation is possible. A of Q; its one curved side follows the true boundary r. (It will normally be
~ and bicubics.
a polynomial approximation to r, interpolating the true boundary at
~3) are summarized in the following
specified nodes.) Then if r is smooth, the uniformity cond!tion on th~s edge
will automatically be satisfied. The interpolating polynomtal edge wlll vary
only by O(h 2 ) fro~ a straight line, and the curvatures will be bounded in
~k 1, and the element transforma- terms ofthe curvature ofr. Therefore, the isoparametric technique,for second-
en if u belongs to Xk(Q), its interpo- order equations, allows essential bbundary conditions to be handled with no loss
in accuracy or in simp/icity over natural boundary conditions. The same remarks coordinates. The mapping in the figure (
apply to an interna! boundary. mite elements, for example, allowing thc.:
The improvement which this technique allows, in comparison with ap- tangents, and in the Hermite case also ti
proxim~ting the boundary by a polygon, is enormous. In his experiments be continuous at the vertices. This is a
with cubic triangular elements, for example, Zlamal [ZlO] found a difference theoretical limitation on the isoparame1
of an ordyr of magnitude in displacements, and even more in strains. There Iems.
can be nordoubt ofthe technique's success in second-order problems.t The message seems to be this, that
For fo~rth-order equations-plate and shell problems, for example- order equations rather than second-ord
the position is much less happy. The theory insists that the coordina te change tion of unknowns may be of overwhelm
be e 1 : Its first derivatives should be continuous between elements, or the The construction of e 1 polynomials o
trial functions are nonconforming. (Convergence is still possible for non- difficult, and curved elements-especiall
conforming elements, as we prove in the next chapter. However, it appears cal integration-are infinitely worse.
. ---
(
that for plate elements even this hope is dashed, because they cannot pass
the required patch test.) A e 1 coordinate change is theoretically possible,
but the extra continuity requirement is extremely restrictive (Fig. 3.9). Once 3.4. ERROR ESTIMATES
In this section we put the previous

b order to achieve the main goal of our '
element error u- uh. The function u
elliptic boundary-value problem of ord
S tion computed in a finite element spa<
element equations KQ F become a S)
shall find at the same time the order o
e p a tions.
The basic question is how well th€
admissible space :re~. In the energy non
as possible to u, and the error in energy
x-y plane { r¡ ptane (24)
Fig. 3.9 The constraints on a C 1 mapping. This is the simplest and still the most j
expression a(v, v) for strain energy is p1
two directions are known at a point-at P, the tangents to PA and P B are the problem is elliptic: a(v, v) ¿ ullvl
established when S is mapped to Q-all other directions are completely
determined. [For any e 1 function f(x, y), all directional derivatives can be (24')
computed from the gradient, that is, from the derivatives fx and !Y in two
directionsl In the figure, this means that the tangent to PC is determined. In displacement, or more generally in t
In fact, the curve CPA must have a continuous tangent at P, since epa is whether u - uh is again of the optím(]
straight. It follows that a general quadrilateral or triangular mesh, e~·en one everything is determined by the degree
with straight edges, cannot be carried into a regular mesh by a e 1 change of is not quite true. If we fix Sh and increa
eventually the trial functions will no lonJ
tGordon an_d Hall recently proposed to map n all at once, rather than element by will collapse. Therefore, the order of a•
element, into a square. For this they use blending functions, a variant of the standard finite
as well as on k and s.
elements. If n is itself not too unlike a square (the mapping of a circular region would
create artificial singularities at the corners) this should save time as compared to the The correct order can be determine
isoparametric method on each element. due to Aubin and Nitsche. The argm
CHAP. 3 SEC. 3.4. ERROR ESTlMATES 165
al boundary conditions. The same remarks coordinates. The mapping in the figure could be carried out by bicubic Her-
rnite elements, for example, allowingthe curved edges to be cubics; but their
:~nique allows, in_~9mparison with ap- tangents, and in the Hermite case also the cross-derivatives xe, and Ye,, musí-
lygon, is enormous. In his experiments be contimwus at the vertices. This is a severe, and almost an unacceptable,
xample, Zlamal [ZIO] found a difference theoretical limitation on the isoparametric technique for fourth-order prob-
~ments, and even more in strains. There Iems.
1ccess in second-ordet, problems. t The rnessage seems to be this, that it is a mistake to work with fourth-
lte and shell probleins;;Jor example- order equations rather than second-order systems. Analytically the elimina-
theory insists that the coordina te change tion of unknowns may be of overwhelming importance, but not numerically.
1e continuous between elenients, or the The construction of e 1 polynomials on straight-sided elements is already
(Convergence is still possible for non- difficult, and curved elements-especially when we cometo consider numeri-
1 the next chapter. However, it appears cal integration-are infinitely worse.
pe is dashed, because they éarmot pass
dinate change is theoretically possible,
: is extremely restrictive (Fig. 3.9). Once 3.4. ERROR ESTIMATES
In this section we put the previous approximation theorems to use, in

b
order to achieve the main goal of our whole analysis-to estímate the finite
element error u uh. The function u is the solution to an n-dimensional
elliptic boundary-value problem of order 2m, and uh is its Ritz approxirna-
S tion computed in a finite element space Sh. On a regular mesh, the finite
element equations K Q F become a system of difference equations, and we
shall find at the same time the order of accuracy of these difference equa-
e p a tions.
The basic question is how well the subspaces Sh approximate the full
admissible space 3C~. In the energy norm~ nothing else matters: uh is as close
as possible to u, and the error in energy must be of the optimal order h2 <k-m):
~- "7 plane (24)
aints on a C 1 mapping. This is the simples! and still the most fundamental error estímate. Since the
expression a(v, v) for strain energy is positive definite (in other words, since
-at P, the tangents to PA and P B are
the problem is elliptic: a(v, v) > a 11 v 11!), (24) is equiyalent to
?-all other directions are completely
~, y), all directional derivatives can be (24')
from the derivatives fx and fy in two
that the tangent to PC is determined. In displacement, or more generally in the sth derivative, the first question is
continuous tangent at P, since epa is whether u- uh is again of the optimal order hk-s. This would imply that
zdrilateral or triangular mesh, e~'en one everything is determined by the degree k - 1 of the finite elements, but it
into a regular mesh by a e 1 change of is not quite true. If we fix Sh and increase the order 2m of the problem, then
eventually the tria! functions will no longer be admissible and the Ritz method
> map Q all at once, rather than element by
will collapse. Therefore, the order of accuracy must somehow depend on m
tding functions, a variant of the standard finite
tare (the mapping of a circular region would as well as on k and s.
) this should save time as compared to the The correct order can be determined by an elegant variational argument
due to Aubin and Nitsche. The argument has come to be known as the
Nitsche trick; it was exten~ed by Schultz, and is now the standard approach Substituting (24') and (29) into (28)
to errors in displacement. Ín Section 1.6, we proved an h2 error with linear
elements; the general case is going to be more technical, but the results (25-
j(g, u if)!
26) are very simple. The upper limit 2(k- m) on the rate of convergence was
found by the first author [SIO].
THEORENI, 3. 7
Suppose that the finite element space Sh is of degree k - 1 and that the By duality in the definition (equation 1
strain energy has smooth coefficients and satisfies the ellipticity condition
v)
u 11 V 11! a(v, < Kll V 11!. Then the finite element approximation uh differs
from the true solution u by
(25) jjcí('CL Uh lis< Chk-s 11 Ullk if s >2m- k, The proof is complete. Tt is most st
(26) !!u- uhlls Ch (k-m)l!ul!k
2
if s 2m k. s = O, because in this case the auxiliar:
was made in Section 1.6.
These exponents are optimal, so that the order of accuracy never exceeds
2(k- m) in ~my norm; in almost al! realistic cases the order is k s. A similar result holds with inhomo~
[S6]. The error estimates have also b
Proof Nitsche's trick is to introduce an auxiliary problem Lw = g, whose vibration, in which the basic ellipticity
variational form (equation of virtual work) is · by the addition of a new zero-order ter:
equation Lu f is altered to Lu- fZU
(27) a(w, v) (g, v) for all v in Xe. values of L, then L fZ is no longer p·
has a solution. Schultz has proved [S'
From the theory of partial differential equations, there exists a uniquesolution changed.
w which is 2m derivatives smoother than the data g: 11 w llzm-s e 11 gll-s·
We choose v u - uh in (27): Remarks. The case s m, corres
yields the correct power !Jk-m. The erre
!(g, u- uh)! = !a(w, u uh)j !a(w- v\ u uh)! included in the theorem as it stands, si
(28)
<Kllw vhllmllu-uhllm· sense. Nevertheless the exponent k
provided the subspaces. Sh satisfy an iJ
This holds for any vh in S\ since a(v\ u - uh) = O by the fundamental Ritz of a tria! function vh increases its mm
theorem 1.1. Now suppose that vh is the closest approximation to w in the hypothesis is almost the same as the
:rem norm. [Or, what is effectively the same, let vh be the finite element solution there it was a case of c/h;; therefore t
to the auxiliary problem (27) and therefore the best approximation to w in elements are of comparable size. Othj
the energy. Note that only approximation in energy is used in the proof, in the error estímate for derivatives of
nevera direct approximation in the :res norm.] According to the approxima- The theorem and proof are valid v
tion theorem 3.3, ingly, the rafe of convergence in negativt
if k s,
2m The reason is this: the error in the I-
(29) o ver the domain:
if k< 2m- s.
In the first case k was reduced to 2m - s before the approximation theorem 11 u ifli-1
was applied; if the subspace is complete through degree k - l, then it is
certainly complete through any lower degree. if we choose v l. Therefore, in the
CHAP. 3 SEC. 3.4. ERROR ESTIMATES 167
hultz, and is now the standard approach Substituting (24') and (29) into (28),
n l. 6, we pro ved a[LS error with linear
o be more technical, but the results (25-
2(k m) on the rate of convergence was
pace Sh is of degree k - 1 and that the By duality in the definition (equation 1.58) of negative norms,
1ts and satisfies the ellipticfty condition
1e finite element approximation uh differs max l(g, u- uh)i
g 11 gil-s
if s >2m- k, The proof is complete. It is most straightforward for the displacernent,

if s <2m- k. s = O, because i·n this case the auxiliary data gis exactly u uh; this choice
was made in Section I .6.
'1t the order of accuracy never exceeds
·ealistic cases the order is k s. A similar result holds wi_th inhomogeneous essential boundary conditions
[S6]. The error estimates have also been extended to problems of forced
luce an auxiliary problem Lw = g, whose l'ibration, in which the basic ellipticity condition a(v, v) > a 11 v ll! is violated
1 work) is by the addition of a new zero-order term. Effectively, the original differential
equation Lu = f is altered to Lu - rxu = f: if a fáHs in between two eigen-
for all v in X~. values of L, then L a is no longer positive definite-but the equation still
has a solution. Schultz has proved [S4] that the rate of convergence is un-
1equations, there exists a unique solution changed.
than the data g: 11 w ll 2 m-s e 11 gll-s·
.Remarks. The case s = m, corresponding to error in energy, always
yield~ the correct power hk-m. The error in higher derivatives, s >m, is not
uh) 1 = 1a(w vh, u uh)l
included in the theorem as it stands, since w líes outside xm and (29) is non-
uhllm· sense. Nevertheless the exponent k s will be correct within each element,
provided the subspaces Sh satisfy an inverse hypothesis: each differentiation
·h, u - uh) O by the fundamental Ritz of a trial function vh increases its maximum by at most a factor cfh. This
the closest approximation to w in the hypothesis is almost the same as the uniformity condítion (2), except that
;ame, let vh be the finite element solution there it was a case of c/h¡; therefore the in verse hypothesis is fulfilled if all
erefore the best approximation to w in elements are of comparable size. Otherwise a factor (hmaxlhminyr-m appears
nation in energy is used in the proof, in the error estímate for derivatives of order s > m.
es norm.] AccordingJo the approxima- The theorem and proof are valid without change in cases< O. Surpris-
ingly, the rafe of convergence in negative norms is not only of academic interest.
if k> 2m- s, The reason is this: the error in the 1-norm is a bound for the average error
if k 2m s. o ver the domain:
- s before the approximation theorem 11 u - uh 11 = max 1

J (u- uh)v dx
0
j
..!...---'==---....,.,----,-,------'-
j J(u ~ if) dx 1'
)Iete through degree k - 1, then it is -t llvllt (vol Q) 112

· degree.
if we choose v __ l. Therefore, in the usual case k > 2m, Theorem 3.7 has
168 APPROXlMATION CHAP.3 SEC. 3.4.
the following consequence for s = 1: The .average error is much smaller mesh this problem can be solved exac
than the typical displacement error ata point. More precisely,. by the property that the true stresses (
approximations (derivatives of a lower-
J
but j (u- uh) dx 1
2
,....., hl<k+l). elements in one dimension, equality oc<
that is where the slope of a quadratic
polate. (Equivalently, that is where t
This mu~t mean that the error alternates rapidly in sign. In fact, it does so horizontal tangent.) Symmetry suggest
within eaeh element, and the practica} problem is to discover even approxi- tional for elements ofhigher degree. (Tl
mately the "special points" where these sign changes occur. Near such points lie elsewhere, and in the simplest cast
the displacements uh will be of exceptional accuracy. Legendre polynomial. They seem more
(We had imagined that -u" == 1, with linear elements, would provide than the stress points.) In two qimens
the perfect example: uh coincides with theinterpolate u1 , so there is exceptional
choice of Pk, and exceptional points fe
accuracy at the nod.es.t Looking more closely, however, the average error exceptional for the others. The midpoin
is only of the same order h2 as the error at a typical point. In fact, the linear tional for derivatives along the edge bt
interpolate never goes above the true (quadratic) solution, and the error is the normal. This is still an area for resc
completely one-sided. The explanation is that the condition k > 2m for the will be more completely understood.
hk+ 1 phenomenon does not hold: k 2m = 2. The no des are exceptional,
but it is because the trial functions solve the homogeneous differential equa-
Three additional problems arise on
tion -u" O [H6, T4]; this is not an example of rapid alternations in sign.)
l. To interpret KQ = Fas a syster
The stresses are in a similar position. In second-order problems they are corresponding local truncation errors.
in error by hk- 1 at typical points, and hk on average; therefore, these errors
2. To prove that the exponents in '
also must alternate in sign, and there must exist exceptional stress points.
3. To show that for smooth soluti
Their presence was noticed in actual computations by Barlow, and it appears apply not only in the mean, but at ever:
for quadratics on triangles that the midpoints of the edges are exceptional.
The accuracy at these midpoints is better than at the vertices, where even
W e shall not attempt a technical d
after averaging the results from adjacent elements, the stress approximations
Roughly speaking, once the behavior o;
are not good. · Since the midpoints are also nodes for quadratic elements,
the central question is the stability ofth~
the situation is extremely favorable; it is only spoiled by errors dueto change
is difficult to establish in the maximum 1
in domain, which do not necessarily alternate in sign.
mentioned above it is almost certain to l
We believe that the stress points can be located in the following way. The
coefficient model problems, and the er
leading term in the error is govemed by the problem of approximating poly-
then ofthe correct arder. More precise r
nomials pk of degree k, in energy, by the trial functions in Sh. On a regular
Bramble, and Ciarlet and Raviart, but 1
tThe phenomenon of "super-convergence" at the nodes was recently made clearer. by For problem 1 suppose first that ther
an elegant observation of Dupont and Douglas. Let Go(x) be the fundamental solut10n unknown) for every mesh square; this v
corresponding to the point x 0 ; G0 is the response toa point load /o = ó(x xo). Then right triangles, bilinear elements on s
1u(x 0) uh(x 0) 1:::;;;; C 11 u - uh llm 11 Go - vh llm for any vh in S h. will Iook exactly like a conventional d
Proof: u(x 0) uh(x 0) (u- uh,/0) a(u- uh, Go) a(u uh, Gd- vh). The most
to innumerable discussions. about the
interesting case occurs if x 0 is a node, since the approximation of Go is likely to be finite differences. It is clear that not all ~
especially good; in the, one-dimensional case -u" /, Go is linea~ wit~ a change i~ slo~e by an appropriate choice of element; t
at x 0, and can be reproduced identically by vh. This confirms the mfimte accuracy m this positive definite, and even under these
special case. N:ormally, the term 11 Go - vh 11 will add sorne finite power of h to the hk~m may not exist~ On the other hand, a suffi
coming from 11 u - uh llm· The whole question of pointwise convergence, and of these.m-
to regard all finite element equations-e
creases in the power of h (super-convergence) at special points, is now being studied very
intensively ~ nodal unknowns-as finite difference e
-1 : The average error is much smaller mesh this problem can be solved exactly. Then stress points are identified
a point. More precisely, , by the property thatthe true stresses (derivatives of Pk) coincide with their
approximations (derivatives of a lower-degree polynomial.) With first-degree
lt J
1 (u- uh) dx r 1'1 h2(k+1).
elem:nts in one dimension, equality occurs at the midpoints of the intervals;
that IS where the slope of a quadratic matches the slope of its linear inter-
polate. (Equivalently, that is where the error function in Fig. 3.3 has a
rzates rapidly in sign .. Jn fact, it does so
horizontal tangent.) Symmetry suggests that the midpoints are also excep-
al problem is to discover even approxi-
tional for elements of higher degree. (The exceptional points for displacement
:se sign changes occur. Near such points
líe elsewhere, and in the simplest case they were the zeros of the second
>tional accuracy.
Legendre polynomial. They seem more sensitive to the boundary conditions
1, with linear elements, wouÍd',provide
than the stress points.) In two qimensions, the results may depend on the
the interpolate u¡, so there is exceptional
choice of Pk, and exceptional points for one stress component need not be
)fe closely, however, the averag~ error
exceptional for the others. The midpoints of an edge seem likely to be excep-
ror at a typical point. In fact the linear
tional for derivatives along the edge but not for stresses in the direction of
e (quadratic) solution, and the error is
the normal. This is still an area for research; eventually these special points
m is that the condition k > 2m for the
will be more completely understood.
2m 2. The nodes are exceptional,
)}ve the homogeneous differential equa-
Three additional problems arise on a regular mesh:
~ example of rapid alternations in sign.)
l. To interpret KQ = Fas a system of finite difference equations, with
IOn. In second-order problems they are
corresponding local truncation errors.
i fzk on average; therefore, these errors
2. To pro ve that the exponents in Theorem 3. 7 are optimal.
re must exist exceptional stress points.
3. To _show that for smooth solutions, the same rates of convergence
;omputations by Barlow, and it appears
apply not only in the mean, but at every individual point.
midpoints of the edges are exceptional.
better than at the vertices, where even
We shall not attempt a technical discussion, particularly of problem 3.
ent elements, the stress approximations
Roughly speaking, once the behavior of the truncation errors is established,
Lre also nodes for quadratic elements
~he central question is the stability ofthe difference operator K. This property
t is only spoiled by errors dueto chang~
IS difficult to establish in the maximum norm, but with the inverse hypothesis
alternate in sign.
mentioned above it is almost certain to hold. We have verified it for constant-
an be located in the following way. The
coefficient model problems, and the error at every point of the domain is
by the problem of approximating poly-
then ofthe correct order. More precise results ha ve been achieved by Nitsche,
the trial functions in Sh. On a regular
Bramble, and Ciarlet and Raviart, but the general problem is unsolved.
;e" at the nodes was recently made clearer by For problem 1 suppose first that there is only one node (and one associated
Jglas. Let Go(x) be the fundamental solution unknown) for every mesh square; this will be the case for linear elements on
sponse to a point load / 0 = ó(x - x 0 ). Then right triangles, bilinear elements on squares, and splines. Then KQ = F
m11 Go vh llm for any ~hin S h. will look exactly like a conventional difference equation. This fact has led
rz(u - uh, Go) a(u- ub, G~- vh). The most to innumerable discussions about the relation between finite elements and
ce the approximation of G0 is likely to be finite differences. It is clear that not all difference equations can be produced
: -u''=/, Go is linear with a change in slope by an appropriate choice of element; the matrix K must be symmetric and
vh. This confirms the infinite accuracy in this positive definite, and even under these restrictions a corresponding element
will add sorne finite power of h to the hk-m may not exist. On the other hand, a sufficiently tolerant reader inay be willing
•n of pointwise convergence; and of these in-
:::) at special points, is now being studied very to regard all finite element equations-even on an irregular mesh, with many
nodal unknowns..:_._as finite diffetence equations. We agree with this view.
The system KQ F is in general a novel kind of coupled difference equa- We may regard Sh as generated by th
tion, which in principie could' have been devised without the variational Iess useless functions. Forming the co
principie as an intermediary. Historically, of course, that almost never ence equations KQ F, we can rewr
happened; the finite element method leads systematically toa special class of difference equations having a very sp
equations-the intersection between al! possible difference equations and al/ is an accurate analogue of the origit
possible ~itz-Galerkin equations-which are astonishingly successful in com- M 1 (coming from the functions in
putations) are completely inconsistent. (This C(
lf the original differerítial equation Lu f has constant coefficients the large M, and Fourier analysis is the (
order of the localtruncation error may be tested as follows: Let f(x) be a is that the Ritz method attaches almo
pure exponential é'x, so that u can be found explicitly and substituted into tion which corresponds to 'JI, and tha
the difference equation. The truncation error wi.ll have the forro eu:xE(h, .;). agrees with the exponent min (k, 2(
With -u" f, fo~ example, the solution is u= eiexgz, and the truncation order of accuracy (in displacement) (
error of h- 2 (-uj;t "=r 2uj- uj_¡) is
In general the coefficient E can be computed in terms of the Fourier trans-

forms of the trial functions. (Here we are right at the heart of the abstract
finite element method; this technique would be impossible on an irregular
mesh.) Expanding E as a pbwer series in h, the leading exporient is exactly
the order of accuracy of the difference equatiQn. We have computed this
exponent and verified that it is the smaller of k and 2(k m); the rate of
convergence given in Theorem 3.7 is correct.
Suppose, finally, that there are M unknowns associated with each mesh
square, so that the finite element equation KQ F becomes a coupled sys-
tem of M difference equations. The unknowns may be the values of uh at
distinct nodes, or function values and derivatives at a multiple node. This
introduced no complications when the error was estimated by a variational
argument, in Theorem 3.7; that result depends only on the order of approxi-
mation achieved by Sh, and any additional facts about the subspace were
irrelevant. The difference equation aspect, however, becomes much more
subtle when M > l.
Briefly, the problem is that the truncation errors in the difference equations
are not al/ of the expected order. Therefore, it will not do simply to estímate
these errÓrs and then, applying stability to invert the matrix K, to convert
them into estimates of the error u - uh. The point is that uh is given by a
special combination of the trial functions, and if other combinations make
little or no contribution to the problem of approximation, then also their
contribution to uh turns out to be small. We recall that in the abstract method,
(])~' •.• , (])M- genera te approximation of order k if and only if it is possible
to construct from them a single function 'JI with the property (5) required in
Theorem 3.2, that is, a function which by itself is adequate for approximation.
tovel kind of coupled d\fference equa- We may regard S" as generated by this superfunction t¡1 and M 1 more or
been devised without tHe variational less useless functions. Forming the corresponding combinations ofthe differ-
·ically, of course, that · almost never ence equations KQ = F, we can rewrite our finite element system as a set of
eads systematically to a special class of difference equations having a very special form: one equation of the system
ll possible difference equatfons and all is an accurate analogue of the original differential equation, and the other
:h are astonishingly ~uccessful in éom- M- 1 (coming from the functions in S" which are useless for approximation)
~ ' :-. are completely inconsistent. (This construction becomes very technical for
n Lu = f has constarit coefficients the large M, and F ourier analysis is the only possible tool.) The final conclusion
:ty be tested as follows: Let f(x) be a is that the Ritz method attaches almost all weight to the one difference equa-
! found explicitly and substituted into tion which corresponds to tp, and that the order of accuracy of this equation
n error wiH ha ve the form e'd: E(h,, . ~). agrees with the exponent min (k, 2(k m)) in Theorem 3.7. This is the
tion is u = e't:x /e 2 , and the truncation order of accuracy (in displacement) of the finite element method.
~ is
mputed in terms of the Fourier trans-

. are right at the heart of the abstract
would be impossible on an irregular
; in h, the leading exponent is exactly
ce equation. We have computed this
naller of k and 2(k- m); the rate of
orrect.
unknowns associated with each mesh
ction KQ F becomes a coupled sys-
mknowns may be the values of uh at
l derivatives at a multiple node. This
: error was estimated by a variational
depends only on the order of approxi-
tional facts about the subspace were
;pect, however, becomes much more
cation errors in the difference equations

fore, it will not do simply to estímate
ity to invert the matfix K, to convert
uh. The point is that uh is given by a
ons, and if other combinations make
·m of approximation, then also their
We recall that in the abstract method,
)[ order k if and only if it is possible
)n t¡1 with the property (5) required in
y itself is adequate for approximation.
SEC. 4.1. VIOLAT
following way: The essential boundaJ

boundary nodes. Between the nodes, a
of following a general boundary and
interpolated. We show in Section 4.4 t
Condition 3 is the most violated of
I(v) exactly and very easy to compute
enter in two ways: The integrals over 1
cal integration, and the domain of intt
union of simple element shapes. In bo1
4 VARIATIONAL CRIMES
tional /*(v): Therefore, the mathemati<
ence of the minimizing function uh 01
paraboloid /(v) is deformed, how far
Each of these three possible violat
are to justify the finite element methoc
it may be useful to call attention speci
and o ver throughout this chapter; th
tLree problems. The first idea is to cor
4.1. VIOLATIONS OF THE RAYLEIGH-RITZ CODE variation-in physical terms, the equ<
rules are violated, the minimizing fun'
The one cardinal rule in the Ritz theory is that the trial functions should
be admissible in the variational principie. In our notation, each vh should
belong to the space :JC'E, and uh should minimize l(vh). This rule is simple ~o
state, but it is broken every day, and for good reason. In fact, there are three In the Ritz case this is compared with
conditions to the rule, and all three present computational difficulties-per-
haps not insuperable difficulties, but seríous ones: a(u, v) ([, v)
l. The trial functions should possess m derivatives in the mean-square If Sh e 3C'E, and if 1 = l * so that the a:
sense, and therefore be of class e m- 1 across element boundaries. tion can be subtracted from the other;
2. Essential boundary conditions should be respected.
3. The functional I(vh) = qr Kq 2qr F should be computed exactly.
Our goal is to analyze the consequences of violating these conditions. In our case this expression is not i:ero
The first is violated by nonconforming elements, and we show in the next that the quaqtity which does appear is
section that in this case convergence mayor may not occur. It is by no means The second idea applies specifically
automatic (or even probable) that the discrete problem is consistent with the the special properties of polynomials.
continuous one. Instead, there is a patch test to be applied to the trial func- test applies to polynonÍial solutions. ·
tions, whiéh determines whether or not they consistently reproduce states of integration, where accuracy depends o
constant strain. If so, convergence takes place within each element. are integrated exactly. And we note
Essential boundary conditions are fulfilled, at least along an approxi- analyzing changes of domain: Polynm
mate boundary, by the isoparametric or subparametric elements. However, region between the given domain n al
there are many circumstances in which these elements are not available, We emphasize that the analysis doe
either because they are too complicated for.the programat hand or because bation in the stiffness matrix, from K
the problem itself is too complicated__.:a full fourth-order system of shéll large-the equations may seem compl
equatíons, for example. In such cases the rule may be partly satis:fied, in the polynomials of degree m are properl~
172
SEC. 4.1. VIOLATIONS OF THE RAYLEIGH-RITZ CODE 173
following way: The essential boundary conditions may be imposed at the

boundary nodes. Between the nodes, a polynomial trial function is incapable
of following a general boundary and thus the essential conditions are only
interpolated. We show in Section 4.4 that éonvergence still occurs.
Condition 3 is the most violated of all, since it is inconvenient to compute
J(v) exactly and very easy to compute it approximately. The approximations
en ter in two ways: The integrals o ver each element are computed by numeri-
cal integration, and the domain of integration Q is itself altered to become a
union of simple element shapes. In both cases I(v) is replaced by a new func-
tional.J*(v). Therefore, the mathematical problem is to determine the depend-
IMES ence of the minimizing function uh on the functional itself: If the convex
paraboloid I(v) is deformed, how far does the mínimum move?
Each of these three possible violations requires a careful analysis, if we
are to justify the finite element method as it is actually used. Before we start;
it may be useful to call attention specifically to two ideas which appear over
and o ver throughout this chapter; they are common to the analysis of all
tLree problems. The first idea is to consider always the vanishing of the first
-RITZ CODE variation-in physical terms, the equation of virtual work. Even when the
rules are violated, the minimizing function u~ still satisfies
1eory is that the trial functions should
:iple. In our notation, each ?J't should for all ?J't in Sh.
l minimize l(vh). This rule is simple to
::>r good reason. In fact, there are three In the Ritz case this is compared with
~esent computational difficulties-per-
~rious ones: a(u, v) = (/, v) for all v in 3C'E.
ess m derivatives in the mean-square If Sh e 3C'E, and if 1 = 1* so that the asterisk can be removed, then one equa-
cross element boundaries. tion can be subtracted from the other; as in Theorem 1.1,
hould be respected.
2qr F should be computed exactly. for all ?J't in Sh.
uences of violating these condítions. In our case this expression is not zero, and the whole problem is to show
ing elements, and we show in the next that the quaQtity which does appear is small.
:zy or may not occur. It is by no means The second idea applies specifically to finite elements; it consists in using
di serete pro blem is consistent with the the special properties of polynomials. We have noted already how the patclf
eh test to be applied to the trial func- test applies to polynomial solutions. The situation is similar in numerical
;t they consistently reproduce states of integration, where accuracy depends on the degree of the polynomials which
es place within each element. are integrated exactly. And we note one property which will be useful in
: fulfilled, at Ieast along an approxi- analyzing changes of domain: Polynomials cannot vary tremendously in the
or subparametric elements. However, region between the given domain Q and its approximation Qh.
eh these elements are not available, We emphasize that the analysis does not depend on estimating the pertur-
d for.the program at hand or because bation in the stitfness matrix, from K to K or K*. This change may be very
-a full fourth-order system of shell Iarge-the equations may seem completely altered-while still the piecewise
.he rule may be partly satisfied, in the polynorttials of degree m are properly dealt with. In abstract terms, these
2
\
J
174 VARIATIONAL CRIMES CHAP.4 SEC. 4.2.
polyn<;>mials become dense in the admissible space as h -? O, and their finite element so/ution ui is sti/1 identi
behavior is therefore decisive. that the problem has constant coeffic
have only an O(h) effect on ui.
There is a celebrated example in [
4.2. NONCONFORMI.NG ELEMENTS ANO ments of shape (B), but is violated b:
THE PATCH TEST (This is in the large-scale strain enei
25 %.) The element in question is a e
A n~~ber of frequently used elements are nonconforming:_their deri-
at the vertices. The tenth unknown
vatives of .order m 1 ha ve discontinuities at the element boundaries-and
coefficients; the authors disapproved
nevertheless they work quite well. Or r-ather, they sometimes work well and
xy 2 , beca use this constraint can bre~
sometí mes not. It is a risk that is taken most often in fourth-order problems,
nodal parameters of xy(I -x-y):
where the elements should lie in e 1 ; matching the slopes between elements
the constraint is satisfied, but the poi
can be difficult for the normal displacement w of a plate under bending, and
fore the authors chose a condition wt
extremely difficult for shells. Yherefore, the technique has been to compute
and never singular.
the energies within each separate element in the usual way and then simply to
add together the results. This has the effect of replacing the true functional
I(v) by a sum of element integrals
I*(v) = I; [ae(v, v) 2(f, v)el a*(v, v)- 2(f, v)*.

e
The difference between 1 and /* is that the singularities on the element bounda-
ries are ignored in/*, whereas I(v) = oo for nonconforming elements.
The Ritz approximation u~ is the (probably nonconforming) function in
the trial space which minimizes 1*(v11 ). This property of u~ is expressed as
(a)
usual by the vanishing of the first variation of 1*:
Fig. 4.1 Success and .
(1)
With irregular triangles, the odds a
This is the approximate equation of virtual work. It is identical to the usual
test. Nevertheless, an error of a few p
equation, except that again the integrals are computed an element at a time
when it is committed in the directio
and then summed, ignoring discontinuities between elements.
explained in Section 1.10, the latter a
Our goal is to find conditions under which this finite element approxi-
a(u11 , uh) < a(u, u); the nonconformin
mation ut in spite of its illegitimate construction, converges to u. This ques-
displacements as well as the strains a1
tion was almost completely obscure until Irons [B9] had a simple but
One approach to the theory is te
brilliant idea, now known as the patch test. Suppose that an arbitrary patch
KQ =Fas a finite difference equati<
of elements is in a state of constant strain: u(x, y)= P m(x, y), a polynomial of
rived. If this equation is consistent a
degree ni~ Then since this polynomial is present in Sh-ev~n on nonconform-
For a regular mesh, this is exactly th
ing elements the constant strain condition is imposed, that the degree k 1
for consistency. t Normally the noncon:
of the subspace must be at least m-a true Ritz solution uh would coincide
derivative, so that v 11 lies in em-z, anc
identically with P m· (At the boundary of the patch, the conditions imposed
are chosen to be consistent with constant strain; e.g. uh = P m is imposed on tRemember that for difference equatior
the displacements at the patch boundary.) Then the test is to see whether, first few terms of a Taylor series-in other
in spite of shifting from 1 to 1* by ignoring the inter-element boundaries, the certain degree. The patch test does exactly t
CHAP.4 SEC. 4.2. NONCONFORMING ELEMENTS 175
tdmissible space as h -----4 O, and their finite element solution ui is still identic_al with. P m· W~ o:ay and shall assume
that the problem has constant coeffic1ents, smce vanatwns over an element
have only an O(h) effect on ui.
There is a celebrated example in [B9] in which this test is p&ssed for ele-
ANO ments of shape (B), but is violated by about 1.5% for those of shape (A).
, (This is in the large-scale strain energy; the pointwise stress errors reached
-~ .
25 %.) The element in question is a cubic, with v, vx, and vy as ~arameters
rnents are nonconfor.ining-their deri-
at the vertices. The tenth unknown is elirninated by a constramt on the
mities at the elernent boundaries-and
coefficients · the authors disapproved of equating the coefficients of X 2Y and
xy2 becau~e this constraint can break down on a right triangle-all nine
rather, they sometirnes work well and
n rnost often in fourth-order pro_plerns,
nodal parameters of xy(I -x-y) are zero on the standard triangle, and
rnatching the slopes between elements
the constraint is satisfied, but the polynomial is not identically zero. There-
:ernent w of a plate under bending, and
fore the authors chose a condition which would be invariant under rotation,
re, the technique has been to compute
and never singular.
ent in the usual way and then simply to
: effect of replacing the true functional
the singularities on the element bounda-

oo for nonconforming elernents.
(probably nonconforrning) function in
'). This property of ui is expressed as
:iation of 1*: {o) {b)
Fig. 4.1 Success and fail~re in the patch test.
With irregular triangles, the odds are strongly against success in the patch
irtual work. lt is identical to the usual
test. Nevertheless, an error of a few per cent may be acceptable, particularly
als are computed an elernent at a time
when it is committed in the direction opposite to the true Ritz error. As
mities between elements.
explained in Section 1.10, the latter always comes from too stiff a structure,
der which this finite element approxi-
a(u\ uh) a(u, u); the nonconforming solution ui is more relaxed, and the
onstruction, converges to u. This ques-
displacements as well as the strains are frequently overestimated.
until Irons [B9] had a simple but
One approach to the theory is to look directly at the discrete system
1 test. Suppose that an arbitrary patch
KQ F as a finite difference equation, forgetting that it is improperly de-
tÍn: u(x, y) p m(X, y), a polynomial of
rived. If this equation is consistent and stable, then ui must converge to u.
is present in Sh-eveá on nonconforrn-
For a regular mesh, this is exactly the content of the patch test: It is a test
:tion is imposed, that the degree k 1
for consistency.t Normally the nonconforming elements will be short only one
1 true Ritz solution uh would coincide
derivative, so that if lies in em- 2 , and consistency is ensured for the lower-
of the patch, the conditions irnposed
:ant strain; e.g. uh Pm is imposed on tRemember that for difference equations, consistency is checked by looking at the
lary.) Then the test is to see whether, first few terms of a Taylor series-in other words, by considering polynomials up to a
)ring the inter-element boundaries, the certain degree. The patch test does exactly the same for finite elements.
order ~e:rms in the differential equation. Then the patch test in case (B) abo ve are fJJ = 1 - x 2 and t¡1 1 - y 2 , anl
confirined consistency for the highest-order difference quotient, while in case therefore, they are not continuous af t.
(A) the equation KQ F proved to be the analogue of a wrong differential ·the element, the solution ofthe final Ii1
equation, with a new Ieading term. This link between consistency and the patch and, the effect of qJ and r¡/ should be
test can be established by Fourier methods; we omit the details, beca use it is within each element. Because of the
effectively limited to a regular mesh and requires a good deal of preparation. is required.
The ¿econd approach to the nonconforming theory is variational and Suppose for simplicity that the e
therefore 'more general. We assert that the variational meaning of success in energy inner product is a(u, v) JJ
the patch test is this: For each polynomial P m and each (nonconforming) is an arbitrary polynomial of degree ,
basis function qJr ,
(2)
aiR, rp) = Jt Jt
-¡ -t
This equality holds ifthe patch test is passed, and vice versa. Notice·that the On the other hand, Green's formula~
right si de is well defined; the main terms are ·mth derivati ves of P m' which
are constants, multiplied by mth derivatives of the nonconforming functions
(/Jp which are o-functions when the derivative is normal to the element bound-
a(P, rp) JJ
aries. The boundary contributions are therefore :finite; they are computable
Choosing a domain large enough to e
from the jumps in the normal derivative of order m- 1, as we shall show by
qJ will· vanish on the boundary and tl
example. These contributions need not be zero; the patch test may fail, and
· - Pyy = O for any linear polyr
in this case convergence will not occur.
O = a*( P, rp) and the patch test is pas
To prove the equivalence of (2) with the patch test, consíder a patch
The patch test is evidently a very s
outside of which qJ1 vanishes identically. We assume that the true solution is
is zero, as we ha ve just seen, and there
u = P m' and note that the corresponding load/ (if any) will certainly ha ve no
computed in the nonconforming way, b
o-functions on the element boundaries:
required to be zero. This can be checke'
without a computer. The size of the~
decides the degree of inconsistendy of
Introducing the two equations ofvirtual work, this is the same as There is another way to achieve ti
o ver a single square:
lf the patch test is passed, so that the approximáte solution is exactly u~ = P m,

then obviously (2) holds. Conversely, suppose that (2) holds for every fJJl' Since rp vanishes on the vertical s
Then a*(u~, rp1) a*(Pm, rp), and necessarily u~ = Pm; the patch test is integral along the bottom and one ba'
passed. one case, and +e in the other, so tha
We rrfust mention one technical difficulty in this argument: Ordinarily the
first variation might not vanish in the direction of qJ1 , which lies outside Jem.
At the smooth solution u P m' however, we are able to show (based on an
integration by parts [S7]) that it does vanish: a(u, qJ1) = (f, qJ1), and the
argument given above is valid. In short? (2) is what the patch test tests. Thus the element again passes the test
Perhaps the simplest instance in which the patch test can be verified occurs rilateral of arbitrary shape, where it i
with Wilson's rectangular elements [W5]. Beginning with the standard e
1 - 2 , '11 = 1 r¡ 2 , and x, y are bil
bilinear functions-on the square 1 x, y 1 there are the four basis standard square into the given quadril
functions (1 ± x)(l y)/4-he adds two new ones. Within the square these strongly violated, even for quadrilatera
1. Then the patch test in case (B) above are rp x 2 and t¡t = 1 y 2 , and outside they are defined to be zero;
·order difference quotient, while in case therefore, they are not eontinuous at the boundary. Since they vanish outside
be the analogue of a wrong dif.ferential the element, the solution of the final linear system admits static condensation,
link between consistency and the patch and, the effect of rp and t¡t. should be to permit an improved representation
hods; we omit the details, because it is within each element. Because of the discontinuity, however, the patch test
td requires a good de,al of preparation. ís required.
conforming theory is variational . and Suppose for simplicity that the differential equation is -Au = f The
t the variational mean:ing of success in energy inner product is a(u, v) = JJ uxvx + uyvy, and if P =a+ bx + ey
vnomial P m and eaeh (noneon[orming) is an arbitrary polynomial of degree m= 1, then
passed, and vice versa. Notice that the On the other hand, Green's formula yields
!rms are mth derivatives of p m' which
atives of the nonconforming functions
ivative is normal to the element bound-
a(P, tp) = Jf ( -AP)rp f rp~~ ds.
~ therefore finite; they are computable

Choosing a domain large enough to contain the given square in its interior,
ve of order m 1, as we shall show by rp will·vanish on the boundary and the Iine integral is zero. Since -AP =
1t be zero; the patch test may fail, and - Pxx - PYY O for any linear polynomial P, we conclude that a(P, rp)
tr. O = a*(P, rp) and the patch test is passed.
with the patch test, consider a patch The patch test is evidently a very simple rule: The true value of a(Pm, rp)
Iy. We assume that the true solution is is zero, as we ha ve just seen, and therefore the integral of every strain Dmrp1-
ng Ioadf(if any) will certainly ha ve no eomputed in the noneonforming way, by ignoring element boundaries-is also
s: required to be zero. This can be checked analytically; the patch test can be run
withQut a computer. The size of these integrals, in case they don't vanish,
= (/, rp,).
decides the degree of inconsistency of the nonconforming equatíons.
ual work, this is the same as There is another way to achieve the same result, using Green's theorem
o ver a single square:
pproximate solution is exactly u~ p m'

suppose that (2) holds for every rp r Since rp vanishes on the vertical si des x ± 1, there remains only an
ecessarily u~ P m ; the patch test is integral along the bottom and one back along the top. aPjan equals -e in
one case, and +e in the other, so that
ficulty in this argument: Ordinarily the
direction of rp1' which Iies outside xm.
ver, we are able to show (based on an f rp aP/an ds f -¡
l
( l - x 2 X-e)dx + s-1 (1
1
x 2 )e(-dx) = O.
es vanish: a(u, rp 1) (f, rp 1), and the
~t, (2) is what the patch test tests. Thus the element again passes the test. lt does not pass the test for a quad-
ich the patch test can be verified occurs rilateral of arbitrary shape, where it is constructed isoparametrically: rp
[W5]. Beginning with the standard 1 - <! 2 , t¡1 1 r¡ 2 , and X, y are bilinear functions of e,
r¡ mapping the
l x, y 1 there are the four basis standard square into the given quadrilateral. In fact the patch test can be so
wo new ones. Within the square these strongly violated, even for quadrilaterals of reasonable shape, that the ele-
\
178 VARIATIO NAL CRIMES CHAP.4 SEC. 4.2.
ment is useless. To pass the test with isoparametrics, Taylor altered the non- This inequality reflects, as it must, bot
conforming rp and '11 and then also modified the numerical quadrature; two term) and the effect of the nonconform
wrongs do make a right, in California. ing element. To prove (3) we corripare
There is a second non-conforming element which is equally simple and ing that f is smooth,
useful. It is composed of piecewise linear functions on triangles. Rather than
placing nodes at the vertices, however, which produces continuity across
each int~telement edge, the nodes are placed instead at the midpoints of the
edges. Interelement continuity is therefore lost (except at these midpoints) It follows that for all vh in S\
and we have a Iarge.r trial space-roughly three times the dimension of the
standard Courant space, because this is ihe ratio of edges to vertices. This (4)
larger space has made it possible to impose the side condition div vh = O
and still to retain enough degrees of freedom for approximation. Here we get a valuable result right a"
For this space to succeed, it must pass the patch test. We impose the test equality, is bounded by 1u- u~ 1*1 vh 1*.
exactly as above, by computing J rp aP¡an ds along every edge. ln fact, it. is over S\ we h..ave (with thanks to R. Se
the jump in rp which we integrate, to compute the effect of going along an error:
edge in one direction and then back along the other side. The jump in rp is
(5)
a linear function, since rp is linear in each of the two triangles which meet at
the edge. The jump is zero at the midpoint, where rp is continuous. Finally, This shows that the estímate (3), once 1
aPjan is a constant because Pis linear. Since the integral of a linear function it implies that on a regular mesh any c<
is zero, given that the function vanishes at the midpoint of the range of test. Only if the test is passed will .d-
integration, we conclude that each trial function passesthe patch test. To complete the proof of the upp
No ti ce that the mesh was not even required to be regular! Témam has element in Sh to u. By the triangle inej
established a Poincaré inequality 11 vh 11~ < Ca*(v\ vh) for this element.
We intend to prove that the finite element solutions based on Wilson's
nonconforming element converge to u. The rate of convergence will be only
1u - u~ 1* < 1u - w 1* + 1w - u~
minimal-O(h 2 ) in energy-although this may not give a fair description of
its accuracy for large h. (An essential feature offinite elements is their success Therefore (3) is proved once the last
on a coarse mesh; even sorne elements which fail the patch test and are term is the maximum over all vh o[ th
nonconvergent give very satisfactory results for realistic h.) If the element because w is the closest element in Sh
also passes the test for polynomials of higher degree Pn, this rate of conver- it satisfies a*(w, vh) = a*(u, vh) for all
gence in energy would be increased to h 2 <n-m+ 1 J; but it doesn't. numerator of R, followed by the ident
Our plan is to begin with a general error estímate, applicable to any with .d. This establishes the upper bot
nonconforming element, and thereby to isolate the quantity which is decisive For Wilson's rectangular element,
in determining the error. This is the quantity .d, defined by show that .d = O(h); this gives the coJ
To achieve this estímate, we write the
Then, for the particular element studied above, we shall esdmate .d and The functions (/J¡ and lj/¡ are the new 1
deduce the rate of convergence. to lie on the ith element. Thus (/J¡ =
The err~r bound on which everything is based is the following: coordinate of the center of the ith S<
standard basis for the conforming b
(3) 1u - u~ 1* < .d + min 1u - wh 1*.
wh in sn "pagoda functions." Since they are co
isoparametrics, Taylor altered the non- This inequality reflects, as it must, both the order of approximation (the last
10dified the numerical quadrature; two term) and the effect ofthe nonconformity; a= a*and L\ =O for a conform-
l. ing element. To pro ve (3) we compare the equations of virtual work: assum-
tg element which is equiQy simple and ing that f is smooth,
tear functions on triangles. J:{ather than
ver, which produces contihuity across
e placed instead at tñ~ midpoints of the
refore lost (except at these midpoints) It follows that for all vh in S\
ughly three times the dimension of the
s is the ratio of edges to vertices. This (4)
• impose the side condition div vh O
freedom for approximation. Here we get a valuable result right away: The left si de, by the Schwarz in-
pass the patch test. We impose the test equality, is bounded by l u- ui 1*1 vh 1*. Dividing by¡ vh 1* and then maximizing
iPjan ds along every .edge. In fact, it is over S\ we have (with thanks to R. Scott for his help) a lower boundfor the
compute. the effect of going a long an error:
along the other side. The jump in rp is
(5)
!ach of the two triangles which meet at
jpoint, where rp is continuous. Finally,
This shows that the estímate (3), once proved, wiii be extremely realistic, and
~. Since the integral of a linear function
it implies that on a regular mesh any convergent element must pass the patch
ishes at the midpoint of the range of
test. Only if the test is passed will L\ ---+ O.
ial function passes ·the patch test.
To complete the proof of the upper bound (3}, we let w be the closest
!n required to be regular! Témam has
element in Sh to u. By the triangle inequality
11~ < Ca*(vh, vh) for this element.
e element solutions based onWilson's
1. The rate of convergence will be only
this may not give a fair description of
eature offinite elements is their success Therefore (3) is proved once the last term. is identified as equal to 1\. This
nts which fail the patch test and are term is the maximum o ver all vh of the ratio R 1a*( w - ui, vh) 1/1 vh 1*. But
results for realis~ic h.) If the element because w is the closest element in Sh to u (i.e., the projection of u onto Sh),
,f higher degree Pn, this rate of conver- it satisfies a*(w, vh) = a*(u, vh) for all vh in Sh. With this substitution in the
o f¡ZCn-m+ u; but it doesn't. numerator of R, followed by the identity (4), the maximum of R is identical
eral error estímate, applicable to any with 1\. This establishes the upper bound (3).
to isolate the quantity which is decisive For Wilson's rectangular element, which passed the patch test, we shall
¡uantity 1\, defined by show that L\ O(h); this gives the correct rate of convergence of l u ui 1*.
To achieve this estímate, we write the trial functions as
tdied above, we shall esdmate L\ and The functions rp¡ and t¡/¡ are the new nonconforming basisfunctions, shifted
to líe on the ith element. Thus fPt = 1 - ((x x¡)/2h1) 2 , where Xc is the
1ing is based is the following: coordinate of the center of the ith square. The functions Wr make up the
standard basis for the conforming bilinear element; we ha ve called them
- min
w' in s•
!u- wh¡*.
"pagoda functions." Since they are conforming, they make no contribution
\
180 VARIAN..._ONAL CRIMES CHAP.4
SEC. 4.3.
to ~- In fact, over a typical square, the numerator of ~ amounts to for the error in eigenvalues, which cor
static problems. This agrees with our pr1
(a* - a)(u, ?A)= (a* - a)(u, a 1rp1 + b1tp¡) =(a* - a)(u -- Pp a1rp1 + b1tp 1).
~ . formity which is responsible; the elem
theory would have permitted O(h 2<k-m>)
Here we use the patch test, or rather the equivalent identity (2); the inclusion
It appears that element 3 passes the t
of the linear polynomial P 1 has no effect. With a little patience, we can esti-
is rather remarkable. In fact, the vanish
ma te this 1~;~ontribetion to ~ from the ith square by
test requires would seem in general to
regular mesh, and one must expect tha
convergent elements will continue to a
Here we have chosen P~' as approximation theory allows, to·be within O(h) future.
of u in the first deriva ti ve (recall that m = 1 in our example). Now we sum the The list also contained a number of
contributions from all the squares, and use the Schwarz ihequality: we prefer not to regard numerical inte,
elements. The effect ofthis integration is
la*(u, vh)- a(u, vh)l < C'h 2: llull2,etlla¡rp¡ + b;t¡/;llt I(v) to a new one, bu't the difference b
boundary integrals. Therefore this effec1
< C"h(2: llull~.eY 12 (2: llvhlli,eY 12 = C"hllull21vhl*·
finite element approximation u\ will be
Dividing by 1vh 1*' we reach the right inequality:
4.3. NUMERICAL INTEGRATION
It now follows from the basic estimate (3) that the Wilson nonconforming Numerical integration has become a
elements give an error in the strain energy norm ofl u ---:.ui 1* = O(h). We are finite element technique. In the early sta
confident that this is the correct rate of convergence, although we must finite elements appeared to lie in preci
emphasize two points: (1) It is obvious experimentally that the constant integration of polynomials over triangl1
which multiplies h is very much smaller than it would be without the extra exact formulas. At the present stage, it r
nonconforming tria} functions, and (2) the strain energy in ui is no longer of polynomials is no longer essential, an
necessarily smaller thari that in u. In fact, convergerice from above is the general shape functions, are equally ce
rule rather than the exceptiori, both for strain energy and for displacements. further from the truth: The key to the s1
The convergence proof after a successful patch test is discussed further in finite element method is the presence oJ
the Index of Notations. Irons and Razzaque ha ve described in the Baltimore The fundamental question is this : W
Symposium volume [6] a number of other elements which pass the test, in--- tion formula is required for convergem
cluding nomial which appears be integrated ex
a(v, v) involves the squares of polynomi
l. the 12-degree-of-freedom rectangular element due to Ari, Adini, this degree may simply cost too much.
Clough, and Melosh. It has v, vx, vy at the corners as nodal parameters, and perly the fraction of computer time whi1
the complete cubic plus x 3y and xy 3 as shape function. Mathematically, we are again faced
2. a constant curvature bending element proposed by Morley, which can this is the effect of numerical quadratm
be viewed as conforming for the complementary energy principie.
3. a new element of Allman-Pian type. . I(v) = a(v, v)- 2b(f, v) = JJ [¡
Extensiv~ computations with element (1) on eigenvalue problems for
plates (the element is continuous, but nonconforming for fourth order equa- [Notice the new notation b(f, v) for the
tions) were carried out in [L3]. They reporta definite convergence rate .of h 2
I*(v) = a*(v, v)- 2b*(f, v) = 2: wJ
CHAP.4
SEC. 4.3. NUMERICAL INTEGRAtiON 181
the ·numerator of A amounts ~

. - L~
for the err.or in eigenvalues, which corresponds to the error in energy for
static problems. This agrees with our prediction. Note that it is the noncon-
formity which is responsible; the element has k = 4, and approximation
theory would have permitted O(h 2ck-m>) O(h 4 ).
the equivalent identity (2); the inclusion
ffect. With a little patience, we can esti- It appears that element 3 passes the test even on an irregular mesh, which
~ ith square by
is rather remarkable. In fact, the vanishing of boundary integrals which the
test requires would seem in general to be a rather lucky chance even on a
regular mesh, and one must expect that not only nonconforming but non-
convergent elements will continue to achieve engineering accuracy in the
nation theory allows, to be within O(h) future.
n = 1 in our example). Now we sum the The list also contained a number of numerically integrated elements, but
nd use the Schwarz iileq uality: we prefer not to regard numerical integration as producing nonconforming
elements. The effect ofthis integration is indeed to change the true functional
,e! + b¡fjl¡ llt
11 a¡tp¡ I(v) to a new one, bu't the difference between the two does not consist of
boundary integrals. Therefore this effect, and the error it introduces into the
LJ CL: 11 v" lli.eY 12
112
C"hll U llzl vh 1*.
finite element approximation uh, will be analyzed separately.
inequality:
4.3. NUMERICAl INTEGRATION
ate (3) that the Wilson nonconforming Numerical integration has beco me an increasingly important part of the
lergy norm ofl u ---:<u~ 1* = O(h). We are finite element technique. In the early stages, one of the crucial advantages of
Lte of convergence, although we must finite elements appeared to líe in precisely the opposite direction, that the
1ious experimentally that the constant integration of polynomials over triangles and rectangles could be based on
ller than it would be without the extra exact formulas. At the present stage, it m~ght seem that the unique simplicity
~2) the strain energy in u~ is no longer of polynomials is no longer essential, and that rational functions, or still more
n fact, convergerice from above is the general shape functions, are equally convenient. In fact, nothing could ·be
:Or strain energy and for displacements. further from the truth: The key to the success of numerical integration in the
:ssful patch test is discussed further in finite element method is the presence of polynomials.
tzzaque have described in the Baltimore The ftmdamental question is this : What degree of accuracy in the integra-
other elements which pass the test, in- tion formula is required for convergence? It is not required that every poly-
nomial which appears be integrated exactly. The integrand in the energy
a(v, v) involves the squares of polynomials, and a formula which is exact to
tangular element due to Ari, Adini, this degree may simply cost too much. It is very important to control pro-
lt the corners as nodal parameters, and perly the fraction of computer time which is spent on numerical integration.
as shape function. Mathematically, we are again faced with a change in the function"J:1(v);
~Iement proposed by Morley, which can this is the effect of numerical quadrature. Suppose the true functional is
tplementary energy principie.
type.
I(v) = a(v, v)- 2b(f, v) JJ[p(x, y)(v; + v;)- 2fv] dx dy.
:ment (1) on eigenvalue problems for
_nonconforming for fourth order equa- [Notice the new notation b(f, v) for the linear term.] Then typically
report a definite convergence rate of h 2
182 VARIATIO~AL CRIMES CHAP.4 SEC. 4.3.
Each element region contri bu tes a certain number of evaluation points. .terms of degree higher than k - 1 in
e¡ (X¡, y¡), WÍth W~Íghts W¡ which depend on the size and shape of the for rectangular elements, this will inj1
element and on the rule adopted for numerical int'egration. The rule is called the correct test is expressed by a*(Pn'
exact to degree q, if the integral of every polynomial Pq is given correctly by mials of degree n m, multiplied b)
Í: W¡Pq{et). integrated correctly.t
Suppose that we minimize 1* over all trial functions vh. Then the minimiz- No attempt will be made to devi
ing funct~lon üh 2: Q/p1 is determined by an approximate finite element sys- astonishing that even on triangles an~
tem 'f(Q ~ P, in which the stiffness matrix and load vector are computed by never been completely solved. Irons [1
numerical rather than exact integrations. This is the system which (apart from this direction, by achieving a given ac,
roundoff error) the computer actuaÍly so/ves. Our goal is to estima te the differ- standard product-Gauss rules. The R
ence uh üh, and we repeat the main point: lt is not necessary that the ener- and others has made a profound stud
gies 1 and 1* be clase in order for uh- üh to be small. regions, and has discovered sorne ren
We shall summarize the main theorem before giving examples and proof. interested in a 14-point formula on a e
The essential condition on the quadrature formula is identical with the patch degree q 5:
test for nonconforming elements: üh converges to uh in strain energy Cll uh üh 11m
---7 O) if and only if for al/ polynomials of degree m and al/ tria/ functions, 2-J5.(121
361 8
±!('
1 •
(6)
e¡
where the are the vertices of a simi
the r¡¡ are the vertices of a regular
There is also a supplementary condition of positive definiteness, reqmrmg sphere of radius ,Jf§760! We repeat, 1
that the approximate strain energy a* should be elliptic on the subspaces finite elements must take account of
Sh: a*(vh, vh) Bll vh 11!. The term O(h) enters the condition only because, if occur in trial functions on rectangl€
the material properties vary within an element, this will not be treated exactly nothing to approximation theory. On ·
by the quadrature. This is a second-order effect. plete polynomial of degree k 1, or ~
The test (6) actually applies only to the leading terms in the strain energy, tion of correctly integrating complet
involving mth derivatives. Since such derivatives of P m are constants, the possible. We owe Table 4.1 on the fol
inner product a(Pm' vh) involves only integrals of mth deriva ti ves of v\ added several new quadrature rules
and the convergence condition reduces to this: the mth derivatives of every Zienkiewicz [22]. The formulas are syr
tria/ function should be integrated exactly. If the trial functions are polyno- ei
íf a sampltng point = ((¡. Cz, C3) o
mials of degree k 1, this means that the quadrature formula should (¡ are distinct, there are six samplin¡
be correct at least through degree k m - l. Rectangular elements are three; the point (j, -j-, j) at the centro
much more demanding; there are terms in the trial functions which add once.
nothing to the degree of approximation, but whose derivatives must never- The isoparametric method could r
theless be integrated correctly. With bilinear elements in second-order prob- since the integrand is a rational funct
lems, for example, the twist term xy has linear derivatives; therefore the first it looks impossible that even 1
quadratute formula must be correct for these terms, and not only for the
constant strains which come from the linear terms a + bx + cy. tit is interesting to compute the quadra
In practice the test (6) will often hold for all polynomials Pn ofsome hígher 'will be of the same order hk-m in the strains <
degree n > m. In this case the accuracy is more than just minimal: the error exponent is to agree with n - m + l, then n
in the strains is of order hn-m+ 1 • Every additional degree of accuracy in the als of degree k - 1, multiplied by the mth (
integrated exactly. If the vh themselves compr
quadrature scheme contributes an additional power of h to the error estima te. true on triangles, and the material coefficien1
In other words, if the tri al functions are of degree k - 1, and the quadrature terms in this energy-the squares of mth derb
is exact to degree q, the error is of order q k + m + 2. If there are any the full accuracy!
CHAP.4 SEC. 4.3. NUMERICAL INTEGRATION 183
~~\.. .....) ~
certain number of evaluation points terms of degree higher than k 1 in the trial functions, as there always are
iepend on the and shape of the. for rectangular elements, this will injure the order of the error. In all cases
.umerical integration. The rule i.s called the correct test is expressed by a*(Pn, v") = a(Pn, vh); it is complete polyno-
~ry polynomial Pq is given correctly by mials of degree n - m, multiplied by trial strains Dmvh, which have to be
in tegrated correctly. t
111 trial functions v". Then the minimiz- No attempt will be made to devise new quadrature formulas, but it is
i by an approximate finite element sys- astonishing that even on triangles and rectangles this classical problem has
Ltrix and load vector 'are computed by never been completely solved. Irons [I5] has shown what can still be done in
1s. This is the system whích (apart from this direction, by achieving a given accuracy q with far fewer points than the
oh•es. Our goal is to estima te the differ- standard product-Gauss rules. The Russian school of Sobolev, Liusternik,
Joint: It is not necessary that the ener- and others has made a profound study of "cubature formulas" over regular
i? to be small. regions, and has discovered sorne remarkable formulas: the reader may be
rem before giving examples and proof. interested in a 14-point formula on a cube of edge which is accurate to
1re formula is identical with the patch degree q 5:
~·erges to u" in strain energy (!1 u"- u" 11m
of degree m and al! trialfunctions, 2-JS (121 8 6
+ 4o ~f(11J
361 8 ~JceJ
)
,
el
where the are the vertices of a similarly placed cube of edge ,J.38l33 and
the 11; are the vertices of a regular octahedron touching a circumscribed
ion of positive definiteness, reqmrmg
sphere of radius ,Jf9765! We repeat, however, that numerical integration for
!* should be elliptic on the subspaces
finite elements must take account of the higher-degree terms which often
) enters the condition only because, if
occur in trial functions on rectangles, even though they may contribute
!lement, this will not be treated exactly
nothing to approximation theory. On triangles the element is normally a com.:.
·der effect.
plete polynomial of degree k 1, or el ose to it, and then it is purely a ques-
the leading terms in the strain energy,
tion of correctly integrating complete polynomials of as high a degree as
derivatives of P m are constants, the
possible. We owe Table 4.1 on the following page to Cowper [CI3], who has
y integrals of mth derivatives of v\
added several new quadrature rules over triangles to those reproduced in
:s to this: the mth derivatives of every
Zienkiewicz [22]. The formulas are symmetric in the area coórdinates, so that
-:tiy. If the trial functions are polyno-
that the quadrature formula should
if a sampling point ( (Cp C2 ,.¡; 3 ) occurs, so do al! its permutations. lf the
- m l. Rectangular elements are
C; are distinct, there are six sanipling points; if two C; coincide, there are
three; the point (j, j, j) at the centroid, if used in the formula, is taken only
~ms in the trial functions which add
once.
m, but whose derivatives must never-
The isoparametric method could not exist without numerical integration,
tlinear elements in second-order prob-
since the integrand is a rational function of the new coordina tes e; and 11· At
has linear deriva ti ves; therefore the
first it looks impossible that even numerical integration should succeed,
~or these terms, and 1iot only for the
linear terms a + +
bx cy. flt is interesting to compute the quadrature accuracy required ín order that u" - iih
:i for all polynomials Pn of sorne higher 'will be of the same order hk-m in the strains as the basic approximation error u u11 • lf this
'f is more than just minimal: the error exponent is to agree wi th n m + 1, then n = k 1 ; the mth derivatives of all polynomi-
additional degree of accuracy in the als of degree k - 1, multiplied by the mth derivatives of all trial polynomials vh, must be
tional power of h to the error estima te. integrated exactly. If the vh themselves comprise al 1polynomials of degree k - 1, as is often
true on triangles, and the material coefficients in the strain energy are constant, the leading
~ of degree k 1, and the quadrature terms in this energy-the squares o! mth derivatives-must be computed exactly to maintain
der q - k + + m 2. If there are any the full accuracy!
Table 4.1
W¡ Multiplicity
'1 '2 '3
3-point formula degree of precision 2
0.33333 33333 33333 0.66666 66666 66667 0.16666 66666 66667 0.16666 66666 66667 3
3-point formula degree of precision 2 r
0.33333 33333 33333 0.50000 00000 00000 0.50000 00000 00000 0.00000 00000 00000 3
-0.56250 00000 00000 0.33333 33333 33333 0.33333 33j33 33333 0.33333 33333 33333 1
0.52083 33333 33333 o. 60000 00000 00000 0.20000 00000 00000 0.20000 00000 00000 3
6-point formula degree of precisio'n 3
0.16666 66666 66667 0.65902 76223 74092 0.23193 33685 53031 0.10903 90090 72877 6
0.10995 17436 55322 0.81684 75729 80459 0.09157 62135 09771 0.09157 62135 09771 3
0.22338 15896 78011 0.10810 30181 68070 0.44594 84909 15965 0.44594 84909 15965 3
0.37500 00000 00000 0.33333 33333 33333 0.33333 33333 33333 0.33333 33333 33333 1
...... 0.23793 23664 72434
00 0.10416 66666 66667 0.73671 24989 68435 0.02535 51345 51932 6
~
0.22503300003300000 0.33333 33333 33333 0.33333 33333 33333 0.33333 33333 ,33333 1
0.12593 91805 44827 0.79742 69853 53087 0.10128 65073 23456 0.10128' 65073 23456 3
0.13239 41527 88506 0.47014 20641 05115 0.47014 20641 05115 0.05971 58717 89770 3
0.20595 05047 60887 0.12494 95032 33232 0.43752 52483 83384 0.43752 52483 83384 3
0.06369 14142 86223 0.79711 26518 60071 0.16540 99273 89841 0.03747 74207 50088 6
0.05084 49063 70207 0.87382 19710 16996 0.06308 90144 91502 0.06308 90144 91502 3
0.11678 62757 26379 0.50142 65096 58179 0.24928 67451 70910 0.24928 67451 70911 3
0.08285 10756 18374 0.63650 24991 2139~ 0.31035 24510 33785 0.05314 50498 44816 6
-0.14957 00444 67670 0.33333 33333 33333 0.33333 33333 43333 0.33333 33333 33333 1
0.17561 52574 33204 0.47930 80678 41923 0.26034 59660. 79038 0.26034 59660 79038 3
0.05334 72356 08839 0.86973 97941 95568 0.06513 01029 02216 0.06513 01029 02216 3
0.07711 37608 90257 0.63844 41885 69809 0.31286 54960 04875 0.4S69 03154 253160 6
-..
-..:¡ a"I:T'J .-+- m §"· .§ ~ g" ~
~ o ~ ~ ~ 3 ..0 ~ ~ ~ ~
o t:;· e;;· ~ S. §'· s¡· ~ ...._, o < s·
.-+- ...., 1 w ¡::: . (') .,. -· o ..... e;;·
- ~ ==' ·"' (fe¡ (') (') r
..... ~~...., o~~~(J'C¡.-+-~ e~a" ~~~ (')
o ~ -· ,J::.,
::ro.,._,S~(óo~~~ .::: ~~"'o 0 m~""t (')
....,
o>¡-<goo
o'"1-a(ó.-+-2'";;J~o~~ ;;,"g~O~I:l~ 3 o ¡::: g. .-+- ~ .-+- o 3 ~· ~
o ~
s· m ::r' o' .-+- fl - ...., . ;. ~ ..., o Q o :::: ~- m (') m O ::r' ~ "' -·
::Sa § ~ E; m ~ ..0 • o t.., g.~ ~ ;:::¡ t.., § -<
- .-+- ~ -· .-+- o .-+- (D m
3m "'a.~. "'§om -· o .-+- 3 ¡::: ~ "' -· (";) ~ ~
.~
~~
§ ~ a" (t e;;· (t ~
~
o.,.,g~~~.,._,~om~e;~~~- s~e-
a.-+- ..... -o:e. -<"<
,_,...,o-..oso ~ ~om~ - ..... ~~~ o "::~ . (";. ) q ;. .... -g ~~
~ O ::r' ~ ~ m :=.: ..., (') ;;J - o ~ 0\ O - · '"ti S: o -..
.-+- ;:.< ... 1 -·· ::::;,· - m ~ -· e o w m ;:::¡ _,..,-..... 1 ¡:g¡:g-eEr~eio
o §' ~ .::t ~ -· a" - ~ ::::. ~ o ""t . (') ~ ~ (";) '-.:::! cr
m (";)
!< l:3 ~ ..... o -·o ~
"' • ~ ...._, .-+- §' ~ o ~ (') (') g ....., ~ :;: - O" "O ~ ~- ""e::; .::t ~~
~ o
..., '-.:::!
...._, ~~~o~(ó~
~. a e: ¡:g ~ ~ 3 o g_"' . '"'t
. c:r ""t o :::1 ~ ~ ~:l.. e t'l~
<
o ~~ ~-
~ ~¡~ .9. , [ 2.
(') .-+- ~
@ ~ ~ ~ g ~ Er,% ~ E ~ ~ So~ ~ ~ ~ (";)• ~
..::::t :['tt ?<
:!~ ~-e -·
~'.:::!o¡:;~oo.-+-~~~~ "::~ z-.
g·. §"· [ ..... ¡:;m" ""t~ ~ "'_.. ::r' ,--..,
::::n ;.....3 ~ -.. O o -· ~ e;;· ::r' o ~ e
g t'lt.-, o ~,~
::::. ~ e. ~ o o ..,
~t.-, ~
::r 0 m ~ ~ ~ -
~::r-~~~oSg~oa~,_,""t~~as.~~ ..::::t -.. ?< <:::l.... o¡:;.,.,oaS
~
..... o G ..::::t ¡:g ....., ~ e o "¡ -· "::~ _,..,-..... O> ~ ~ - o ::¡ .-+- ::::.
6a(J'C¡...._,~o~...,-...,e.~c§- ... "'"!)~·
....., ~ :::1 ~ e;;· '-.:::! ~ ..., ~ ::r :r ::r ::r o
~ 11 ~ ..::::t ~ ~o-g,.o~
...,~a::r'mom~i-m~~r...,::r' ..... ~ 0 ~~ ~ a ';::-"'
..::::t
...._, 1
"' o m m o o E.
g. g o ~ S ~ o S. ::= ñ'' ~ E. ~ iS' o ~ ..... _0 o ~ ~ m ~
'-.:::! -· m
...., o o ~ ~ o ....., - ~ e.. . . . ~ s· ;:::¡ ~ ::r ~ ...., ~ -.. '-.:::! '...: ?<
....., -..
~....., m,_,~::rm~ ..... oo~~~ ~.a ..... Y> q ?
....~ "' ~
~ o '"2 x~ ~ ~ ¡; a . . , ~ ~ Q ~· [ ~ ~ S: § '-.:::! s "" "" ~ ~
?ti o [ ~ r.p·~
-- '
SEC. 4.3. NUMERICAL INTEGRATION 185
since it is never exact for rational functions. The entries a*(rpi' rpk) of K will
be completely different from the entries Kik a(rpi, rpk), and no perturbation
argument is possible. Nevertheless, we compute a(Pm' vh) - a*(Pm' vh) and
apply the test; the crucial point is that the test involves only one trial func-
tion_:_not both rpi and rpk at the same time-and this will save us.
A typical transformation was given in Section 3.3, with m = 1; aPmfox
is a constant e, and
..
'1
~
M\00
M 11') t'--
MVt'--
MM O\
voo
00 00
MO
MO
N....,. \0
o--
11') 0\ 00
-ov
M00\00
MM....,. \O
MON....,. ff p(x, y) aPm avh
ax ax dx dy
r¡ M N 00 00 11') MO\<'IM
0\t--v Mf'--011')
r¡ M M t'-- M t'-- M00\('1
:t M t-....,. v-oo
000 VIl') O\ M \O <'IV
')
..., MOt-- VN -vv M\0011')
M 11') 00 <'IV or--o M 0\....,.,.....
.... M \0 11') 11') r-- 0\\011') Mil') OM
? M 00....,. N t'-- oooov MVMO
M N t'-- V') V 0<'1....,. MM-O\
M....,. 0\ r--r-- M 0\ M
~ M011') MOII')\0
MM \OVIl') M\0\0oo
M,.....O vo ONO It is crucial to observe the form ofthe transformation matrix:
::) 000 oo ooo MNOv
oooo
-x")·
X.;
Evidently ('1: = YrtfJ and f/x -Y.;/1. With this substitution, the integral
beco mes
(7) e JJp(!;, r¡)(v~y"- v~Y.;) dt; dr¡.

Et
The rational functions have disappeared, and convergence will occur if this
integral is computed correctly. (For fourth order equations the rational func-
tions -do not disappear, and we cannot justify numerical integration unless
the coordinate changes satisfy the smoothness conditions 11 Fllk <.e des-
cribed on p. 163. In that case, the elements are only slightly distorted, and J
is apparently no worse than the variable coefficient p; the integration error
is of the same arder as for constant coefficients without isoparametrics.)
In the case of bilinear functions on quadrilaterals, all the derivatives
Ot--\0
ONO
r-- 1'1')
OON
t-0\V
o t- t-
OVO\t-- v~, y", ... are linear, and it would appear that the quadrature must be cor-
oo N t'--0MII')
80011') <'IMM \0('100('1
voo
ovoo
0\0
\0 00
0\0oo
t'--('1-
f'--MOOO
\O M O O\
rect for quadratic polynomials. It just happens, however, that the second-
M
M 11') t- t'--N M t'-- \0
\011')11')
vv\Ooo
Vt'--11')0
degree terms cancel in the particular combination K= v~y" v:y.;, so that
§ 000
< '11')
1 """"""
o-
V') V
or--r-- VIl') M \O
exactness to first degree is actually sufficient for convergence. In practice the
a:~ o-
O\ N O
":t:.\0 .....
ONNt--
011')f'--M
M
M M 0\
00\1'1')
11') 0\
0\ \0
VOOII')
00 t- 00
r- ..... v ..... quadrature will be more accurate than that (it will probably have to be, just
11') 11')('1 V') M
11') \0 M -
<'IN M
<'~,.....,.....
o \0
NO
O \O N
11') ..... 00
0\ll')l")t-
Vt-ll')f'-- to achieve positive definiteness of a*) and the benefits come in establishing
o-o --oo
000 00 ooo oooo more than a mini mal rate of convergence. Suppose the conditions of Section
1
3.3 are satisfied-the Jacobian stays away from zero, and the coordinate trans-
formations x(t;, r¡) and y(!;, r¡) have bounded coefficients. Then justas in the
x-y plane, each additional degree ofpolynomials in t;, r¡ contributes one more
order of approximation. Therefore we expect the integration error to improve
at the same rate; since first-orde'r quadrature is sufficient for convergence
14
186 VARJATIONAL CRJMES CHAP.4 SEC. 4.3.
in the bilinear case, the error in strains should be O(hq) if the quadrature is Then the error in strains due to approx.
exact to degree q.
One final isoparametric remark: since the coordinate transformations (10)
x(c;, r¡) and y(<!', r¡) have the same form as the shape function v"(c;, r¡), it fol-
lows that the combination K has the same form as the Jacobian J. Therefore, · Proof. The left side of the identity
our rule ~hat K must be correctly integrated coincides with the rule [22], and the right side is bqunded above bj
based on/iirons' intuition, that the volume of each element (the integral of J) common factor, the result is (10).
must be correctly computed by the quadrature. In three dimensions this
requires a higher order of exactness: K and J will involve products of three The lemma and theorem are not
rather than two derivatives. For subparametric elements the two rules di- They apply also toa change in the co~
verge; the stiffness matrix depends on K, the mass matrix and load vector on tion; in other words, they describe th
J. To repeat, these rules apply to highly distorted isoparametrics; for small continuous as well as the discrete pr<
distortions J is smooth and can be discounted in the convergence test. on the inhomogeneous term. Consider
We begin now on the theory, which is entirely based on the following qu = f, and suppose thatp, q, andjar1
simple identity. #: tity, applied on the whole space JC1 i
LEMMA 4.1 f p(u' u') 2 + ij(u - u) 2

Suppose that uh and u" minimize thefunctionals I(v") and I*(v"), respective/y, (11)
so that the equations of virtual work (the Euler equations ~1 = ~!* =O) become f (¡;- p)u'(u' u')+ (
The left si de, which is a*, will be posi

Then righ t si de, each term is less than 11 u
perturbation. This yields a simple b01
(8) a*(uh u\ uh- uh) =(a* - a)(uh, uh u") (b* - b)(f, uh u"). on the resulting perturbation in the s~
Proof The left si de of the identity is COROLLARY
a*(u", uh uh) a*(ü", uh- u") Suppose that the coefficients and
(a* - a)(u", uh - u") + a(u", uh- u") a*(u", u" u"). less than f:
max(lp- pj, Jq --
With v" = uh - u" in the equations of virtual work, the last two terms yield
x
(b b*)(f, u,. u"), and the proof is complete. There is a similar identity, Then the solution is also perturbed by (
with a, b, and u,. replaced by a*, b*, and u\ but it is not so useful. Notice
also that the terms in a and b actually cancel in (8), but it is "important to
keep them and to work with the differences a* - a and b* - b.
(12) !tu u11
Our principal theorem follows immediately from the identity. In terms of finite elements this ha
that p, q, andfare replaced by their i
THEOREM 4.1 This is a perturbation of order hk. lf
Suppose that the approximate strain energy is positive definite, a*(v", v") solved exactly (there will be produc1
811 v,. 11~,
and that integrals), then by the corollary 11 u"
possible alternative to numerical in
(9) 1(a* a)(u\ vh) 1 + l (b* - b)(f, vh) 1 ChP 11 v" llm· share of attention in the numerical ar
CHAP.4
ns should be O(htJ) if the quadrature is

Then the error in strains due to approximate integration is of order
since the coordinate transformations (lO)
m as the shape function vh({, 17), it fol-
mme form as the Jacobian J. :Therefore Proof. The left side ofthe identity (8) is bounded below by Bii uh- uh 11!,
ntegrated coincides ~ith the rule [22]: and the right side is bqunded above by (9), with if = uh uh. Cancelling the
ume of each element¡{the integral of J) common factor, the result is (lO).
quadrature. In three· dimensions this
K and J will in vol ve products of three The lemma and theorem are not limited only to numerical quadrature.
)parametric 'elements the twi
rules di- They apply also to a ohange in the coefficients of the original differential equa-
K, the mass matrix and load vector on tion·; in other words, they describe the manner in which the solution, to the
hly dístorted isoparametrics; for small continuous as well as the discrete problem, depends on the coefficients and
iscounted in the convergence test. on the inhomogeneous term. Considera one-dimensional problem -(pu')' +
lich is entirely based on the following qu = f, and suppose that p, q, and f are changed to p, ij, and /. Then the--iden-
tity, applied on the whole space X1 instead of the subspace S\ becomes
(unctionals 1(vh) and l*(vh), respective/y,

f ft(u' u')2 + ij(u - u)2
(11)
ze Euler equations ~1 = ~1* =O) become = f (ft p)u'(u' u')+ (ij q)u(u u)-(/ f)(u u).
The left si de, which is a*, will be positive definite if ft > O and ij >O. On the
right side, each term is less than 11 u- u11 1 times a constant multiple of the
perturbation. This yields a simple bound, not the most precise one possible,
on the resulting perturbation in the solution.
r is
COROLLARY
ih) Suppose that the coefficients and inhoinogeneous term are perturbed by
z(u\ uh uh)- a*(ii\ uh- uh). less than E:
max (J p - ft J, 1q - ij J, 1f ID< E.
·virtual work, the last two terms yield X
complete. There is a similar identity,

Then the solution is also perturbed by O(E):
:md ii\ but it is not so useful. Notice
y cancel in (8), but it is important to
~ences a*- a and b* b. (12)
1ediately from the idéntity. In terms of finite elements this has the following interpretation. Suppose
that p, q, and f are replaced by their interpolates ·in the finite element space.
This is a perturbation of order hk. If the resulting finite element problem is
energy is positive definite, a*(vh, vh) solved exactly (there will be products of three polynomials in the element
integrals), then bythecorollary !luh- i?ll 1 O(hk). Thus interpolation is a
possible alternative to numerical integration, and it has had the lion's
share of attention in the numerical analysis literature; Douglas and Dupont
)
1.88 VARIATI<fNAI: CRIMES CHAP.4
SEC. 4.3.
[D8] have successfully explored even nonlinear parabolic problems. In engi~ This is the 5-point scheme rotated th
neering calculations, however, direct numerical integration has consistently unstable; in fact, our oscillating twist ~
been preferred; with isoparametric or shell elements, there is effectively no equation KQ = O.t
choice. A rough 'operation count suggests that also in other problems direct A similar one-point rule at the cen
quadrature is the more efficient, and we therefore intend to concentrate on
this techni,que.
functions, will not be indefinite. The str:
at the centroid without being identical
The simplest example is, as usual, the most illuminating. Therefore, we rise to the normal 5-point scheme.
begin with the e¡¡uation -u" = f, and apply numerical integration to l(v) = The basic test for definiteness is to
f (v') 2 - 2fv. Consider first the requirement of positive definiteness: trial functions which, under numerical
energy. In practice this is decided fror
(13) matrices; if the only zero eigenvalues
quadrature is all right. If there are add
The interval is divided into elements, in other words into subintervals, and a tu re might still be acceptable: it has to t
standard quadrature rule is applied o ver each element. If vh is a polynomial of ing polynomials in separate elements e~
degree k l, and· the quadrature weights wi are positive, then the definite- of the twist described above-into a t
ness requirement amounts to this: There must be at least k 1 integration energy over the whole domain. For ex
points- c!1 in the subintervals. Otherwise, there will exist in each element a non- (2 x 2) does not satisfy our stability e
zero polynomial of degree k - 2 which vanishes at every c!1• Joining these q uadratic. With Gauss points ( ±~
polynomials and integrating once, we have constructed a trial function vh (x2 _ c;2)(yz - c! 2) has zero strain ener
which contradicts (13): to give trouble over the whole domain
singular, if this· pattern does not fit the
the problem. In this case one eould live ·
· integration, even though K is mueh n
demands.)
Ifthere are negative weights, then even more integration points will be needed The question of definiteness beeorr
for definiteness. We do not expect such formulas to be popular; they are also eight-parameter element, obtained fron
vulnerable to roundoff errors. x 2y 2 term and the no de at the eenter
In two dimensions, with bilinear trial functions on rectangles, the definite- functions no Ionger include (x 2 c! 2)(
ness condition would fail if we were to choose the midpoint rule for quad- apparently enough for a stable finite
rature: equation. Taylor has pointed out, ho 1
h/2 fh/2 two dependent variables, the situatio1
J -h/2 -h/2
g(x, y) dx dy "' h 2 g(O, 0).
x(yz _ c;2), v - y(xz c! 2) líes in the
ishes for the 2 x 2 rule. But Taylor h~
This is the one-point Gauss rule on a square, and it is exact for the polyno- cannot be continued into a neighboring
mials g 1, x, y, xy. However, it is indefinite; for the trial function vh = xy matrix is too low, but after assembly 1
numerical Íntegration of (v~) 2 + (v;) 2 gives zero. If we set vh +1 or -1 in nonsingular and the numerical integra·
a checkerboard pattern O'Ver the whole set of nodes in O, the result is a high So mueh for definiteness, whieh is a
frequency oscillation (pure twist, with the smallest wavelength 2h which the points. We turn now to aeeuracy, whi
mesh can accept) whose numerical energy is exactly zero. This is reflected in whieh the quadrature is exact. There ar
the discrete approximation to the Laplacian which would arise from this is to compute directly the exponent p
midpoint rule; a typical row of the stiffness matrix K gives
tA recent paper by V. Girault analyzes the
rature rule. It turns out to be surprisingly usf
our hypothesis of positive definiteness.
CHAP.4
~ nonlinear parabo_lic problems. In engi-

This is the 5-point scheme rotated through 45°, and seems dangerously
t numerical integration ~as1:on~istently ·
)r shell elements, there 1s (ffectively no unstable; in fact, our oscillating twist gives a solution to the homogeneous
:gests that also in other problems direct equation KQ = O.t
l we therefore intend to concentrate on A similar one-point rule at the centroids of triangles, with linear trial
functions, will not be indefinite. The strains are constant, and cannot vanish
1, the most illuminadng. Therefore, we at the centroid without being identically zero. In fact, the Laplacian gives
d apply numerical integration to I(v) = rise to the normal 5-point scheme. '
irement of positive definiteness: The basic test for definiteness is to determine whether or not there are
trial functions which, under numerical quadrature, give up all their strain
energy. In practice this is decided from the rank of the element stiffness
matrices; if the only zero eigenvalues come from rigid body motions, the
quadrature is all right. If there are additional zero eigenvalues, the quadra-
in other words into subintervals, anda ture might still be acceptable: it has to be checked whether or not the offend-
'er each element. If vh is a polynomial of ing polynomials in separate elemehts can be fitted together-as in the case
:ights W¡ are positive, then the definite- of the twist described above-into a trial function vh which has too little
'zere must be at least k - 1 integration energy over the whole domain. For example, four-point Gauss integration
, there will exist in each element a non- (2 X 2) does not ·satisfy our stability condition for the nine-parameter bi-
ich vanishes at every ej. Joining these quadratic. With Gauss points ( ±e, ±e) on the square centered atthe origin,
e have constructed a tria! function vh (x 2 - e 2
e
)(y 2 - 2
) has zero strain energy, and this pattern can be translated
to give trouble over the whole domain. (The matrix K may not be actually
singular, if this pattern does not fit the boundary conditions (say vh = O) of
but v~(eJ = O. the problem. In this case one could live dangerously and try such a four-point
integration, even though K is much nearer .to singularity than the theory
t more integration points will be needed demands.)
h. formulas to be popular; they are al so The question of definiteness becomes rather delicate for the important
eighÍ:·parameter element, obtained from the biquadratic by eliminating-the
ial functions on rectangles, the definite- x 2 y 2 term and the node at the center of each mesh square. Sin ce the trial
to choose the midpoint rule for quad- functions no longer include (x 2 - <!' 2 )(y 2 - <!' 2 ), the four Gauss points are
apparently enough for a stable finite element approximation to Laplace's
equation. Taylor has pointed out, however, that for plane elasticity with
dx dy "' h 2 g(O, 0). two dependent variables, the situation is different: the combination u=
x(y 2 - e 2
), V = - y(x 2 - <!' 2 ) lies in the trial space, and its strain energy van-
ishes for the 2 X 2 rule. ~ut Taylor has also demonstrated that this pattern
square, and it is exact for the polyno-
cannot be continued into a neighboring element; the rank of a single element
definite; for the tria! function vh = xy
matrix is too low, but after assembly the global stiffness matrix is perfectly
gives z~ro. If we set Vh = + 1 or - 1 in
nonsingular and the numerical integration is all right.
;! set of nodes in Q, the result is a high
So much for definiteness, which is a question of having enough integration
the smallest wavelength 2h which the
points. We turn now to accuracy, which is decided by the polynomials for
:rgy is exactly zero. This is reftected in
which the quadrature is exact. There are two ways to develop the theory. One
Lplacian which would arise from this
is to compute directly the exponent p which appears in equation (9) .of the
ffness matrix K gives
t A recent paper by V. Girault analyzes the discrete system which results from this quad-
rature rule. It turns out to be surprisiRgly useful in sorne contexts, even though it violates
our hypothesis of positive definiteness.
"\ "
theorem; w~)ake)this·approach in the next two paragraphs, writing clown In two or three dimensions the I
explicit boutJ.ds (14-15) for the errors in numerical integration. Then in the nonials of degree q + 1, say x11.yP, th
final paragraphs of this section we describe a simpler' and neater argument, rise to corresponding terms in the en
which leads immediately to the connection between the exactness of the
quadrature rule and the order of the resulting error uh- üh.
First 1!/proach. Por a quadrature exact to degree q, the error in comput-
ing f g(iJdx numerically will be bounded by C hq+t f 1g<q+o (x) 1 dx. This is
the exact counterpart of the Approximation Theorem 3.3, and is proved in The strain Dmuh is smooth, but ever:
the same way. Applied to the expressions (9) which appear in Theorem 4.1, factor h- 1 • The convergence conditit
this becomes pear. Therefore the full q + 1 dil
hilate nmvh for any trial function if.
(14) l(a*- a)(u', u'- ii')l Ch•+l ~ f..iCfxt\p(xM(u~ íí~) any of the terms x~~.yp for which the
the convergence criterion (6) given e
+ q(x)u"(u"- U')]' dx. trial function should be integrated e:
Second approach. Suppose that tl
(15) 1(b* b)(f, u'- íí') 1 < Ch•+ ~f.. 1Cfxtucu• íí')]l dx.
1 n - m, multiplied by any mth deriv;
puted exactly by the quadrature. Wt
exponent p in Theorem 4.1 is p n
Suppose the datafis smooth, as well as the variable coefficient p(x) we have dueto numerical integration will be
now allowed. Then u is also smooth, and so is its finite element approxima- the two quantities a - a* and b 1
tion uh. Therefore the only uncontrolled terms on the right sides are uh - fíh A typical term in (a- a*) (u\ v"
and its derivative. Every differentiation ofthese trial functions can introduce a a coefficient c(x, y), looks like
factor h- 1.t It would appear that q + 1 differentiations could completely
cancel the factor hq+ 1 , and destroy the proof of convergence. Here is the
point at which it is essential for the tria! functions to be polynomials. Since
u~ u~ is of degree k - 2, this is the maximum number of factors h- 1 which
catt appear; further differentiations would annihilate the polynomiaL There- The essential point, coming from m
fore the expression (14) is of order hcq+tJ-<k- 2 ) lluh- ühll 1 • The same is true exactness of the quadrature, is that
ofthe other expression (15); the first differentiation gives u~- ü~, and we are
back to the same argument. (We shall return to this point, and show in more
detail why b* bis of the same order as a* - a if the same quadrature rule
is applied.) We conclude that p = q- k+ 3 in Theorem 4.1, and therefore Since numerical integration is carric
that the effect of numerical quadrature is bounded by centrate on a specific element and el
between the two, according to our 1
be of order hn-,m+t_provided cDmuh
This coincides with the expgnent q - k + m 2 given earlier. Por N-point case if the original problem is smoot
Gauss quadrature, the degree of accuracy is q = 2N 1, and the resulting elements, the error (16a) is ofthe rig
error is O(h 2 N-k+ 2 ). Therefore the use of k 1 Gauss points should be com- A variation of this argument yi~
pletely successful in one dimension: it is enough to satisfy the requirement of below order m in the energy, and als
definiteness, ~nd it leads to an error of order hk. This is even of lower order the steps involved in estimating thi
than the approximation error in the strains. ' simplicity that the mth derivatives
polynomials ofsome degree t. Then e
tMore precisely, lvhlS+t < Ch- 1 lvhls for aJI s. amounts to correct integration of
\)
' CHAP.4 SEC. 4.3. NUMERICAL INTEGRATION 191
1~ next two paragraphs, writing down

In two or three dimensions the principie is the same. For certain poly-
• m numerical integration. Then in the nonials of degree q + 1, say x(l;yP, the quadrature will be inexact. This gives
:scribe a simpler and neater argument rise to corresponding terms in the error a* - a, of the form
nection between the exactness oJ) th~
resulting error uh- fíh. ·· rY
~xact to degree q, the,. error in comput-
tded by e hq+ 1 f 1 gCq+.'o (xJ1 dx. This is
mation Theorem 3.3, and is proved in The strain Dmuh is smooth, but every differentiation of Dmvh can introduce a
ions (9) which appear in Theorem 4.1, factor h- 1• The convergence condition is that at most q of these factors ap-
pear. Therefore the full q + 1 differentiations (ajax)(l;(a¡ay)P must anni-
., ~ J../C~T'rpCxM(u~
hilate Dmvh for any trial function vh. In other words, Drrivh must not include
ü!J any of the terms xa.yP for which the quadrature is inexact. This is precisely
the convergence criterion (6) given earlier, that the mth derivatives of every
•(x)u'(u"- ü')]l dx. trial function should be integrated exactly.
Second approach. Suppose that the integral of any polynomial of degree
'~J../ (fxf'ucu" - ü'll/ dx. n m, multiplied by any mth derivative Dmvh of any trial function, is com-
puted exactly by the quadrature. We want to show, following [S7], that the
exponent p in Theorem 4.1 is p n - m + 1 ; then the error in the strains
:ts the variable coefficient p(x) we ha ve
dueto numerical integration will be of this order hP. To do so we consider
1nd so is its finite element approxima-
~d terms on the right sides are uh - ilh
the two quantities a - a* and b b* which appear in the inequality (9).
A typical term in (a a*) (uh, vh), with material properties expressed by
of these trial functions can introduce a
a coefficient c(x, y), looks like
+- 1 differentiations could completely
1e proof of convergence. Here is the
·ial functions to be polynomials. Since
naximum num ber of factors h- 1 which
lUid annihilat~the polynomial. There- The essential point, coming from our condition a(P vh) a*( P.,, vh) on the
11
,
l-1)-(k-z> 11 uh fíhllt· The same is true exactness of the quadrature, is that this term can be rewritten as
fferentiation gives u! - u~, and we are
~eturn to this point, and show in more
as a* a if the same quadrature rule
k + 3 in Theorem 4.1, and therefore
! is bounded by Since numerical integration is carried out an element at a time, we can con-
centrate on a specific element and choose Pn-m close to cDmuh. The difference
Chq-k+3. between the two, according to our mean-square approximation theory, will
be of order hn-m+ 1-provided cDmuh is sufficiently smooth, which will be the
( + m + 2 given earlier. For N-point case if the original problem is smooth. Summing the results over the separate
acy is q = 2N_ 1, and the resulting elements, the error (16a) is of the right order hn-m+ 1 •
.f k - 1 Gauss points should be com- A variation of this argument yields the same bound for any derivatives
; enough to satisfy the requirement of below order m in the energy, and also for the term b - b*. We shall describe
order hk. This is even of lower order the steps involved in estimating this ·error in the latter term, assuming for
·ains. · ' simplícity that the mth derivatives of the trial functions vh consist of all
polynomials of sorne degree t. Then our exactness condition on the quadrature
amounts to correct integration of all polynomials of degree n - m + t.
)) S.
192 VARIATIONAL CRIMES CHAP.4 SEC. 4.4. APPROXIMATION OF D(
Therefore This section discusses four proble1

1. Change of domain with a ho1
(b - b*)(f, vh) = JJfvh dx dy - 2: w¡(fvh)(e) onr.
= JJ (fvh Pn-m+t) dxdy 2: w¡(fvh - Pn-m-t)(e¡)· on
2. Change of domain with a horn
r.
3. Approximation of an inhomog
With the ·right choice of P in each element, these quantities are of order on r.
hn-m+t+ 1 , multiplied by an integral of the absolute value of derivatives up to 4. Approximation of an arbitrar:
this order of fvh. But vh can be differentiated only t + m times, after which un + ct(x, y)u b(x, y) on r.
(being a polynomial) it disappears. Therefore, assumingf is smooth enough
to allow n m + t + 1 differentiations, the error b - b* is of order In each case we work with a sec
say Poisson's equation -Au-= fIn
simplicity of exposition: very little is E
three space dimensions. For apure B
At the last st'ep, the removal of t differentiations from vh was paid for, as in higher order, saya plate with its edge
the footnote on p. 190, by a factor h-e. error in
strain energy is the same as
Thus both a- a* and b b* are of the correct order hn-m+ 1 , and by A new and quite different situatio
Theorem 4.1 the strains are in error to this order. This is the main result of boundary condition are combined. r
the section: /f a(Pn, vh) a*(Pn, vh), then 11 uh üh 11m O(hn-m+ 1). Ciarlet the biharmonic equation A2 U f acc
and Raviart have been able to show, even allowing for the use of iso- Poisson's ratio v enters the natural t:
parametrics [6], that the error in displacement shows the usual improvement
over the error in strains. (Their proof is a subtle variation of Nitsche's (17) u O and vAu +
trick.) Therefore the displacement error is of order hn+ 1 , and the theory
of numerical quadrature is in a satisfactory state: n = m is necessary for Jt is no longer possible to guarantee,
convergence, and n k - 1 is sufficient to reduce the errors due to quad- matically well-posed problem, that !
rature to the same level as the errors due to.approximation by polynomial solution on the polygon is close to the
tria! functions. exact solution Uh on Qh and to the f
required than simple convergence .of
The di:fficulty is easy to see when
4.4. APPROXIMATION OF DOMAIN ANO dition Uh =O immediately forces 1
BOUNDARY CONDITIONS Therefore, the second boundary con
edge to AUh O, and the depende1
At the same time that the admissible functions in X'; are being approxi- propases introducing Vh AUh as
mated by piecewise polynomials, sorne other and quite different approximations order equations:
are being_made in the finite element method. In the first place, the domain
itself may be changed: !l is replaced by a nearby polygon !l\ or in the iso-
parametric method by a domain whose boundary is a piecewise polynomial. It
For such a second-order system '
would bé extremely di:fficult to handle an arbitrary domain in any other way.
approach the solutions of
Second, the boundary conditions themselves are subject to approximation;
if the problem specifies that U g(x, y) On r, Or that Un+ (tU b(x, y), AU = V and AV= f
then these fuñctions g and b will almost inevitably be interpolated at the nodes
on the boundary r (or on its approximation). We want to estimate the errors This is precise! y the problem of the
involved. . the limiting function is independent (
CHAP. 4 SEC. 4.4. APPROXIMATION OF DOMAIN AND BOUNDARY CONDITIONS 193
This section discusses four problems in detail:
dy - 2: wlfvh)(e¡) 1. Change of domain with a homogeneous Dirichlet condition, u O

on r.
dy 2: w¡(fvh - Pn-m-r)(e1). 2. Change of domain with a homogeneous Neumann condition, un = O
onr.
3. Approximation ofan inhomogeneous Dirichlet condition, u g(x, y)
element, these quantities are of order
on r.
th~ absolute value of derivatives up to 4. Approximation of an arbitrary inhomogeneous Neumann condition,
mtlated only t + m times, after which
Un+ rt(X, y)u = b(x, y) on r.
1erefore, assumingf is smooth enough
ns, the error b b* is of order In each case we work with a second-order equation in two dimensions,
say Poisson's equation -!:,.u-= f In large part this is for convenience and
simplicity of exposition: very little is altered ifthere are severa] unknowns and
three space dimensions. For a pure Dirichlet ora pure Neumann problem of
~entiations from vh was paid for, as in
t higher order, saya plate with its edges either clamped or free, the order ofthe
er~or in strain energy is the same as the one demonstrated below.
: o f. the correct order hn-m+ 1 ~ and by
A new and quite different situation arises when an essential anda natural
> thts order. This is the main result of
boundary condition are combined. This occurs in a slmply supported plate:
then 11 uh íJh 11m = O(hn-m+ 1). Ciarlet the biharmonic equation ll. 2 u faccounts for the loading in the interior, and
v, even allowing for the use of iso- Poisson's ratio v enters the natural boundary condition:
Lcement shows the usual improvement
of is a subtle variation of Nitsche's (17) u = O and vl\u + (1 - v)unn O on r.
ror is of order hn+ 1 , and the theory
factory state: n m is necessary for It is no longer possible to guarantee, in this physically important and mathe-
mt to reduce the errors due to quad- matically well-posed problem, that if thé polygon Qh is clase to n, then the
due to- approximation by polynomial soluiion on the polygon is close to the solution u on Q. This applies both to the
exact solution Uh on Qh and to the finite element approximation uh. More is
required than simple convergence of the boundaries.
The difficulty is easy to see when Qh is a polygon. On each edge, the con-
~NO
dition Uh =O immediately forces the tangential derivative U~t to vanish.
Therefore, the second boundary condition in (17) is equivalent on a straight
edge to fl.Uh O, and the dependence on v has disappeared. Babuska [B1]
le functions in Jejl are being approxi- proposes introducing Vh = !:,.Uh as a new unknown, yielding two second-
:her and quite different approximations order equations:
ethod. In the firs t p~ce, the domain
r a nearby polygon í!\ or in the iso-
>oundary is a piecewise polynomial. It
n arbitrary doínain in any other way. For such a second-order system convergence is guaranteed, and Uh, Vh
selves are subject to approximation; approach the solutions of
y) On r, Or that Un + rtU = b(x, y),
nevitably be interpolated at the nodes ll.U =V and :ll.V =f in n, U V=O onr.
.tion). We want to estímate the errors
This is precisely the problem ofthe clamped plate with v 1, and therefore
the limiting function is independent of the Poisson's ratio actually specified in
·.~
194 VA~IA?ONAL CRIMES CHAP.4 SEC. 4.4. APPROXIMATION OF DOM
the boundary conditions. Convergence occurs, but nearly always to the wrong Q
· answer. Corresponding difficulties with finite élement calculations are report-
ed in [Rl] and discussed in [Bll]. We would anticípate success in the isopara-
metric method, on the other hand, if the approximation of r is at least
piecewise quadratic; the curvature of the boundary converges in this case.
AlternativY.ly, suppose the essential condition u O were replaced at bound-
auhjat = o, using the cubic trial space Z 3 (see Section 1.9)
1
ary nodes by uh
and taking the tangent to the true boundary r. Then convergen ce is expected
even on a polygon. At this writing, however, the required theory does not
exist.
Fig. 4.2 Polygonal appro:x
We turn to the four problems listed above. In each case the analysis is
rather technical, but the conclusions are straightforward.
boundary r' the mean-value theorem :
l. Suppose that Q is replaced by an inscribed polygon Qh, and the trial
functions are made to vanish on the straight edges of rh. Then imagining (20) !u(M)I Ch 2 :
that they are defined to vanish everywhere outside rh, they are admissible in
the variational problem; they vanish on tl:ie true boundary r, and. the trial It is no longer permissible to choose v
space Sh is a bona fide subspace of 3Có(Q). Therefore, the fundamental Ritz vh will be the piecewise quadratic uj ~
theorem 1.1 assures that uh minimizes the error in strain energy: only and vanishes on rh. The error in
using the triangle inequality, is
(18) JJ(u uh); + (u - uh); n;~n JJ(u- vh)~ + (u vh);.
a a JJ(u uj); + (u - uj);
Qh
Since every vh vanishes outside. Qh, the integral over the skin Q Qh is
fixed; it is just the integral of u; + u;. Therefore, uh is minimizing over Qh as
well as Q:
The term 1u u1 II is completely famili:
(19) JJ(u uh)~ + (u uh); ~}n JJ(u vh); + (u - v");.
ter m u1 - uj is a piecewise quadratic
~ ~
gles: the only nodes at which u1 ::f.= u~
these points uj =o and ul u= e
The question is simply to estímate (18) and (19). U p to now we ha ve chosen triangles is 0(1/h), and
vh to be the interpola te u1 • If Sh is Courant's piecewise linear space, with nodes
at the vertices of the triangles, this is still a good choice. Since it interpola tes
u O at boundary nodes, it vanishes along the boundary of the polygon and
1u1 uj 11 L JJ 1u(M¡) P
lies in Sh. _The standard approximation theorem 3.3 yields an h 2 error in

energy. No doubt the approximation is very poor near the boundary.
Suppose a more refined element is used, for example a piecewise quadratic
(Fig. 4.2). As usual, the nodes are placed at the vertices and midpoints of the Here rp is the quadratic with value 1 at
edges. Each trial function will vanish along rh provided it is zero at aii bound- five nodes ofthe triangle. Its first derh
ary nodes, but this excludes u1 from the tria! space Sh. The true solution u the triangle is of order h2 , so that fina
vanishes at the vertices P and Q but not at the midpoint M, so the interpola te
is not zero along the boundary. Since Mis a distance O(h 2 ) from the true !u1 ujj
CHAP.4 SEC. 4.4. APPROXIMATION OF DOMAIN AND BOUNDARY CONDITIONS 195
occurs, but near1y always to the wrong

th finite element calculations a~~ report-
would anticípate success in tfie ?i~opara
if the approximation of r is )¡t least
f the boundary converges in this case.
Jndition u o were replaced at bound-
he cubic trial space Z~ (see Section 1.9)
mdary r. Then convergence is expected
however, the required theory does not
ted above. In each case the analysis is Fig. 4.2 Polygonal approximation of the boundary.
are straightforward.
boundary r, the mean-value theorem yields
an inscribed polygon !l\ and the trial
straight edges of rh. Then imagining (20) [u(M)[ Ch 2 max 1 grad u[.
here outside rh, they are admissible in o
on the true boundary r, and·the trial
It, is no Ionger permissible to choose vh u1 in (19). Instead, a convenient -
:J(Q). Therefore, the fundamental Ritz
vh will be the piecewise quadratic u1 which interpolates u at interior nodes
the error in strain energy:
only and vanishes ·on rh. The error in strain energy over Qh for this choice,
using thetriangle inequality, is
~!n JJ(u .vh); + (u vh);.
o
un;+ (u u1); =[u- U¡+ U¡ uJ'Ii
he integral over the skin n - Qh is
'¡
<([u u1 l1 + 1u1 uJ'I1 )2
Therefore, uh is minimizing over Qh as
<liu u1 11 + 2[ u1 - uJ'[i ·
The term 1u- u1 121 is completely familiar; it is O(h 4 ) by Theorem 3.3.• The new
~}n JJ(u
•
vh); +(u- vh);. term u - u* is a piecewise quadratic which vanishes over all intenor tnan-
o~<
gles: the onÍy no des at which u1 *
u1 are the boundary midpoints M;; at
these points ur o and U¡ u O(h 2 ). The number of such boundary
8) and (19). Up to now we ha ve chosen triangles is 0(1/h), and
mt's pfecewise linear space, with nodes
ill a good choice. Since it interpolates
.ong the boundary of the polygon and
1U¡ u11i = JJ u(M¡)I (cp; + cp;) dx dy
2: 1
2
e¡
m theorem 3.3 yield§ an h 2 error in
s very poor near the boundary. <o( k) c h max grad u [ JJ(cp; + cp;) dx dy.
2 4
1
2
5ed, for example a piecewise quadratic

d at the vertices ahd midpoints of the
Here cp is the quadratic with value 1 at the midpoint M; and zero at the other
ong rh provided it is zero at aU bound-
five nodes ofthe triangle. Its first derívatives are of order 1/h, and the area of
e tria! space Sh. The true solution u
the triangle is of order h 2 , so that finally
at the midpoint M, so the interpolate
M is a distance O(h 2) from the true
196 VARIA TIONAL CRIMES CHAP.4
SEC. 4.4. APPROXIMATION OF D
THEOREM4.2 St. Venant principie, but one which a 1

more customary, to the boundary da
The error in energy over Q\ produced by polygonal approximation of the boundary, it is the average which ma
domain, satisfies oscillations.
The situation is not unlike that (
(21) 1u - uh li < 1u - .u11i = O(h 3 ).
with a wavy boundary. Suppose the 1
Then in the conforma! map betwee
The error 'In function values is u - uh ·= O(h 2 ).
into (r + h2 r 11h cos 0/h,O). All the
The latter estímate (which we shall not prove) is suggested by the contin-
r > 1 - Ch; for smaller values of r,
tion term is exponentially small,
uous problem on the approximate domain: -dUh = f on Q\ Uh =O on
rh. Since d(u - Uh) = - f + f = O, the maximum principie applies, and
u - Uh attains its maximum on rh. But Uh = O and u= O(h 2 ) on this
boundary, so that u- U 11 is of order h 2 everywhere. The theorem asserts the
same result for u - uh [B20]. · The derivatives of the mapping are ~
they are virtually zero, because of r
1t follows that if we loo k only at the exponent of h, and approximate Q by
same area; if one were inscribed in t
a polygon, then there is no point in going beyond quadratic shape functions
term rh 2 , whose derivati ves do not de~
in computing stresses. Furthermore, even linear polynomials will be correct
the domain. In St. Venant terms, tl
to the best possible order h 2 in the displacements. Probably this is an instance
nonzero and is propagated. Furthern
in which the exponents of h are not sufficient to indicate the true accuracy;
of wavy, with 1cos 0/h 1 instead of c1
the actual computed error with linear elements may be excessive.
possess weak singularities at the cus]
A similar estímate applies to three-dimensional and to higher-order
however, these are smeared over and
Dirichlet problems. 1t is most unusual that the exponent should be odd;
h at the boundary and h2 inside.
th~ average error in the deriva ti ves of u - uh is of the fractional order h 3 12 • The
There are a number of ways to n
estímate itself is nevertheless correct, and in fact the exact solution Uh to the
quadratics. Coordinate change by i
equation on the polygon [which is closer to u than uh is, because ids minimiz-
1 x-y element which has piecewise hy
ing over the space Xb(Qh), which contains S,;] is also in error to order h31 2 . !
been rrientioned. Another possibility
[BlO]. For a rectangular approximation ofthe boundary, which is much more
polygonal approximation · uh [826],
crude, the order becomes h 112 ; the computational results would be com-
l(v) [N3]. In all these techniques it is 1
pletely unsatisfactory. That is the reason for triangular elements.
the error dueto change of domain is
. The explanation for the fractional exponent ! appears to be this: There
atically in and out of r. This is St
1s a boundary !ayer, a couple of elements thick, within which the derivatives
term depends on the area of n - Q
e
are in error by O(h). It is easy to verify that the angle in the figure is of this
further possibility: to average the a
order, so that the true deriva ti ve of u along the chord is O(h) rather than zero.
circumscribed polygons. All these pro
This layer alone accounts for the O(h 3 ) error in energy. Beyond the boundary
are largely untested in practica! prot
layer, the error is quite different: it is of the optimal order h 2 in the displace-
ment, that is, u - uh = O(h 2 ), and it is so smooth that the first derivatives
2. Consider next the equatiort -
are also of order h 2 • The error is much better behaved in the interior than in
condition un= O. In this case there
the boundary, illustrating the smoothing property common to all elliptic
on the wrong boundary; the trial fm
problems. In the finite element approximation, u - uh oscillates from zero to
However, a change of domain does
O(h 2 ) and ba~k along each c-hordPMQ; this singular behavior is damped out
integrations over the true domain ~
rapidly in the direction normal to the boundary. The damping is very visible
instead over an approximate domain
in computations and has been explicitly verified. It represents a kind· of
CHAP.4 SEC. 4.4. APPROXIMATION OF DOMAIN ANO BOUNDARY CONDITIONS 197
St. Venant principie, but one which applies to the geometry rather than, as is
more customary, to the boundary data. The rule is the same: Away from the
ced by polygonal approximation of the
boundary, it is the average which matters and not the local short-wavelength
oscillations.
The situation is not unlike that of mapping the unit circle onto a circle
with a wavy boundary. Suppose the latter boundary is R((}) = 1 + h2 cos (Jjh.
Then in the conforma} map between these "circles," the point (r, (}) goes
into (r + h2 r 11h cos Ofh,O). A11 the activity lies inside a boundary layer
not prove) is suggested by the contin- r > 1 Ch; for smaller values of r, the amplitude of the "wavy" perturba-
)main: -JlUh f on 0\ Uh O on tion term is exponential1y small,
the maximum principie applies, and
But Uh O and u = O(h 2 ) on this
~ everyw~ere. The theorem asserts the
The derivatives of the mapping are O(h) near the boundary. In the interior
: exponentof h, and approximate n by they are virtually zero, because of r 11h. Note that the two circles have the
·ing beyond quadratic shape functions same area; if one were inscribed in the other, there ~ould be an additional
ven linear polynomials will be correct term rh 2 , whose deriva ti ves do not decay but are of lower order h2 throughout
lacements. Probably this is an instance the doinain. In St. Venant terms, the average over the local oscillations is
tfficient to indica te the true accuracy; nonzero and is propagated. Furthermore, if the circle were scalloped instead
elements may be excessive. of wavy, with 1 cos 0/h 1 instead of cos 6/h, then the conforma} map would
~ee-dimensional and to higher-order possess weak singularities at the cusps. In the finite element approximation,
al that the exponent should be odd; however, these are smeared over and the mean error in derivatives is of order
uh is ofthefractional order h 31 2 • The
h at the boundary and h2 inside.
nd in fact the exact solution Uh to the There are a number of ways to recover the optimal convergence rate for
:r to u than uh is, beca use ifis minimiz- quadratics. Coordinate change by isoparametric elements, and Mitchell's
ains Sh] is also in error to order h 3 12 x-yelement which has piecewise hyperbolas (!) as boundary, have already
of the boundary, which is much more been nientioned. Another possibility is to compute a correction term to the
omputational results would be com- polygonal approximation uh [B26], or to modify the original functional
tn for triangular elements. l(v) [N3J. In all these techniques it is not essential that Qh lie inside Q; in fact,
!Xponent fappears to be this: There the error dueto change of domain is partly self-cancelling, ifrh goes system-
,ts thick, within which the derivatives atically in and out of r. This is St. Venant again, that the Ieading error
that the angle (} in the figure is of this term depends on the area of Q - Qh (with algebraic sign), and suggests a
mg the chord is O(h) rather than zero. further possibility: to average the approximate solutions for inscribed and
error in energy. Beyond the boundary circumscribed polygons. All these proposals, except for the isoparametric one,
f the optimal order ft2 in the displace- are largely untested in practica! problems.
s so- smooth that the first derivatives
better behaved in the interior than in 2. Consider next the equation -A u + qu = f, with natural boundary
ing property common to all elliptic condition un O. In this case there is no question of imposing conditions
mation, u uh oscillates from zero to on the wrong boundary; the trial functions are unrestricted at the boundary.
; this singular behavior is damped out However, a change of domain does enter, if it is inconvenient to carry out
)oundary. The damping is very v.isible integrations over the true domain Q. If the potential energy is computed
tly verified. It represents a kind of instead over an approximate domain Qh, the effect is to introduce a new defi-
198 VARIATIONAL CRJMES CHAP. 4 SEC. 4.4. APPROXIMATION OF D(
nition of the potential energy: LEMMA 2.2

Suppose that p = area(T Th)fa
J"(v) ah(v, v)- 2(/, v)h depending only on the degree of the p(
= JJcv; + v;+ qv 2 2fv)dxdy.
Qll
-
ff p;
T-T 4
P;+pz
Assuming.that uh minimizes the true functional I over Sh and that uh mini-
mizes 1\ the problem is to estímate eh uh- uh. Summing over all boundary tria1
Mathematically, this is essentially the same question which arises for include also the interior triangles, the
Gaussian quadrature: the integrals are computed inexactly. Therefore, we
apply a variant ofthe identity ofLemma 4.1; it follows directly from the van-
ishing of the first variations at u, u\ and uh that
JJ (e~)2 + (e~Y
n-íl"
Substituting back into the estímate fo
The first term on the right, by the Schwarz inequality, is not larger than
In other words, the error in strain ener

n satisfies
The other terms on the right amount to an integral o ver the skin n- Qh:
B J J(-uxe~ u:ve~ queh + feh) dx dy. The first term reflects the strain enei
n-n4 new. The second term is the interes1
Suppose, for example, that Qh is a poi
Assuming that ux, u:r, q, andf are bounded, the Schwarz inequality this time ratio p to the area of the neighboring
involves the area A of the skin: error in changing to a polygon is of tht.
tia! boundary conditions. We conjectut
For isoparametrics instead of subpar
and their product h2 k-I is dominated
strain energy of order h2 k- 2 • Therefo
The remaining problem is to estímate this last integral in terms of in with curved regions, whether the boun
other words, to relate the size of eh in the skin to its size in the interior. For
arbitrary functions this would be impossible. The function uh u\ however, 3. The next problem is that of a
is by no means arbitrary: -¡n each triangle it agrees with sorne polynomial, condition, u= g(x, y) on r. This con
and polynomials cannot suddenly explode-a bound in Qh implies a bound the true admissible space 3C1. Any tw
in n. by a function v0 in Xó; the boundary
We denote by T a typical curv~d triangle at the boundary, and by Th Suppose that the situation is the s;
the inscribed straight ·triangle; their difference T -:- Th is one of the pieces trial functions assume the same valm
that go to make up n Qh. The trial functions are polynomials on each vh = g, that is too much to expect fron
triangle T, and we art: dealing with the error eh = uh ijh which is caused by Then any two trial functions differ b:
integrating only over Th. This error is itself a polynomial P(x, y) on each T, these functions vi form a space Si, w
and obeys the following lemma given by Berger. neous space Xá.
CHAP. 4 SEC. 4.4. APPROXIMATION OF DOMAIN AND BOUNDARY CONDITIONS 199
LEMMA 2.2
Suppose that p = area(T- Th)jarea(Th). Then there is a constant e,
!(/, v)h
depending only on the degree of the polynomial P(x, y), su~h that
: + qv 2 - 2/v) dx dy.
JJ P2; + P; + pz < eP JJP2; + P; + pz.
T-Th Th
: functional 1 over S,; and that fth mini-
?h = uh _ fth. Summing over all boundary triangles and increasing the right side to
ly the same question which arises for include also the interior triangles, the lemma gives
ue computed inexactly. Therefore, we
ma 4.1; it follows directly from the van-
and i? that
JJ (e~)z + (e;)z + (eh)2 < cpEz.
Q-Qh
Substituting back into the estimate for 1 B ¡, the identity now reads
hwarz inequality, is not larger than
In other words, the error in strain energy dueto integration over Qh instead of
Q satisfies
t to an integral o ver the skin Q - Qh:
The first term reflects the strain energy in u - uh and contributes nothing
new. The second term is the interesting one, and it is purely geometrical.
Suppose, for example, that Qh is a polyg'on. Then the area A is O(h 2), and the
mded, the Schwarz inequality this time ratio p to the area of the neighboring triangles is O(h). Thus the strain energy
error in changing to a polygon is of the same arder h 3 for natural as for essen-
tial boundary conditions. We conjecture that there is again a boundary layer.
~)' +(e;)' + (e')'T'·. For isoparametrics instead of subparametrics, A is O(hk) and pis O(hk- 1 ),
and their product h2 k-J is dominated by the ordinary approximation error in
strain energy of order h 2 k-z. Therefore isoparametrics should be successful
te this last integral in terms of E, in with curved regions, whether the boundary conditions are essential or natural.
the skin to its size in the interior. For
>ssible. The function uh - u\ however, 3. The next problem is that of an inhomogeneous essential,boundary
.ngle it agrees with ~ome polynomial, condition, u= g(x, y) on r. This condition is satisfied by every member v of
1lode-a_ bound in Qh implies a bound the true admissible space JC1. Any two admissible functions therefore differ
by a function v 0 in JC¿; the boundary values of v 0 are zero.
triangle at the boundary, and by Th Suppose that the situation is the same for the trial functions vh in Sh: All
iifference T - Th is one of the pieces trial functions assume the same values on the boundary r (not necessarily
al functions are polynomials on each vh = g, that is too much to expect from polynomials, but say vh = gh instead).
· error eh = uh - fth which is caused by Then any two trial functions differ by a function V~, which vanishes on r;
.itself a polynomial P(x, y) on each T, these functions v~ forma space S~, which is a subspace of the true homoge-
by Berger. neous space JC¿.
200 VARIA TIONAL CRIMES CHAP.4 SEC. 4.4. APPROXiMA TION OF DO
THEOREM 4.3 b. All tria] functions vh must assur

that their differences belong to 3CJ. ]
Suppose that u minimizes J(v) over 3C'; and uh minimizes over Sh. Then the
should be determined on r by their va
vanishing of the first variation is expressed by composed of flat edges, and we assum
(22) a(u, V0 ) = (f, V0) for all v 0 in 3CJ, this will be the case.
(23) a(u\ vi)=(/, vi) for aíl vi in Sí. We hesitated to give the usual estima
As in Theorem 1. 1, uh has the additional mJnimizing property requires that the solution be smooth (
domains, a singularity in the derivative
(24) a(u- u\ u- uh) = min a(u- v\ u- vh). order equations, u normally just fails
vh in Sh ·
interior angle. The first derivatives v.
The proof is copied from Theorem 1.t. Since u and uh are minimizing, any h"ia.; the strain energy error will be h2 '
perturbations EV 0 and fVÍ must increase the functional /: usual order f1Hk- 1 > can be preserved, 1
(better) by introducing. special trial fu
/(u)< /(u + fVo) and I(uh) < l(uh + fVÍ). Suppose that a nonpolygonal Q i
directly in the x-y plane, or else in the
Expanding, the coefficients off must vanish, and these are the virtual work formation. Jf the computations ate <l
equations (22) and (23). nodes completely determine all the be
To prove that uh minimizes the error in strain energy, write be applied: The final error is dominat
proximate domain, plus the error stud¡
a(u - v\ u - vh) = a(u - uh + uh - v\ u - uh + uh - vh)
There remains a further possibility
= a(u - u\ u - uh) + 2a(u - u\ uh - vh) given x-y coordinates and to interpoh
+ a(uh - v\ uh - vh). ably the integrations would be done r
elements at the boundary, although 1
The middle term must vanish automatically, subtracting (23) from (22) and out by exact algebraic operations. In
choosing vi = uh - vh. The last term will be positive unless vh happens to from one tria] function to anothex, ex<
equal uh; therefore, the minimum occurs at this point, and the theorem is the differences between the trial funct
pro ved. Ritz rules are therefore violated, and
are:
COROLLARY
Suppose that n is a polygon, and the essential condition u = g(x, y) on r a. How large can these differences
is interpolated in the finite element method: at al! boundary nodes, the tria! b. Js the error u - uh affected by
functions satisfy vh(z) = g(zj) [or more generally Djvh(z) = Djg(z)]. does uh still yield an optimal approxi
Then
These questions are relevant even w
(25)
u= g =O on r. In this case the tri~
Therefore, the error estimate in the finite element method is reduced to the nodes; question a asks how large the)
standard approximation estimate of u - u1 • the answer is independent oftheir deg1
same whether vh is linear, vanishing
Proof The conditions that must be checked are triangles, or quadratic or cubic, vanisl
a. u1 must lie in S\ so that the choice vh = u1 is possible in (24). Since u1 each piece of r. It is even the same fe
satisfies the boundary conditions imposed in the corollary, that is, u¡(z) = true tangential derivatives set to zero
u(z j) = g(z) at boundary no des, this requirement is me t. There exists a particular tria! function
CHAP.4 SEC. 4.4. APPROXIMATION ·OF DOMAIN ANO BOUNDARY CONDITIONS 201
b. All trial functions vh must assume the same boundary values on r, so

that their differences be1ong to X~. In other words, the trial functions vh
:JC1t and uh minimizes over Sh. Then the
s·sed by
r
should be determined on by their values at the boundary nodes. Since is r
composed of flat edges, and we assume that a conforming eleinent is used,
for all v0 in Xó, this will be the case.
for all vi in Si. We hesitated to give the usual estima te h2 (k-t >for (25), since such an es ti mate
'11 minimizing property requires that the solution be smooth (u in :JCk). At the corners of polygonal
domairis, a singularity in the deriva ti ves of u is almost a u toma tic. For second-
min a(u vh, u - vh). order equations, u normally just fails to lie in :JCl+nlo:, where rx is the largest
'in S"
interior angle. The first derivatives will actualJy be approximated to order
1.1. Since u and uh are minimizing, any hnlo:; the strain energy error will be h 2 n1o:. In Chapter 8 we consider how the
:ase the fu'nctional 1: usual order h 2 (k-n can be preserved, by refining the mesh at the corners or
(better) by introducing special trial functions with the right singularities.
Suppose that a nonpolygonal Q is ·replaced by a polygonal Qh-either
vanish, and these are _the virtual work
dir~ctly in the x-y plane, or else in the e-r¡ plane after an isoparametric trans-
formation. Jf the computations are done on 0\ then again the boundary
nodes completely determine all the boundary values, and the corollary can
·or in strain energy, write
be applied: The final error is dominated by the error in u- u1 over the ap-
+ uh v\ u- uh + uh vh) proximate domain, plus the error studied earlier due to change in domain.
There remains a further possibility: to work on a curved domain Q in the
u uh) + 2a(u u\ uh- vh) given x-y coordinates and to interpolate vh = g at boundary nodes. Presum-
v\ uh- vh). ably the integrations would be done numerically, especially o ver the curved
elements at the boundary, although the experiments in [Cl4] were carried
tically, subtracting (23) from (22) and
out by exact algebraic operations. In this method the boundary values vary
will be positive unless vh happens to
from one trial function to another, except of course at the nodes themselves;
;urs at this point, and the theorem is
the differences between the trial functions are small but not zero on r. The
Ritz rules are therefore violated, and the theoretical questions which arise
are:
'w essential condition u = g(x, y) on r a. How large can these differences on the boundary beco me?
?thod: at al! boundary nodes, the tria! b. Is the error u - uh affected by this worst possible behavior on r, or
!Ore generally Divh(z¡) Dig(z¡)]. does uh still yield an optima) approximation to u from Sh?
- a(u - u1 , u u1 ). These questions are relevante ven with homogeneous boundary conditions,
u =g =O on r. In this case the trial functions are zero at the boundary
nite element method is reduced to the nodes; question a asks how large they can be on the rest of r. Surprisingly,
U¡. the answer is independent of their degree. The maximum size of vh on r is the
same whether vh is linear, vanishing only at the vertices of the boundary
e checked are triangles, or quadratic or cubic, vanishing at one or two additional nodes on
1ice vh u1 is possible in (24). Since u1 each piece of r. It is even the same for the cubic space Z 3 , with values and
osed in the corollary, that is, u¡{z¡) = true tangential derivatives set to zero at boundary vertices. The answer is:
requirement is met. There exists a particular tria! function Vh of unit energy whose mean value on
202 VARIATIONAL CRIMES CHAP.4 SEC. 4.4. APPROXIMATION OF DC
r is of.order h 312 : paper, a second rather surprising fact

Iimited by the existence of an undesira
f f (V~)z + (V;)z
o
l'
the true displacement error u u" is
words, the boundary values of the Ri1
lt must be the normal derivatives at t
To an~wer question b, that is, to find a bound on u - uh, it will be neces- We consider now the possibility of
sary to eitend the classical Ritz theory. It is true that the first variations the right side of (26). If the boúndary
still vanish ·at the minimizing functions u- and uh: zero, then even though their average
the integral itself will tend to be an o
a(u, v) (/, v) for v in Je8, lation does occur for sorne elements, bz
for example, when the condition u O
However, even with g = O, Sh is nota subspace of JeA; in the inhomogeneous vertices, and at the points on r which
case g -::f::. O, Sg is not contained in Jeó. Nevertheless, we may still appeal to in vh is a cubic s(s h/2)(s - h) in th
Green's formula: Sin~e - Au f, at boundary nodes, and this cubic ha
vh oscillates around zero. Correspond
a(u, vh) = JJ uxv~ + u~v; JJfvh + Jr unvh ds. in energy dueto interpolating u= g i
occur for cubics [810], unless the nod
o o
give vh an average val ue of zero.) Sinct
Subtracting a(u\ vh) = (f, vh), the boundary term remains: quadratics isO(h 2 ) in energy, the boUJ
ceptable. We don't know whether t
(26) boundary triangles; namely integratio
x-y plane, has a useful future.
lt is this term which controls the error due to violating the essential boundary 4. The final boundary problem on
conditions. natural condition
We propose to show that question b usually has the worst possible answer:
The order of error will be decided by the trial function Vh which is largest un + tX(x, y)u =' b(x,
on r. In fact, with vh Vh in (26), that is · easy to see. Since Vh has unit
energy, the Ieft side is bounded by the square root of a(u- uh, u uh); For Poisson's equation, this Ieads to ti
this is the Schwarz inequality. On the right side, Vh is of average value· h3lz, the potential energy:
and unless there is sorne special cancellation (see below !) we must expect the
integral of unVh to be of the same order. Therefore, /( v) JJ v; + v; + Jr
o
Strictly speaking, there are three

This woul~ mean that interpolating. the boundary conditions and integrat- these boundary integrals: polynomial
ing over Q is no better, even for an element of high degree, than replacing gration, and change of domain. The la
Q by a polygon Qh. (At least, the exponent of h is no better: we ha ve no estí- it wiJI be nearly impossible to compute
mate of the constants involved.) One explanation is this: Each polynomial numerical procedure will be adopted, a
which interpolates u Oat boundary nodes will vanish along a curve which on Qh rather than n, the integrals will
runs close to the true boundary, but the two curves may differ by as much as a detailed analysis of these errors on
O(h 2 )-th~ same distance as for a polygon Qh. are no larger than those already studie1
It is shown in [810] that h312 is also an upper bound for the error due to very desirable.
interpolating the boundary conditions. From the manipulations in that The novel question is that of polyn
CHAP.4 SEC. 4.4. APPROXIMATION OF DOMAIN ANO BOUNDARY CONDITIONS 203
paper, a second rather surprising fact emerges: Even though the accuracy is
limited by the existence of an undesirable Vh of order h 312 on r, nevertheless
the true displacement error u- uh is actually of order h3 on r. In other
words, the boundary values of the Ritz solution will look deceptively good.
It must be the normal derivatives at the boundary which are most in error.
tnd a bound on u tJ\ it will be neces- We consider now the possibility of cancellation in the integral J unvh ds on
)ry. 1t is true that the first variations the right side of (26). If the boundary values vh happen to oscillate around
ts u and uh: · zero, then even though their average modulus 1vh 1 may be as large as h 312 ,
the integral itself will tend to be an order of magnitude smaller. This oscil-
lation does occur for sorne elements, but not for al!. It occurs for quadratics,
for example, when the condition u O(or u g) is interpolated at boundary
subspace of x¿; in the inhomogeneous vertices, and at the points on r which lie halfway between. The leading term
~. Nevertheless, we may still appeal to in vh is a cubic s(s - hf2)(s h) in the are length variables on r, vanishing
at boundary nodes, and this cubic, has average value zero. In other words,
vh oscillates around zero. Correspondingly, Scott has shown that the error
); = ff fvh +Ir unvh ds.
o
in energy due tointerpolating u = g is improved to O(h 5 12 ). (This does not
occur for cubics [BIOJ, unless the nodal point~ are specially placed on r to
give vh an average value of zero.) Since the ordinary approximation error for
.ndary term remains: quadratics isO(h 2 ) in energy, the boundary error now appears perfectly ac-
ceptable. We don't know whether this alternative to isoparametrics for
ds for all vh in Sh. boundary triangles, namely integration over curved elements directly in the
x-y plane, has a useful future.
jue to violating the essential boundary
4. The final boundary problem on our list arises from an inhomogeneous
natural condition
' usually has the worst possible answer:
the trial function Vh which is Jargest un+ a(x, y)u = b(x, y) on r, a O.
1at is easy to see. Since Vh has unit
he square root of a(u uh, u uh); For Poisson's ·equation, this Ieads to the appearance of boundary integrals in
right side, Vh is of average value h 31z, the potential energy:
lation (see below!) we must expect the
~r. Therefore,
l(v) JJ v; + v; + Jr av 2
- 2 JJfv - 2 Jr bv.
n n
Strictly speaking, there are three questions to consider in relation to

.e boundary conditions and integrat- these boundary integrals: polynomial approximation on r, numerical inte-
ement of high degree, than replacing gration, and change of domain. The last two arise beca use, when r is curved,
1ent of h is no better: we ha ve no esti- it will be nearly impossible to compute the boundary integrals exactly. Sorne
explanation is this: Each polynomial numerical procedure will be adopted, and in case the trial functions are defined
1odes will vanish along a curve which on ,Qh rather than n, the integrals will be shifted to rh. we pro pose to omit
! two curves may differ by as much as a detailed analysis of these errors on rh, because we are confident that they
gon .Qh. are no larger than those already studied on the interior .Qh. A proof would be
an upper bound for the error due to very desirable.
s. From the manipulations in that The novel question is that of polynomial approximation on r. It arises in
204 VARIATIONAL CRIMES CHAP.4
the ust,1al way: The method minimizes the error in energy,
a( u - u\ u uh) = min JJ(u - vh); + (u - vh); + Jr rx(u - vh) 2 ds.

n
For a finite element space of degree k - 1, and the choice vh u1 , the inte-
gral overr',a is of order h2ck-u. Fortunately, the integral o ver r is even of
higher order; the rate of convergence is not reduced by the presence of boundary
integrals. This is obvious if r is composed of straight lines; the restriction of
the trial functions to r yields a complete polynomial of degree k - 1 in the
boundary variables, and the integral o ver r is of order h 2 k. Nitsche [6] has
obtained the same result for a curved boundary.
5 STABILITY
With isoparametric elements the boundary in the t;-r¡ plane is straight,

and all the boundary integrals can be computed directly by numerical inte-
gration. In fact, this seems to be the main conclusion of the boundary theory
for second-order problems: The isoparametric technique establishes a local
change of coordinates into normal and tangential directions, which is more
accurate and com•enient than was el'er achiel'ed with finite differences. 5.1. INDEPENDEÑCE OF THE BASI:
In one sense, there should be no 1

elliptic variational problem depends
f and all pres~ribed boundary displa,
strain energy in u is small. In. other wo
more, regardless of the choice of the
Ritz approximation úh is automaticall:
projects u onto S\ and this can only r
1.1). Therefore, the approximate pro
should always be possible to construc
calculation of uh.
The difficulty is that to achieve tot
ing as the mesh size is reduced to zer
be too fancy. For the standard fiv
systematically use all the constraints
from consistency with the Laplace o
the matrix, as well as the first momc
elimination process. For finite elemen
responding constraints will be extrem
analyze the standard e1imihation algm
error as h O but intending that tht
proof as possible; unnecessary nume
It is known that the key to stabilit
of the basis elements tp r Even thoug
choice of basis, the roundoff which en
2
CHAP.4
s the error in energy,
: - 1, and the choice vh = u1, the inte-

mately, the integraltbver r is even of
not reduced by the pre~ence of boundary
o sed of straight lines; the restriction of
lete polynomial of degree k - 1 in the
over r is of order h 2 k. Nitsche [6] has
boundary.
5 STABILITY
Joundary in the e-t¡ plane is straight,

! computed directly by numerical inte-
min conclusion of the boundary theory
zrametric technique establishes a local
i tangential directions, which is more
chieved withfinite differences. 5.1. INDEPENDENCE OF THE BASIS
In one sense, there should be no problem of stability. The solution of an

elliptic variational problem depends continuously on the data; if the load
f and all pres~ribed boundary displacements and forces are small, then the
strain energy in u is small. In.other words, the problem is well posed. Further-
more, regardless of the choice of the subspace Sh, the strain energy in the
Ritz approximation üh is automatically bounded by that in u; the Ritz method
projects u onto S\ and this can only reduce the energy (corollary to Theorem
1.1) .. Therefore, the approximate problems are uniformly well posed, and it
should always be possible to construct a "stable" procedure for the numerical
calculation of uh.
The difficulty is that to achieve total numerical stability-conceding noth-
ing as the mesh size is reduced to zero-the required algorithm may simply
be too fancy. For the standard five-point difference scheme, one could
systematically use all the constraints on the coefficient matrix which come
from consistency with the Laplace operator: The sums along each row of
the matrix, as well as the first moments, vanish at all stages of the Gauss
elimination process. Por finite elements of irregular shape, however, the cor-
'
responding constraints will be extremely difficult to use. Therefore, we shall
analyze the standard elimination algorithm, accepting an increase in roundoff ·
error as h - t O but intending that the numerical stability should be as fool-
proof as possible; unnecessary numerical instabilities will not be accepted.
It is known that the key to stability lies in the uniform linear independence
of the basis elements rp r E ven though uh is complete] y independent of the
choice of basis, the roundoff which enters its computed value uh does depend
205
206 STABILITY CHAP. 5 SEC. 5.1.
on this choice. (Mikhlin refers to the "strong minimality'' of the basis, and roundoff error would not mean a lat
Soviet authors are generally careful to consider Ílumerical methods from is small, but at the particular node z1
this point of view, as well as for stability with respect to coefficient changes tively poor .
in the differential equation.) To quantify the linear independence of the basis, . The standard rule in scaling aspa
the standard procedure is to consider the Gram matrix (in physical terms, the all the diagonal en tries equa/: in the f
mass ma~rix) whose entries are the inner products of the basis elements: governed by the stiffness matrix, thc
1\ in the basis elements are equal. This
D which is nearly optimal [V3]. The qt
lar mesh for finite elements of Hermi
and deriva ti ves appear among the unk1
In many cases we shall prefer, because the Ritz method operates always with is to keep the unknowns dimensional
the energy a(v, v) which is intrinsic to the problem, to work with the stiffness hv~ rather than v~ itself.
matrix: K1k = a(rp1 , rpJ. Both matrices are Hermitian and positive definite. Fried [F17] has observed that sorne
As a first measure of the independence of the basis, we propase the while others cannot. Take, for example
condition number · say -u" f with piecewise linear ei
K(M) = Amax(M). length h, except that the first is of len1
Amin(M) a natural condition u'(O) = O or an es,
tiónal respectively to
If the basis were orthonormal, M would be the identity matrix and K l.
This is not the case for finite elements, but the important fact on a regular -e
mesh is that thefinite element basisfunctions are uniformly linear/y independent: 1 +e
K(M) constant. In other words, the eigenvalues of M are all of the same
order. As Schultz observed, this makes piecewise polynomials infinitely more
stable for least-squares approximation (which is just the Ritz method applied
to the differential equation u= f of order zero) than the sequence 1, x, y,
-1 2
J
x 2 , ••• of ordinary polynomials. The condition number for this sequence, The largest eigenvalue grows with e,
whose mass matrix is the Hilbert matrix (1.5), would increase exponentially. order N- 2 , N order of the matrix.
There are applications in whieh a more realistic meas u re of independence as e oo.
is provided by the optimal condition number Scaling produces diagonal entries

the matrices unchanged:
c(M) = min K(DMD).
D 2
Here D may be any positive diagonal matrix, corresponding to a rescaling -2.../c/1 +e
of the basis elements; e = 1 if the original basis is only orthogonal rather (
than orthonormal. With irregular elements, sorne trial functions may be
much smaller than others, and this ·rescaling could make a significant differ-
ence to th.-e condition number. We view rescaling in the following way:
When the condition number of Mor of K is improved by rescaling, that suggests
DK,.,D =( -,J2~1 +e -A/
that the numerical difficulties which are thereby cured were never likely to
propagate throughout the whole problem. Rescaling cures local difficulties;
if a particular rp1 is badly out of scale, say the diagonal entry Kn is too small, The Iargest eigenvalues are now bou
then roundoff will destroy confidence in the computed value of the weight the size of Amia· In this case there is
Q1 • The effect which is measureddepends on the choice of norm; such a e has very little effect on Amia• which i
CHAP. 5 SEC. 5.1. INDEPENDENCE OF THE BASIS 207
: "strong minimality" of the basis, and roundoff error would not mean a large mistake in the energy, because rp1
to consider numerical methods from is small, but at the particular node z1 the approximation may be compara-
,ility with respect to coefficient changes tively poor.
tify the linear independence of the basís, The standard rule in scaling a sparse positive definite matrix is to keep
the Gram matrix (in physical terms, the al! the diagonal entries equal: in the finite element case, which is primarily
mer products of the basis elements: governed by the stiffness matrix, that means that the strain energies Kn
in the basis elements are equal. This rule yields a diagonal scaling matrix
D which is nearly optimal [V3]. The question of scaling arises even on a regu-
lar mesh for finite elements of Hermite type, in which both function values
and derivatives appear among the unknowns Qr Perhaps a natural procedure
e the Ritz method operates always with is to keep the unknowns dimensionally correct, by using the rotation (}
the problem, to work with the stiffness
hv~ rather than v~ itself.
es are Hermitian and positive definite. Fried [F17] has observed that sorne problems can be successfully rescaled,
~ndence of the basis, we propose the
while othe,rs cannot. Take, for exarnple, a two-point boundary-value problem,
say -u" f with piecewise linear elements. If the subintervals are all of
length h, except that the first is of length h/c, then the stiffness matrices with
a natural condition u' (O) = O 9r an essential condition u(O) == O are propor-
tional respectively to
>Uld be the identity matrix and K l.
ts, but the important fact on a regular -e
-::tions are uniformly linear/y independent:
l +e 1 +e -1 )
e eigenvalues of M are all of the same
-1 2 : or K,.,= ( 1 ~ :
:s piecewise polynomials infinitely more )
1 ( which is just the Ritz method applied
order zero) than the sequence 1, x, y,
! condition number for this sequence, The Iargest eigenvalue grows with e, whereas the smallest is of the usual
rix (1.5), would increase exponentially. order N- 2 , N order of the matrix. Therefore, the condition deteriorates
no re realistic measure of independence as e~ oo.
umber Scaling produces diagonal entries all equal to 2, and leaves the tail of
the matrices unchanged:
~n K(DMD).
.l matrix, corresponding to a rescaling 2

-,.}2/1 +e )'
~iginal basis is only orthogonal rather 2 -1
ements, sorne trial functions may be -1
scaling could make ásignificant differ-
view rescaling in the;. following way:
K is improved by rescaling, that suggests DK,,D ( -,.)2~1 +e 2
we thereby cured were never likely to
'em. Rescaling cures local difficulties;
say the diagonal entry Kn is too small, The largest eigenvalues are now bounded; the diffiéult question is always
: in the computed value of the weight the size of Amín· In this case there is no problem with the second matrix;
pends on the choice of norm; such a e has very little effect on Amin' which is again of order N- 2 • The first matrix,
208 STABILITY. CHAP. 5 SEC. 5.2.
' howev~r, begins with a block of order 2 which is nearly singular for large e, it may require double precision to pro
and the smallest eigenvalue for the whole matrix is necessarily below the It has been objected that the cond
eigenvalues for this block. Thus Amín O as e - We conclude that
(X),
with the load vector in a particular p
rescaling tends to be successful with an essential condition but not with a it must be, at least partly pessimistic.
naturai condition. tive numbers, which the computer e
This is the numerical anaÍogue of a physical situation which is well under- which automatically take into accoun
stood: A,r\stiff system connected to earth by a soft spring is extremely un- problem. We accept that for on-the·
stable, whereas a firm connection (essential condition) is stable. It should be calculation or changing to double pt
emphasized that while ill-conditioning from numerical sources is to be kind are the best. However, our goa
watched and degenerate elements are to be avoided, there will arise cases of of sensitivity, and for this purpose th(
physical ill-conditioning which cannot be altered-except perhaps by a change The rule to which it leads, that the ro u
from stiffness method to force method, with stresses as unknowns. This situa- with h-zm, is definitely obeyed in noJ
tion occurs when there is a sharp change in the stiffness of the medium, or the dependence is not on the total nut
when Poisson's ratio approaches the limit v = f of incompressibility [Fl8]. the number of elements per side. In oth
For shells there will be difficulties with large stiffness in the thickness direc- ence on the number of spatial. dimen:
tion, or with extremely thin shells. Roughly speaking, extensional modes can
involve the ratio (rfth) 2 [20], whereas bending modes bring out the fourth-
5.2. THE CONDITION NUMBER
order (plate-like) aspect of the problem and the roundoff is proportional to
h-4.
Our goal is to estímate the ratio 1<
To conclude this introduction, we must clarify the connection between and mínimum eigenvalues of the stifi
the condition number of a matrix and its sensitivity to perturbations. The
extreme case occurs when the given linear system KQ = F has a load vector THEOREM 5.1.
F which coincides with a unit eigenvector of K, in particular with the eigen-
For each variational problem and
vector Vmax corresponding to lmax· The solution is then Q = Vmax/lmax·
constan! e such that
Suppose this load vector is slightly perturbed by the eigenvector at the
other extreme, so that F = Vmax + EVmin· Then the solution becomes Q (1) ,K
Vmax/Amax + EVmin/Amin· The relative change in the solution is therefore
The constant depends inversely on the
tinuous problem, and it increases if
degenerate.
Thus a perturbation in F of arder E is amplified to a perturbation in Q of We give two proofs, both correc1
arder KE. only to the special case of a regular
lt is easy to see that this case is extreme, and that always Toeplitz) analysis permits a quite prec
condition number. The second applü
IO.QI K!O.FI. To begin, suppose that the mesh i
TQT IFI given problem Lu = f are constants.
Proof jFj jKQj < AmaxlQI, and jO.Fj = jKO.QI > Amin!O.Qj. also the mass matrices m1, will all be iC
K and M are essentially Toeplitz ma;
The consequences of this simple inequality, and a parallel one with are constant along each diagonal: K
1O. K lf! K 1 on the right si de [20], are very far-reaching. If the condition number between the column and row índice!
is K "'"' 1o-s, as many as s significan! digits may be lost during the solution of case to an integral operator f K(s t
KQ F. If this is clase to the number of digits carried by the computer, in which the kernel depends only on
eHAP. 5 SEC. 5.2. THE CONDITION NUMBER 209
r 2 which is nearly singular for large e, it may require double precision to protect the accuracy ofthe computedresult.
whole matrix is necessarily below the It has been objected that the condition number might have nothing to do
in ~O as e~ co. We conclude that with the load vector in a particular problem. The constant K may be, in fact
an essential condition but not with a it must be, at least partly pessimistic. Irons [12] has proposed several alterna-
tive numbers, which the computer can form as elimination proceeds, and
1 physical situation which is well under- which automatically take into account the scaling and the data/ ofthe given
arth by a soft spring,is extremely un- problem. We accept that for on-the-spot decisions-terminating a specific
ential condition) is sút:ble. It should be calculation or changing to double precision-computable quantities of this
ing from numerical sources is to be kind are the best. However, our goal here is to find sorne a priori measure
to be avoided, there will arise cases of of sensitivity, and for this purpose the condition number is very satisfactory.
be altered-except perhaps by a change The rule to which it leads, that the roundoff error will increase proportionately
, with stresses as unknowns. This situa- with h- 2 m, is definitely obeyed in normal computations. We emphasize that
mge in the stiffness of the medium or the dependence is not on the total number of elements in the domain; it is on
limit v = f of incompressibility [F,l8]. the number of elements per side. In other words, there is no significant depend-
:h large stiffness in the thickness direc- ence on the number of spatial. dirriensions.
ughly speaking, extensional modes can
bending modes bring out the fourth-
m and the roundoff is proportional to 5.2. THE CONDITION NUMBER
Our goal is to estímate the ratio K lN(K)/l 1 (K) between the maximum
! must clarify the connection between
and mínimum eigenvalues of the stiffness matrix.
td its sensitivity to perturbations. The
near system K Q F has a load vector THEOREM 5.1.
ctor of K, in particular with the eigen-
The solution is then Q--:- Vmax/lmax- For each variational prob!em and each choice of finite element there is a
perturbed by .the eigenvector at the constan! e such that
Vmin· Then the solution becomes Q = (1)
hange iD the solution is therefore
The constan! depends in verse/y on the smallest eigenvalue l 1 · of the given con-
Ké. tinuous problem, and it increases if the geometry of the elements becomes
degenerate.
s amplified to a perturbation in Q of We give two proofs, both correct but rather informal. The first applies
only to the special case of a regular mesh, and illustrates how Fourier (or
reme, and that always Toeplitz) analysis permits a quite precise computation of the eigenvalues and
condition number. The second applies to finite elements of arbitrary shape.
K ióFJ. To begin, suppose that the mesh is regular and that the coefficients in the
IFI
givert problem Lu f are constants. The element stiffness matrices k 1, and
!óFI IK óQI > ArninlóQI. also the mass matrices m 1, will all be identical. This means that after assembly,
K and M are essentially Toeplitz matrices. The entries of a Toeplitz matrix
inequality, and a parallel one with are constant along each diagonal: K¡¡ depends. only on the difference j i
far-reaching. Ifthe condition number between the column and row índices. (This corresponds in the continuous
;gits may be lost during the solution of caseto an integral operator J K(s- t)f(t) dt, in other words toa convolution,
er of digits carried by the computer, in which the kernel depends only on s- t.) Although this property may be·
lost at boundaries, particularly with natural boundary· conditions, we want The same technique applies to the
to show how calculations with. Toeplitz matrices are still useful and easy. h/6 along each row. The eigenvalues i1
Suppose we compare two stiffhess matrices, both arising from -u" f
with piecewise linear elements. The first is of order N, and it is constrained J.l(O) = ~ e-:te + ~h +
by u(O) u'(n) O; it has been the basic example throughout the book.
The second is formed with no boundaries whatsoever, extending the interval Since cos (} ranges from 1 to 1, the (
[0, n] all the way to oo, oo) by adding on more and more element matrices: h. Notice the way in which the mass m
2 -1 the correct upper bound on the large
bound on Amin(M). The condition nur
-1
1 1 2 -1
K x:(M) x:(M«
h 2 -1
2 -1
Since this is independent of h, the piect
independent uniformly as h - ¡ . O. Tht:
Since K is formed by constraining K=-eliminating all elements outside
from the rigid-body motions still allow
[0, n] and imposing the essential condition Q0 0-the extreme eigenvalues
-u"= O on the whole 1ine, corres¡:
o.f K are enclosed by those of Koo:
(... 111 ...) which is eliminated from t
boundary cÓndition. Therefore, we shc
the mass matrices to achieve a rigorou
The same will be true for the mass matrices, and therefore linear independence First, to give a better understandin:
of the basis can be tested first on an infinite interval.
It is simple to work with the Toeplitz matrix K""', which can also be de- l. Bilinear elements on a square n
scribed as a discrete convolution matrix.Its eigenvectors are pure exponentials, known at every node, and a typical ro
just like the eigenfunctions of any constant-coefficient differential equation. on the main diagonal and eight l's o:
We try the vector whose components are v1 efi8 , and apply the matrix KQ Fis
do it:
. K oovj l ( -ie +' 2

=h-e where (j', k') represent the eight near
square grid. Again the eigenvectors v
Therefore,, the eigenvalue is there are two frequencies: the comp¡
the eigenvalues,
3hKoovjk = (8 é8 e-te-
er<e-'~'> e-i(B-!p
Since this_number ranges between O and 4/h, we conclude that
The extreme eigenvalues come from m
o sion in parentheses. Since it is linear
occur where these equal ± 1, and we f
In this case the result coincides with the conclusion of Gerschgorin's theorem
(1.4): Every eigenvalue lies in the circle with center given by the main diagonal
Ku 2/h and radius given by the absolute row sums Ei~i 1Kii 1 = 2/h. In
general, Gerschgorin will not be nearlyso good: Even for the bilinearelements
below, with off-diagonal entries all ofthe same sign, it is not precise. Gerschgorin's argument would give An
CHAP. 5 SEC. 5.2. THE CONDITION NUMBER 211
1 natural boundary conditions, we want The same technique applies to the mass matrices, which have h/6, 4h/6,
:plitz matrices are· still useful and easy. h/6 alóng each row. The eigenvalues In the infinite case are
ss matrices, both arising from -u" = f
first is of order N, and it is constrained
1e basic example throughout the book.
iaries whatsoever, extending the interval
ing on more and m o~~ element matrices: Since cos (} ranges from -1 to 1, the eigenvalues of M lie between h/3 and 00
h. Notice the way in which the mass matrices are better: We obtain not only
the correct upper bound on the largest eigenvalue, but also a good lower
bound on Amin(M). The condition number of M is given very accurately by
-1 2 -1
h
-1 2 -1 K(M) < K(Moo) = h/ 3 = 3.
Since this is independent of h, the piecewise linear roof functions are linearly
Koo-eliminating all elements outside independent uniformly as h ------+ O. The zero lower bound for Amin(K) arose
iition Q 0 = 0-the extreme eigenvalues from the rigid-body motions still allowed in Koo-a constant function satisfies
-u" = O on the whole line, corresponding to a discrete eigenvector of
(... 111 ... ) which is eliminated from the finite matrix K only by the essential
boundary condition. Therefore, we shall eventually have to argue by way of
the mass matrices to achieve a, rigorous estima te for K( K).
:rices, and therefore linear independence First, to give a better understanding of Koo, we study two more examples:
infinite interval.
plitz matrix Koo, which can also be de- l. Bilinear elements on a square mesh, for -ll.u = f There is one un-
x. Its eigenvectors are pure exponentials, known at every node, anda typical row of K (multiplied by 3h) shows an 8
mstant-coe:fficient differential equation. on the main diagonal and eight -1 'son t~e adjacent diagonals. The equation
cts are vj = éj8 , and apply the matrix KQ =:=Pis
where (j', k') represent the eight nearest neighbors of the point (j, k) on a
square grid. Again the eigenvectors v of Koo are pure exponentials, but now
there are two frequencies: the components are vjk = eiUB+krpl. Calculating
the eigenvalues,
_ eio) = 2(1 - cos {}).
h
3hKoovjk = (8- e¡o- e-io- é"'- e-irp- ei<B+rpl - e-i(B+rpl
nd 4/h, we conclude that - ei<o-"'> - e-i(B-"'>)vjk"
The extreme eigenvalues come from maximizing and minimizing the expres-
sion in parentheses. Since it is linear in cos (} and cos rp, the extrema must
occur where these equal ± 1, and we find
e conclusion of Gerschgorin's theorem
with center given by the main diagonal
1solute row sums I:j*i 1 Kij 1 = 2/h. In (at (} = re, rp = 0).
so good: E ven for the bilinear elements
he same sign, it is not precise. Gerschgorin's argument would give Amax < I6j3h.
212 STABILITY CHAP.5 SEC. 5.2.
2. Cubic elements in pne dimension for u<iv) f There are now two bound is useless. At this point we ha·
unknowns at every node, and KQ F Ís a coupled system of two difference return to inequalities. According to J
equations. Correspondingly, K and Koo are block Toeplitz matrices: there is value of K is
a 2 x 2 block appearing o ver and o ver on the main diagonal, flanked on one
side by another such block and on the other side by its transpose. From (2) A. (K)= min xrKx
1 XTX
Section 1.7,
With the appearance of M :- 1 K, it bec<
The eigenproblem KQ A1lf Q is exc:
continuous problem Lu = lu, and we
eigenvalue A1 (M- 1K) in the discrete
fundamental eigenvalue A1(L) in the co¡
which is independent of h.
In the example Lu -u", with ·
A 1(24
h3 o
o)
8h 2 '
B 1(-12
h 3 -6h
6h)
2h 2 •
the lowest eigenvalue is A1(L) l Si
> A1(M"")> h/3, it follows· that 11
smallest eigenvalue of K is sin 2 (h/4)/(
In analogy with. the previous examples, the eigenvalues now come from the mate (2) misses by a fact~r of 3, es
blockl(O) Bre-te + +
A Be10 • This is itself a 2 x 2 matrix; its eigenvalues, because the true mínimum of xrKx;
maximized over the whole range -n O n, yield AmaxCK=). Again the whose components are all
of one si
extreme case must occur at O = O or O n, since the Rayleigh quotient Amax(M) than to the Amin(M) which a
xTA.(O)xfxTx is linear in cosO. We compute The essential point, however, is t
number is correctly determined:
l(O)
1
h3
(oo o)
12h 2 '
l(n) 1 (48
h3 o
o)
4h 2 •
K( K)
Therefore, Amax(K"") is the larger of 12/h and 48/h 3 • The result can be influ-
, enced by altering the relative scaling of the two types of basis functions, For an equation of order 2m, in' a re~
corresponding to displacement and slope. approach [combined with (2)} yields:
constant involves A. 1(L), as it should;
The extreme eigenvalues of any block Toeplitz matrix K"" can be found tioned, that must be reflected in its fi
in the same way. If the blocks are of order M (the number of unknowns
per mesh square), then 1(0) will be a matrix of this order; Orepresents O1 , • • • , Second proof ofTheorem 5.1.
On in an n-dimensional problem. Since each square is connected only to its The remaining problem is to exte
nearest neighbors in the nodal finite element method, the matrix A. will be number to the case of irregular elemc
linear in cos 0 0 . •• , cos On. Therefore, Amax(K,) can be computed by trying and the whole argument must be ba~
all possible ± 1 combinations for these cosines and evaluating the largest For this second proof we follow Fri
eigenvalues of the 2n resulting matrices l. The condition number for apure AmaxCK), we recall that the global ~
Toeplitz matrix can thus be computed exactly by working only with matrices element matrices k1 by
of order M (which is less than the arder of the element matrices and is given (3)
for common elements in the tables of Section 1.8). For the mass matrices
(also denoted by M!) we obtain K(M) K(M""), and this is a constant depend- The vectorx and matrix K are of ord
ing only on the element. Finite elements are uniformly independent. order d¡, ·the number of degrees of fJ
Por the stiffness.matrix there is a good upper bound on Amax' but the lower x 1 is formed from x just by striking o
ion for u(iv) = f There are now two bound is useless. At this point we have to abandon exact computations and
!;' is a coupled systém of two difference
return to inequalities. According to Rayleigh's principie, the smallest eigen-
~" are block Toeplitz matrices: there is value of K is
~ on the main diagonal, flanked on one
TK TK TM
he other side by its transpose. From (2) A1(K)= min~ > min~min~ =A 1(M- 1K)A 1(M).
xT x - xT Mx xT x
With the appearance of M- 1K, it becomes easy to use variational arguments.

The eigenproblem KQ = AMQ is exactly the finite element analogue ofthe
continuous problem Lu = Au, and we show in Chapter 6 that thefundamental
eigehvalue A. 1 (M- 1 K) in the discrete problem always equals or exceeds the
fundamental eigenvalue A1 (L) in the continuous problem. This is a lower bound
which is independent of h.
In the example Lu = -u", with boundary conditions u(O) = u'(n) = O,
the lowest eigenvalue is A1 (L) =t. Since we have already proved that A1 (M)
> A1 (Moo) > h/3, it follows- that A1 (K) > h/12. The exact value for the
smallest eigenvalue of K is sin 2 (h/4)/(h/4), and therefore about h/4. The estí-
~, the eigenvalues now come from the mate (2) misses by a factor of 3, essentially the condition number of M, ..
is itself a 2 x 2 matrix; its eigenvalues, because the true mínimum of xT KxfxTx occurs at a fundamental mode x
: < O < n, yield Amax(Koo). Again the whose components are all of one sign. This corresponds more closely to
r (} = n, since the Rayleigh quotient Amax(M) than tothe Amin(M) which appears in (2).
tpute The essential point, however, is that the exponent of h in the condition
number is correctly determined:
A(n) = 1 (48
h3 O
O)
4h 2
= AmaxCK) < 4/h = 4~.
•
K(K)
Amin(K) , h/12 h
/ h and 48/h 3 • The result can be influ-
of the ,two types of basis functions, For an equation of order 2m, in a regular domain, the same Toeplitz matrix
,pe. approach [combined with (2)] yields a condition number of order h- 2 m. The
constant involves A1(L), as it should; ifthe physical problem is poorly condi-
JCk Toeplitz matrix Koo can be found tioned, that must be reflected in its finite element analogue.
order M (the number of unknowns
.trix of this order; (} represents (} 1 , • • • , Second proof of Theorem 5.1 .
: each square is connected only to its The remaining problem is to extend this h- 2 m estímate of the condition
~lement method, the matrix A will be number to the case of irregular elements. The Toeplitz structure falls apart,
, AmaxCKoo) can be computed . by trying and the whole argument must be based on the individual element matrices.
se cosines and evaluating the largest For this second proof we follow Fried [F14]. To find an upper bound for
;¡ A. The condition number for a pure AmaxCK), we recall that the global stiffness matrix is assembled from the
~xactly by working only with matrices element matrices k¡ by
:r of the element matrices and is given (3) xTKx = L; xfk¡X¡.
· Section 1.8). For the mass matrices i
: K(Moo), and this is a constant depend-

The vector x and matrix K are of order N; the vector X¡ and matrix ki are of
's are uniformly independent.
order d¡, the number of degrees of freedom within the ith element. In fact,
od upper bound on Amax' but the lower
X¡ is formed from x just by striking out all except the appropriate d¡ compo-
nents; we may think of inultiplying x by an incidence matrix, made up of Therefore, (K- 1 )jj is less than the tr
O's and 1's, to obtain x 1• pendently of the distribution of nodes.
If A is the maximum 't:;igenvalue of all the element stiffness matrices, so unknowns, and a second-order prol
that xfk¡x1 <A xTxr by Rayleigh's principie, we conclude from (3) that positive definite and thus (K- 1)f1 ~
oecur on the main diagonal. Therefor
· absolute values is bounded by eN; N
N- 1 to be the average mesh width ii.
Now suppose that q is the maximum number of elements which meet at any sum is bounded by cfhmin' since hi (
node. Then no component of x will appear in more than q ofthe abbreviated element matrix. Therefore, the condi
vectors x 1, and :E xTx1 qxTx. Therefore, of absolute row sums-correspondin
bounds-is less than Cfh hmin· With a
The important conclusion is thi
To find a lower bound on Amin(M), we proceed in exactly the same way: strongly on the degree of the polynm
h and on the order and fundamental
Therefore, the ·way to achieve num<
is to increase the degree of the tria
where e is the smallest eigenvalue of any of the element mass matrices and r cubics is only slightly worse than fo
is the mínimum number of elements which meet at a node. (Often r 1 errors for a given h are cómparable
or r 2 beca use of elements at the boundary; certainly r = 1 if there are an order oJ: magnitude smaller for c1
nodes interna! to the elements.) Rayleigh's principie again gives where roundoff prohibits any furthe1
in h, the cubic element is much more
computation of stresses, where diffe
ments introduces an extra factor h
second-order problems the roundof
Therefore, according to (2), the condition number of the stiffness matrix
in the degree of the trial functions is
is less than
(4)
If the geometry of the elements does not degenerate, so that the basis is
uniform in the sense of Chapter 3, a direct calculation yields the expected
K(K) = O(h- 2 m). If there is a degeneracy, so that triangles become very thin
or rectangles approach triangles, it is reflectedin the parameters .A ande.
Since these are eigenvalues of small matrices, the effects of this degeneracy
can be rigorously estimated. Fried [F14] has computed thís dependence on
the geometry for several examples-the estímate in Theorem 5.1 is sometimes
pessimistic-and gíves also a 1ower bound on the condition number.
He has also given a níce estímate for irreg~lar meshes in one dimension,
using the fact that a(uh, uh) a(u, u); the discrete structure is always stiffer
than the continuous one. Suppose we apply a point load at the jth node,
f ó(x- zj). Then it was observed in Section 1.1Ó that the load vector F
has only one nonzero component and that
: x by an incidence mairix, made up of Therefore, (K- 1)n is less than the true energy a(u, u), and is bounded inde-
pendently of the distribution of nodes. [We assume function values to be the
of all the element stiffness mátrices so unknowns, and a second-order problem -(pu')' + qu =f.] Since K- 1 is
'rinciple, we conclude from (J) that ' positive definite and thus (K- 1)fj (K- 1)u(K- 1 )n, the largest entry must
occur on the main diagonal. Therefore, along each row of K- 1 , the sum ofthe
· absolute values is bounded by eN; N is the order ofthe mat.rix. We may take
N- 1 to be the average mesh width ii. Along each row of K, the corresponding
number of elements which meet at any sum is bounded by cjhmin• since h¡ appears in the denominator of the ith
.ppear in more than q of the abbreviated element matrix. Therefore, the condition number, even in the stronger sense
refore, of absolute row sums-corresponding to pointwise rather than mean-square
and .A.mu.(K) Aq. bounds-is less than Cfh hmin· With a regular mesh we recover C/h 2 •
The important conclusion is this: The roundo.ff error does not depend
'), we proceed in exactly the same way: strongly on the degree of the polynomial element. It depends principally on
h and on the order and fundamental eigenvalue of the continuous problem.
Therefore, the way to achieve numerical accuracy in the face of roundoff
is to increase the degree of the trial functions. The condition number for
any of the element mass matrices and r cubics is only slightly worse than for linear elements, so that the roundoff
s which meet at a node. (Óften r 1 errors for a given h are comparable. The discretization error, however, is
boundary; certainly r = 1 if there are an order of magnitude smaller for cubics. Therefore, at the crossover point
!Ígh's principie again gives where roundoff prohibits any further improvement cohling from a decrease
in h, the cubic element is much more accurate. This applies especially to the
computation of stresses, where differentiation (or differencing) of displace-
ments introduces an extra factor h- 1 into the numerical error. Even in
tdition number of the stiffness matrix second-order problems the roundoff becomes significant, and an increase
in the degree of the trial functions is th~ best way out.
K) Aq
(M) A.JL)Or.
·s not degenerate, so that the basis is

. direct calculation yields the expected
acy, so that triangles become very thin
; reflected in the parameters A and (}.
natrices, the effects of this degeneracy
'14] has computed this dependence on
e estimate in Theorem 5.1 is sometimes
mnd on the conditioñ number.
br irreg'flar meshes in one dimension,
the discrete structure is always stiffer
~ apply a point load at the jth node,
.n Section 1.1 ó that the load vector F
that
SEC. 6.1. VARIATIONAL FORMUI
element subspace Sh. Therefore, the

can be put directly to use. Mathem~
is to deduce, starting from these appn
for the error in the eigenvalues and t
our main goal.
We shall precede the general theo:
in order to illustrate the behavior o
we link the approximation theory
find bounds on A1 - A.~ and u1 - u?, tl
6 EIGENVALUE PROBLEMS
function, which are sharp enough to
as on h.
We begin with the eigenvalue prc
(1) Lu _!!_(p(x)du) +,
d~ dx
This is the same operator L which wa

city lies in the fact that it is only on(
6.1. VARIATIONAL FORMULATION ANO
THE MINMAX PRINCIPLE
the. distinction between natural and <
ducing one of each:
Eigenvalue problems-which we shall write as Lu A.u, or more gener-
(2} u(O) =O,
ally as Lu = A.Bu-arise in a tremendous variety of applications. We mention
the buckling of columns and shells, the vibration of elastic bodies, and multi-
It ís known that such a Sturm-Liot
group diffusion in nuclear reactors. Fortunately the Rayleigh-Ritz idea is
as useful for these problems as for steady-state equations Lu = f In fact, of real eigenvalues
the idea had its beginning in Rayleigh's description of the fundamental
frequency as the mínimum value of the Rayleigh quotient. Therefore, the
-step which has been taken in the last 15 years was completely natural and
and an associated complete set of on
inevitable: to apply the new finite element ideas to this long-established
f"'
variational form of the eigenvalue problem.
From a practica} point of view, this means that piecewise polynomial (3) o
U.'
J
functions can be substituted directly as trial functions into the Rayleigh
quotient. The evaluation ofthis quotient becomes exactly the problem which It follows immediately that these eig
has already been discussed and which large-scale computer systems have energy inner product
been developed to carry out: the assembly of the stiffness and mass matrices
K and M: The next step, however, leads to a different and more difficult (4)
computational problem in linear algebra: instead of a linear system KQ = F,
there arises a ·discrete eigenvalue problem, KQ A.MQ. Fortunately, it is or, with the usual integration by par
now known how the properties of the two matrices-symmetry, sparseness,
positive definiteness of M-can be used to speed up the numerical algorithm.
We shall discuss several effective numerical processes in Section 6.4.
From a theoretical point of view, the basic steps in establishing error
bounds depend once again on the approximation properties of the finite In the case of constant p and q, wh
216
SEC. 6.1. VARIÁTIONAL FORMULATION AND THE MINMAX PRINCIPLE 217
element subspace Sh. Therefore, the approximation theorems of Chapter 3

can be put directly to use. Mathematically, the new step which is required
is to deduce, starting from these approximation theorems, satisfactory bounds
for the error in the eigenvalues and the eigenfunctions. This step is therefore
o ur main goal.
We shall precede the general theory by a study of sorne specific examples,
in order to illustrate the behavior of finite element approximations. Then
we link the approximation theory to the eigenvalue problem, aiming to
find bounds on A1 A? and u1 - uf, the errors in the lth eigenvalue and eigen-
function, which are sharp enough to give the right dependence on las well
as on h.
We begin with the eigenvalue problém
(1) ~(p(x)~~) + q(x)u == Au, O< x <n.
This is the same operator L which was studied in the first chapter; its simpli-
~NO city líes in the fact that it is only one-dimensional. We- shall again illustrate
the. distinction between natural and essential boundary conditions by intro-
ducing one of each:
1all write as Lu .A.u, or more gener-
us variety of applications. We mention (2) u(O) O, u'(n) =O.
vibration of elastic bodies, and multi-
'ortunately the Rayleigh-Ritz idea is It is known that such a Sturm-Liouville problem has an infinite sequence
teady-state equations Lu f. In fact, of real eigenvalues
gh's description of the fundamental
the Rayleigh quotient. Therefore the A<
2-
... ... -+oo,
I 5 years was c;ompletely natural' and

lement ideas to this ·long-established and an associated complete set of orthonormal eigenfunctions:
blem.
1is means that piecewise polynomial (3)
as trial functions into the Rayleigh
1t becomes exactly the problem which It follows immediately that these eigenfunctions are orthogonal also in the
1 Jarge-scale computer systems have energy inner product
bly of the stiffness and mass matrices
:1ds to a different ami more difficult (4)
:t: instead of a linear system KQ F,
1
lem, KQ .A.MQ. Fortunately, it is or, with the usual integration by parts,
two matrices-symmetry, sparseness,
to speed up the numerical algorithm. j=k,
rical processes in Section 6.4. (5)
j=Fk.
the basic steps in establishing error
proximation properties of the finite In the case of constant p and q; which will serve as a model problem, the
218 EIGENVALUE PROBLEMS CHAP. 6 SEC. 6.1. VARIATIONAL FORMU
eigenfunctions are sinusoidal: cross section of the egg cup which .

(v, v) = l. The lowest point on thi
ulx) = !{- sin(j !)x, · ;ti p(j- !)2 + q. fundamental frequency A- 1 •
Alternatively, we can fix the nur
The first step in the Rayleigh-Ritz method is to rewrite Lu = lu as a the egg cup is cut by a horizontal pl
variatio11al problem. There are two possibilities, corresponding to the Ritz cross section is the ellipse a(v, v) = :
minimizátion and the Galerkin weak form of steady-state equations Lu = f 1 in infinitely many dimensions.
Both lead ·to the same result. The first is to introduce the Rayleigh quotient, first eigenfunction u1 , since it is in ti
which is defined by from the axis. In other words, with tl
leigh quotient is minimized when thc
a(v, v) sider the cross section of the ellipst
- (6) R(v)
(v, v) fix fX 1 = O and look in a space of 01
of this ellipse has length ,.;rrr;_.
[AIJ
ellipsoid will have. a major axis who:
We claim that the stationary (or critical) points of this functional R(v), to the minmax principie described
which are the points where the gradient of R vanishes, are exactly the eigen- 1'1 under any one given constraint :
functions of the problem. To understand this, Jet the trial function v be It appears that in four dimensioJ
replaced by its eigenfunction expansion 2:; fXiui, fXi = (v, ui): shell of a whole egg. ·
We have not yet specified whicl
(7) R(v) = a(2; fXiui, 2:; akuk) = 2:; AifXJ,
(2:; fXiui, 2:; fXkuk) 2:; fXJ variational, or Rayleigh quotient, el
functions. This is exactly the same ~
using the orthogonality conditions (3) and (5). Ata stationary point of R(v), problem in Section 1.3. In that case
the derivative with respect to each fXk must be. zero: tions in 3<! 1 satisfying the essential
This space arose naturally through a
o. functions that satisfied alt bounda1
which was a limit of such vN in the
we shall do exactly the same thing,
If v is one of the eigenfunctions ui' this is certainly satisfied: Every ock = O
changed; all stationary points rema
witñ the exception of the term with k = j, and this remainíng survivor is
p(x) were discontinuous, this proces
cancelled by the vanishing of lj lk. There are no stationary points other
R(v) actually fills in the eigenfuncti
than the eigenfunctions. (For a repeated eigenvalue ;tj = lj+P all combina-
were initially excluded.J
tions v +
fXiui ocj+Juj+J are eigenfunctions, and there is a whole plane of
The resulting admissible space is
stationary points.) The value of R(v) at a stationary point v = u 1 is easy
that all the same finite element subsi
to determine: it is exactly the eigenvalue Ar
mation, the difference being that wt
rather than the mínimum of I(v).
(8) 1t was mentioned earlier that tl
eigenvalue problem. This is to put t
lt is useful to try to visualize the graph of R(v) .. The nume~ator a(v, v) or Galerkin form, which is produc
corresponds toa convex surface, an '"egg cup," exactly as for I(v) in the first Lu lu by a function v and integr~
chapter; the only difference is that the linear term -2(/, v) is now absent,
so that the bottom of the egg cup lies at the origin. There are two ways to
consíder the effect of the denomínator (v, v). Since it makes the quotient
(9) J: (pu'v' + qu
homogeneous, R(fXv) R(v), one possibility is to consider only unit vectors · The eigenvalue problem is then to
v. In other words, we can fix the denominator to equal 1, and loo k at the such that (9) holds for all v in 3<!1. '
CHAP.6 SEC. 6.1. VARIATIONAL FORMULATION ANO THE MINMAX PRINCIPLE 219
cross section of the egg cup which is cut out by the right circular cylinder
(v, v) l. The lowest point on thi& cross section will correspond to the
fundamental frequency A1 •
Alternatively, we can fix the numerator at the value l. This means that
itz method is to rewrite Lu Au as a the egg cup is cut by a horizontal plane, lying one unit above the base. The
possibilities, corresp<;mding to the Ritz cross section is the ellipse a(v, v) = 1, or more precisely the ellipsoid L; A¡IXJ
form of steady-statel~quations Lu f. = 1 in infinitely many dimensions. Its major axis is in the direction of the
;t is to introduce the Rayleigh quotient, first eigenfunction u 1 , sin ce it is in this direction that the ellipsoid is farthest
from the axis. In other words, with the numerator of R(v) fixed at 1, the Ray-
leigh quotient is minimized when the denominator is largest. If we then con-
sider the cross section of the ellipse normal to this major axis, that is, we
fix tX 1 O and look in a space of one lower dimension, then the major axis
of this ellipse has length ~. [And any other cross section of the original
ellipsoid will ha ve a major axis whose length falls in between, corresponding
itical) ·points of this functional R(v),
to the minmax principie described in (13) below: The smallest eigenvalue
nt of R vanishes, are exactly the eigen-
A.'1 under any one given constraint satisfies A1 < .A.'1 A2 .]
stand this, Jet the trial function v be
It appears that in four dimensions, the cross section of an egg cup is the
mI; rx¡uj, rxi (v; u¡):
shell of a whole egg.
We have not yet specified which functions V are to be admitted in the
variational, or Rayleigh quotient, characterization of eigenvalues and eigen-
functions. This is exactly the same question which arose for the steady-state
and (5). At a stationary point of R(v), problem in Section 1.3. In that case I(v) was finally minimized over all func-
must be zero: tions in X 1 satisfying the essential boundary conditions, that is, o ver X 1.
This space aro se naturally through a completion process, startingwith smooth
o. functions that satisfied al! boundary conditions and then including any v
which was a limit of such vN in the sense that a(v vN, v- vN)---+ O. Here
1is is certainly satisfied: Every ak O we shall do exactly the same thing, since it still leaves the surface R(v) un-
k j, and this remaining survivor is changed; all stationary points remain so in the completion process. [And if
. There are no stationary points other p(x) were discontinuous, this process of filling in the holes on the surface of
ed eigenvalue A.i A¡+t> all combina-
R(v) actually fills in the eigenfunctions; because they are not smooth, they
lctions, and there is a whole plane of were initially excluded.]
at a stationary point v u¡ is easy The resulting admissible space is then X1, precisely as before. This means
ue A.i. that all the same finite element subspaces may be used in the discrete approxi-
mation, the difference being that we now look for stationary points of R(v)
'¡,u)= A. .• rather than the mínimum of /(v).
'u¡) ' It was mentioned earlier that there is a second way to reformulate the
eigenvalue problem. This is to put the equation Lu Au into its weak form,
~raph of R(v). The numetator a(v, v) or Galerkin form, which is produced in the following way: We multiply
gg cup," exactly as for l(v) in the first Lu AU by a function v and integrate by parts to obtain
~ linear term - 2(/, v) is now absent,
at the origin. There are two ways to
•r (v, v). Since it makes the quotient
(9) J: (pu'v' + quv) dx A J: uv dx.
ibility is to consider only unit vectors The eigenvalue problem is then' to find a scalar A and a function u in X 1
)minator to equal 1, and loo k at the such that (9) holds for all v in X1. The boundary conditions on u come out
220 EIGENVALUE PROBLEMS CHAP. 6 SEC. 6J. VARTATIONAL FORMUL
correctly, since equation (9) is actually equivalent to There is an a1ternative formula f<
ing u~' ... , u1_ 1 • It will be of fundan
J Lu(x )v(x) dx pu'vln: A Juvdx. was discovered by Poincaré, Courant
"' o Minmax Principie: /f R(v) is ma
At x = O, the integrated term pu'v automatically vanishes. Equality of the S 1, then the minimum possible value fm
remaining t~rms, for all v in 3C1, -forces both the natural condition u' O
to hold at x.=
n and the differential equation Lu Au to hold over the (13) A1 mir
s,
interval.
The equation (9) is a specific case ofthe standard weak form for the eigen- The mínimum is taken over alll-dimel
value problem: Find a scalar A, anda function u in the admissible space V, subspace S1 = Eh the maximum of R
such that To prove the minmax formula (1
choice of S 1,
(10) a(u, v) = A(u, v) for all v in V.
(14) maxR
This resembles the ·condition a(u, v) (J, v) for the vanishing of the first v in St
variation in the steady-state problem ofminimizíng /(v). Indeed, the equation

represents exactly the vanishing of the first variation of R at the stationary The argument depends on selecting
pomtu: - that is, so as to satisfy the l - 1 f
exists such a v*, since we are imposi:
R(u -+· Ev) = a(u + EV, u+ Ev) = a(u, u)+ 2Ea(u, v) + ...
on an l-parameter space. Now it fo
(u + EV, u + Ev) (u, u) + 2E(u, v) + · · ·
A¡< R(v*). In other words, (14) ht
R(u) + lE a(u, vXu, u) a( u, u)(u, v} + ... pro ved.
(u, u) 2 One useful consequence of this fo
R(u) + lE a( u, v) A( u, v) + O(Ez). value A1, based on a comparison v
(u, u) clear that for any v,
Thus the weak formulation and the stationary point formulation are equiva-
lent.
In the case of A1 and u 1-thai is, for the fundamental frequency and
its associated normal mode-the stationary point is actually a mínimum: Dividing by f v 2 , this beco mes a c<
the variable-coefficient case sand"'
(11) 1 1 = "!in, R(v). eigenvalues are known explicitly. By
v m·:JC 8
A1 of the middle problem must lie bt
This is obvious if vis expanded as 2: rxiui' since then R(v) = 2: Airx7fí: a; >
A¡.
It is valuable to describe also the higher eigenfunctions in terms of a
minimization, since convergence to a mínimum is so much simpler to analyze In particular, A1 is of order J2 as l
than convergence to a stationary point. One possibility in this direction to Finally, we cometo the maino~
constrain v to be orthogonal to the first 1- 1 eigenfunctions: a 1 = (v, u 1 ) the Rayleigh-Ritz principie for appro
O, ... , rx 1_ 1 (v, u1_ 1 ) O. Under these constraints the mínimum of the with the weak forma(u, v) A(u, v)'
Rayleigh quotieht beco mes A1 : as critical (stationary) points of R(v~
is to work only within a finite-dimen
(12) A1 = min R(v).
v_I_Et-1 space 3C1. In this subspace, we look
Here E 1_ 1 is the space spanned by the eigenfunctions up ... , u 1_ 1 • (16) a(tl', if) = A"(tf, 1
CHAP. 6
SEC. 6.1. VARIATIONAL FORMULATION AND THE MINMAX PRINCIPLE 221
equivalent to There is an alternative formula for A1 which does not depend on know..
ing u., ... , u1_ 1 • lt will be of fundamental importance in what follows, and
~' v /: = J
A uv dx. was discovered by Poincaré, Courant, and Fischer; we refer to itas the
Minmax Principie: Jf R(v) is maximized over an !-dimensional subspace
tomatically vanishes. Equality ·of the S1, then the minimum possible value for this maximum is A1 :
~s both the natural q,pndition u' = O
equation Lu = Au tb' hold over the (13) A1 = min maxR(v).
St v in S¡
the standard weak form for the eigen- The mínimum is taken o ver all/-dimensional'subspaces of 3C1. For the special
function u in the admissible space V, subspace S 1 = E 1, the maximum of R(v) is exactly A1•
To prove the minmax formula (13), we have to show that for any other
for all v in V. choice of S1,
(f, v) for the vanishing of the first (14) max R(v) > A1•
v inSt
minimizing I(v). Indeed, the equation
first variation of R at the stationary The argument depends on selecting v* in S1 so as to be orthogonal to E1_ 1 ,
that is, so as to satisfy the 1- 1 equations (v, uJ =O, 1 i <l. There
_ a(u, u)+ 2ea(u, v) + · .. exists such a v*, since we are imposing only 1 1 homogeneous constraints
- (u, u) + 2e(u, v) + ... on an /-parameter space. Now it follows from (12), since v* j_ E 1_p that
A1 R(v*). In other words, (14) holds, and the minmax formula (13) is
1, u) - a(u, u)(u, v) + pro ved.
(u, u) 2
One useful consequence of this formula is a rough estímate for the eigen-
. u;(u, v) + O(é). value A1, based on a comparison with the constant-coefficient case. It is
clear that for any v,
ionary point formulation are equiva-'
for the fundamental frequency and
mry point is actually a mínimum: Dividing by J v 2 , this beco mes a comparison of Rayleigh quotients, with
~. R(v). the variable-coefficient case sandwiched between two problems whose
"E eigenvalues are known explicitly. By the minmax principie, each eigenvalue
A1 of the middle problem must lie between the two known eigenvalues,
tigher eigenfunctions in terms of a

1imum is so much simpler to analyze
In particular, A1 is of order f2 as 1 oo.
One possibility in this direction to
Finally, we come to the main object of this section, which is to establish
1 eigenfunctions: a 1 = (v, u 1) = the Rayleigh-Ritz principie for approximating the eigenvalues. It begins either
;e constraints the mínimum of the
with the weak forma( u, v) = A(u, v) or with the description ofthe eigenvalues
as critica! (stationary) points of R(v) = a(v, v)f(v, v). In either case, the idea
R(v). is to work only within a finite-dimensional subspace Sh of tlÍe full admissible
space 3C1. In this subspace, welook for a pair Ah and uh such that
igenfunctions u 1 , ••• , u1_ 1 •
(16)
222 EIGENVALUE PROBLEMS CHAP. 6 SEC. 6.2.
In other words, If the basis functions rpi are orthc

identií:y and the discrete problem is ·
this orthogonality condition on the rp
property of finite elements, that rp¡ s·
Alternatively, the approximate eigenvectors are the critica/ points of R(vh) taining the corresponding node z¡. 1
m•er the spape Sh. 1 or do violence to Rayleigh's idea l:
• To see t'hat these methods Iead to the same approximations, choose a former, since numerical algorithms
basis qJp ... ~ rpN for Sh. Then any vh in S" can be expanded as are now appearing which are compar
We emphasize that in all cases the
(17) definite; it is the Gram matrix for t
(/JN•
where q¡ are the generalized coordinates (the nodal parameters of vh, if Sh The fundamental frequency A1 is o
is a finite element space). Substituting into the Rayleigh quotient, and we especially hope that A1 prov.
.:tt always lies above A1 , l1 A¡, sine
the subspace Sh and l 1 is the minimt
(18)
It is natural to expect that if the tru
mated by the ~ubspace Sh, then A1 is
key result of the theory.
The integrals in the numerator and denominator are by now familiar: They The minmax principie applies eq
are the entries in the stiffness matrix Kh and the mass matrix Mh. Thus the the same proof, so that the approx
Rayleigh quotient, in terms of the vector q = (q 1 , • • • , qN), is exactly by
(19) (22) Af = mir

St
The critica) points of this discrete quotient are the solutions to the matrix Here S 1 ranges over al/1-dimensional
eigenvalue problein the definition makes sense only for
N, there exist only N approximatc:
(20)
principies (22) and (13), it follows i
Therefore, this is the eigenproblem which has to be assembled and solved. proximated from above:
The eigenvalues Af are expected to approximate the continuous eigenvalues
A~> at least for small values of 1, and the eigenvectors Qf lead toa correspond-
(23)
ing approximate eigenfunction
Every space S1 which is allowed in t
(13), and therefore the mínimum A1 i
(21)
The minmax principie extends ve
B is positive definite; the Rayleigh
Thus the components of the discrete eigenvectors, in the matrix problem The mass matrix in the discrete prob
KQ = AMQ, yield the nodal values of the finite element eigenfunctions.
The weak form' of the efgenproblem Ieads directly to the same result.
Suppose that in,equation (16) we take vh rpk: 6.2. SOME ELEMENTARY EXAMPLI
In this section we shall treat son

This is simply the kth row ofthe matrix equation KhQh AhMhQh. the patterns which occur in the gene1
CHAP. 6 SEC. 6.2. SOME ELEMENTARY EXAMPLES 223
If the b-asis functions rp j are orthonormal, then the mass matrix Mh is the
identity and the discrete problem is to find the eigenvalues of K''. However,
this orthogonality condition on the rp j is incompatible with a more important
property of flnite elements, that rpj should vanish over all elements not con-
·ctors are the critica! points of R(vh) taining the corresponding node Zr Therefore, we must either accept Mh *
1 or do. violence to Rayleigh's idea by "lumping" the masses. We prefer the
the same approximi~:tions, choose a former, since numerical algorithms for the general problem KQ = A.MQ
1 S'' can be ex panded as are now appearing which are comparable in efficiency to those for KQ = A.Q.
We emphasize that in all cases the mass matrix is symmetric and positive
definite; it is the Gram matrix for the linearly independent vectors rpp •.• ,
fPN•
es (the nodal· parameters of vh, if Sh
The fundamental frequency A. 1 is often the q1,1antity of greatest significance,
nto the Rayleigh quotient,
and we especially hope that A.~ provides a good approximation. Notice that
A.~ always lies above A. 1 , A.~ >- A. 1 , since A.~ is the mínimum value of R(v) over
J(p(-x)rp~rp~ + q(x)rpjrpk) dx the subspace Sh and A. 1 is the mínimum over the whole admissible space 3C1.
:¿; qjq k Jrpjrpk dx It is natural to expect that if the true eigenfunction u 1 can be well approxi-
mated by the subspace S\ then A.~ is automatically close to A. 1 ; this will be a
key result of the theory.
wminator are by now familiar: They
The minmax principie applies equally well to the discrete problem, with
h and the mass matrix Mh. Thus the
the same proof, so that the approximate eigenvalues can be characterized
1r q = (q~' ... , qN), is exactly
by
fKhq
rTMhq. (22) A.? = min max R(vh),
s, vh in St
tient are the solutions to ~he matrix Here S1 ranges over alll-dimensional subspace of the subspace Sh. Of course,
thé· definition makes sense only for l < N, since _if the dimension of Sh is
N, there exist only N approximate eigenvalues. ·Comparing the. minmax
principies (22) and (13), it follows immediately that every eigenvalue is ap-
ich has to be assembled and solved. proximated from abo ve:
roximate the continuous eigenvalues
:!igenvectors Q? lead to a correspond- (23) for all/.
Every space S1 which is allowed in the minimization (22) is also allowed in

(13), and therefore the mínimum A.1 in (13) is at least as small as A,7.
The minmax principie extends verbatim to the case Lu = A.Bu as long as
B is positive definite; the Rayleigh quotient is then R(v) = (Lv, v)f(Bv, v).
!igenvectors, in the matrix problem
The mass matrix in the discrete problem becomes Mjk = (Brpj, rpk).
f the finite element eigenfunctions.
n leads directly to the same result.
h = rpk:
6.2. SOME ELEMENTARY EXAMPLES
In this section we shall treat sorne specific examples in order to isolate

the patterns which occur in thé general theory of eigenvalue approximation.
We shap concentrate on the constant-coefficient problem Then
(25)
0 <X< 1l,
In this special problem the eigenfu11
with the boundary conditions u(O) O, u'(n) = O.
The fi~t trial space to consider is made.up ofpiecewise linear functions,
with equally spaced nodes xi = jh. Using th~ standard roof functions rpi (26) ~uf(jh) u/.jh)
as a basis, the key matrices are
It follows that uf is equal to the i
4 2 -1 theorems give
1 4 -1 2 -1
(27)
Mh K~=!
4 1 -1 2 -1 Although the exactness of the Ritz
the estimate (27) is typical of the !
1 2 -1
functions u1 - uf are of the same ord
The stiffness matrix can be written as Kh pK~ + qMh. The optimal weights to steady-súzte problems Lu -=f.
for the Rayleigh-Ritz eigenfunction Turning now to the approxima1
kh(l) (l !)2
h2 (1
are the values of uh at the nodes, and, in fact, the system KhQh = J..hMhQh 6
is precisely the difference equation
Recalling that 1 1 p(l - !) + q,
2
~ (2Q~
(24)
Observe that the eigenvalues art

For this to be valid at the boundary point j = N we put QJ¡.+ 1 Q'N~ 1 • functions. This is true in general:
Recall that the Dirichlet condition u(O) O gives Q0 O.
The eigensystem (24) has a Toeplitz structure away from the boundaries,
in the sense that the (i, j) en tries of the stiffness and mass matrices depend for 2mth-order problems. The rea:
only on the differeríce i - j. 1t is therefore reasonable to expect trigonometric near a critica! point. Moderately ac
eigenvector..s, and indeed the components of the /th eigenvector are eigenvalues.
The eigénvalues are actually ir
(Qf)j jf sin((/ f)jh). addition of the term qu to the op
a 'constant amount: l 1 = p(l -f
To write out an expression for the associated eigenvalue A.?, put in general. The crucial point is that i
approximation J..?. It will be conven
kil) 2h~ 2 (1 - cos(/- !)h),
lf - A¡, by adding a sufficiently la
this chapter that 1 1 > O.
-coefficient problem Then
.u, O< x < n, (25)
J, u'(n) =O. In this special problem the eigenfunctions are infinitely accurate at the nodes,
nade. up of piecewis~;;.,linear functions,
Jsing th~ standard rbof functions rpj (26) u?(jh) = u¡(jh) = Jf sin((/- f)jh).
It follows that u? is equal to the interpolate of u1, and the approximation

2 -1
theorems give
-1 "2 -1
(27)
-1 2 -1 Although thé exactness of the Ritz eigenfunction u? at the nodes is special,

-1 the estimate (27) is typical of the general case. The errors in the Ritz eigen-
functions u1 - u? are of the same order as the errors in the approximate solution
~h = pK1 + qMh. The optimal weights to steady-state problems Lu = f
Turning now to the approximate eigenvalue M given by (25), we expand
;1 Q1.rp.(x)
1
kh(l) = ( / - f); - ~; (1- f) 4 + 0(/6 h4),
mh(l) = 1 - ~2 (1 - !)2 + 0(14h4).
Recalling that A1 = p(l - !F + q, the .eigenvalue error is
(28)
Observe that the eigenvalues are as accurate as the energies in the eigen-
point j = N we put QhN +1 = QhN-!· functions. This is true in general:
= O gives Q0 = O.
structure away from the boundaries A?- A¡< Cllu?- u11l~
~ stiffness and mass matrices depend
for 2mth-order problems. The reason is that the Rayleigh quotient is flat
re reasonable to expect trigonometric
near a critica! point. Moderately accurate tria] functions yield very accurate
ts of the lth eigenvector are
eigenvalues.
The eigenvalues are actually independent of the con.stant q, since the
n((/ - f)jh).
ad,dition of the term qu to the operator simply translates the spectruin by
a constant amount: A1 = p(l - !F when q = O and A1 = p(l - f) 2 + q
iated eigenvalue A?, put in general. The crucial point is that it has the same effect on the Ritz-Galerkin
approximation A?. It will be convenient to exploit this invariance ofthe error
mh(l) =2 + cos(l- f)h. A? - A1, by adding a sufficiently large constant term to ensure throughout
3 this chapter that A1 > O.
226 EIOENVAÜJE PROBLEMS CHAP. 6 SEC. 6.2.
In a lumping process the mass matrix Mh is replaced by a diagonal matrix, valid at the boundaries if we put
in this. simple case by the identity l. In our example it happe_ns that the
eigenfunctions remain. unchanged. The eigenvalues, on the other hand, are U0 = U_ 1 +U 1 U'1 u'_ 1 ÚN::::::
altered to
[Observe that (30) reduces to O = O ,
atj N.]
As in the piecewise linear case, tb
simple problem happen to agree at the 1
a
Thus 1; is lower bound to 1 1 and has the same O(h 2 ) accuracy as l?.
Lumping has a very seductive physical interpretation in terms of the s tiff-
ness ofthe system KhQh = lhMhQh. From this point ofview, the replacement (32) ui = ,jf sin((! -1)jh),
of the mass matrix Mh with the identity matrix 1 has the effect of making the
structure "softer" and hence reducing the magnitude of the approximate Indeed, with v1 = 1 - !, the substitut
eigenvalues. Since the Rayleigh-Ritz. approximations are necessarily upper 2 X 2 eigenvalue problem
bounds, l? l, the hope is that this reduction in magnitude will increase ·
the accuracy of the approximation. Our concern, on the other hand, is that . COS V 1h)
sin v ,
--5-1
this violation of the Ritz rules may make the structure too soft and thereby
adversely affect the accuracy. In the present case the damage has not been sin v1h h
great, but a more typical example at the end of this section shows that a far -5- T5 (4 cos
(33)
more serious loss of accuracy is possible.
We turn toan example of a higher-degree piecewise polynomial approxi- ;o (52.+ 18 cos v,J
mation, namely the cubic Hermite space Sh on a uniform mesh. There are
basis functions VI i corresponding to function values, and roi to slopes, so
l?
[ 13 h2 sm
210
. vh
1
thatt
N N-1 Observe that (33) gives two eigenvah
vh(x) = 2:1 ui'fllx) + 2:
j= j=O
u~rolx). l <N. Since the system (30)-(31) is o
all the finite element eigenvalues. By a
The matrix eigenvalue problem for the optimal weights, with p = 1, q O,
beco mes (34)
The corresponding eigenfunctions are

a ti ve.
(30)
The second eigenvalues lt 1 are of
to any eigenvalue of the differential equ
a serious disadvantage of using cubic .fl
tions, that at least half of the eigem•alu
(31) as approximations. However, this phen
ment approximations. A closer look at
tions shows that for each h there is a1
Strictly speaking (30) and (31) hold only for 1 <j <N- 1, but they remain eigenvalues are reasonable; moreover,)
implications for the choice of a metho(
tFor simplicíty we shall require that all functions vh in Sh satisfy the natural boundary of K Q lM Q; only the leading eigenv¡
condition at n as well as the essential boundary condition at O. We clase this section by noting that
·ix Mh is replaced by a diagonal matrix, valid at the boundaries if we put

In our example ·it happe.ns that the '
1e eigenvalues, on the other hand, are o.
(Observe that (30) reduces to O= O atj O, and (31) reduces to O= O
atj =N.]
As in the piecewise linear case, the approximate eigenfunctions for this
simple problem happen to agree at the nodes with trigonometric polynomials:
the same O(h 2 ) accuracy as A.?.
:ical interpretation in terms of the stiff-
~om this point ofview, the replacement (32) ui /{sin((/ !)jh), U~=/{(/- !)a¡ GOS((/- i)jh).
ty matrix 1 has the effect of making the
.g the magnitude of the approximate Indeed, with v1 l - !, the substitution of (32) into (30)-(31) produces a
approximations are necessarily upper 2 X 2 eigenvalue problem
; reduction in magnitude will increase
[
ur concern, on the other hand, is that
ake the structure too soft and thereby
~~ (1 COS V 1h)
Jresent case the damage has not been sin v1h

he end of this section shows that a far --5-
(33)
ble.
[ 7~ (52+ 18
-degree piecewise polynomial approxi- COS v1h)
ace Sh on a uniform mesh. There are A.?
13 h2 sm
. vh
function values, and roj to slopes, so 210 1
N-1 Observe that (33) gives two eigenvalues, A.?, o and A.t 1 , for each integer !,
e)+ L; u~rolx). l <N. Since the system (30)-(31) is of order 2N- 2, these are necessarily
j=O
all the'finite element eigenvalues. By a direct calculation,
! optimal weights, with p = 1, q O,
(34)
The corresponding eigenfunctions are in error by l 4 h 4 -s in the sth deriv-

ative.
The second eigenvalues 1?, 1 are of order O(h- 2 ), and they are not close
to any eigenPalue of the differential equation. At first notice this seems to be
a serious disadvantage of using cubic Hermite elements in eigenvalue calcula-
tions, that at least half of the eigenvalues are spurious and complete/y useless
as approximations. However, this phenomenon is quite typical of finite ele-
ment approximations. A closer look at even the piecewise linear approxima-
tions shows that for each h there is an integer lh such that only the first lh
y for 1 j <N- 1, but they remain eigenvalues are reasonable; moreover, hlh ~O as h O. This has iinportant
implications for the choice of a method to use in computing the eigenvalues
nctions vh in Sh satisfy the natural boundary of KQ = lMQ; only the leading eigenvalues are wanted.
uy condition at O. We close this section by noting that "lumping" can lead to a serious Ioss
l
228 EIGENVALUE PROBLEMS CHAP. 6 SEC. 6.3. EIGENVAI
"'
of accuracy [T8]. For example, a typical approach would be to replace Now we begin on the eigenvalue es
projection, defined as follows: If u is i1
1( f 1 ) subspace Sh (with tespect to the energy
10 Uj+! - Ui-1
(38) a(u- Pu, vh) O

in (30) with ui and O, respectively, and to replace
This means that in the energy norm a(
to the given u. In other words, if u were 1
Lu J, Pu would be exactly its Ritz app
in (31) with u~/210 andO. This leads toa problem KQ = XQ, which appears ous work on approximation (Theorem
simpler because M is replaced by the ·.identity. However, a direct calculation
shows that the O(h 6 ) eigenvalue error is increased to O(h 2 ). (39)
Our strategy for estimating lt A., j

spanned by the true eigenfunctions u 1 ,
6.3. EIGENVALUE ANO EIGENFUNCTION ERRORS
subspace of Sh in which to use the min
In this section we give a general theory of Rayleigh-Ritz approximations the trial functions Pu 1 , • • • , Pu1• These a
to elliptic eigenvalue problems Lu = A.u. The setting is by now familiar: eigenfunctions ut, ... , ut, but the essen
Integration by parts converts (Lv, v) into a symmetric form a(v, v), which is el ose.,
defihed for all v in the admissible space 3C';. The eigenfunctions are the points
u1 at which the Rayleigh quotient R(v) a(v, v)f(v, v) is stationary, and the
LEMMA 6.1
corresponding eigenvalues are l 1 R(u 1). These eigenfunctions are ortho- Let e1 be the set of unit vectors in E1 ~
gonal and, since L is symmetric, the eigenvalues are real.
On the subspace S\ the Rayleigh quotient becomes (40) at max j2(u, u
u in er.
Pu)
Then provided that at < 1, the app~oxin
and the stationary points Q yield the approximations ut and A.t. These points (41)
are determined by the matrix eigenvalue problem KQ = l_hMQ, and any
two eigenvectors Qi and Q1 satisfy the stan<;lard orthogonality relations
Proof To apply the minmax princi¡
(35) is !-dimensional. Certainly E1 itself is l-•
tion is whether possibly Pu* = Ofor so
Translated in terms ofthe eigenfunctions u~= l:(Q;)jlpj and ut = l:(Q1)i'Pb to be a unit vector, and therefore in e1, ..4
this means that

af j2(u*, u* - Pu*)- (u* P;
(36)
Since this contradicts a7 < 1, S 1 must
Thus the approximations mirror the basic properties of the true solutions. Now from the minmax principie (3í
They share also the minmax principie:
(37) A; = min max R(vh), A.t max.R(vk) =

v" in St
S¡cS" v" in s,
where S 1 denotes any subspace of dimension l. The numerator is bounded above by a(-
CHAP. 6 SEC. 6.3. EIGENV ALUE AND EIGENFUNCTION ERRORS 229
pical approach would be to replace Now we begin on the eigenvalue estiinates. Let P be the Rayleigh-Ritz
projection, defined as follows: If u is in X 'E, then Pu is its component in the
'1¡-t ) , l ( u¡+t
' - ' 1)
u¡_ subspace Sh_ (with respect to the energy inner product):
10
(38) a(u Pu, vh) =O
.nd to replace
This means that in the energy norm a(v, v), Pu is the closest function in Sh
to the given u. In other words, if u were the solution to a steady-state problem
Lu J, Pu would be exactly its Ritz approximation uh. This link to our previ-
toa problem KQ 1Q, which appears ous work on approximation (Theorem 3.7) guarantees that
e ·identity. However, a direct calculation
•r is increased to O(h 2 ). (39)
Our strategy for estimating l?- l 1 is this: We let E1 denoté the subspace
CTION ERRORS
spanned by the true eigenfunctions up . .. , Uz, and choose S 1 = PE1 as the
subspace of Sh in which to use the minmax principie. Thus S 1 is spanned by
theory of Rayleigh-Ritz approximations the trial functions Pu 1 , • • • , Pu1• These are not identical with the approximate
lu. The setting is by now familiar: eigenfunctioris u1, ... , uf, but the essential point of the proof is that they are
1
into a symmetric form a(v, v), which is el ose.
ce X'E. The eigenfunctions are the points
~v) = a(v, v)f(v, v) is stationary, and the LEMMA 6.1
R(u1). These eigenfunctions are ortho- Let e1 be the set of unit vectors in E 1, and let
eigenvalues are real.
1 quotient becomes (40) (}'7 = max 12(u, u- Pu)- (u- Pu, u- Pu)l.
u in e¡
Then prpvided that (}'7 < 1, the approximate eigenvalues are bounded abo ve by
approximationsu? and l?. These points (41)
tvalue problem KQ _lhMQ, and any
.1e standard orthogonality relations
Proof To apply the minmax principie we must make sure that S 1 PE1
QTKQz = Móu. is /-dimensional. Certainly E1 itself is /-dimensional, and therefore the ques-
tion is whether possibly Pu* O for sorne nonzero u* in E1• Normalizing u*
ions u?= í:(Qi)jQ'j and uj = í:(Q1)¡Q'¡, to be a unit vector, and therefore in e1, Pu* = O would imply that
(}'; > 12(u*, u* - Pu*) - (u* Pu*, u Pu*) 1 1(u*, u*) 1 l.
Since this contradicts (}'7 < 1, S1 must be /-dimensional.

basic properties of the true solutions.
Now from the minmax principie (37), .
e:
max R(vh), max.R(vh) = max a(Pu, Pu) ·

'vh in St vhinS1 uine 1 (Pu, Pu)
nension l. The numerator is bounded above by a(Pu, Pu) <a(u, u), since Pis a projec-
)
230 EIGENVALUE PROBLEMS CHAP. 6 SEC. 6.3. EIGENV
tion Ít:J. the energy norm.t The denominator is bounded below by Proof We want to estímate a~. It~
(Pu, Pu) (u, u) 2(u, u Pu) +(u Pu, u- Pu) > 1 a~.
2l(u, u- Pu)l = 2¡ ~ c1A.¡
Therefore, the lemma is pro ved: 2K[!(/
max a(u, u) _ _A._

1 _. since !a(v, w)l < Kllvllmllwllm· Appl~
u in e¡ 1 af - 1 a?
The problem is now to estímate cr;; and for this we need the following 2l(u, u- Pu)l < 2KCZhZ(k-m>
identity.
C'h2(k-m) JI~
(44)
LEMMA 6.2
C' h2(k-m) A_jk,
This holds for all functions u = ~ e

(42) case occurs at c1 = 1, because it is
eigenvalue A.1• The next-to-last step w
Proof Since u1 is a true eigenfunction,
quálity 11 v llk <e 11 Lk!Zmv llo· For k
11 v 11~ c 2 a(v, v), and the inequality:
(u 1, u - Pu) = A.¡ 1 a(u1, u- Pu),
differential operator L are sinooth.
by (10). Furthermore, a(Pu¡, u- Pu) O, if we take 11' = Pu1 in the defini-
The other term in af is of highe
tion (38) of the projection P. Subtracting, theorem with s = O yields
(45) (u Pu, u Pu) < (

Therefore, if we increase the constan1
Multiplying by c1 and summing on i, the result is (42).
expression will exceed a? for all 'smal
Now we have the basic error estímate for eigenvalues. The theorern is now proved; if h
then
THEOREM 6.1
/f Sh is a finite element space of degree k - 1, then there is a constant o
su eh that the approximate eigenvalues are bounded for small h by Such error bounds for the Rayle
particularly in the Russian literature.
(43) theory, including eigenvalue and eig1
self-adjoint case. Combined with tl
This is in agreement with the explicit bounds (28) and (34) for linear and cubic elements, his analysis establishes th
elements in one dimension. pro ved. for self-adjoint problems by
principie must be replaced by a con1
tThis is in the corollary to Theorem 1.1 and is easy to see directly: followed by an application of the (
a(v, v) a(Pv, Pv) 2a(v - Pv, Pv) + a(v Pv, v - Pv). given in Section 2.3. The integral of (
The last term is nonnegative, and the next-to-last term is zero by the definition (38) of produces exactly 2niuju1 ; Bramble an
the projection P. have computed the error in using fin:
CHAP.6 SEC. 6.3. EIGENVALUE ANO EIGENFUNCTION ERRORS 231
inator is bounded below by Proof We want to estímate u?. Its first term is
) + (u - Pu, u Pu) 1 ~ u?. 21 (u, u Pu) 1 21 t e1).;: 1a(u1 - Pu1, u - Pu) 1
< 2Kjl(l- P) ~ e¡Aj 1Utllm!l(l P)ullm
since 1a(v, w) l K 11 v 11m 11 w 11m· Applying the approximation theorem,
rr, and for this we need the following 2j (u, u Pu) j 2KC 2h 2<k-m> 11 te,).¡ 1U¡ llk ll u llk
(44)
C' h2<k-m> 11 t e1).}k/2m>- 1 u1 llo 11 t e1).f12mu1 llo
C' h2(k-m) .A,}k/m}-1.
This holds for all functions u ~ e¡u1 in e1, since ~e[ 1; the extreme
case occurs at e1 1, because it is multiplied by a power of the largest
eigenvaltie A-1• The next-to-last step was more delicate and required the ine-
ion,
quálity ll v llk <e ll Lktzmv !lo· For k m this is exactly the ellipticity condition
11 v !1~ < e2a(v, v), and the inequality holds for all k if the coefficients of the
differential operator L are stnooth.
= O, if we take vh Pu/in the defini- The other term in af is of higher order in h, since the approximation
ing, theorem with s = O yields
l(u1 - Pu1, u- Pu). (45)
the result is (42). Therefore, if we in crease the constant C' in (44) to a larger constant 8, that
expression will exceed uf for all small h.
.ate for eigenvalues. The theorem is now pro ved; if h is small enough to ensure that uf !,
then
·gree k 1, then there is a eonstant 8

zre bounded for small h by Such error bounds for the Rayleigh-Ritz method have a long history,
particularly in the Russian literature. Vainikko [7] has given a very complete
theory, including eigenvalue and eigenfunction estimates even in the non-
self-adjoint case. Combined with the approximation theorems for finite
Junds (28) and (34) ¡ó'r linear and eubie elements, his analysis establishes the same error bounds which we have
pro ved. for self-adjoint problems by means of the minmax principie. This
principie must be replaced by a contour integration in the complex plane,
and is easy to see directly: followed by an application of the Galerkin estimates for static problems
Pv, Pv) + a(v - Pv, v Pv). given in Section 2.3. The integral of (L- zl)- 1 around a true eigenvalue ).1
o-last term is zero by the definition (38) of produces exactly 2niufu1 ; Brambleand Osborn [B29], and Babuska and Fix,
· have computed the error in using finite elements instead.
232 EIGENVALUE PROBLEMS CHAP. 6 SEC. 6.3. ElGEN'
The minmax argument, which we ha ve preferred for self-adjoint problems quotient is R(v) = a(v, v)fb(v, v), and
beca use it is mo~e elementary, has been applied to finite elements by a number where the mass matrix Mb is formed
of authors; the basic reference is [816]. We have made the argument more (Mb)jk = b(rpi, rpk). One change occurs
precise, in order to determine not only the power h 2 ck-mJ but also the correct (45) in the 0-norm i~ replaced by
dependence on l:
b(u--'- Pu, u- Pu) <C
However, the first term in u? remains •
The last factor means that higher eigenvalues are progressively more difficult
final estímate in the theorem is unch<:
to compute. This prediction has been tested experimentally, ancl the appear-
The effect on the computed eigen
ance of Af1m is confirmed by the calculations described in [Cl4J with a quintic
finite element technique-change of d
element, those in Section 8.4 for the eigenvalues of an L-shaped membrane,
rature, nonconformity of elements-
and the following numerical results which are reproduced from [L3]. The
energy in static problems. We cautio
bicubic Hermite element was applied to a square plate which is simply sup-
polygonal domain Q\ the error in en
ported either on all sides (SSSS) or on only two sides, with the other two left
fore, this error is not the O(h 3 ) estab:
free (SFSF); the number of elements is fixed at 10 per side (Fig. 6.1). Since
if all trial functions vanish over n -
k = 4 and m = 2, the prediction is a relative error of
completely lost, with_an effect which.
We turn now to the eigenfunctio1
A.?-
A
Az
""'e
h4,
Az. errors are of the same magnitude as
l
It must be expected, of course, that tl
accuracy will deteriorate for higher
V Ah- A
-2 log
1 1 involve two simple identities, stated
A¡ technical computation.
LEMMA 6.3
-3
With the normalizations (u1, u1) =
o
(46) a(u1 - u?, u1 - u?)=)
o
o-SFSF Proof Observe that
-4
D w
Wx
Wy
Wxy
A-SSSS
a(uz - u7, Uz - uf)= a(u1:
=Az-
= 11[2
- 5 1~--~--~3--~--~~~--L---~--~--~9----L---,L,--~--_jl3 But the quantity in brackets is exactl:

l
2 - 2(u1, u?) = (u 1, u1) - 2(u
Fig. 6.1 Errors in the higher eigenvalues of a square plate.
The generalized eigenvalue problem Lu = ABu yields to exactly the same With this identity, and the estim
analysis. Suppose that B is a symmetric operator of order 2m' < 2m, corre- is necessary only to bound the error i
sponding to the quadratic form (Bv, v) = b(v, v). Then the true Rayleigh in energy will be an immediate conse1
CHAP. 6 SEC. 6.3. EIGENV ALUE AND EIGENFUNCTION ERRORS 233.
ave preferred for self-adjoint,problems quotient is R(v) a(v, v)fb(v, v), and the discrete one becomes qr Kqfqr Mbq,
applied to finite elements by a nwmber ,where the mass matrix Mt is formed with respect to the new inner product:
)]. We have made the argument more (Mb)jk = b(rpj, rpk). One change occurs in the last term of u?, where the bound
· the power h2 (k-m> but also the correct (45) in the 0-norm is replaced by
However, the first term in uf rernains larger, 2b(u, u- Pu) ,...., h2 (k-m>, and the
nvalues are progressively more difficult
final estímate in the theorern is unchanged. .
:ested experimentally, and the appear-
The effect on the computed eigenvalues of other approximations in the
ttions described in [Cl4] with a quintic
finite elernent technique-change of dornain or coefficients, numerical quad-
igenvalues of an L-shaped membrane,
rature, nonconforrnity of elernents-is exactly comparable to the effect on
vhich are reproduced from [L3]. The
energy in static problerns. We caution only that, for a change frorn Q to a
toa square plate which is simply sup-
po1ygonal dornain Qh, the error in energy must be measured over Q. There-
only two sides, with the other two ieft
~ore, th~s error i.s not th~ O(h 3) established ih Section 4.4 over the polygon;
is fixed at 10 per side (Fig. 6,1). Since
tf all tnal functwns vamsh over Q Qh, then the energy in this region is
elative error of
cornpletely lost, with an effect which is proportional to its area, of order hz.
We turn now to the eigenfunctions. We should like to prove that their
errors are of the sarne rnagnitude as those in steady-state problems Lu f
It must be expected, of course, that there will be sorne dependence on /; the
accuracy will deteriorate for higher eigenfunctions. Again the arguments
involve two simple identities, stated in the following lernrnas, and a rather
technical computation.
LEMMA 6.3
/:¡.
With the normalizations (u1, u¡) (uf, ut) 1, we have
/:¡. o o
~ o
(46)
o-SFSF Proof Observe tha t
J
w t:..-SSSS
Wx a(u¡ - uf, Uz - uf) a(u1, u1) - 2a(uz, uf) + a(uf, uf)
Wy A¡- 2A.¡(u¡, uf)+ A.?
Wxy = 11[2 2(u1, uf)]+ A.?- A.1•
11 13 But the quantity in brackets is exactly

l
eigenvalues of a square plate.

2 2(u¡, uf) = (uz, u¡) 2(u1, uf) + (uf, uf) 11 U¡ - uf 11~-
n Lu = A.Bu yields to exactly the same With this identity, and_the estímate already established for ;.; - J., it
ic operator of order 2m' <2m, corre- . is necessary only to bourtd the error in the norrn 11 u1 u? llo; then the error
, v) = b(v, v). Then the true Rayleigh in energy will be an imrnediate consequence.
Q34 EIGENV ALUE PROBLEMS CHAP. 6 SEC. 6.3. EIGENVAJ
LEMMA 6.4 The eigenfunction error is now bounde1

error u1 Pu1 ; it follows from (39) tha-
For al/ j and l,
(47) (52)
11 U¡ - Pu7 ilo C'(l +,
C"[hk -1
Proo,f Since the term - A1(Pu1, uj) appears on both sides, we ha ve only
to show·.that This is the essential part of our theorer
THEOREM 6.2
Because uj and u1 are eigenfunctions, these two sides can be rewritten as Jf Sh is a.finite element space ofdegré
a(Pu1, u') and a(u1, uj), respectiv~ly. Now equality comes from the definition thenfor sma/1 h,
(38) of the projection P.
The expression (47) resembles a truncation error in the eigenvalue equa- (53) 1lu1 u?llo <'
tion; it coincides with a(Puz, uj) A1(Pu1, uj). (54) a(u1 - u;, u1 ufr) <,
The set u~, ... , u';¡ forms an orthogonal basis for S\ and in particular
Jf A1 is a repeated eigem•ahie, then the
(48) chosen so that these estimates still lwh
and are consisten! with the special case (
J
The ,identities (47) and (48) can be interpreted in the following way. First,
Proof The bound (53) is virtually
from (47) it is seen that the coefficient (Pu1, uj) is small if Aj is not close to
proved. It remains only to show that
.A1• Then from (48) it follows that Pu1 is close to u;. This will be our strategy
comes from fiddling about with the tri
for estimating uf - Pu1 (and hence uf - u1), but to make the process rigorous
it is convenient to consider separately the cases of distinct and repeated
eigenvalues.
If A1 is distinct from the other eigenvalues, then according to our eigen-. Recallíng that u1 and u; are unit vect
value bounds (43) there is a separation constant p such that for small h, p >O, this is the same as IP 1] ll
(49) p for all j. !Ju¡ u?llo llu¡- Pu?llo + 11
Now the right side is given by (52), a1
Now come the computations. Writing p for the key coefficient (Puz, uf) in
is pro ved: e= 2C". The error in energy
(48), the size of the other terms is given by
6.3. This is the simpJest argument we l
The case of a repeated eigenval
awkward, but not essentially different.
(50)
#1 (;.? ~ ;.)\u 1 Pu¡, uj)
2 p between these eigenvalues and the a
(49). The constant p becomes a matrix
'< p 2
(u1. - Pu1, uj) 2
The long computatio~ (50) is now carri1

since the square of the norm is the sum of the squares of the components. +
1 R on the.left side rather than the
Thus we have the crucial estímate
CHAP. 6 SEC. 6.3. EIGENVALUE AND EJGENFUNCTION ERRORS 235
The eigenfunction error is now bounded in terms of the Ritz approximation

error u1 Pu1; it follows from (39) that
11 U¡- Puf !lo C'(1 p)[hk + h 2Ck-m>] 11 Utllk

(52)
< C"[hk + h2(k-m>]Af/2m~
~?) appears on both s~des, we have only
This is the essential part of our theorem.
THEOREM 6.2
s, these two sides can be rewritten as Jf Sh is afinite element space of degree k 1 and A1 is a distinct eigem•alue,
~ ow equality comes from the definition thenfor small h,
:runcation error in the eigenvalue equa- (53) 11 U¡ uf llo c[hk + h2Ck-m>]Af12m,

1(Pu1, uj). (54) a(U¡ Uf, U¡- u?)< c'h 2 Ck-m> Af/m.
hogonal basis for S\ and in particular
Jf A1 is a repeated eigenl'alue, then the orthonormal eigenfunctions uj can be
eh osen so that these estimates still hold. The -estimates are the best possible
and are consistent with the special case (27) of linear elements.
:nterpreted in the following way. First,
nt (Pu1, uj) is small if Aji's not close to Proof The bound (53) is virtually the same as (52), which is already
is close to uf. This will be our strategy pro ved. It remains only to show that the factor p is el ose to 1, and this
u1), but to make the process rigorous comes from fiddlíng about with the triangle inequality:
ely the cases of distinct and repeated
;envalues, then according to our eigen-

Recalpng that u1 and uf are unit vector~, and choosing their sign so that
tion constant p such that for small h,
P >O, this is the same as 1P 11 < 11 u1 Pu7llo· Therefore,
p for all j. jju,- uf!!o < llu, Pu?ilo + li(P- l)ufllo < 2llu¡ Pu?llo·
1g P for the key coefficient (Pu1, uf) in Now the right side is given by (52), and the first statement of the theorem
1en by is pro ved: e= 2C". The error in energy (54) follows immediatelyfrom Lemma
6.3. This is the simplest argument we have seen for eigenfunctions.
The case of a repeated eigenvalue A1 = Al+ 1 = · · · AI+R is more
awkward. but not essentially different. There is stiii a separation constant
A¡ .
Aj - A¡
)2( U¡ - '
P
U¡,
b.)2
uj
p between these eigenvalues and the approximations Aj to the others, as in
(49). The constant p becomes a matrix of order R 1;-
: ( u1 Pu¡, uj) 2
;¡
O< i,r <R.
U¡- Pu¡ll~,
'
The long computation (50) is now carried out with all ofthe termsj 1, ... ,
.um of the squares of the components. 1 + R on the left side rather thanthe right and Jeads to
~
~
priUl+ih
Ut+r
'11. C[hk + h2(k-m>]A_k/2m.
1
11 0
Inverting p, this means that new eigenfunctions U1+r-linear combinations 1970 is [21]. Before describing a ne
of the ·old ones-can be chosen so that widely u sed· algorithm in this class
method. In its simplest form, for tl
iteration proceeds by solving a li1
Then the approximation to ). is ).n+ 1
Since the uf+r are known to be orthonormal, it is even possible to keep the to X ÍS the normalized vector Xn+ 1
U1.;., als~:,prthonormal, without any harm to the estímate. were an eigenvector. If we imagine
in terms of the true eigenvectors v.J'
The same theorem and proof-with the inner product (u, v) replaced iterations is to amplify each com¡:
throughout by b(u, v)-apply to the generalized eigenvalue problem Lu = 2: c/l)-nvr If A1 is distinctly smal
).Bu.
first component will become domim
vector v 1 • The convergence is like tha
6.4. COMPUTATIONAL TECHNIOUES
fraction ).1 1Az; the error xn X is (
becomes more effective when this ra
The Rayleigh-Ritz principie has led to the matrix eigenvalue problem One tec'hnique for reducing this
KQ = lMQ, which now remains to be solved. This is nota trivial problem, at the nth step by A lnl. This shi
and it is scarcely discussed in a typical text on linear algebra. An efficient amount ln. If ).,n is close to a true ei1
algorithm should take a"dvantage of the fact that K and M are symmetric the corresponding corríponent of th
positive definite and also of the fact that they are sparse. The last property quantity (). - ).n)- 1 • There is no nU1
would be lost, (or example, if we were to factor M into LLT by a Cholesky though A - li is nearly singular.
elimination, and compute the eigenvalues of L- 1K(L- 1 )r by a standard algo- approximation to the eigenvalue th'
rithm (we would choose QR, ora Givens method which begins by reduction Ieigh quotient ln (Ayn, Yn)f(yn, YJ
to a triangular matrix, rather than the older Jacobi method); This Ioss of tient is very ftat in the neighborhoc
sparseness will not be so serious for a srrÍall problem, which can be handled stationary point), and the algorithr
within the core of the computer; but for a large system it is inefficient. cubic convergence: ).n+l l "' ().n-
We propose to find the eigenvalues-or more precisely the first few eigen- A - lJ means that Gauss eliminat
values, since it would be useless to compute higher eigenvalues which have the triangular factors of A cam1ot b
no physical significan~e-directly from the equation KQ = ).MQ. We shall done with the simple iteration Ayn+ 1
reject Iumping, since a diagonal Mis not enormously better than a banded In the generalized eigenvalue prot
M. Kyn+ 1 = Mxn at each stage and the
There is also a technique of economization, which reduces the order of this form, however, the iteration mat
the system by working only with a small number of master ~'ariables., The fore, if the Cholesky factorization
dependence of the other slave variables is assumed a priori, and these degrees s- 1 M(BT)- 1 as the (symmetric) iterai
of freedom are thereby eliminated [22]. Fried has described this idea in the nonsparse matrices we wished to a
following terms: Placing a point load ata node zj corresponding toa master not begin by multiplying these ma1
variable, -let cJ>j be the finite element solution in the static problem. Then is determined from Brvn+ 1 u, Bw
these functions cJ>j are a basis (not local, as the original rpj were) for the trial Convergence of the normalizing fact
space in the economized problem. These functions can be expected to repre- eigenvector x~' again depends on the
sent low-frequency modes fairly well, and those are the ones which are cal- A very useful variant of inverse
culated. Nevertheless, out impression is tñat as efficient algorithms are introduced by Bauer and improved l
constructed for the original problem, this economization will become less kinson.t The idea is to find several ei!
necessary and less popular.
An authoritative reference for eigenvalue algorithms prior to about tit is known in the engineering literatur
CHAP. 6 SEC. 6.4. COMPUTATIONAL TECHNIQUES 237
;'unctions ul~r-iine¡u combinations 1970 is [2 1]. Befo re describing a newer method, we want to review a very
widely u sed algorithm in this class: the in verse iteration or in verse power
method. In its simplest form, for the eigenvalue problem Ax = 1x, inverse
iteration proceeds by solving a linear system at each step: Ayn+t xn.
Then the approximation toA. is }.,n+t = 1/IIYn+t 11, and the new approximation
rmal, it is even possible to keep the to X is the nonnalized vector xn+l = }.,n+tYn+J• These would be exact if xn
rm to the estímate. were an eigenvector. If we imagine that the starting vector x 0 is expanded
h the inner product (u, v) replaced
in terms of the true eigenvectors v1, x 0 .2: c1v1, then the effect of n inverse
iterations is to amplify each component by (}.,)-n; xn is proportional to
neralized eigenvalue problem Lu
.2: c1(1 1)-nvr lf A1 is distinctly smaller than the other eigenvalues, then the
first component will become dominant, and xn wiii approach the unit eigen-
vector v 1 • The convergence is Iike that of a geometric series, depending on the
fraction A1 /A 2 ; the error xn x is of order (1 1 / A2 )n. Obviously the method
becomes more effective when this ratio is small.
j to the matrix eigenvalue problem One technique for reducing this ratio is to shift the origin, replacing A
solved. This is not a trivial problem, at the nth step by A }.,ni. This shifts all the eigenvalues of A by the same
text on linear algebra. An efficient amount An. If }.,n is close toa true eigenvalue A., so that }., - }.,n is small, then
e fact that K and M are symmetric the corresponding component of the vector Yn+l is amplified by the Iarge
tt they are sparse. The last property quantity (}., }.,)- 1 • There is no numerical difficulty with this process, even
to factor M into LLT by a Cholesky though A }.,J is nearly singular. In fact, it is useful to have a better
es of L _, K(L -~y by a standard algo- approximation to the eigenvalue than }.,n 1/IIYn 11. For example, the Ray-
lS method which begins by reduction Ieigh quotient An (Ayn, Yn)f(yn, y) will be much more accurate. This quo-
older Jacobi method). This loss of tient is very ftat in the neighborhood of a true eigenvalue (where it has a
a1aii problem, which can be handled stationary point), and the algorithm with these improved shifts possesses
· a Iarge system it is inefficient. cubic convergence: }.,n+ 1 }.,
('V (}.,n 1) 3 ; On the other hand, shifting A to
-or more precisely the first few eigen- A 4nl means that Gauss elimination has to be redone at each iteration;
tpute higher eigenvalues which have the triangular factors of A cannot be stored and used over and over, as is
the equation KQ = }.,MQ. We shall done with the simple iteration Ayn+ 1 xn.
ot enormously better than a banded In the generalized eigenvalue problem, the simplest idea would be to solve
Kyn+J = Mxn at each stage and then obtain xn+t by normalizing Yn+t· In
1ization, which reduces the order of this form, however, the iteration matrix K- 1 M will be nonsymmetric. There-
all number of master variables., The fore, if the Cholesky factorization of K is BBr, it is preferable to use
; assumed a priori, and these degrees B- 1M(B'r' as the (symmetric) iterating matrix. This appears to involve the
Fried has described this idea in the nonsparse matrices we wished to avoid, but obviously the iteration does
a node z1 corresponding toa master not begin by multiplying these matrices. Instead the approximation un+t
)]ution in the static l)roblem. Then is determined from BTvn+! =u, Bwn+t Mvn+l' un+l normalized wn+t•
as the original rp1 were) for the trial Convergence of the normalizing factors to Ap and of the vn (not u)) to the
: functions can be expected to repre- eigenvector x 11 again depends on the ratio 1 2 /1 1 •
td those are the ones which are cal- A very useful variant of inverse iteration is the block power method,
is that as efficient algorithms are introduced by Bauer and improved by Rutishauser and by Peters and Wil-
his economization will become less kinson.t The idea is to find severa] eigenvalues at once, by carrying l approxi-
mvalue algorithms prior to about tlt is known in the engineering literature as subspace iteration.
mate ftigenvectors in the iterations. (Obviously they must be coupled as the Parlett recommend an accelerated se4
alg~rithm proceeds, or the result would only be l different approximations
to the same fundamental mode.) Convergence toA¡ occurs at the rate lJli+P
and repeated eigenvalues can be handled without difficulty.
Block iteration for Ax = lx proceeds as follows. Suppose the l starting
vectors, ~ssumed to be orthonormal, are the columns of an N x l matrix This is a variant of the usual Newtc
P 0 • The ,first step is to solve the l equations AZ 1 P 0 • Then, rather than replaced by a difference quotient; th
normalizíng separately each column of Zl' we orthonormalize all/ columns. is el ose toa true A, the computer may
For this we form the l x 1 matrix Z{Z 1 and find its eigenvalues J.l¡- 2 (which turn to ordinary inverse iteration. We
are the first approximation to X¡- 2 , • • • , l¡ 2 ) and corresponding eigenvectors algorithm was very successful in the n
W¡. The new approximation P 1 is the product of Z 1 and the matrix W with ter 8.
columns W 1 J.lp ... , w1p 1• The columns of P 1 are orthonormal-they are the We want to give the theoretical
approximate eigenvectors of A-since P{P 1 = wrz;z 1 W l. At the next can be based on a classical theorem
step AZ2 . P 1 , and so on. two real symmetric matrices A and D •
This algorithm would reproduce exactly the eigenvectors vl' ... , v1 of tion A BDBr, where Bis any nons
A. More precisely, let each column of P0 be a linear combination of vl' ... , number ofnegative eigenvalues, ofposJ
v 1-say P 0 VQ, where Q is an 1 x 1 orthogonal matrix and Vis the N x l The proof is especially neat when
matrix whose columris are Vp •.• 'V¡. We note that vrv = 1 and AV VA, eigenvalues. Let Be be a family of nc
if A is the diagonal matrix whose entries are l 1 , • • • , A1• ually from the identity matrix (at (}
Thus the first block power step gives Z 1 A- 1P0 = A- 1 VQ VA- 1 Q, cannot be sure that the particular [(¡
and Z{Z 1 becomes QrA -zQ. Since Q is orthogonal, it follows that the singular, but the construction of a su
eigenvalues p,; 2 of Z{Z 1 are equal to the en tries 1; 2 of the diagonal matrix B9 DB~ are always symmetric and al~
A -z. In other words, if P 0 = VQ then the first step of the method would values A.((}) are real, change graduall
produce P 1 V, and the correct eigenvalues . quently, the number of eigenvalues o
. The changes involved in working with A Q = lM Q rather than Ax = Ax () = 1 as at () O. In other words, A ;
are described in the final paragraph of this chapter. The block power algo- eigenvalues and of positive eigenvalu
rithm as restated there is completely straightforward to program. the same argument for the mat'rices
These inverse power methods are very simple and practica!, especially limit as E ----"' O.
when only modest accuracy is required. However, there is a new technique, Now we apply this Iaw of inertia
based on a more subtle matrix theorem, which has also become a most is factored into LDLT (as in Section
highly recommended algorithm for the band eigenvalue problem KQ lM Q. 1's on the diagonal and D is the di:
It is due to Peters and Wilkinson [PI, P2], and depends on the following of the pivots determine the signs of
lovely matrix theorem: The number of eigenvalues less than a given l 0 can exchanged during the elimination pt
be determinedjust by counting the num/Jer of negative pivots when Gauss early pivots happens to be zero, thc
elimination is applied to K- l 0 M. exchanged to preserve the symmetr)
This suggests an algorithm based on bisection. Suppose it is determined of both rows and columns is again
that there are n 0 eigenvalues below the first guess l 0 • Then the Gaussian about by a permutation matrix B. TI
pivots for K (l 0 /2)M reveal the number n 1 which is below l 0 /2; and the and after such an exchange the Gaus
remaining n0 - n1 eigenvalues must líe between l 0 /2 and l 0 • Repeated bi- of the eigenvalues correctly.)
section will isolate any eigenvalue with tremendous accuracy, but the process The application to the generalize
requires a Gauss elimination at each step and is rather expensive. It must be lows. Let A= K 1 0 M, and count ·
speeded up by using the values of the. pivots (or their product, the determi- elimination. We claim that this equa
nant d(l) = det (K lM)) rather than just their signs. Clough, Bathe, and 1 0 in the given problem K Q = 1MQ
CHAP. 6 SEC. 6.4. COMPUTATIONAL TECHNIQUES 239
wiously they must be coupled as the Parlett recommend an accelerated secant iteration
d only be 1 different approximations
rgence to A¡ occu~s al the rate Aj Al+ 1,
:d without difficulty.
ds as follows. Suppose the 1 starting
tre the columns of an N X 1 matrix This is a variant of the usual Newton's method, in which the derivative is
ations AZ 1 P 0 • TH:en, rather than replaced by a difference quotient; the factor 2 is for acceleration. Once Ak
Z 1 , we orthonormaliie all/ columns. is close to a true A, the computer may cease further matrix factorizations, and
2
1 and find its eigenvalues J-li (which turn to ordinary inverse iteration. We can confirm that the Peters-Wilkinson
A1- ) and corresponding eigenvectors
2
algorithm was very successful in the numerical experiments reported in Chap-
roduct of Z 1 and the matrix W with ter 8.
of P 1 are orthonormal-they are the We want to give the theoretical background for this algorithm, which
P~P 1 WTZ~Z 1 W l. At the next can be based on a classical theorem known as Sylvester's law of inertia: Jf
two real symmetric matrices A and D are related by a congruence transforma-
xactly the eigenvectors v~' . .. , v1 of tion A= BDBT, where Bis any nonsingular matrix, then they have the same
0 be a linear combination of v 1 , ••• , number of negative eigenvalues, of positive eigenvalues, and of zero eigenvalues.
•rthogonal matrix and Vis the N X l The proof is especially neat whe11 D is nonsingular; that is, it has .no zero
Ve note that VTV 1 and AV= VA, eigenvalues. Let B9 be a family of nonsingular matrices which changes grad-
tries are A1, ••• , A1• ually from the identity matrix (at fJ O) to the matrix B (at fJ = 1). [We
$ Z1 A- 1P0 A- 1 VQ = VA- 1 Q, cannot be sure that the particular family Be fJ B + (1 - fJ)l will be non-
? is orthogonal, it follows thatthe singular, but the construction of a suitable Be is never difficult.] The matrices
1e entries Ai 2 of the diagonal matrix BeDBI are always symmetric and always nonsingular. Therefore, their eigen-
t the first step of the method would values A(fJ) are real, change gradually with fJ, and never cross zero. Conse-
;alues. quently, the number of eigenvalues on each side of zero must be the same at
th AQ = AMQ rather than Ax Ax fJ = 1 as at fJ = O. In other words, A and !{ have the same number ofnegative
this chapter. The block power algo- eigenv.alues and of positive eigenvalues. If D happens to be singular, we use
traightforward to program. Jhe same argument for the matrices D ± El and prove the theorem in the
very simple and practica!, especially limit as E ·-----" O.
. However, there is a new technique, Now we apply this law of inertia to Gauss elimination. If the matrix A
em, which has also become a most is factored into LDLT (as in Section 1.5), where L is lower triangular with
,and eigenvalue problem KQ AMQ. 1's on the diagonal and D is the diagonal matrix of pivots, then the signs
, P2], and depends on the following of the pivots determine the signs of the eigenvalues. (In case two rows are
· eigenvalues less than a given A0 can exchanged during the elimination process, as will be required if one of the
mber of negative pivots when Gauss early pivots happens to be zero, the corresponding two columns must be
exchanged to preserve the symmetry of the matrix. Bu_t such an exchange
n bisection. Suppose--it is determined of both rows and columns is again a congruence transformation, brought
.e first guess A0 • Then the Qaussian about by a permutation matrix B. Therefore, the law of inertia still applies,
1ber n 1 which is below A0 /2; and the and after such an exchange the Gauss elimination pivots still give the signs
: between A0 /2 and A0 • Repeated bi- of the eigenvalues correctly.)
tremendous accuracy, but the process The application to the generalized eigenproblem KQ AMQ is as fol-
:p and is rather expensive. It must be lows. Let A K- A0 M, and count the num~er of negative pivots in Gauss
Jivots (or their product, the determi- elimination. We claim that this equals the number n 0 of eigenvalues below
1 just their signs. Clough, Bathe, and A0 in the given problem KQ AMQ.
240 EIGENVALUE PROBLEMS CHAP.6
Prpof. Jlewrite this given problem as M- 112KM- 112(M112Q) l(MII2Q),

so th'\-t ng is the number of eigenvalues of M-11 2KM- 112 which lie below l 0 •
This is the number of negative eigenvalues of M- 112KM- 112 l 0 /. But by
the law of inertia (again! Choose this last matrix as D, and B = M 1' 2 ) this
coincides with the number of negative eigenvalues of A K- l 0 M. There-
fore, n0 !s easily determined by applying Gauss elimination to A. The eigen-
vectors fóllow in one or at most two steps of inverse iteration with the matrix
C K , lcomputedM. A good initial guess for the eigenvector is described in
7
[P2].
Experiments are now going on (see the supplementary bibliography)
to compare the performance .of eigenvalue algorithms. Clough and Bathe INITIAL-VALUE P
ha ve found the determinant search techniques, of which the Peters-Wilkinson
method is one variant, to be most competitive when the bandwidth is not
large; about five triangular factorizations were required for each eigenvalue,
and the cost of every such iteration is proportional to the square of the band-
width. The most generally successful technique was the inverse block power
method, alias subspace iteration. They express the algorithm in the following
equivalent form: starting from an eigenvector guess xn-t' which is a matrix 7.1. THE GALERKIN-CRANK-NICOI
with l orthonormal columns; sol ve KYn = M Xn_ 1 • Then sol ve the /-dimen- FOR THE HEAT EOUATION
sional eigenproblem
So far we have discussed only st
(55) and eigenvalue problems. Galerkin'
also to initial-value problems, and ir
V¡ are the approximate eigenvalues, and the new matrix Xn of approximate element approximations. They still
eigenvectors is the product of Yn with a square matrix of order l, formed from differences, for general geometries at
the eigenvectors Q of (55). Two computational problems have been consid- tively slowly in time (low Reynolds
ered in detail by Bathe and Parlett-the solution of the small eigenproblem high-speed wave problems, we shall
(55), for which they use a Jacobi-type sweep since the matrices become near The main theorem for finite dim
to diagonal, and the selection of the starting X0 • There is also a compromise sistent method is convergent if and <
to be made on the choice of l; with a large l there are few iterations, but each to Galerkin approximations. In fact,
one is expensive. They chose l = min (2p, p + 8) in computing the first consistency are often easier to verify
p eigenvalues, and found that eight iteration steps gave excellent results. ences. We show below how the Gal
This technique is effective even on problems too large to be handled in core, the differential equation, and we rej:
and it may well become the accepted algorithm for finite element eigenvalue by consistency: The Galerkin method
problems. dense in the admissible space. This m
mated more and more closely, as h
mation theorems of Chapter 3 establ
even the degree of approximation-v..
sistency, or the order of accuracy, o
A natural setting in which to illus
equation
(1) au au =
at - ax2
2
f(x, t),
CHAP. 6
ts M-ti2KM-ti2(A{ti2Q) =,l(Mti2Q),
of M- 112KM- 112 which lie below 10 •
tlues of M- 112KM- 112 - lo!· But by
ast matrix as D, and 1$ M 112 ) this
igenvalues of A= K 10 M. There-
~ Gauss elimination to A. The eigen-
)S of inverse iterationwith the matrix
:ss for the eigenvector is described in
ee the supplementary bibliography)

1 alue algorithms. Clough and Bathe
tiques, of which the Peters-Wilkinson
7 INITIAL-VALUE PROBLEMS
npetitive when the bandwidth is not

ns were required for each eigenvalue,
·oportional to the square of the band-
chnique was the inverse block power
:xpress the algorithm in the following
lVector guess Xn-t, which ÍS a matrÍX 7.1. THE GAlERKIN-CRANK-NICOlSON METHOD
. M Xn_ 1 • Then sol ve the /-dimen- FOR THE HEAT EQUATION
So far we have discussed only steady-state problems-elliptic equations

and eigenvalue problems. Galerkin's principie is flexible enough to apply
also to initial-value problems, and in this chapter we want to consider finite
d the new matrix xn of approximate element approximations. They still have important advantages over finite
square matrix of order /, formed from differences, for general geometries and for problems which evolve compara-
1tational problems have been consid- tively slowly in time (low Reynolds numbers, in the case of fluid flow). For
e solution of the small eigenproblem high-speed wave problems, we shall poi1;1t out sorne disadvantages.
;weep since the matrices become near The main theorem for finite differences is well known: A formally con-
.rting X 0 • There is also a compromise sistent method is convergent if and only if it is stable. This applies verbatim
rge 1 there are few iterations, but each to Galerkin approximations. In fact, the only difference is that stability and
n (2p, p + 8) in computing the first consistency are often easier to verify for finite elements than for finite differ-
teration steps gave excellent results. ences. We show below how the Galerkin principie imitates the stability of
)lems too large to be handled in core, the differential equation, and we repeatnow the disguise which is assumed
lgorithm for finite element eigenvalue by consistency: The Galerkin method is consistent if the subspaces Sh become
dense in the admissible space. This means that every admissible vis approxi-
mated more and more closely, as h ---:--* O, by trial functions vh. The approxi-
mation theorems of Chapter 3 establish exactly this property, aq.d they give
even the degree of approximation-which translates into the degree of con-
sistency, or the order of accuracy, of the Galerkin equations.
A natural setting in which to illustrate these ideas is provided by the heat
equation
(1) O< x < n, t > O.
241
242 INITIAL·V ALUÉ ·PROBLEMS CHAP. 7 SEC. 7.1. THE GALE
This parabolic differential equation represents heat conduction in a rod; Notiee that the time variable is still e
u u(i, t) is the temperature at the point x and time t > O, and f(x, t) correctly, Faedo-Galerkin) formulation
is a heat-source term. As in our previous examples, we impose a Dirichlet yields a system of ordinary differential1
boundary condition at x O and a natural condition at x n: which have to be solved numerically. T·
choose a basis rp¡, .•• , rpN for the trial
(2) u(O, t) = ~~(re, t) = O. solution as
N
if(x, t) I;
The first condition means physically that the temperature at the left end of l
the rod is held at O. The Neumann condition, on the other hand, means that
the right end of the rod is insulated; there is no temperature gradient across The optimal weights Qi are determine<
x = n. To complete the statement of the problem we specify the initial
temperature
(3) u(x, O) = u 0 (x), O x n.

Since every vh is a combination of the:
This classical formulation of heat conduction is fraught with difficulties. apply the principie only to the basis ft
For example, (2) and (3) are contradictory if u 0 fails to vanish at x =O, or ordinary differential equations for the
au
if 0 f8x does not vanish at x n. In addition, f could be a point source the boundary conditions are already i
x
which is singular at sorne 0 , and so (1) is not literally true at that point. In initial condition u = u0 is still to be acc
all these cases the underlying physical pro blem still makes sense, to find possibilities. Mathematically, a natural
the temperature distributü;m associated with a given initial temperature u 0 tion ui is the best least-squares approx
and heat so urce f We therefore seek an alternative integral formulation as (ui, vh) (u 0 , vh) for all vh, or in other
in the steady-state case.
Since there is no natural mínimum energy principle-the equation is
not self-adjoint-we turn to the weak formulation: At each time t > O,
(4) J: (ut Uxx- f)v dx =O.

In practice, this means that the i11¡tegn
and the mass matrix on the left has to l
efficient to use the interpolate of u0 as
In the steady state, with ut O and/ f(x), this coincides with the earlier system: ui (u 0 ) 1 • The order of accur
Galerkin formulation. Now we put the Galerkin equatior
To achieve greater symmetry between trial function and test function, (Q P •.• , QN). The striking point is tha1
we integrate -uxxv by parts. The integrated term uxv vanishes naturally at and load vector which appear in the st
x n, and we are led back to the essential condition v(O) = O, in other Q' is the mass matrix M, and the co
words to the space X 1:
MQ'+K
(5) J: (utv + uxvx fv) dx O · for all v in X1.
(8)
The components of the right hand sic

This is the starting point for the finite element approximation. Given were inhomogeneous boundary cond
an N-dimensional subs{mce Sh of X1, the Galerkin principie is to jind a fun'c- effect would also appear in F.
tion uh(x, t) with thefollowing property: At each t > O, uh lies in Sh and satisjies It is natural to ask why finite eleme
tion. This has certainly been attempte
matically, it is perfectly reasonable to
(6)
first to analyze the finite element erro
CHAP. 7 SEC. 7.1. THE GALERKIN-CRANK-NICOLSON METHOD 243
represents heat conduction in a rod; No ti ce that the time variable is still continuous: the Galerkin (or, more
~ point x and time' t >
O, and f(x, t) correctly, Faedo-Galerkin) formulation is discrete in the space variables and
vious examples, we impose a Dirichlet yields a system of ordinary differential equations in time. lt is these equations
1atural condition at x = n: which have to be solved numerically. To make this formulation operational,
choose a basis rpp ... , rpN for the trial space Sh and expand the unknown
~u
r;;(n,t ) =0.
s·olution as
that the temperature át the left end of

ndition, on the other hand, means that
:here is no temperature gradient across The optimal weights Qj are determined by the Galerkin principie (6):
Jf the problem we specify the initial
(7) k= 1, ... ,N.
O< x <n.
Since ev~ry vh is a combination of these basis functions rpk, it is enough to
conduction is fraught with difficulties. apply the principie only to the basis functions. The result is a system of N
ctory if u 0 fails to vanish at x = O, or ordinary differential equations for the N unknowns Q 1 (t), ... , QN(t), and
:n addition, f could be a point source the boundary condítions are already incorporated in these equations. The
1) is not literally true at that point. In initial condition u = u 0 is still to be accounted for, and here there are severa}
al problem still makes sense, to find possibilities. Mathematically, a natural choice of approximate initial condi-
!d with a given initial temperature u 0 tion u~ is the best least-squares approximation to u 0 ; u~ is in Sh and satisfies
an alternative integral formulation as (u~, vh) = (u 0, vh) for all v\ or in other words,
1m energy principle-the equation is

k= 1, ... ,N.
tk formulation: At each time t > O,
- f)v dx =O. In practice, this means that the integrals on the right have to be computed,
and the mass matrix on the left has to be inverted. Therefore, it is often more
efficient to use the interpolate of u 0 as initial condition in the finite element
= f(x), this coincides with the earlier system: u~ = (u 0 ) 1 • The order of accuracy is not affected.
Now we put the Galerkin equations (7) into vector notation, with Q =
Neen trial function and test function,
(Q¡, ... , QN). The striking point is that they involve exactly the same matrices
!grated term uxv vanishes naturally at
and load vector which appear in the steady-state problem; the coefficient of
:!ssential condition v(O) = O, in other
Q' is the mass matrix M, and the coefficient of Q is the stiffness matrix:
(8) MQ' + KQ = F(t).

·=O for all v in'X1.
The components of the right hand side are Fk = f f(x, t)rpk(x) dx. If there
finite element approximation. Given were inhomogeneous boundary conditions, time-dependent or not, their
the Galerkin principie is to find a fun'c- e:ffect would also appear in F.
. At each t > O, uh lies in Sh and satisjies It is natural to ask why finite elements are not used also in the time direc-
tion. This has certainly been attempted, but not with great success. Mathe-
matically, it is perfectly reasonable to study the discretization in two steps-
first to analyze the finite element· error u(x, t) - uh(x, t), and then the error
244 INITIAL-VALUE PROBLEMS CHAP. 7 SEC. 7.2. STABILITY AND COl'
incurred In solving the ordiriary differentia.l equations for uh. In the time vari- elimination into LLT, where Lis Chol
able there are no geometrical difficulties to overcome by finite elements, and Q11 + 1 would be computed at each ste1
in fact a straightforward application of the Galerkin principie may couple all
the time levels, and destroy the crucial property of propagation forward in LQn+l/2 M-Kilt Qn + (/
_ 11
"
time. We see no reason to forego the extra flexibility of finite differences. 2

First we must mention the technique of mode superposition, a competitor
of conveb;tional finite differences. The id~a is simply to analyze the initial u0 If the coefficients in the problem are
and the forcing term f into the natural modes of the problem-the eigen- the strict Galerkin theory the matrice
functions uf of Chapter 6. All the computati.onal work is transferred to the time step. It seems very likely that a
eigenproblem. Only the Iower eigenvalues are followed forward in time; necessary, leading to sorne hybrid of ·
Nickell suggests that fewer than 30 modes out of 1000 can produce very good order to produce a stiffness matrix,
results, unless/is unusually rich in the higher harmonics. recomputing every integral. In large
A conventional difference scheme in time, which couples all the modes, to solve exactly for Qn+ 1 ; an iteraH
has to contend with the extreme stiffness of the equation (8); the condition guess) may be more efficient. Dougla
number of M- 1K can easily exceed 1000, so that the modes are decayíng at several iteration techniques for non]
radically different rates. A "trapezoidal rule" scheme (Crank-Nicolson, the solution of a large nonlinear sy
Neumark fi) automatically filters out the useless high modes, if ilt is properly completely justifies these modificatio
chosen. Of course these schemes are implicit, but so is Galerkin's differential The two remaining sections are o
equation; M cannot be inverted in (8) without destroying the band structure. the expected rates of convergence .f<
(The alternative of lumping is discussed at the end of the chapter.) In one For the simple equation ut ux in
respectan implicit equation is not such a serious drawback for parabolic equa- finally occm;s.
tions, such as the heat equation, since the stability requirements on explicit
difference schemes are in any case severe: the time step must be limited to
7.2. STABILITY ANO CONVERGEN<
ilt < Ch 2 or there will be exponential instabilities in the difference equation.
PARABOLIC PROBLEMS
By contrast, an implicit scheme can be unconditionally stable; the size of ilt
is restricted only by the demands of accuracy, and not by stability. This dis-
There are two approaches tothe tt
tinction between implicit and explicit methods is natural for the heat equa-
the other may be more general. In t
tion, where there is an infinite speed of propagation: the temperature Q"+ 1 at
followed forward in time in all th
any point x 0 depends on the previous temperature Q" at all points of the
equation for u, the ordinary different
medium, however small the time step may be. This dependence is reflected in
ence equation for Q11 • lf the coeffic
the fact that M- 1 is not sparse in the Galerkin equation, and that ilt .- h 2
equations are independent of time 1
is required for stability in an explicit difference equation.

approach is extreme!y successful; th
We shall analyze the Crank-Nicolson scheme, which is centered at (n +
on the eigenfunction errors, and t]
i)At and therefore achieves second-order accuracy in time:
mentary. t In the nonstationary cast
Qn+l _ Q11 Q11+l + Q11 ¡11+1 + ¡11 but parabolic equations are so stron~
(9) M ilt +K 2 = 2 . even nonlinear) effects can ultimatel
proach, which is based on energy im
Rewritten, the approximation Q11 + 1 is determined by becomes comparatively simple.
We consider a parabolic equatio
M +KAtQn+l ilt kind of elliptic operator studied in t
2 (m 1 is by far the most common)
In an actual computation, the matrix on the left can be factored by Gauss tThis is also the approach to use in ana
CHAP. 7 SEC. 7.2. STABILITY AND CONVERGii.NCE IN PARABOLIC PROBLEMS 245
ntial equations for uh. In the time vari- elimination into LLT, where Lis Choleskts lower triangular matrix, and then
::s to overcome by finite e'lements, and Qn+ 1 would be computed at each step by two back substitutions:
~ the Galerkin principie may couple all
il property of propagation forward in
;!Xtra flexibility of finite differences.
1e of mode superposition, a competitor
idea is simply to an~lyze the initial u0 If the coefficients in the problem are time-dependent (or nonlinear), then in
al modes of the problem~the eigen~ the strict Galerkin theory the matrices M and K must be recoinputed at each
1putational work is transferred to the time step. It seems very likely that another variational crime will be found
·alues are followed forward in time; necessary, leading to sorne hybrid of finite elements and finite differences-in
des out of 1000 can produce very good order to produce a stiffness matrix which is approximately correct without
higher harmonics. recomputing every integral. In large problems it may also be too expensive
in time, which couples all the modes, to solve exactly for Qn+ 1 ; an iteration (possibly starting with Qn as initial
ess of the equation (8); the condition guess) may be more efficient. Douglas and Dupont [D6, D9] ha ve introduced
1
00, so that the modes are decaying at severa! iteration techniques for nonlinear problems in order to circumvent
idal rule" scheme (Crank-Nicolson, the solution of a large nonlinear system at every time step. Their analysis
1e useless high modes, if ilt is properly completely justifies these modifications of the pure Galerkin method.
zplicit, but so is Galerkin's differential The two remaining sections are occupied with verifying the stability and
~vithout destroying the band structure. the expected rates of convergence for parabolic and hyperbolic equations.
ed at the end of the chapter.) In one For the simple equation ur = ux in the last section, something unexpected
a serious drawback for parabolic equa~ finally occm;s.
the stability requirements on explicit
·ere: the time step must be Iimited to
·nstabilities in the difference equation. 7.2. STABILITY ANO CONVERGENCE IN
unconditionally stable; the size of ilt PARABOLIC PROBLEMS
curacy, and not by stability. This dis-
There are two approaches to the theory. One is more explicit and revealing;
methods is natural for the heat equa-
the other may be more general. In the first approach each eigenfunction is
·propagation: the temperature Qn+l at
followed forward in time in all three equations-the partial differential
i temperature Qn at all points of the
equation for u, the ordinary differential equation for uh, and the finite differ-
1ay be. This dependence is reflected in
ence equation for Qn. If the coefficients and boundary conditions in the
Galerkin equation, and that ilt h2 <"V
equations are independent of time (the stationary case), then this simple
fference equation.
approach is extremely successful; the previous chapter gave precise bounds
·m scheme, which is centered at (n +
on the eigenfunction errors, and the arguments become completely ele-
der accuracy in time: ·
mentary.t In the nonstationary case the analysis is much more technical,
¡n+l + ¡n but parabolic equations are so strongly dissipative that time-dependent (and
2 ·- even nonlinear) effects can ultimately be accounted for. In the second ap-
proach, which is based on energy inequalities at each time, this accounting
determined by becomes comparatively simple.
We consider a parabolic equation ut + Lu where L is exactly the
kind of elliptic operator studied in the previous chapters. It is of order 2m
(m = 1 is by far the most common) and its coefficients may depend on the
on the left can be factored by Gauss tThis is also the approach to use in analyzing mode superpositi'on.
246 INITIAL-VALUE PROBLEMS CHAP. 7 SEC. 7.2. STABILITY AND CO
position vectof x. Suppose first that f =

O, and that the initial function u0 or
is expanded into the orthonormal eigenfunctions:
00
L;
l
c.u.(x),
) ) The solution after n steps is Qn
factor J.t1 is
Each eig~nfunction decays at its own rate in time, and the solution evolves
accordirl~ to (12) h 1
j.lj=r
(10)
Since every 11 is nonnegative, it fol
For t > O, this solution lies in the admissible space :JC1. E ven if the initial the Crank-Nicolson scheme is autor
function u 0 is discontinuous, it is easy tb see that as time goes on u becomes pal eigenfunction u1 is governed by
increasingly smooth; the deriva ti ves at any positive time satisfy be compared with the true amplifica·
of the Galerkin equation o ver a sin~
J.l1 ,..., 1 ;.1át +

e-;." 14r ,..., 1 A1 Át +
The exponential term makes this sum finite, and it does the same for the spa-
tial derivatives. These norms are monotonically decreasing in t, in particular
Since J.l1 is the smaller, this compon
in the finite difference equation. Th
the second-order accuracy of the Cr
The fundamental frequency A1 > O gives the rate of decay of the solution. In the finite difference case, ho'
The Galerkin equation MQ' + KQ O is actually a little more stable not decay at fas ter and fas ter rates.
than the equation it approximates. Suppose the initial Q0 is expanded in converges to 1, and the weights ¡
terms of the discrete eigenvectors Qi of M- 1K, or, equivalently, the initial sign at each time step. This does n
ui is expanded in terms of the approximate eigenfunctions u1: a fully implicit difference scheme. TI
and óbviously J.lj O as A1 ~ oo.
J.l1 will become negative and start t<
A1 2/.At. The highest frequency t:
normally exceeds 2/á t. Therefore
Then the solution at a later time t is present only in small amounts) are
moderate frequencies. If this should
bolic problems it is possible to add
Therefore, the rate of decay is A1, which is slightly larger than A1 : The rate of convergence is eas
expansions
(13)
Finally, the Crank-Nicolson scheme is also stable-with no restriction
on the size of .At. The difference operator at each step has the same eigen- There are two sources of error: tl
vectors as M- 1 K, since it is given by Whether ui is computed as a best
eh osen to be its interpolate, the initi
M+ K .AtQn+1 _ M- K .AtQn like any other initial condition it will
2 - 2 error arises when ui is used as initia
SEC. 7.2. STABILITY AND CONVERGENCE IN PARABOLIC PROBLEMS 247
= O, and that the initial. function

' uo or
:functions:
The solution after n steps is Qn 2; d/f.l1)nQi, where the amplification

factor fl-1 is
ate in time, and the ,solutíon evolves
(12)
Since every A1 is nonnegative, it follows that every 1 f.l11 < 1, and therefore
1issible space X1. Even if the initial the Crank-Nicolson scheme is automatically stable. The decay of the princi-
o see that as time goes on u becomes pal eigenfunction u1 is governed by its amplification factor Ji~, which is to
any positive time satisfy be compared with the true amplification (or contraction) factor exp (-A~ f:..t)
of the Galerkin equation o ver a single time step:
f.l1 "' 1 - A1 t:..t +!<A~ t:..t)2 - -HA1 t:..t) 3 • •• ,

e-).htt:.t "' 1 - A1 dt +!(A~ dt) 2 -k{A1 dt) 3 • • • •
lÍte, and it does the same for the spa-
onically decreasing in t, in particular
Since f.l1 is the smaller, this component of the solution decays slightly fas ter
in the finite difference equation. The discrepancy is of order dt 3 , reflecting
the second-order accuracy of the Crank-Nicolson scheme.
es the rate of decay of the solution. In the finite difference case, however, the high-frequency components do
O is actually a little more stable not decay at faster andfaster rates. As A1 ~ oo, the amplification factor J.t1
ppose the initial Q0 is expanded in converges to 1, and the weights attached to the high frequencies change
.f M- 1K, or, equivalently, the initial sign at each time step. This does not oq::ur in the Galerkin equation or in
na te eigenfunctions u1: a fully implicit difference scheme. The latter would have f.l1 (1 + A1 at)- 1 ,
and obviously f.l1 O as A1 ~ oo. In the Crank-Nicolson case, however,
fl-1 will become negative and start to increase in magnitude at the frequency
A1 2/f:..t. The highest frequency the mesh can hold is A~ "' eh- 2 m, which
normally exceeds 2/f:..t. Therefore the very high frequencies (presumably
present only in small amounts) are actually damped less strongly than the
moderate frequencies. If this should present any difficulty, then as in hyper-
bolic problems it is possible to add a simple dissipation term.
t is slightly larger than A1 : The rate of convergence is easy to determine from the eigenfunction
expansions
~~t) 11 u~ !lo exp ( -1 1 t).
(13) u
e is also stable-with no restriction
~or at each step has the same eigen- There are two sources of error: the initial error and the evolution error.
Whether u3 is computed as a best approximation to the true u0 or simply
chosen to be its interpolate, the initial error u0 u3 will be of order hk-and
_M- Kf:..tQn like any other initial condition it will decay with time like e--<~r. The remaining
- 2
error arises when u3 is used as initial condition in both equations. In the true
248 INITIAL-VALUE PROBLEMS CHAP. 7 SEC. 7.2. STABILITY AND CO
equati~m it is expanded into the eigenfunctions up and in Galerkin's equation The error u uh will still be of ord<
it is expanded into the uj-with a different evolution in time. We know from then the decay factors e-uwill no Ion
the previous chapter that at times -r so near to t that their deca
W e try now a different idea-the
mentioned at the beginning of the
[In case lf < 2m, which is not usual in practice, hk should be replaced by rate of change ofthe error u- uh. R
h 2(k-m>.] These estimates show that the difference in the weights is only function expansions back to their
ci- dj = Ju~(ui- u1) dx "' hk. Therefore, comparing u and uh in {13), (or t + lit, in case of the difference
the evolution error is also of order hk. The error in the derivatives will be of at time t. This technique has been pi o
the usual order hk-s, again decaying at the rate e-;. 11 • following early papers by Swartz a1
We emphasize the simplicíty of this technique for deriving error bounds. [P11]; Wheeler, Dendy, and others
It would lead equally easily to the O(!l.t 2 ) errors in the Crank-Nicolson Suppose that the differential equ
process. The simplicity would seem to require that the spatial part of the written in variational form at each 1
problem be self-adjoint, in order to apply the eigenvalue and eigenfunction
estimates of the previous chapter, but this hypothesis is actually inessential.
(15) (u 0 v) + a(u, v) = (.
In fact, there is a simple formula which bypasses the eigenvalue theory entirely (16) (u7, vh) + a(uñ, vh)
and relates the evolution error directly to the basic estimates for steady-state
problems. With the same initial function ui in both equations, the solutions In the stationary case, the energy
differ at time t by time; it arises from (Lv, w). We reta
of the admissible space X1 onto i
(14) a(u Pu, vh) = O for all if.
Splitting the Galerkin error in
Here z is a complex number, uz solves the (non-self-adjoint) steady-state size of u Pu is known from appr
problem (L + z)uz ui, and u~ is its Galerkin approximation. [In effect, inequalities required for the Pu - u
uz and y~ are the Laplace transforms of u(t) and uh(t), respectively, and the identity.
integral (14) inverts the Laplace transform; the contour C runs along two
rays arg z = ±(n/2 + e) in the left half-plane, so that the exponential ezt LEMMA 7.1
produces a converg~nt integral.} With this formula-which reduces to the Jf e(t) = Pu(t) uh(t), then
eigenfunction expansions in the setf-adjoint, discrete spectrum case-the
evolution error at time t follows directly from the steady-state errors of (17) (e0 e) + a(e,
Theorem 2.1. The result is the expected arder hk, even for non-self-adjoint
equations. Proof Since e lies in S\ we ma
To complete the discussion of this first approach, we note that Duhamel's subtract:
principie places the inhomogeneous case O into the same framework. (ut- u~, e)+
According to this principie, the source term f entering at any time 1: acts
like an iriitial condition over the remaining time t - 7:; If the solution u is The second term is just a(e, e), afte
originally connected to u 0 by sorne rule .u(t) = E(t)u 0 , then in the nonhomo- Rearranging, we have
geneous case
u(t) = E(t)u 0 + J: E(t -r)f(-r) dT. (Put- u7, e) + t
It remains to identity the first term a
In the eigenfunction expansions, this becomes
to know that Put is the same as (Pu
the differentiation a¡at. This is true
inner product were to depend on t, ti
CHAP. 7 SEC. 7.2. STABILITY ANO CONVERGENCE IN PARABOLIC PROBLEMS 249
ctions ui' and in Galerkin's equation The error u - uh will still be of order hk, but if the so urce f acts at all times,
nt evolution in time. We know from then the decay factors e-u will no longer apply; there will be errors committed
at times r so near to t that their decay has not begun.
We try now a different idea-the second approach to parabolic problems
11 u?- Ui llo "' hkljf 2 m.
mentioned at the beginning of the section-to estimate at each time t the
practice, hk should :pe replaced by rate of change of the error u - uh. Rather than connecting u and uh by eigen-
~ difference in the weights is only function expansions back to their data at t = O, the error at time t + dt
'ore, comparing u and uh in (13), (or t + 6.t, in case of the difference equation) is determined from the error
he error in the derivatives will be of at time t. This technique has been pioneered by Douglas and Dupont [06, D9],
he rate e-;.~c. following early papers by Swartz and Wendroff [S11] and Price and Varga
echnique for deriving error bounds. [Pll]; Wheeler, Dendy, and others have made more recent contributions.
6.t 2) errors in the Crank-Nicolson Suppose that the differential equation and its Galerkin approximation are
require that the spatial part of the written in variational form at each time· t,
ly the eigenvalue and eigenfunction
is hypothesis is actually inessential. (15) (uc, v) + a(u, v) = (f, v) for all v in JC1,
'passes the eigenvalue theory entirely (16) (u~, vh) + a(u\ vh) = (f, vh) for all vh in Sh.
> the basic estimates for steady-state
u~ in both equations, the solutions In the stationary case, the energy inner product a(v, w) is independent of
time; it arises from (Lv, w). We retain the notation P for the Ritz projection
of the admissible space JC1 onto its subspace S\ defined as in (6.38) by
r ezt(uz -
e
u~) dz. a(u - Pu, vh) = O for all vh.
Splitting the Galerkin error into u - uh = (u - Pu) + (Pu - uh), the
: the (n<;m-self-adjoint) steady-state size of u - Pu is known from approximation theory (6.39), and the energy
:ialerkin approximation. [In effect, inequalities required for the Pu- uh term are consequences of the following
u(t) and uh(t), respectively, and the identity.
•rm; the con tour C runs along two
!f-plane, so that the exponential ezt LEMMA 7.1
his formula-which reduces to the Jf e(t) = Pu(t) - uh(t), then
ljoint, discrete spectrum case-the •
tly from the steady-state errors of (17)
order hk, e ven for non-self-adjoint
Proof Since e lies in S\ we may put v = e in (15), vh = e in (16), and
.t approach, we note that Duhamel's subtract:
.e f-=!= O into the same framewórk.
(uc- u~, e)+ a(u- u\ e)= O.
term f entering at any time r acts
ting time t - !'; If tlú~ solution u is The second term is just a( e, e), after applying the identity a(u - Pu, e) = O.
J(t) = E(t)u 0 , then in the nonhomo- Rearranging, we have
~(t - r)f(r) dr. (Puc - u~, e) + a(e, e) = (Puc- un e).
;o mes lt remains to identity the first termas the inner product (e0 e), in other words
to know that Puc is the same as (Pu)r: The Ritz projection P commutes with
the differentiation a¡at. This is true only in the stationary case. If the energy
··d dr], .· ~(r) = Jn f(x, r)ui dx. inner product were to depend on t, the term (Puc - (Pu)c, e) would also appear
250. INITIAL-V ALUE PROBLEMS CHAP. 7
SEC. 7.3.
in the identity and would ha ve to be estimated. This difficulty is only technical

According to the theorem just prove
(we om.it the details) if the operator L depends smoothly on time! In the sta-
. suggests that in the energy norm e is ·
tionary case it is obvious that P is independent of time: Differentiating the
and that the Galerkin error u - u11
identity (u Pu~ vh) O, we have (ut (Pu)n vh) = O for all v\ so that
(Pu)r must coincide with Put. This completes the proof ofthe lemma. (21) a(u u\ u- uh) rv a(u Pí
From i!J,1is identity the· rate of change of the error is easy to fin d. The first
This completes the technical ern
term in (17) can be rewritten as
there are no surprises in the results. (
problems, finite elements are particulaJ
(e0 e) :t (ele)= 11 ello :t 11 e !lo· with a large value of h. Iri this situatio
represented by Galerkin's principie, <
The terma(e, e) is at least as large as A1 11 e 115, since A1 is the mínimum ofthe based, than by supposing difference q
Rayleigh quotient. Finally, the right side of the identity is bounded by Because of the integrals to be evaluatt
11 ur - Pur llo 11 ello. Cancelling the common factor 11 e 11 0 , the identity leads to in computing time. Perhaps there will
tion of finite elements and finite diffe1
(18)
7.3. HYPERBOLIC EQUATIONS

Multiplying by e-' 1\ and integrating with respect to -r from Oto t,
It is natural to experiment with fi1
lems. In fact, the Galerkin principie <
the equation M(u) Ois replaced by
The initial error e(O) Pu 0 - u~ is of order Chkll u 0 llk, whether u3 is com- expect finite elements to be tested in
puted by interpolation or least-squares approximation of u0 • The main Sorne preliminary tests are already 1
theorem, giving the correct order of convergence although not necessarily yet in (and may never be, given the.
the most precise estimates, follows immediately from (19). experiments). Maybe all that ca~ be 1
guidelines.
THEOREM 7.1 Mathematically, one property th<
Suppose that Sh is a jinite element space of degree k - l. Then the error is conserved in the true problem, the
in the Galerkin approximation satisjies and that if it is decreasing with time i
in the Galerkin approximation. For
11 u(t) - uh(t) llo 11 u(t)- Pu(t) llo + 11 e(t) llo easy to see. The rate ofchange ofthe e
(20)
Chk[i! u(t) 11 k + e-).¡r 11 Uo llk + J: ét<~-c) 11 ur(-r) llkd-r J. by multiplying the equation by u and
· (up u)+ (Lu, u) ..,e

Thus the error is of the same order hk as in steady-state problems~ and it decays ()
as fast as the fundamental mode if there is no source term.

The equation is conservative-(u, u:
This theorem agrees with the hk estímate derived earlier by means of is identically zero, and it is dissipat
eigenfunction expansions. An averaged error bound in energy follows easily (Lu, u) O for all possible states u. l
by direct integration of the identity (17): tive, since (Lu, u) is in that case a
2 J: a(e(-r), e(-r)) d-r 11 e(O) 115 - 11 e(t) 115 +2 J: 1(Pur Up e) 12 d-r.

.derivatives of u. Hyperbolic equaü
tProvided the test space Vh coincides '

CHAP. 7
SEC. 7.3. HYPERBOLIC EQUATIONS 251
mated. This difficulty is o,nly technical

According to the theorem just proved, the right side is of order h2 k. This
:lepends smoothly on time. In the sta-
cpendent of time: Differentiating the suggests that in the energy norm e is negligible in comparison with u- Pu,
e - (Pu)t, vh) = O fDr aii v\ so that
and that the Galerkin error u- uh = u- Pu + e satisfies
,]etes the proof of the lemma.
(21) a(u- u\ u- uh)"' a(u- Pu, u- Pu) < C 2h 2<k--:m> 11 ullf.
e of the error is easy/t,o fin d. The first
This completes the technical error estimates for parabolic problems;
there are no surprises in the results. Our impression is that just as in static
=
a
llello at llello·
problems, finite elements are particularly effective incoarse mesh calculations,
with a large value of h. In this situation the physics is often more adequately
represented by Galerkin's principie, on which the finite element method is
,llell~, since ..t, is theminimumofthe based, than by supposing difference quotients to be close to the derivatives.
side of the identity is bounded by Beca use of the integrals to be evaluated, however, there is a price to be paid
!lon factor 11 e 11 0 , the identity leads to in computing time. Perhaps there will ultimately be a satisfactory combina-
tion of finite elements and finite differences.
h respect to r from O to t, 7.3. HYPERBOLIC EQUATIONS
It is natural to experiment with finite elements also for hyperbolic prob-

e-<~.-¡¡ uc(r) - Puc(r) llo dr.
lems. In fact, the Galerkin principie can be formulated in such generality-
the equation M(u) = Ois replaced by (M(uh), vh) = O for all vh-that we may
· order Chkli U 0 iik, whether u~ is com- expect finite elements to be tested in an increasing variety of applications.
es approximation of u0 • The main Sorne preliminary tests are already going on, but the conclusions are not
onvergence although not necessarily yet in (and may never be, given the a:m,biguous nature of most numerical
1ediately from ( 19).
exper.jments). Maybe all that can be expected is agreement on sorne general
gui'delines.
Mathematically, one property that can be guaranteedt is that if energy
pace of degree k - l. Then the error is conserved in the true problem, then it is conserved in Galerkin's method,
and that if it is decreasing with time in the true problem, then it is decreasing
in the Galerkin approximation. For a first-order system ut + Lu = O this is
1- 11 e( t) 11 o
easy to see. The rate of change ofthe energy (u, u) = f u 2 dx can be computed
-.<¡r 11 Uo llk + J: e-<~<.--t> 11 ukc) llkdr J. by multiplying the. equation by u and integrating:
in steady-state problems, and it decays (ut, u) + (Lu, u) = :t (u2u) + (Lu, u) = O.

is no source term.
The equation is conservative-(u, u) is constant in time-in case (Lu, u)
stimate derived earlier by means of
is identically zero, and it is dissipative-the energy (u, u) is decreasing-if
error bound in energy foiiows easily
(Lu, u) > O for all possible states u. Parabolic equations are strongly dissipa-
):
tive, since (Lu, u) is in that case a positive-definite expression in the mth
e(t) 11~ +2 J: 1(Put - Un e) 12 dr.

derivatives of u. ·Hyperbolic equations are either conservative or at best
tProvided the test space Vh coincides with the trial space Sh.
252 INITIAL-VALUE PROBLEMS CHAP. 7 SEC. 7.3.
weakly dissipative; energy may leak out at the boundaries, but. only very order:
slowly. This makes their analysis much more delicate. In the Galerkm method,
Jet Q denote the projection of 3C 0 onto the súbspace Sh-Qu is the best least- (24) MQ"
squares approximation in Sh to u, where earlier Pu was t.he best ~ppr~xi
The starting values are approxim
mation in the strain energy norm a(v, v). Then the Galerkm approxtmatwn
displacement uo(x) and initial ve14
uh is determined by projecting the differential equations onto the subspace:
is completely at variance with the
of Q". In that case, with F _ O, th
vector is associated with an expone
Since uh is required to lie in Sh, it is automatic that uh Qu\ and the Galerkin disappear. In the hyperbolic cas<
equation can be written more symmetrically as oscillates rather than decays. The
data, and discontinuities are prop;
u~+ QLQuh O, For the one-dimensional wave
approximation becomes
In other words, the true generator Lis replaced by QLQ. But then a conserva-
tive equation remains conservative, (QLQu, u) (LQu, Qu) = O, and a Q~'+ 1 + 4Q~' + Q~'-
dissipative equation remains dissipative: (QLQu, u) (LQu, q_u) >O. 6
The corresponding nonlinear operators are caUed monotone (Sectwn 2.4),
We remar k again on the coupled fo
and the same result holds.
leads to an implicit difference equ
It is interesting that the conservative property is not always desirable,
Clough and others, and the theo
particularJy in nonlinear hyperbolic equations. The simplest example is the
U.S.-Japan Seminar, that there'"
conservation Iaw ur = (u 2 )x. The solutions of these problems may develop
(through a suitable lumping proe<
spontaneous disco_ntinuities (shocks) and the conservation of energy is lost,
be the case for elements of all de:
even though sorne other conservation laws of mass and moment~m are re-
low degree (generally" piecewise C4
tained. In the Galerkin equation these shocks apparently never qmte appear,
not differentiated with respect to .
and the approximate equation remains conservative-from which it follows
other terms in the equation ar,e tre
that convergence to the correct solution is iinpossible. The st~n~ard .rem~dy
Fujii has also given a valuable
for finite differences is to dissipate energy by means of arttficml vtscosity,
tions (in the time direction) of thc
and apparently that wi11 also be necessary for finite elements. .
example that the terins Q" are
There are two forms in which hyperbolic equations may appear-either
at-2(Qn+1 2Qn + Qn-1). Then,
as a first-order system in time, say wt + Lw = f with a vector unknown. w,
step size ..1t must be restricted <
or as a second-order equation uu + Lu = f We begin with the latter case,
explode exponentially with n. Fo
in which Lis elliptic; a typical example is the wave equation utt - c 2 un = O.
stability conditions are cA.t h/'\1
The weak form of such an equation is just
c..1t h for the lumped mass e<
stability with lumping.) Fujii ha
(22) (um v) + a(u, v) = (f, v) for v in 3C 1, t > O.
and more general hyperbolic inith
In the Galerkin approximation u and vare replaced by uh and vh; this means equations of elasticity.
that uh = I; Q/t)rp/x) is determined by The Galerkin approximation h
of energy (if f = O) and convergenc1
for k= 1,.:., N, t >O. hyperbolic problem we add the ki
This is again an 'ordinary differential equation in the time variable. In this

E(t) = t[
case the same mass and stiffness matrices appear, but the equation is of second In the wave equation this energ~
CHAP. 7 SEC. 7.3. HYPERBOLIC EQUATIONS 253
ut at the boundar.ies, but only very order:

wre delicate. In the Galerkin method,
:he súbspace Sh-Qu is the best least- (24) MQ" + KQ = F(t).
re earlier Pu was the best approxi-
The starting values are approximations from within Sh to the true initial
·). Then the Galerkin approximation
·ential equations ont<? the subspace: displacement u 0 (x) ~nd init~al velocity u~(~). The behavior, of the so.lutions
is completely at vanance wrth the parabohc case, where Q appears mstead
of Q". In that case, with F- O, the solution decays very rapidly; each eigen-
vector is associated with an exponential e-).jc, and discontinuities immediately
1atic that uh = Qu\ and the Galerkin disappear. In the hyperbolic case the exponent changes to ±iA-1 ~,, a.n~
oscillates rather than decays. The solution is no smoother than Its mrtral
·º
cally as
data and discontinuities are propagated indefinitely in time.
For the one-dimensional wave equation and linear elements, Galerkin's
approximation becomes
placed by QLQ. But then a conserva-
QLQu, u) (LQu, Qu) O, and a Q~'+t + 4Q~ + Q~' -¡ ºi-1.
tive: (QLQu, u)= (LQu, Qu) >O. 6
s are called monotone (Section 2.4),
We remar k again on the coupled form ofthese equations, which automatically
Ieads to an implicit difference equation. It appears from the experiments of
ve property is not always desirable,
Clough and others, and the theoretical discussion by Fuji~ at t~e Second
rations. The simplest example is the
U.S.-Japan Seminar, that there will be no loss in accuracy tf M 1s replaced
ons of these problems may develop
(through a suitable lumping process) by a diagonal matrix. This would not
d the conservation of energy is lost,
be the case for elements of all degrees; lumping implicitly uses elements of
1ws of mass and momentum are re-
Iow degree (generally piecewise constants) in dealing with terms which are
hocks apparently never quite appear,
not differentiated with respect to x, and .this loses overall accuracy once the
conservative-from which it follows
other. terms in the equation are treated with high accuracy. .
is impossible. The standard remedy
Fujii has also given a valuable stability analysis for difference approxima-
:rgy by means of artificial viscosity,
tions (in the time direction) of the finite erement equation (24). Suppose for
uy for finite elements.
example that the terms Q are replaced by centered second differences
11
rbolic equations may appear-either

6.r2(Qn+t - 2Qn Qn-t). Then, as is well known for fini~e di~erences, t~e
Lw f with a vector unknown w,
step size ó.t must be restricted or the computed approxrmatrons Qn wtll
=f. We begin with the latter case, explode exponentially with n. For the one-dimensional wave equation, his
s the wave equation uu c2 uxx O.
stability conditions are có.t h/,.JT for the consistent mass matrix M, and
ust
có.t< h for the lumped mass case. (Tong [TS] also observed the added
for v in JC1, t >-0. stability with lumping.) Fujii has studied other finite difference ·schemes
and more general hyperbolic initial-boundary-value problems,. including the
are replaced by uh and if; this means equations of elasticity. . .
The Galerkin approximation has two important properttes: ~onservatwn
of energy (if f O) and convergence. To measure the energy in a second-order
for k 1, ... , N, t > O. hyperbo1ic problem we add the kinetic and potential energies:
quation in the time variable. In this E(t) = j'[(un ut) + a(u, u)].
. appear, but the equation is of second In the wave equation this energy is -!J (u'f c2 u;) dx. The quantity E is
254 INITIAL-VALUE PROBLEMS CHAP. 7 SEC. 7.3.
independent of time, since with v = U1 in the weak form (22), Evidently the equation is again im1
problems. The truncation error is 1
(25) vergence for linear elements.
Dupont [DIO] happened to b
For the wave equation, this becomes simply convergence for cubic trial function
not appear. lnstead, the error u - u
J (U 1Uu + C
2
UxUx 1) dx = JU1(Uu - c 2uxx) dx = O.
of magnitude larger than the best
culations, which were greeted with
were carried out for the Hermite c1
Conservation of energy in the Galerkin equation can be verified in the same
node. For cubic splines his comput
way:
of convergence does not depend e
(u'¡¡, u~) + a(u\ u~) O. polynomials, and in fact, the large
approximation than its subspace (
order. ·
Thus the approximation, like the true equation, is only neutrally stable.
We propose to compute the tn
We shall sketch the proof of convergence, which follows from an identity
the true solution of U 1 + Lu = O in
analogous to Lemma 7.1 : With e Pu - u\
O. In our case L = -a¡ax, and Q
(26) The truncation error is
The left side is the derivative of the energy E(t, e) in the quantity e. This Lu- QLQu = (I
expression is not quite conserved, but the right side is less than
This first term on the right is the
Lu = -ux. If Sh is of degree k - 1
usual order hk. It is the other te1
Thus E' Chk ,Jlf. Integrating from O to t, L(l Q)u is the derivative of the
:JC 1 nÓrm and cannot be better, tha11
term L(I Q)u is annihilated by ti
does happen for linear elements e
The initial error E 0 will be of order h 2 <k-l) and so will the energy in u - Pu. cubics. Since such a cancellation m1
rate of convergence in a first-order
u - Pu + e is of the
i
Therefore, the energy in the Galerkin error u - uh
optimal order h 2 <k-u. Provided the initial data are smooth, this continues to
hk. This rate of convergence has be
of hyperbolic systems.
hold even for large times, t "' Ifh.
To understand how this cancel
We turn now to the trivial but interesting example U 1 ux. The equation
to the polynomial of lowest degree ,
itself is certainly not very exciting; it describes a wave traveling to the left
space-x 2 for linear elements and J
with unit.velocity, u(x, t) = Uo(X + t). There is no distortion in the wave, and
J:=u 2
dx is obviousfy conserved; this is the energy in a first-order problem.
is the error function illustrated in
multiple of the sawtooth function; 1
The Galerkin approximation at each time is (u~, vh) =(u~, vh), and it, too, over each subinterval. The best ca
conserves energy. With linear elements uh(t, x) = ~ u/t)rp/x), where rpi is of such a function is identically z¡
the roof function centered at the node jh, this approximation becomes has occurred. In the cubic case, L(J
and the ñnal projection Q leaves it
(27) ' u~+l + 4u~ + u~_ 1 _ uj+ 1 ui- 1 the Galerkin error is of order 11 u
6 - 2h .
In one sense the exponent k
CHAP. 7' SEC. 7.3. HYPERBOLIC EQUATIONS 255
n the weak form (22), Evidently the equation is again implícit-a serious drawback for hyperbolic
problems. The truncation error is found to be O(h 2 ), the usual rate of con-
vergence for linear elements.
Dupont [D 1O] happened to be computing the corresponding rate of
nply convergence for cubic trial functions, and the expected power h4 simply would
not appear. Instead, the error u- uh turned out to be O(h 3 ), which is an order
f urCuu - c 2 uxJ dx ~ O. of magnitude larger than the best approximation to u by a cubic. His cal-
culations, which were greeted with surprise and perhaps even sorne disbelief,
were carried out for the Hermite cubics with u and ux as unknowns at each
equation can be verified in the same
node. For cubic splines his computations did give O(h 4 ). Therefore, the rate
of convergence does not depend only on the ciegree of the finite element
fEh
dt -_ (l{m
.. U h)
1
+ a(Uh, U h) _
. 1 -
O• polynomials, and in fact, the larger space of Hermite cubics gives a worse
approximation than its subspace of cubic splines. Sorne explanation is in
order.
equation, is only neutrally stable.
We propase to compute the truncation error in general, by substituting
ence, which follows from an identity
- uh, the true solution of u1 + Lu =O into the Galerkin equation u;+ QLQuh =
O. In our case L = -ajax, and Q is the projection onto the subspace Sh.
The truncation error is
tergy E(t, e) in the quantity e. This Lu- QLQu = (1- Q)Lu + QL(l- Q)u.
.e right side is less than
This first term on the right is the error in least-squares approximation of
Lu = -ux. If Sh is of degree k - 1 and u is smooth, then this error is of the
usual order hk. It is the other term QL(I - Q)u which is decisive. Since
to t, L(I- Q)u is the derivative of the least-squares error, it is an error in the
JC 1 norm and cannot be better than hk- 1 • The question is whether or not this
term L(I - Q)u is annihilated by the final projection Q; we believe that this
does happen for linear elem~nts on a regular mesh but not for the usual
- > and so will the energy in u - Pu.
1 cubics. Since such a cancellation must be regarded as exceptional, the normal
-ror u - uh = u - Pu + e is of the rate of convergence in a first-order hyperbolic system will be hk- 1 rather than
.1 data are smooth, this continues to hk. This rate of convergence has been established by Lesaint for a wide class
of hyperbolic systems.
sting example U¡ = u x· The equation To understand how this cancellation could occur, we apply QL(I- Q)
escribes a wave traveling to the left to the polynomial of lowest degree which is not identically present in the sub-
here is no distortion in the wave, and space..:._x2 for linear elements and x 4 for cubics. In the linear case (J- Q)x 2
is the error function illustrated in Section 3.2 (Fig. 3.3). Its derivative is a
the energy in a first-order problem. multiple of the sawtooth function; L(I - Q)x 2 goes linearly from + 1 to -1
me is (u;, vh) = (u~, vh), and it, too, over each subinterval. The best continuous piecewise linear approximation
uh(t, x) = :E u/t)rp/x), where rp1 is of such a function is identically zero: QL(l - Q)x 2 = O, and cancellation
h, this approximation becomes has occurred. In the cubic case, L(l- Q)x 4 happens to be a Hermite cubic,
and the final projection Q leaves it unchanged; there is no cancellation, and
_U¡+ 1 -u1 _ 1 the Galerkin error is of order 11 u- uh llo ,..., h3 •
- 2h .
In one sense the exponent k...:.__ 1 might have been anticipated. Ifthe wave
256 INITIAL-VALUE PROBLEMS CHAP. 7
equation u,, c2 uxx is reduced to a first-order system, the vector unknown

is made up of first derivatives u, and cux:
Thereforé, the ordinary energy llu, 115 + 11 cux 115 in the vector unknown is
precisely twice the energy E(t); it comes from the same sum of the kinetic
and potential energies. Since the error in this energy was of order h 2 <k-I)
for the single equation, the exponent k
for a system.
1 is exactly what we should expect
From a practica! point ofview, these error bounds as h ---lo O are subordi-
8 SINGULARITIES
nate to the problem of obtaining reasonable accuracy af reasonable expense.

With hyperbolic equations, we are not sure that this is achieved most effec-
tively by finite elements. The finite speed of propagation in the true solution
means that explicit finite difference equations are possible, with time steps
ilt of the same order as h, and it is known how artificial viscosity can be
introduced to promote stability. For finite elements the difference equations 8.1. CORNERS ANO INTERFACE
will be implicit and almost too conservative. (They can be explicit only if
we lump the mass matrix, or if, as Raviart has proposed, we choose the nodes Perhaps the most characteri:
as the evaluation points e!¡ for numerical integration. By conserving mass in problems like
the lumping process-or by using lower degree trial polynomials as in [T8]
for the element mass matrices-we achieve a consistent difference equation, (1) -V ·(p Vu) + qu
with a typical Courant condition for numerical stability. Stability is not un-
is that the solution u is smooth as
conditional, as it was for tlie implicit pure Galerkin processes described ear-
lier in the chapter.)
f are smooth. In fact, Weyl's femn
The one important advantage of finite elements in hyperbolic problems-
nl ofQprovidedp,q, andfareai
"up to the boundary" provided th
which must somehow be imitated-by finite difference schemes in the future-
Singularities can therefore occ
is the systematic achievement of high accuracy, even at curved boundaries.
of the data: is not smooth. Unfortt
in fracture mechanics problems,
completely unsatisfactory to proct
As with difference approximation
we ha ve discussed in earlier chapte
for dealing with singularities. HO\
ture of singularities which arise in
the variational method invites one
kin approximation. This chapter
establishing the analytical form oJ
Starting first with the case of n<
equation
(2) -A.u fir

CHAP.7
t-order system, the vector' unknown
'::) ( Ur)
) CUX X
- 11 cux 115 in the vectdr unknown is

s from the same sum ·of the kinetic
in this energy was of order h2<k-u
- 1 is exactly what we should expect
error bounds as h _____, O are subordi-

8 SINGULARITIES
:tble accuracy af reasonable expense.
;ure that this is achieved most effec-
l of propagation in the true solution
ations are possible, with time steps
.own how artificial viscosity can be
.te elements the difference equations 8.1. CORNERS ANO INTERFACES
ative. (They can be explicit only if
t has proposed, we choose the nodes Perhaps the most characteristic property of elliptic boundary-value
integration. By conserving mass in problems like
degree trial polynomials as in [T8]
:ve a consistent difference equation, (1) -V·(pVu)+qu finQ, u o on r,
nerical stability. Stability is not un-
re Galerkin processes described ear- is that the solution u is smooth as long a~ the boundary r and data p, q, and
f are.smooth. In fact, Weyl's lemma states that u is analytic in any subregion
! elements in hyperbolic problems- Q 1 of Q provided p, q, andf are analytic in Q 1 • Similar conclusions are valid
te difference schemes in the future- "up to the boundary" provided the boundary of Q itself is analytic.
:curacy, even at curved boundaries. Singularities can therefore occur only when the boundary or sorne part
of the data is not smooth. Unfortunately, these cases often arise, for example
in fracture mechanics problems, and in the presence of singularities it is
completely unsatisfactory to proceed with finite elements on a regular mesh.
As with difference approximations, local mesh refinement in the sense that
we ha ve discussed in earlier chapters has been a popular and effective method
for dealing with singularities. However, a great deal is known about the na-
'ture of singularities which arise in elliptic problems, and the special form of
the variational method invites one to use this information in the Ritz-Galer-
kin approximation. This chapter is devoted to this task, and we begin by
establishing the analytical form of the singularities which can arise.
Starting first with the case of nonsmooth boundaries, consider the La place
equation
(2) -Au fin Q, u= o on r

257
258 SINO ULARITIES CHAP. 8 SEC. 8.1.
defined in a region n which has a corner (Fig. 8.1). To fix ideas 1et us assome use the orthogonality properties
that f is analytic in the el o sed region ñ and that r
is analytic except at P.
Then Weyl's lemma states that u is analytic in n except at P, and we seek a
Then (5) becomes, after the first te
(8)
f ro
0
dr[!!_(rdui)- .
dr dr
fir) J:n frpiO).

Since this holds for a11 t¡J, the expres
tutes rhe basic differential equatio
its general solution is
Fig. 8.1 A domain with comer angle rtn.
description of u near that point. In particular, we shall study the behavior of

u in the sector where we agree to replace
(3) (10)
where (r, 0) denotes polar coordinates at P. The weak form of (2) is whenever vj is of the form (/ 2) 2 +
set pi = O, since otherwise that ter
(4) for all v in X 1(0). rti is chosen so that at r r0 , (9) is
Altogether, the solution neaL the 1
If vis chosen to vanish near the corner and also outside the sector 0 0, this
2: rtirvjrp¡(O) + 2
00 "'
(11) u(r, 0)
reduces to j= 1 j=
(5) O= J·
Clo
[Vu·Vv -
f V}- f'"o r dr f"'lt [au av . -2au av
. o ar ar + r a(} a(} -
¡;
JV
Jd(} . Suppose for the moment that 11
(11) that the leading term in the si
Now, since u is analytic away from the corner and u(r, O) = u(r, rtn) = O,
u may be expanded for each fixed r > O into
00
Observe tbat this singularity becc
(6) u(r, O) 2: uir)rpiO),
j=t
increases, and if tbe corner at P is
derivatives of u are unbounded. Th
where ning into it; tbis is one of the prc
the last section. Around tbe point.
(7)
is a full interior angle of 2n, and ti
We can easily determine tbe d
The Fourier coefficients uir) are determined by (5): Let v = t¡~(r)rp/0), and singularity. By direct calculation,
CHAP. 8 SEC. 8.1. CORNERS AND INTERFACES 259
r (Fig. 8.1).'To fix ideas Jet us assome use the orthogona1ity properties
l and that r is analytic except at P.
lytic in n except at P, and we seek a ,
Then (5) becomes, after the first term is integrated by parts in r,
(8)
Since this holds for all t¡J, the expression in brackets must vanish. This consti-
tutes the basic differentia1 equation for u/r). Expanding f/r) into ~hi,
its general solution is
th corner angle rx,Tl.
(9)
icular, we shall study the behavior of
where we agree to replace
),O<e<an} n, (10)
Lt P. The weak form of (2) is whenever v;

is of the form (l + 2) 2 • In addition, we reject the term r-v¡ and
set pj = O, since otherwise that term has infinite energy. The other constant
ex j is chosen so that at r r 0 , (9) is the correct Fourier coefficient of u(r 0 , e).
for all v in X1(Q).
Altogether, the solution near the corner takes the form (see Lehman [L2])
and also outside the sector no, this

(11)
00
u(r, e) = ~ ajr• 1 rp/e)

= 00
+ ~ ~ fj 1[(1 + 2)2 - vy]- 1 qJ/e)r+ 2 •

j=l j=ll=O
Suppose for the moment that 1/a is not an integer. Then it follows from
'r Jocn [au av + r_ 2 au av fv] de.
(1 1) that the leading term in the singularity of u is
o arar ae ae
e corner and u(r, O) = u(r, an) o, , •

r ¡ ,ocsm-·
e
) into a
Observe that this singularity becomes more pronounced as the angle an

u/r)rp/e), increases, and if the corner at P is not convex, that is, a > 1, even the first
deriva ti ves of u are unbounded. The worst case is a region with a crack run-
ning into it; this is one of the problems which we examine numerically in
the last section. Arotmd the point P at the head of the crack (Fig. 8.3) there
is a full interior angle of 2n, and the solution behaves like r 11z sin e¡2.
We can easily determine the degree of smoothness of u near any such
nined by (5): Let v = t¡J(r)rp/e), and singularity. By direct calculation, the function rv sin ve falls just short of
260 SlNGULARITIES CHAP. 8 SEC. 8.1.
possessing 1 + v derivatives in the mean-square sense (and only v in the piecewise constant:
pointwise sense-here we absolutely need the mean-square approximation
theory to predict any reasonable convergence). Thus for any [3 < v 1
the solution has 1 + [3 fractional derivatives. At a reentrant corner, where
1/oc, (13) p={
n is not convex and oc > I, it follows that u líes in 3C 1 but not in 3C 2 • Around The classical formulation of the pr
a crack u l~ not quite in 3C 3 12 •
equation (12) hold separately in nl a
When '1¡rx is an integer the first sum iQ (11) is analytic. However, except
ous across the interface r (v is the r.
for the case a I in which r .is a straight line near P, we cannot conclude
that u is analytic in 0 0 ; logarithms gene rally appear in the second sum. For
example, if rx = f, so that r makes a right angle at P, then v] (1 + 2)2 = 4 (14) au¡
Pt uv r-
in case j = 1 and 1 = O. This is an instance in which the replacement (1 O)
is required, and a term (f1 0 r 2 In r sin 2B)/6 appears in the sol ution u. Observe As we noted in Section I .3, the weal
that this term is in 3C 2 but it is not in :JC,l for any l > 2.
Simiiar. arguments give the behavior of the solution to other problems
near corners. In the variable coefficient problem (1) a more complicated
calculation shows that the singularity is still ofthe form (l 1). More generally,
the singularity in a '2mth-order problem is determined by the principal part and (14) is a natural boundary conditJ
of the operator. The leading term in u is of the form r;.rp;.(B), where 'PA is a tri al functions in the Ritz method ..Ir
smooth function of () and A. is an eigenvalue of an auxiliary problem-:-both at a corner of r.
of which depend on the boundary conditions. We refer the reader to Kel- Except when the interface is a
logg's work [K2], which also includes the three-dimensional case, and to the the solution u will be singular at the ¡
Russian work [K3], as well as to standard engineering references [18, P6, tion analogous to (11) for the beh
W4, H5]. Birkhoff [Bl3] and Kellogg [Kl], we
We now permit the second type of sitigularity, when the boundary is system
smooth yet one or more ofthe data is not smooth-. Such a singularity typically
, (15) d ( drp) lprp, p
arises in interface problems, anda simple example is provided by d() p d()
i
(12) -V·(p Vu) fin n, u= o on r, The eigenfunctions rp(B) are require(

to satisfy the interface conditions
where Q is the region shown in Fig. 8.2. We take the coefficient p to be
(16) ~¡~ [Pt ~~(B)]
(17) ~i~ [P ~~(rxn- B)]
1
There is an infinite sequence of posi

ciated eigenfunctions rpiB) are ortho
2n:
J
o p(B)rp~(B)rp~(B) dO = VJ
Fig. 8.2 An angular interface. For each fixed r > O, the solutior
l
CHAP. 8 SEC. 8.1. CORNERS AND INTERFACES 261
an-square sense (and only v in the piecewise constant:

:ed the mean-square approximation
gence). Thus for any fJ < v 1 = 1/tL, (13)
1tives. At a reentrant corner, where
at u lies in JC 1 but not in JC 2 • Around
The classical formulation of the problem is to require that the differential
equation (12) hold separately in 0 1 and 0 2 , with u and p aujav being continu-
in (1 1) is analytic. However, except
ous across the interface r (vis the normal to r). Thus, referring to Fig. 8.2,
ght line near P, we cannot conclude
?rally appear in the second sum. Por
ht angle at P, then v; (! + 2)2 = 4 (14)
tan ce in which the replacement (1 O)
•/6 appears in the solution u. Observe As we noted in Section 1.3, the weak form of this equation is still
1
for any l > 2.
r of the solution to other problems
for all v in X1,
rt problem (l) a more complicated
;till ofthe form (11). More generally,
is determined by the principal part and (14) is a natural boundary condition. It will not have to be satisfied by the
is of the form r;.rp;.(fJ), where rp;. is a trial functions in the Ritz method. Indeed, it would be very -difficult to satisfy
ralue of an auxiliary problem-:-both at a corner of r.
litions. We refer the reader to Kel- Except when the interface is a straight line (ex = 1) or when p 1 p 2 ,
e three-dimensional case, and to the the solution u will be singular at the point P in Pig. 8.2, and we seek a descrip-
iard engineering references [18, P6, tion analogous to (11) for the behavior of u near this point. Following
Birkhoff [B13] and Kellogg [Kl], we introduce the periodic Sturm-Liouville
' singularity, when the boundary is system
: smooth-. Such a singularity typically P1 if O< fJ < cxn,
e example is provided by (15) ', d ( drp) lpqJ, p p(fJ)
dfJ p dfJ {p i f cxn < fJ < 2n.
2
u= o on r, The eigenfunctions rp(fJ) are required to be periodic, rp(fJ) rp(fJ + 2n), and
to satisfy the interface conditions
.2. We take the coefficient 1? to be
(16) Wt¿ [Pt ~;({})] ~¡~ [Pz~(-fJ)}
(17) ~~ [Pt ~;(cxn- fJ)J = ~i~ [Pz ~;(cxn + fJ)J.
There is an infinite sequence of positive eigenvalues }.,i v;, and the asso-
ciated eigenfunctions rp ifJ) are orthogonal:
2n J2lt
S p(fJ)rpj(fJ)rp~(fJ) dfJ = VJ p(fJ)rp/fJ)rp¡{fJ) dfJ
0 0
lar interface. Por each fixed r> O, the solution u(r, fJ) satisfies the jump conditions
262 SINGULARITIES CHAP. 8 SEC. 8.2.
( 16)-(17), and hence fied:
2>t
(18) u(r, 8) 2: uir)rpi8),
}= 1
uir) = f 0
p(8)u(r, 8)rpi8) d8.
The eigenvalue v is determined by
We now proceed in exactly the same way as in the case of boundary singular- shown that either tan vn/4 O, t
it~s. Subi~·ituting (18) into the differential equation (12) and using orthogo- v 1 is the smallest positive root of
nality, we bbtain a differential equation for the Fourier coefficients uir).
These equations can be solved exactly and we obtain an expression Iike (11) (20)
for u, except that in this case the exponents {vj} are the square roots of the
eigenvalues of (15) and the {rp1} are the associated eigenfunctions.
Similar formulas are valid for the
The dominant singularity in th
Ps Q?Y
r-------------~,--------~----,
P4=(t, t) v 1 líes between O and 2. Note th•
Otherwise u lies only in the fractio
1
1 it is not in 3C 2 when v 1 < l.
le= 37r
: 2 There is no difficulty in extend
1 number of interfaces meet at the pe
1
1 several discontinuities, with a jum:
Crack i Kellogg has observed that ata cross
P1t---------\--l-____.. P____ --_:::._e. _1T.;.__ .____ o2 = ( t ,o J
11
chosen coefficients p 1 in each of th1
TR2 =(0- .l) can be arbitrarily smalL Therefore,
1
1 24
1 sorne extra attention to the trial
1
1 the finite element rnethod will giv€
1
le 1T
1 2
1
1 8.2. SINGULAR FUNCTIONS '.
1
1
l The expansions given in Sectio

ment spaces which will improve 1
Suppose that we can construct ind
Fig. 8.3 A domain with a crack-an interna! angle of 2n. u 2: c1r¡J1 is smooth, say in JeA
e~' ... , cs. Then why, not add r¡J 1 ,
For this simple problem one can obtain exact formulas for the eigen-
The idea is obviously to let the sin¡
functions and eigenvalues-in a more complicated problem numerical
near the singularity, with the conve
techniques will be required-and for illustration we consider the specific
elsewhere. As a result, it is necess
case rt t. The eigenfunctions of (15) fall into two symmetry groups: those locally near each singularity. We tl
which are symmetric about 8 O and those which are antisymmetric. In
interfaces,
the former case the eigenfunctions have the form
r"1 sin
(19)
cos v8 for 181 < n/4, (21) tflir, 8)
{
pir) s
rp.(8) { rt. cos v(n - 8) for 181 > n/4. o
The constant a. is chosen so that the interface conditions (16)-(17) are satis- The transition points r 0 and r1 afi
CHAP. 8 SEC. 8.2. SINGULAR FUNCTIONS 263
fied:
av=-p¡Slll4
[ . 3vn]
. vn]/[ P2Slll4·
= J:n p(fl) u(r, fl) rp~(fl) d().
The eigenvalue vis determined by substituting (19) into (15), and it can be
as in the case of bouQdary singular- shown that either tan vn/4 O, that is, v 4n, or v = ± 4n, where
11 equation (12) and ~·sing orthogo- v 1 is the smallest positive root of
l for the Fourier coefficients u/r).
d we obtain an expression like (11) (20) tan2 v:J/[ 1 3 tan 2 vn] =

:nts {v 1} are the square roots of the 4
;sociated eigenfunctions.
Similar formulas are valid for the odd symmetry class.
The dominant singularity in the solution is therefore r"rpv(O), where v
v_1 lies between O and 2. Note that v 1 = 1 in (20) if and only if p 1 =-p 2 •
Otherwise u líes only in the fractional spaces :Jel+P, p < Vp and in particular
it is not in 3C 2 when v1 < l.
There is no difficulty in extending this analysis to the case in which any
number of interfaces meet at the point P. The coefficient p(O) in (15) will ha ve
several discontinuities, with a jump condition of the form (16) at each one.
Kellogg has observed that ata crossing oftwo straight interfaces, with suitably
8=1r
chosen coefficients pi in each of the four quadrants, the leading exponent v1
-------- can be arbitrarily small. Therefore, the singularity can be very severe; witho.ut
sorne extra attention to the trial functions at all such exceptional points,
the finite element method will give disappointing results.
8.2. SINGULAR FUNCTIONS
The expansions given in Section 8.1 suggest a modification of finite ele-

ment spaces which will improve the approximation of singular solutions.
Suppose that we can construct independent functions t¡/p ... , tfls such that_
;-an internal angle of 2n. u - L; C¡t¡/¡ is smooth, say in X\ for suitable (but unknown) constants
e~' ... , cs. Then why not add t¡/p . . . , tfls to the finite element space Sh7
'tain exact formulas· for the eigen~ The idea is obviously to let the singular functions t¡lp ... , tfls approximate u
~ complicated problem numerical
near the singularity, with the conventional finite elements carrying the burden
lustration we consider the specific elsewhere. As a result, it is necessary to define the singular functions only
Lll into two symmetry-groups: those locally near each singularity. We therefore choose, both for corners and for
those which are antisymmetric. In interfaces,
the form forO r < r0,
for ()! < n/4,

1
(21) for r 0 < r < r P
for 1 fll > n/4. for r 1 < r. ·
~rface conditions (16)-(17) are satis- The transition points r 0 and r 1 are fixed (independent of h), and the poly-
264 SINGULARITIES CHAP. 8 SEC. 8.2.
nomials p1 are chosen so as to mergethe coefficient rv1 smoothly into zero. by high-order numerical iiltegratior,
Jf we' want the trial functions 'flt to lie in ek-l (and therefore in Jek), then the is totally inappropriate.
polynomials are of degree 2k 1 and are determined by the Hermite con- The inversion of the stiffness
ditions addition of singular functions destt
may lead to extra operations in elin
(22) d (r)
rVj] 1r = r o_ - drzP¡ =0, l O,l, ... ,k-l.t requirements. In addition, the cor
much in question. The singular ft
by the other finite elements, and he1
For example, suppose that we want to use cubic elements (k 4) to
dependent."
solve Laplace's equation in a region with a crack. The exponents are v1 =
In practice, both difficulties c<;
j !, and according to (ll) there exist constants IXp ••• , 1X 4 such that unknowns. Let rpp ... , rpN be aba
u-
1~ IX}/f¡. = IX 4 r
712
sin J+ · ·· +
7
analytic terms.
functions. Then·we order the unkno
appear last; that is, the vector of m
The stiffness matrix for S~ is thereJ
The left side belongs to 3C 4 , and therefore it can be approximated to the opti-
mal order h4 by cubic elements. The best possible degree of approximation
has been recovered by the inclusion of three singular functions '!Ir
More generally, suppose that we have constructed t¡/p ... , 'lis such that
u - 2: C;'f/; is in Jek for appropriate constants e 1 , ••• , cs. In addition, let where K 11 is the standard stiffness 1
Sh be a standard finite element space of degree k :_ l. Then if S~ is spanned the singular functions. The entri(
by Sh together with r¡Jp • •• , 'f!s, there is an approximation Uh in S~ to the a(r¡¡¡, r¡¡ J
singular solutiorru with error Faddeev and Faddeeva [6] the
matrices, which we write in block:
(23)
This is clear: Since u- 2: C;'f/; is in 3C\ we can approximate it by its interpo-:-

late vh in the finite element space Sh. Then Uh = vh + 2: C;'t/f;·
Therefore,
This extra accuracy in approximation, obtained by adding singular func-
tions to the trial space, will be reflected in extra accuracy ofthe Ritz-Galerkin
approximation uh. We shall postpone the analysis until the next section and
concentrate here ón the computational problems that accompany singular Obviously L 11 and U11 are the fa(
functions. sociated with Sh. As a consequence
Two difficulties arise in applying these ideas to an actual computation, K 11 'and the much smaller matrices
e ven after the forros of the singularities are known. The first is the evaluation in the symmetric case.) The addi
of inner products involving singular functions, and the second is the inversion which is orders of magnitude less
of the stiffness matrix. For the former there are a variety of tricks [F7, F9] addition, the factorization and th
which exploit the special forro of the singular functions. General.ly speaking, by back substitution represent only
the most singular part-the radial dependence in the energy integrals a(r¡¡1, r¡¡1 ) width of K 11 • This is the same as
-must be done analytically. The integration in(}, and if absolutely necessary the bulk of the work is in the facto
also the integrals a(rp¡, r¡¡1) involving only one singular function, can be done ities are isolated in the smaller m~
the rounding errors. Typically the ~
tA possible alternative is to multiply the singular function rv sin v9 direct1y by a
polynomial q(r) which merges smoothly into zero. This eliminates the condition at r 0
tion ofX22 L 21 U 12 , and it is oft1
in favor of qm(o) = () 01 • in high precision.
CHAP. 8 SEC. 8.2. SINGULAR FUNCTIONS 265
coefficient rvj smoothly jnto zero. by high-order numerical integration. A fixed quadrature -rule of low accuracy
~k-t (and therefore in :Jek), then the is totally inappropriate ..
·e determined by the Hermite con- The inversion of the stiffness matrix is a more serious problem. The
addition of singular functions destroys the band structure of the matrix and
may lead to extra operations in elimination ("fill in") as well as extra storage
o, o, 1'. . 'k l.t requirements. In addition, the conditioning of the stiffness matrix is very
much in question. The singular functions 1{1¡, ••• , 1{1 8 can be approximated
by the other finite elements, and hence the basis for S~ will be "nearly linearly
to use cubic elements (k = 4) to
dependent."
1 a crack. The exponents are v1 =
In practice, both difficulties can be avoided by correctly ordering the
mstants IX¡, .•. , tX 4 such that ·
unknowns. Let (/Jp .•. , rpN be a basis for Sh and 'fl P ••• , 'fls be the singular
functions. Thenwe order the unknowns so that the components of 'f/ 1 , ••• , 'fl s.
. · . + analyti~ terms. appear last; that is, the vector of unknowns is (Q 1 • • • QNP1 ••• Ps) = (QP).
The stiffness matrix for S~ is therefore
it can be approximated to the opti-
- possible degree of approximation
.ree singular functions 'flr
: constructed IJ/ 1 , • • • , 'fls such that
1stants e~' ... ,cs. In addition, let where K 11 is the standard stiffness matrix for Sh and the other blocks involve
legree k · l. Then if S~ is spanned the singular functions. The entries of K 22 are the energy inner products
an approximation Uh in S~ to the a('f/¡, 'fl).
Faddeev and Faddeeva [6] then factor Ks into a product of triangular
matrices, which we write in block forro as
L u
·e can approximate it by its interpo-
:n Uh = vh + 1: C¡'fl¡· Therefore,
, o btained by adding singular func-
~xtra accuracy-ofthe Ritz-Galerkin
analysis until the next section and
Jroblems that accompany singular Obviously L 11 and U 1 1 are the factors of the usual stiffness matrix K 11 as-
sociated with Sh. As a consequence, it is necessary only to store the bands ·of
;e ideas to an actual computation, K 11 and the much smaller matrices K 12 , K 2 p and K 22 • (Note that K 12 = KI1
e known. The first is the evaluation in the symmetric case.) The additional storage required is only sN + s 2 ,
ons, and the second is the inversion which is orders of magnitude less than the storage requirement for K 11 • In
tere are a variety of fricks [F7, F9] addition, the factorization and the calculation of the unknowns Q and P
ular functions. General.ly speaking, by back substitution Fepresent only O(w 2 N) operations, where w is the band-
~nce in the energy integrals a('f/¡, 'f/¡) width of K 11 • This is the same as is required to sol ve K 11 Q = F 1 • In fact,
:on in 9, and if absolutely necessary the bulk of the work is in the factorization.. The effects of numerical instabil-
one singular function, can be done ities are isolated in the smaller matrices, and it is relatively easy to control
the rounding errors. Typically the only real so urce of trouble is in the forma-
;íngular function rv sin v(J directly by a
ero. Thís eliminates the condition at r 0
tion of K 22 L 21 U 12 , and it is often desirable to do the latter multiplication
in high precision.
. :.·
.~l
·':'.
SEC. 8.3. ERRO

266 SINGULARITIES CHAP. 8
.)
Suppose that we attempt to ca
These ideas can also be used for eigenvalue problems. The bisection and ! and 3.4, to estimate the error in d
(block) inverse power method require factorizations of the mass and stiff-
As before, we take u - ti' to be t1
, ness matrix into LU, and if the Faddeev-Faddeeva bordering is u sed, no
solution is denoted by z:Lz =u
extra problems of storage and numerical stability are created by the addition
· unchanged, except at one crucial 1
of singular functions.
may no longer be true. Indeed, in 1
nonzero components of the singu1
8.3. ERRÓRS IN THE PRESENCE OF be content to work with theweakt
SINGULARITIES
llzlla ~
Let L be a 2mth-order self-adjoint elliptic operator with homogeneous
boundary data, and let a(v, w) be the associated inner product on the energy which follows from (25) (replacin
space JC~. If the problem Lu = fhas interface or boundary singularities such quality is that a factor h 2m-a has.
as those described in Section 8.1, the error estimat~s derived in 3.4 are no
longer valid. In this section we shall modify the earlier analysis to obtain the
correct rates of convergence in the presence of singularities.
Using expansions analogous to those derived in Section 8.1, we can write For example, in the torsion expe
the exact solution as a sum of singular functions plus a smoo.th function: section, u= -5;. For any choice of
to be O(h 112 ) and the error in disp
(24) U= :t
i=l
C¡t¡/¡ + W. This is the error over the whc
it is reasonable tó hope for sometl
ha ve a strong smoothing effect in tl
Each singular function t¡l; is in Jea for sorne u > m, and is ·independent of of ordinary least-squares approxi1
the data f; it depends only on the geometry of Q in the case of corners in apparently no pollution from the
the bound.ary, and on the coefficients of L for interface problems. We may domain Q', then even without spec
use either r¡¡i = rv rp;(O), as in (11), or ti]. e function constructed in (21); all
1
tion over Qis correct to order hk i
that matters is to keep the correct behavior near P. The smooth function w is no longer tr:ue, and sorne polluti
and the coefficients e 1 , • • • , es' op the other hand, do depend onf Accotding is better within Q' than near the ~
to the fundamental work of Kondrat'ev [K3], it is not only possible to ensure solution behaves like ra. near a e
that w is in JCk (by including sufficiently many r¡¡¡) but .even to estimate its Nitsche and Schatz, the error in th
size: to the singularity is of order h 2 a..
error in JC 1 of order h away fro
(25)
over the whole domain n.
Now we turn to the importa11
Observe that if Q and the coefficients of L are smooth, then- the associated singular functions are introduced
singular functions r¡¡ 1 , • • • , r¡¡ s are -zero and hence (25) reduces to the usual we note that by construction the s
bound for the solution in terms of the data. tion U 11 satisfying (according to (:
As a first case let us derive the rate of convergence for the finite element
method when no special tricks-mesh refinement or singular functions- (26)
are used. The error in strain energy presents no difficulties. As always, d•
is the closest trial function to u, and if u lies in JCk, then this error is of order
Therefore, the same bound holds f
h 2 <k-m>. In general, however, u will lie only in sorne less smooth space Jea,
Proceeding to the JC 0 estimat,
and the rate of convergen ce in energy is reduced to h2 <a-m>. This erro,r is
Lz =u -ti'. The crucial ~oint of
likely to be unacceptably large.
SEC. 8.3. ERRORS IN THE PRESENCE OF SINGULARITIES 267
CHAP. 8
Suppose that we atternpt to carry out Nitsche's trick, as in Sections 1.6
value problems. The bisection and
and 3.4, to estímate the error in displacements without singular functions.
:torizations of the mass and stiff-
As before, we take u - uh to be the data g in an auxiliary problem, whose
i-Faddeeva bordering is used, no
solution is denoted by z: Lz u uh. Then the argument can proceed
.tability are created by the addition
unchanged, except at one crucial point: The estímate llzll 2 m Cllu- ifllo
may no longer be true. Indeed, in light of (24), it may happen that z contains
nonzero components of the singular terms r¡t 1 , ••• , r¡t s' and hence we must
be content to work with the weaker inequality
liptic operator with homogeneous

ciated inner product on the energy which follows frorn (25) (replacing u with z). The effect of this weaker ine-
face or boundar,y singularities such quality is that a factor h 2 m-O' has to be sacrificed, and the optimal bound is
or estimates derived in 3.4 are no
fy the earlier analysis to obtain the r min(k,a).
ce of singularities.
ierived in Section 8.1, we can write For example, in the torsion experiment for a crack described in the next
functions plus a smooth function: section, u = l For any choice of element this forces the error in the slopés
to be O(h 112 ) and the error in displacement to be O(h).
This is the error over the whole domain Q. Away from the singularity
+w. it is reasonable to hope for something better, since elliptic equations always
ha ve a strong smoothing effect in the interior ofQ. In fact, ifit were a question
tme u > m, and is ·independent of of ordinary least-squares approximation by piecewise polynomials, there is
!try of n in the case .of corners in apparently no pollution from the singularity; if u has k derivatives in a sub-
L for interface problems. We may dornajo Q', then even.without special tricks the best least-squares approxima-
~ function constructed in (21); all tion over Q is correct to order !zk in Q' [N6]. For second-order equations this
or near P. The smooth function w is no longer true, and sorne pollution does occur. However, the exponen{ still
~r hand, do depend onf According is better within Q' than near the singularity. Suppose, for example, that the
C3], it is not orily possible to ensure solution behaves like rrx near a corner in the domain. Then, according to
many r¡t¡) .but ,even to estímate its Nitsche and Schatz, the error in the energy norm over Q' which is attributable
to the singularity is of order h2 rx. For the region with a crack this means an
error in :JC 1 of order h away from the singularity, as compared with h 112
over the whole domain n.
Now we turn to the important question-the rate of convergence ·when
L are smooth, then the associated singular functions are introduced into the trial space. For the energy norm
nd hence (25) reduce.s. to the usual we note that by construction the singular space S~ contains at least one func-
.ta. tion Uh satisfying (according to (23))
~ convergence for the finite element
efinement or singular functions-
~ents no difficulties. As always, uh
(26) a( u - U\ u Uh) Ch 2 (k-m) 11 u - :t m.
i=l
C¡'t/f¡
es in X\ then this error is of order

Therefore, the same bound holds for u - uh.
1ly in sorne less smooth space :JCO',
~ reduced to h 2 <0'-m). This error is
Proceeding to the :JC 0 estímate we again consider the auxiliary problem
Lz u uh. The crucial :point of Kohdrat'ev's theory is that z can be written
268 ~ 1 SINGULARITIES CHAP. 8 SEC. 8.4.
as ~f=;' 1 d¡'f/¡ + v, where v lies in Jek and In the torsion problem we give
refinement as against the use of sin~
(27) their rates of convergence are the s
number of unknowns required. In
Since v is smoóth, we can approximate it with a trial function vh in Sh to
we shall be less concerned with si1
order hk-m in the energy norm. (We assume k> 2m, as is normal for finite
efficient methods for treating inte1
elementsi~ On the other hand, the function ~ d¡'f/¡ is in our singular space
is included beca use of its long histo
S~, and so. is zh = L d¡'f/¡ + vh. Therefore,
a singularity. In fact, special methc
d uced extremely accurate approxi
We shall compare these results wi
Repeating the (rather complicated) argument of Section 4.4, method.
The differential equation govern
(29) in normalized form as
(30) -fl.u = 1 in
This means that by including enough singular functions, it is possible to
obtain the same rate of convergence as for smooth problems. A similar conclu- n is the region shown in Fig. 8.3 a
sion applies to mesh refinement, if the mesh size is taken to be an average sions about P in Section 8.1 reduce
ii = N- 112 , based on the dimension N of the trial space. In this case the
singular functions are not in the space, but according fo the final comments u(r, 8) = L cirv; 5
in Section 3.2, a suitable mesh refinement permits their approximation to j= 1
order hk. With this estimate the Nitsche argument goes as befo re.
.It is obvious that all these theoretical predictions must be thoroughly plus analytic terms. Thus the domiJ
tested. In. complicated physical problems, it may be extremely difficult to is r 1i 2 sin (8/2). lts coefficient
identify the singularities and to incorporate them into special trial functions.
(31) c 1 = lim ,- 1
Therefore, such a construction will be carried out only if the benefits are r--+0 ··
correspondingly great. Even mesh refinement introduces sorne complications,

_although it is usually much simpler than the construction of singular func- has great engineering significance;
)tiOns. All we can do here is to try- each of these possibilities on a number of the torsion which the beam can en
siiJ?-plified physical problems and report the results; By glancing at the graphs the stress intensity factor [18]. We n
in the following section, the reader can anticipate our conclusion: High just short off derivatives in the m
accuracy has to be paid for, either by computer time on a simple method or In the problem as stated there
by programm~ng time with a more subtle -technique. The prices vary with the at each of the corners P2 , P3 , P4 , a
problem. But almost certainly, from his own experience, the reader already paratively insignificant, we shall r
knew that. conditions to
8.4. EXPERIMENTAL RESULTS

where V denotes the normal to r. ~
We conclude this chapter with three examples drawn from physics: (1)
value problem is analytic away fron
the computation of the rigidity and deformation of a cracked squan! e las tic
Section 8.I.t
beam under torsion; (2) the criticality computation in an idealized square
We shall compute approximati
nuclear reactor consisting of a homogeneous square core surrounded by a
square reflector, in the one-group diffusion approximation; and (3) the com- tBy the Saint-Venant principie this el
a·
putation of the fundamental frequency of vibrating L-shaped membrane. singularity at P, a fact which is also evid
CHAP. 8 SEC. 8.4. EXPERIMENTAL RESULTS 269
In the torsion problem we give a numerical comparison of local mesh

refinement as against the use of singular functions. For a given finite element
their rates of convergence are the same, and efficiency hinges largely on the
number of unknowns required. In· the reactor problem, on the other hand,
it with a trial function 1i in Sh to
we shall be Iess concerned with singu1arities and concentrate attention on
Jme k > 2m, as is normal for finite
efficient methods for treating interfaces. Finally, the L-shaped membrane
ion ~ d1t¡11 is in our;~:~ingular space
is included beca use of íts long history as a model of an ellfptic problem with
~e,
a singularity. In fact, specia1 inethods developed for this problem have pro-
duced extremely accurate approximations to the vibrational frequencies.
We shall compare these results with those obtained by the finite element
nent of Section 4.4, method.
The differential equation governing the torsion problem (1) can be written
in normalized form as
singular functions, it. is possible to

(30) -ll.u = 1 in n, u= o on r.
smooth problems. A similar conclu- n is the region shown in Fig. 8.3 and r includes the crack P 1 P. Our expan-
nesh size is taken to be an average · sions about P in Section 8.1 reduce to
of the tria! space. In this case the
•ut according to the final" comments 2j-
vi= __ 1
2_,
:nt permits their approximation to
ugument goes as before.
al predictions must be thoroughly plus analytic terms. Thus the dominant term in the singularity at the point P
LS, it may be extremely difficult to is r 1l 2 sin (8/2). Its coefficient
ite them into special trial fu_nctions.
:;arried out only if the benefits are (31) c1 Iim ,-u 2 [u(r, n)- u(O, n)]
r-~o
1ent introduces sorne complications,

1 the construction of singular func- has great engineering significance; it is the commonly accepted measure of
.f these possibilities on a number of the torsion which the beam can endure before fracture occurs and is caBed
he results. By glancing at the graphs the stress intensity factor [I8]. We note that because of the factor r 112 , u falls
n anticipate our conclusion: High just short of! derivatives in the mean-square sense.
mputer time on a simple method or In the problem as stated there are also singularities of the form p 2 ln p
' technique. The prices vary with the a,t each of the corners P 2 , P 3 , P4 , and P 5 • Since these singularities are com-
own experienc.e, the reader already paratively insignificant, we shall remove them by changing the boundary
conditions to
(32)
where V denotes the normal to r. That the solution u of the new boundary-
examples drawn from physics: (1)
value problem is analytic away from P can be verified using the techniques of
rmation of a cracked square elastic
Section 8.l.t
:omputation in an idealized square
We shaJI compute approximations from four different spaces; the first
1eous square core surrounded by a
)n approximation; and (3) the com- tBy the Saint-Venant principie this'change of boundary conditions will not affect the
)f a vibrating L:·shaped membrane. singularity at P, a fact which is also evident from our analysis in Section 8.1.
270 SINGULARIT!ES CHAP.8 SEC. 8.4.
three deal with a uniform. mesh of length h and use singular functions con- is O(h 4 )'for the cubic spaces, and C
sttu~ted from rv1 sin v/l, vi= (2j 1)/2, as in Section 8.2. These are the fol- this error directly, however, since th
lowing: Therefore, we also used a quintic
More precise! y, we computed with t
l. Si denotes the space of continuous functions on a uniform mesh which across the lines PQ 1 , PQ 2 , and PQ
reduce to a bilinear polynomial a + bx + cy dxy on each subdivision. mented by six singular functions. )
This spacé~also includes the singular function fJ/ 1 constructed from r 112 sin(B/2) is obtained with this space; the ratee
so that we''have the approximation property (26) with k 2. · solutions for h !, i, and Tu- differ
figures were taken to. be a correct '
2. Si: denotes the bicubic Hermite space (Section 1.8) together with the
Figures 8.4 and 8.5 show the err<
singular functions fJfp fJ/ 2 , and fJ1 3 , so that (26) holds with k 4. an average mesh length ii N-llz
These examples fall into the category of nodal.finite element spaces in the "efficient" space, meeting a given e
sensé that we have discussed earlier, ~nd so it is perhaps appropriate to unknown parameters. In this regare
include also an example of an abstract.finite element space. The space of bicu- space S~ 6 of triangular elements w
bic splines which are of class ez in n is the most likely candidate; however, we used h !, ~ -(o and h = t•
it is a difficult space to work with for this problem. The troubles are related . ·many extra elements are needed to
to the essential boundary condition u = O on the.crack PP 1 • A bicubic spline only one extra unknown-the coc
which vanishes along this line will be .so constrained at the point P that it required. The spaces S'¡¡, S}L of cu
cannot possihly match the true solution u, whose derivatives are all singular. the space of mínimum dimension.
To avoid this, we impose only simple continuity (e 0 ) across the lines PQ 2 , appears to be the most efficient of
PQ 3 , and PQ 4 and obtain what is usually called a spline-Lagrange space: The graphs also indicate the
without singular functions. There a
3. S}L consists of piecewise bicubic polynomials which are ez everywhere these figures. The first, naturally, h
exc~pt at the lines PQ 2 , PQ 3 , and PQ4 • Across these lines the normal deriva- functions-at R1 and h = !, for
tives are permitted to be discontinuous. As with Si:, we include the singular guiar functioris is about 40 per ce
functions fJ/ 1 , fJ/ 2 , and fJ/ 3 so that (26) holds with k 4. singular functions ·are added. The :
functions, the pointwise errois ·an:
The advantage of the spline-Lagrange space is that it has only one un- that these errors are of order O(h 1
known per node, except along the lines PQ 1 , PQ 2 , PQ 3 where there are three. third is that the standard Hermite
Thus the dimension of S}L is nearly four times less than that of the Hermite linear elements. The cubics are toe
space Si:, and virtually the same as the dimension of the piecewise bilinear FinalJy, in Fig. 8.6, we give ap]
space Si. Incidentally, a basis for S}L is readily obtained from the standard (31). For the spaces Si, S'H, and S
spline formulas, by considering the lines PQ 1 , PQ 2 , and PQ 3 as a triple conflu- function fJ/ 1 , since
ence of nodallines.
Our final space consists of triangular elements and uses a graded mesh of é1 = lim r- 11
r-+0
maximum length h and mínimum length ~.
For the space S~ 6 this quantity is
4. S~ 6 denotes the space of continuous functions which reduce to linear
polynomials on each triangle. We assume that ó = O(h 2 ), so that (26) holds tWhile the error estimates cited above
with k= 2.t the rates of convergence appear to be th
the graphs.
The estimates in Section 8.3 imply that the square root of f 1u uf' 12 :j:An analysis similar fo that in Secti
case is O(h), hence the area of the regi<
tin the calculations, the transition from meshsize h to á = h2 was achieved by progres- must be quite small and converge to ze1
sively halvin~ the triangles. with linear functions; R 2 is within the 1
CHAP. 8 SEC. 8.4. EXPERIMENTAL J.'ESULTS 271
th h and use singular functions con- is O(h 4 }for the cubic spaces, and O(h 2 ) for Si and S~ 6 • We cannot measure
' as in Section 8.i These are the fol- this error directly, however, since the solution u is not known in closed form.
Therefore, we also used a quintic element with several singular functions.
More precisely, we computed with the quintic splines, of class e4 in n except
s functions on a uniform mesh. which across the lines PQ~' PQ 2 , and PQ 3 -a quintic spllne-Lagrange space, aug-
~~: + cy + dxy on each subdivision. mented by six singular functions. A very rapidly convergent approximation
tion t¡t 1 constructed f*9m r tlz sin(O/2) is obtained with this space; the rate of convergence is h6 , and the approximate
~rty (26) with k 2. · . solutions f9r h .¡,, !, and -(o- differed only in the seventh place. The first six
figures were taken to. be a correct value for u.
pace (Section 1.8) together with the
Figures 8.4 and 8.5 show the errors at the points R 1 and R 2 plotted against
tt (26) holds with k 4.
an average mesh 1ength ii = N- 112 .t This permits the selection of the most
of nodal finite element spaces in the "efficient" space, meeting a given error tolerance with the least number of
nd so it is perhaps appropriate to unkpown parameters. In this regard Si appears to be more efficient than the
!ite element space. The spacé ofbicu- space S~ 8 of triangular elements using local mesh. refinement. For the latter
the most likely can di date; however, we used h = !, J = -fo and h !, J -h according to the rule in 4. Thus
is problem. The troubles are related many extra elements are needed to obtain the h312 convergence, yet with Si
O on the crack PP1 • A bicubic spline only one extra unknown-the coefficient of the singular fqnction t¡t 1-is
o constrained at the point P that it required. The spaces S'¡¡, Sh of cubic elements were far superior to Si, and
u, whose derivatives are all singular. the space of mínimum dimension, namely the spline-Lagrange space S~L,
ontinuity (e 0 ) across the lines PQZ, appears to be the most efficient of all.
ally called a spline-Lagrange space: The graphs also indicate the (larger) errors for finite element spaces
without singular functions. There are three interesting thirtgs to note about
·olynomials which are ez everywhere these figures. The first, naturally, is the improvemetit obtained with singular
\cross these lines the normal deriva- functions-at R 1 and h !, for example, the relative error without sin-
As with S'¡¡, we in elude the ~singular gular functions is about 40 per cent, aQd this drops to 0.1 per cent when
!ds with k 4. sing~lar functions are added. The second is that in the absence of singular
functions, the pointwise errors are largest near P. In particular, it appears
~e space is that it has only one un- that these errors are of order O(h 112 ) near P and O(h) everywhere else.t The
'Q1, PQ 2 , PQ 3 where there are three. third is that the standard Hermite cubics come out worse than the simplest
r times less than that of the Hermite linear elements. The cubics are too smooth to cope with the singularity.
dimension of the piecewise bilinear Finally, in Fig. 8.6, we give approximations to the stress intensity factor
; readily obtained from the standard (31). For the spaces Si, S'¡¡, and S~L we use the coefficient.c 1 ofthe singular
0
Qp PQ 2 , and PQ 3 as a triple conflu- function t¡t 1 , since
elements and uses a graded mesh of lim r- 112 [zf(r, n)- ·u11(0, n)].
r->0
J.
For the space S~ 6 this quantity is zero, and it is necessary to choose sorne
ms functions which reduce to linear
;: that J O(h 2 ), so that (26) holds
tWhile the error estimates cíted above give no information concerning pointwise errors,
the rates of convergence appear to be the same for this problem, as can be verified from
the graphs.
that the square root of J1 u - u!' ¡z tAn analysis similar to that in Section 8.3: shows that the mean-square error in ttiis
case is O(h), hence the area of the region about P wh~re the pointwise error is O(hliZ)
.hsize h to o = hZ was achieved by progres- must be quite small and converge to zero as h ---¡. O. This explains the kink in the ·error
with linear functions; R 2 is within the boundary layer, when h is small.
U(R1) = 0.071023 V
10,000
8000 1 1
20,000
'6000
'- i± 1-f--
¡-r-
"~
'~
Bicubic hermite
4000 ~ No singular functions 10,000
Slope = 1 t===" Bilinear
,,'"'
3000 1
8000 t - - - - No singular funct
~~ 7000
6000
¡-
11...
2QO.O 5000
~
t•';l
\~~
o 4000
3000
1000 '\.
1'
- ~
800
600
\'\.. "
\J.t "'"' e-Si linear
2000
Sf•8 A "Y No singular functions

400
Slope:::; 2 \,'\ Slope: 1 1\
1000
300 1 \ 1"' 800 1
shL
\' 700 1
-t-
1-
U) 200
.,...
""1
U)
+
600
500
1 ·
Slope =2 t-
+
Q
X
1-
Í\
~
\ º )(
lo.
400
o1-
1-
100
g
Q)
300
CIJ
~ 80 1\. .!!?
::::1 ::::1 200
o111 1\.. o
..e 60 (/)
.o
sh
e::[ 1 '\--
L c:::r
40 DÍ \ Slope:::; 2 1
100
30
, o 1\ \ 30
20
\1\ \ 20
~
s:L_:..--
~
~
Slope:: 4
\ s~L -
10 10 Slope::4=
1\ 8
8
6
' \\
ttt 7
6
\ \ 5
4 1 4
~~~\
3
~~h
S~ope =4
2 2
\
3 4 5 6 7 8 10 20 30 40 60 80 100 3 4 5
1/h
Fig. 8.4
272
0.071023 V(R2)= 0.027425
.
20,000
=~Bicubic hermite
-r-- No singular functions
Slope = 1
10,000
t===Sil inear "'
~~~
8000 1----- No singular _functions
'
Bicubic hermite
No singular functions
Slope =t
,,
~"~:"'Ct
7000
-t--- ~y-
r--- Slope =1
6000
'!\..
' ~""'
5000 \.
~"
¡':
¡ ¡
4000
: 3000 ' ~ '~

'\
~
\.'- ' 2000 1\ ~
\~' '
~ -Si linear
----\ ~ No singular functions
\)\ Slope~: 1 \ s~a
1000 -¡-¡-
\ ~
)..
..._Siope = 2 -f-+-
'
.
800 -¡-¡-
\ ' 700
600
st -""-
\
'
U) Slo pe= 2
+ 500 \ 1
\ º)(
.....
400
\1
\ o
g 300
\.
o
Q)
Q)
:; 200 \
~
'\-- -
\.
\
;
sh·
L
Slope =2
(/)
..0
<{
100
30
:>'
\ ' "7
\ 20 \
\ 10
s~L
Slope = 4
\ \ \
S~
·slope = 4
\
i\ \
e
1\
\ 1
7
_\
,~,
..,,.\ 1
6
1
\ \ 5 \
1 4 1\ \
~ .. \ 3 \
\Lsh
\
sfope =
4
2
1\
\
\
\>
\
10 20 30 40 60 eo 100 2 3 4 5 678910 20 30- 40 5060 eo 100
1/h 1/h
1 Fig. 8.5
273
Stress intensity factor =0.1917
SEC. 8.4.
1000
800
700
IE difference quotient
600
'500
,.
400
300
=Hii Since the errors in the finite elemem
tions grow quite rapidly as the poin
,,,200
1'\ Bicubic hermite choice { 11 O(h) is as good as any.
~!"a
No singular functíons The conclusions are the same as in th(
roo
~Bilinear
'~ :--.,
l"'tl ¡......
V
A'r"o
V ~
I"''..L. ...........
Oifference quotíent
approximation
Slope =t f-
f-
refinement is even less competitive 1
difference quotient.
80 -No singular functions .... ~
f- Our second example is the one-J
70
- Oifference quotient ............ "' ~
60
- approximation '~ (33)
50 -
=t "" .................
......
-V·(p Vu)
40 _Siope
' This differential equation holds in th

30
20
1
sh,s
1 1
Oi¡ference quotient
~
. " ~ i
8.7), and we require the energy .flu:
(34) u=O
"''
appro:dmatíon
V
... Slope 1 = _
º ><
... i
·e ('¡)
ro
8
!'...
""
1
1
1
7 1
Q) 1
'5
o
6
5 S~ -
' ''" 1
r
~
.o
<(
(/)
4 Slope =1 _..._" P 1
1
1
Os - - - - - - -4- - - -
.
2 """1~ '
01 .... -·----·--·----t---
1
!).,. \
P,l
0.8 1
0.7
0.6 s~L
Slope =3
__ ¡...,
'\ \
1
1
1
1
0.5 1
' \ 1
1
0.4 1
-0.3
~ \--S~ 1
\ Slope"' 3
\
0.2 Fig. 8.7 Square core sun
\\ The coefficients p, q, and pare regio1
0.1 rating n 1 and 02 is an interface, and ,
1 2 3 4 5 6 7 8 910 20 30 40 50 60 80 lOO
ous across r:
1/ h
. (35) iJu\ =p-
p- iJu\
Fig. 8.6 iJv r+ iJv r
274
1sity factor =0.1917
SEC. 8.4. EXPERIMENTAL RESULTS 275
1 1
1 l
¡ difference. q uotient
:
1
1
Since the errors in the finite element approximation without singular func-
! tions grow quite rapidly as the point P is approached, it turns out that the
!
Bicubic herr¡fu,ite choice eh = O(h) is as good as any. These are the values given in Fig. 8.6.
No singular f.unctions The conclusions are the same as in the previous experiments, except that mesh
·'~
Difference quotient
approximation
refinement is even less competitive because of the extra error made in the
t-
l~ ~
...........
Slope ~ t difference quotient.
Our second example is the one-group, two-region reactor govemed by
1 "o. .....
r... ............ ............
(33) -V ·(p Vu) + qu = A.pu.
'"" "
............ ¡..,
1-'
t This differential equation holds in the core 0 1 and the reflector 0 2 (see Fig.
~nt ~ ~ "" 8. 7), and we require the energy flux u to vanish on the o u ter boundary:
(34) u=O
i
!
(
" ~
1
1
1
1
1
1
1
1
.n2
1 1
1 1
1 t
~', ' 1 !
1 1
1 - "
1
P4 1 r+ IP3
" :""
Q6
! r_
i .... r+ r_ .n, r_ r+
!
1
r_ --
Q1
P1 1
1
r+ tP2
1
1 1
1 1
j 1 1
1 1
! 1 1
1
i 1\ 1
1
1
1
f
;\
\--S~
\ Slope= 3
,. , 1
1 l
Fig. 8.7 Square core surrounded by a square reflector.

¡\ \ Thé coefficients p, q, and p are regionwise constants. Thus the curve r sepa. .
rating 0 1 and 0 2 is an interface, and we require that u and p au¡av be continu-
~lO 20 30 40 5060 80 100
ous across r:
1/ h
. (35) au¡
p- au¡ uln ulr-·
.6 av r+ =p-
av r-' =
' '
276 ' SINGULA'RITIES CHAP. 8 10,00o
800o
700o
The _most important quantity to calculate is the lowest eigenvalue l, which 600o-
meastires the criticality of the reactor. 500 o
Independ(mt of any corner, the existence of interfaces fundamentally alters 400o
the best choice of an appropriate finite element space Sh. To use piecewise 300o
polynomials which are e 1 across r would clearly lead to poor approxima-
tions, ~ince u has discontinuous derivatives. Moreover, the use of trial func- 200 o "
tions si~isfying the jump condition (35) leads to extra difficulties at the corners
Pi,j = l, 2, 3, 4. If we force the trial functions to satisfy the jump conditions
\ 1\
along P 1 P 2 , then they will still be influential on a portion of Q 1 P 1 where the 100o
solution is smooth. 80o
Because of the convenient geometry, a spline-Lagrange space like that 70 o
60o
described in the torsion problem appears to be the most appropriate. The
50o
idea is to subdivide the region n into squares, with the interfaces lying on
40 o
mesh lines. The trial spac~ Sh consists of piecewise bicubic polynomials which
are C 2 everywhere except on the lines Q1 Q4 , Q2 Q7 , Q 3 Q 6 , and Q 5 Q 8 , across 30 o
which the elements are only continuous. We ignore the jump condition (35),
20o
which is a natural boundary condition, and allow the trial functions to ha ve ID
+
arbitrary jump discontinuities in their normal derivatives across the inter-
faces. Since the Galerkin method gives us in sorne sense a best approximation,
ºX
it will presumably work out the jump discontinuities across the interfaces in g
Q)
10o
a satisfactory way! Q) 8o
"S 7o
Approximations to the first eigenvalue using this space are given in Fig. o
1/)
6o '
.o 5o
8.8 for the case
~
e::(
4ó
Pt = 5, Pz = 1, q =O, p=l. 3o
t-
1 \
The approximations A_h are reasonably accurate, yet the rate of their conver- 20
gence to l is very slow. In particular,
~
(36) V= 0.78.
10
8
The reason is that the eigenfunction u has unbounded first derivatives at the 7
points Pp P 2 , P 3 , and P 4 • In fact, the analysis in Section 8.1 shows that the 6
5
dominant term in the singu_larity is of the form rvrpv(O), where 'Pv is a periodic
4
function of (} and v = 0.78. Therefore, the eigenvalue error cannot be bettyr
than h~x~ By including rvrpv in the trial spac·e, the convergence rate can be 3
Í!lcreased to approximately h 6 - 2 v. This is also confirmed by the numerical
data. 2
Because of the singularity, it may seem surprising that l\ on a uniform
mesh without singular functions, is reasonably accurate. This is in striking
contrast to the situation for the torsion problem. The reason is that the coef- 1
2 3 4 5
ficient of rvrpv is quite small; computer plots ofthe eigenfunction uh show that
it is virtually a constant in the core 0 1 • We have checked this property (which
is a consequence of the physics) by computing also the critica! eigenvalue for
CHAP. 8 10,00o
800 o
700o
e is the lowest eig~nva1ue ,t, which
ce of interfaces fundamentaHy alters

600 O'
500 o
400o
t
element space Sh. To use piecewise 300o
Id clearly lead to poor approxima-
es. Moreover, the use of trial func- 200o
lds to extra difficulti~\at the corners
"
:tions to satisfy the juntp conditions 1\ 1\. Cubic space without
r-r--singular functions
tial on a portion of Q1 P1 where the \~
100o '\. Slope ~2v
80 o
. a spline-Lagrange space like that
·s to be the most appropriate. The
70o
60o ''
1uares, with the ·interfaces lyirig on
liecewise bicubic polynomials which
50o
40o '1\
Q4, Q2Q,, Q 3Q6, and Q 5Q 8, across 30o \
We ignore the jump condition (35),
20o
'
ld allow the trial functions to have u;¡
+
tormal derivatives across the ínter-
in sorne sense a best approximation, º
;continuities across the interfaces in 10o
8o 1
7o i
!
te using this space are given in Fig. 6o ! l

.
5o
4o ~
\ Cubic space with one
q =0, p=1. 3o
{ singular function
;curate, yet the rate of their conver- 2o

\ Slope~6-2ll
V= 0.78:
10
8
~
s unbounded first derivatives at the 7 !\
\
alysis in Section 8.1 shows that the 6
· form r rpv{8), where 'Pv is a periodic
11
1e eigenvalue error cannot be bett;r

5
4
'
ipac·e, the convergenee rate can be 3 \
is also confitmed by the numerical
2
:m surprising that lh, on a uniform
onably accurate. ·This is in striking
)roblem. The reason is that the coef- 1
2 3 4 5 6 7 8 910 20 30 40 5060 80 100
ts of the eigenfunction uh show that
e ha ve checked this property (which 1/h
uting also the critica! eigenvalue for Fig. 8.8
277
278 SING ULARITIES CHAP.8 SEC. 8.4.
the caset like

p 1 = 500, Pz = 1, q= O, p=l.
(37) u(r, 8) = E E ci
j= 1 1=0
1J
This eigenvalue is shifted only to 5.582 and is therefore virtually independent

of p 1 ; the contribution to the Rayleigh quotient from the inner square íl 1 (plus analytic terms) near the reent
is almo~t zero. The most accurate ,approxima
Frofu a physical point of view this "weak" coupling to a "strong" singu- Fox-Henrici-Moler [FIO]. The ide~
larity is quite satisfactory. In fact, unlike the torsion problem, the singularity tions to Au + lu = O, in this case
at P has no physical meaning-equation (33) should be replaced with a v = vi' and to determine the coeffi1
transport equation in this region. It would appear, therefore, that singular tion on the boundary r. This me
functions are not absolutely necessary for such problems, and a great deal method: It works with exact solutio
of mesh refinement may not be worth the effort. This conclusion does not apply tions. The essential point, however.
to all interface problems. It was mentioned in the first section that a crossing · It is known that the eigenfunction
of interfaces could produce a dominant term in the singularity of any order by linear combinations of the K,,
r€rp€(8).t lt seems reasonable that a small E will be reflected in poor approxi- produce approximations with simil
mations, as in the tQrsion problem, and that singular functions or local mesh were used. Strictly speaking, the fu
refinement may be necessary to obtain acceptable results. violate the essential boundary cm
Our final example is the perennial L-shaped membrane-an overworked closely related functions (1 - x 2 )
but nevertheless effective model (Fig. 8.9). We seek the eigenvalues of -Au = which are comparable with those e
lu in Q, u= Oon r, and note from Section 8.1 that the eigenfunctions behave These calculations reflect the re1
by knowing exactly the righ t trial f1
polynomials! (Fourteen singular f1
( -1 '1) ( 1' 1)
r-----------~-.--------------. bicubics, to improve on the resu
however, is to obtain good accurac.
finite element program. Thereforel
eigenvalues, using a cubic spl~ne s¡
(r,8)
Figure 8.1 O illus~rates how the slo
case is greatly increased by the intr
There is no question that the cons
the whole program more efficient.
P=(O,O) (1,0)
The conclusion which we draV~
Even for coarse meshes and for sii
which the theory predicts are clea
engineering literature contains a
also leading to the same conclusic
the steps of the finite element me1
achieved. We hope that this analy:
(-1,-1) (0,-1) fu tu re development of the method
nomial elements was already clear,
Fig. 8.9 An L-shaped membrane. is mathematically confirmed.
tv in this case in approximately j.
tFor the one-group, two-region reactor, the eigenvalues are the roots of (20), and
E2j. ,
CHAP.8 SEC. 8.4. EXPERIMENTAL RESULTS 279
like
q=O, p l. 2'
(37) u(r, e) V = _l_,
j 3
td is therefore virtually independent
quotient from the inner 'square nl (plus analytic terms) near the reentrant corner P.
The most accurate approximations to date have been computed by
veak" coupling toa 'fstrong" singu- Fox-Henrici-Moler [FIO]. The idea is to take a linear combination of solu-
the torsion problem, the singularity tions to Au + lu =O, in this case combinations of Kv li,JI:r) sin ve,
m (33) should be replaced with a v vi, and to determine the coefficients by minimizing the linear combina-
Jld appear, therefore, that singular tion on the boundary r. This method is in a sense dual to the Galerkin
Jr such problems, and a great deal method: It works with exact solutions and approximates the boundary condi-
·ffort. This conclusion does not apply tions. The essential point, however, is th.e class of functions which are used.
~d in the first section that a crossing It is known that the eigenfunctions u can be very accurately approximated
:erm in the singularity of any order by linear combinations of the Kv, and therefore a Galerkin method would
f will be refiected in poor approxi- produce approximations with similar accuracy if the same class of functions
hat singular functions or local mesh were used. Strictly speaking, the functions Kv are not admissible, since they
;ceptable results. violate the essential boundary. conditions. However, calculations with the
shaped membrane-an overworked closely related functions (l - x 2 )(l y2 )rv+ 21 sin ve are reported in [F9]
, We seek the eigenvalues of -Au which are comparable with those obtained by Fox-Henrici-Moler.
n 8.1 that the eigenfunctions behave These calculations reflect the remarkable accuracy which can be achieved
by knowing exactly the right trial functions-which are not always piecewise
polynomials! (Fourteen singular functions would be needed, together with
( 1' 1)
----------, bicubics, to improve on the results described above.) Our chief interest,
however, is to obtain good accuracy with simple modifications of a standard
finite element program. Therefore, we have computed the first and fourth
eigenvalues, using a cubic spline space with and without singular functions.
Figure 8.1 O illustrates how the slow rate of convergence (h 413 ) in the latter
case is greatly· in~reased by the introduction of three singular trial functions.
There is no question that the construction of these special functions makes
' the whole program more efficient.
::>:(0,0} (1,0}
The conclusion which we draw from all the numerical evidence is this:
Even for coarse meshes and for singular problems, the rates of convergence
which the theory predicts are clearly reproduced by the computatioris. The
engineering literature contains a large number of numerical experiments,
also leading to the same conclusion. This means that our goal, to analyze
the steps of the finite element method and to explain its success, is largely
achieved. We hope that this analysis will provide a theoretical basis for the
),-1) future development of the method. The simplicity and convenience of poly-
nomial elements was already clear, and now the accuracy which they achieve
ted membrane. is niathematically confirmed.
1e eigenvalues are the roots of (20), and

A., = 9.639723844
A.4 = 41.474516
2,000,000
"
•
1,000,000
8oo,qoo
600,000 ' '
~
400,000 1111
l 11
300~000
8A.4
"
"
~·:.
2oo;ooo __ Cubics-no singular functions

ax 1 ~'\ !\. Slope =4 BIBLIOGRAPHY
100,000
80,000
=: Cubics- no singular
-:- functions 1
GQOOO -
r- Slope =4
40,000
1 1
30,000
20,000
(
11 t1t111
\111 8A.4 The number of publications on t
10,000 1 / V Cubics-3 singular functions both the engineering and the numeric
8000 Slope"" 6 -impossible any attempt at completen
6000 1\ prior to 1971 is contained in the exce
4000 ZIENKIEWlCZ, 0. C. (1970), The finite
ID
1\
º...
)(
3000
2000
1 (
Appl. Mech. Rev. 23, 249-256.
The following books have been refe1
g
for the three which appeared too rece
Q)
Q)
:; 1000 \
1. AGMON, S. (1965), Lectures on
oI/)
.a "> trand Reinhold, New York .
<1:. 400
300 2. ÁRGYRIS, J. H. (1954-1955), Ene
worth, London, 1960.
200
3. ARGYRIS, J. H. (1964), Recent
100 \ Analysis, Pergamon Press, Elms
80 4. AuBIN, J. P. (1972), Approxim
60 8X1
11" Cubics- 3 singular functions Wiley, New York.
40 -,\+ Slope ~ 6 5. DESAI, C., and J. ABEL (1972),
30
' Van Nostrand Reinhold, New)
w
20 ' 6. FADDEEV, D. K., and V. N. F.
• -
Linear Algebra, Dover, New Yo
10
8 7. KRASNOSEL'SKII, M. A., G. M. \
6 of Operator Equations, Nauka, l'
4 8. HoLAND, 1., and K. BELL, eds. (1
3 sis, Tapir, Trondheim, Norway.
2 LIONS, J. L. (1969), Quelques mé
1 2 34 6810 20 3040 60 80 100 200 400 1000 9.
non linéaires, Dunod, Paris.
1/h
Fig. 8.10
l.639723844
.474516
m~
TT
8>..4 i\
!--
t- Cubics- no singular functions
t-
BIBLIOGRAPHY
Slope =%
8>..4 The number of publications on the finite element method is growing, within
Cubics- 3 singular functions ¡...... both the engineering and the numerical analysis literatures, at arate which renders
Slope 1::6 t-
impossible any attempt at completeness. A bibliography which includes 170 titles
lffi prior to 1971 is contained in the excellent survey

ZIENKIEWICZ, O. C. (1970), The finite element method: from intuition to generality,
Appl. Mech. Rev. 23, 249-256.
The following books have been referred to in preparing the present text, except
for the three which appeared too recently (1972) for us to consult:
l. AoMON, S. (1965), Lectures on Elliptic Boundary Value Problems, Van Nos-
trand Reinhold, New York.
'
2. ARGYRIS, J. H. (1954-1955), Energy Theorems and Structural Analysis, Butter-
worth, London, 1960.
3. ARGYRIS, J. H. (1964), Recent Advances in Matrix Methods of Structural
Analysis, Pergamon Press, Elmsford, N.Y.
4. AUBIN, J. P. (1972), Approximation of E/liptic Boundary-Value Problems,
singular functions
Wiley, New York.
5. DESAI, C., and J. ABEL (1972), Introduction to the Finite Element Method,
\
Van Nostrand Reinhold, New York.
6. FADDEEV, D. K., and V. N. FADDEEVA (1963), Computational Methods of
- Linear Algebra, Dover, New York.
-
..._
1-- f-
1= 1- 7. KRASNOSEL'SKII, M. A., G. M. VAINIKKO, et al. (1969), Approximate Solution
1
!--1-
1--1- of Operator Equations, Nauka, Moscow (in Russian).
1
8. HOLAND, 1., and K. BELL, eds. (1969), Finite Element Methods in Stress Analy-
sis, Tapir, Trondheim, Norway.
11
040 60 80 100 200 400 1000 9. LIONS, J. L. (1969), Quelques méthodes de résolution iles problemes aux limites
non linéaires, Dunod, Paris.
1/ h
281
o
282 BlBLIOGRAPHY
10. LIONS, J: L., and E. MAGENES (1968), Problemes aux limites non homogenes
et applications, Dunod, París. 5. Numerical Solution of Partial Dfj
11. MIKHLIN, S. G. (1964), Variational Methods in Mathematical Physics, Perga- of Maryland), ed. B. Hubbard, 1
mon Press, Elmsford, N.Y. 6. The Mathematical Foundations ~
12. MIKHLIN, S. G. (1965), The Problem of the Mínimum of a Quadratic Func- Maryland at Baltimore), Acaden
tional, Holden-Day, San Francisco. 7. Numerical Solution of Field Prol
13. MI~tiLIN, S. G. (1971), The Numerical Performance of Variational Methods, Duke University, SIAM-AMS P
translated from the 1966 Russian edition, Wolters-Noordhoff, Groningen. 8. SM D Symposium on Computer·
14. NECAS, J. (1967), Les méthodes directes en théorie des équations el/iptiques, University of Waterloo, May 19'
Academia, Prague. 9. Finite Element Techniques in St,
15. ODEN, J. T. (1972), Finite Elements of Nonlinear Continua, McGraw-Hill, C. Brebbia, Southampton Unive
New York. 10. Conference on Variational Metht
16. PRZEMIENIECKI, J. S. (1968), Theory of Matrix Structural Analysis, McGraw- England, 1972.
Hill, New York. 11'. Conference on the Mathematics
17. SYNGE, J. L. (1957), The Hypercircle in Mathematical Physics, Cambridge University, England (to be publi
University Press, New York. 12. Proceedings of the Internationa
18. VAINBERG, M. M. (1964), Variational Methods for the Study of Nonlinear Methods in Engineering, Univers
Operators, Holden-Day, San Francisco. 13. Proceedings of the American Nu(
19. VISSER, M, (1968), The Finite Element Method in Deformation and Heat Con- 14. Proceedings of the First Interna¡
duction Problems, Delft, Holland. tures, Berlín, 1971.
20. WILKINSON, J. H. (1963), Rounding Errors in Algebraic Processes, Prentice- 15. Proceedings o/ the First Sympl
Hall, Englewood Cliffs, N.J. J. H. Goodier and H. J. Hoff, P
21. WILKINSON, J. H., and C. REINSCH (1971), Linear Algebra, Springer-Verlag, 16. Proceedings of the Symposium o
Berlin. 1969.
22. ZIENKIEWICZ, O. C. (1971), The Finite Elem.ent Method in Engineering Science, 17. Recent Advances in Matrix Met
2nd ed. McGraw-Hill, New York. J. T. Oden, R. H. Gallagher, an
23. ZIENKIEWICZ, O. C. and G. S.. HOLISTER, eds. (1965), Stress Analysis, Wiley, 1971. (First Japan-U.S. seminar
New York. 18. App/ication of Finite Element A1.
It will also be useful to list sorne of the recent conferences and symposia which Engineering, ISPRA, Italy, 1971.
have concentrated on the theory and applications of finite elements. Their published 19. Con/erence on Computer Oriente,
proceedíngs contain a large number of valuable papers. Those to which we have Alto Research Laboratories, Pal
referred specifically in the text will reappeai:' in the list of individual papers below, Journal of Computers and Struct~
with an indication such as "Wright-Patterson Conference 11."
20. Symposium on the App/ication oj
l. Proceedings ofthe First Conference on Matrix Methods in Structural Mechanics, Vanderbilt University, 1969.
WrÍiht-Patterson AFB, Ohio, 1965.
21. Symposium on Application of th
2. Proceedings of the Second Conference on Matrix Methods in Structural Swiss Society of Architects and
Mechanics, Wright-Patter~on AFB, Ohio, 1968.
22. National Symposium on Computt
3. Proceedings of the Third Conference on Matrix Methods in Structural Me-
Washington University, 1972.
chanics, Wright-Patterson AFB, Ohio, 1971 (to appear).
23. On General Purpose Finite Eler
4. Proceedings IUTAM Symposium, High Speed Computing of Elastic Struc-
Amer. Society of Mechanical Er
tures, Liege, Bel.gium, 1970.
24. Proceedings of the N ATO Advan
BIBLIOGRAPHY 283
Problemes aux limites non 'homogenes
5. · Numerical Solution of Partial Differential Equations (SYNSPADE, University
·thods in Mathematical Physics, Perga- of Maryland), ed. B. Hubbard, Academic Press, New York, 1971.
6. The Mathematical Foundations of the Finite Element Method. (University of
if the Minimum of a Quadratic Pune- Maryland at Baltimore), Academic Press, New York, 1973.
7. Numerical Solution of Field Problems in Continuum Physics, ed. G. Birkhoff,
·Performance of Varü/;t,ional Methods, Duke University, SIAM-AMS Proceedings, Vol. 2, 1970.
on, Wolters-Noordhoff,. Groningen. 8. SMD Symposium on Computer-Aided Engineering, ed. G. L. M. Gladwell,
es en théorie des équations elliptiques, University of Waterloo, May 1971.
9. Finite Element Techniques in Structural Mechanics, eds. H. Tottenham and
if Nonlinear Continua, McGraw-Hill, C. Brebbia, Southampton University Press, 1970.
10. Conference on Variational Methods in Engineering, Southampton University,
Matrix Structural Analysis, McGraw- England, 1972.
1L · Conference on the Mathematics of Finite Elements and Applications, Brunel
in ftt.!athematical Physics, Cambridge University, England (to be published byAcademic Press), 1972.
12. Proceedings of the International Symposium on Numerical and Computer
Methods for the Study of Nonlinear Methods M Engineering, University of Illinois, 1971.
13. Proceedings of the American Nuclear Society Meeting, Boston, 1971.
Method in Deformation and Heat Con-
14. Proceedings of the First International Conference on Nuclear Reactor Struc-
tures, Berlín, 1971.
·rors in Algebraic Processes, Prentice-
15. Proceedings of the First Symposium on Naval Structural Méchanics, eds.
J. H. Goodier and H. J. Hoff, Pergamon Press, 1960.
)71), Linear Algebra, Springer-Verlag,
16. Proceedings of the Symposium on Finite Element Techniques, ISD, Stuttgart,
1969.
~lement Method in Engineering Science,
17. Recent Advances in Matrix Methods of Structural Analysis and Design, eds.
J. T. Oden, R. H. Gallagher, and Y. Yainada, University of Alabama Press,
:R, eds. (1965), Stress Analysis, Wiley, 1971. (First Japan-U.S. seminar; the second will be held at Berkeley in 1972.)
18. App!ication of Finite Element A1ethods to Stress Analysis Problems in Nuclear
ecent conferences and symposia which Engineering, ISPRA, Italy, 1971.
:ions of finite elements. Their published
19. Conference on Computer Oriented Analysis of Shell Structures, Lockheed Palo
table papers. Those to which we have
Alto Research Laboratories, Palo Alto, Calif., 1970 (papers to appear in the
in the list of individual papers below,
Journal of Computers and Structures).
Jn Conference 11."
20. Symposium on the Application of Finite Element Methods in Civil Engineering,
fatrix Methods in Struc!_ural Mechanics,
Vanderbilt University, 1969.
21. Symposium on Application of the Finite Element Method in Stress Analysis,
~e on Matrix Methods in Structural
Swiss Society of Architects and Engineers, Zurich, 1970.
io, 1968.
m Matrix Methods in Structural Me-
22. National Symposium on Computerized Structural Analysis and Design, George
1971 (to appear).
' Washington University, 1972.
L Speed Computing of Elastic Struc- 23. On General Purpose Finite Element Computer Programs, ed. P. V. Marcal,
Amer. Society of Mechanical Engineers.
24. Proceedings of the NATO Advanced Study Institute, Lisbon, 1971.
284 BIBLIOdRAPHY
BS. BABUSKA, J. (1971 ), Finite

2S . .Compuiational Approaches in Applied Mechanics, ASME Joint Computer Univ. Maryland.
Conference, Chicago, 1969.
B6. BABUSKA, J. (1973), The finit
The following bibliography contains those papers which are comment~d on in Numer. Math. 20, 179-192.
the text, together with many others which we have found to be valuable references.
B7. BACKLUND, J. (1971), Mixea
At this point especially, it must be repeated that a search of the literature would
plastic Piafes in Bending, (
yield a 01uch Iarger number .of important titles. This is particularly true of engi-
Sweden.
neering ~'~apers. The list below does indicate those journals in which the :finite
element method is strongly represented, and together with. the conference proceed- B8. BAUER, F. L. (1963), Optima
ings it should be a reasonable guide to the significant analytical and theoretical B9. BAZELEY, G.P., Y. K. CHI
work on the method. (l96S), Triangular elements
Al. AHMAD~ S., B. M. IRONS, andO. C. ZIENKIEWICZ (1968), Curved thick shell forming solutions, Wright-Pt
and membrane elements with particular reference to axisymmetric problems, BtO. BERGER, A., R. Scorr, and
Wright-Patterson ll. conditions in the finite eleme
A2. ALLMAN, D .. J. (1971), Finite element analysis of plate buckling using a Istituto Nazionale di Alta
mixed variational principie, Wright-Patterson III. Mathemat_ica, Academic Pr~
Bll. BIRKHOFF, G. (1969), Piecev
A3. ANDERHEGGEN, E. (1 970), A conforming triangular finite element plate
in polygons, Approximation ·
bending solution, Int. J. for Num. Meth. in Eng. 2, 2S9-264.
J. Schoenberg, Academic Pre
A4. ARGYRIS, J. H., and S. KELSEY (1963), Modern Fuselage Analysis and the
Bl2. BIRKHOFF, G. (1971), Num
Elastic Ai~cra/t, Butterworth, London.
Regional Conference Series, '
AS. ARGYRIS, J~ H., I. FRIED, and D. W. ScHARPF (1968), The Hermes eight Bl3. BIRKHOFF, G. (1972), Angula
element for the matrix displacement method, J. Royal Aero. Soc., 613-617. Th. 6, 21S-230.
A6. ARGYRIS, J. H., and I. FRIED (1968), The LUMINA element for the matrix Bl4. BIRKHOFF, G., and C. de Boo1
displacement method, J. Royql Aero Soc., Sl4-S17. J. o/ Math. and Mech. 13, 82
A7. ARGYRIS, J. H., O. E. BRONLUND, I. GRIEGER, and M. SORENSEN (1971), BIS. BiRKHOFF, G., and C. DE Bo1
A survey of the application of finite element methods to stress analysis and approximation, Approx
problems with particular emphasis on their application to nuclear engi- Elsevier, Amsterdam.
neering problems, /SPRA -Con/erence. ·
Bl6. BIRKHOFF, G., c. DE BooR, R
A8. AUBIN, J~ P. (1967), Approximation des espaces de distributions et des Ritz approximation by piece
opérateurs différentiels,, Bull. Soc. Math. France, Mémoire 12. 13, 188-203.
A9. AuBIN, J. P. (1968), Eva1uation des erreurs de troncature des approxima- 817. BIRKHOFF, .G., and G. FIX (l
tions des espaces de Sobolev, J. Math. Anal. Appl. 21, 3S6-368. nometric polynomials, Indiar.
A10. AUBIN, J. P., and H. BuRCHARD (1971), Sorne aspects of the method of the 818. BIRKHOFF, G., and G. Ftx (
hypercircle applied to elliptic variational problems, SYNSPADE, 1-67. elliptic problems, Duke Unü
Bl. BABUSKA, l. (1961), Stability of the domain of definition ... (in Russian), Bl9. BIRKHOFF, G., M. H. SCHUL~
Czech. Math. J. 11(86), 76-10S, )6S-203. · interpolation in one and two
ential equations, Numer. Mat
B2. BABUSKA, l. (1970), Approximation by hill functions, Tech. Note 648, Univ.
Maryland. B20. BLAIR, J. J. (1971), Bounds fe
elliptic PDEs when the boun
B3. BABUSKA, J. (1970), Finite element method for domains with corners,
Computing 6, 264-273. B21. DE BooR, C. (1968), The mett
solution of two-point bounda
B4. BABUSKA, I. (1971), Error bounds for the finite element method, Númer. University of Michigan.
Math. 16, 322.:._333.
BIBLIOGRAPHY 285
B5. BABUSKA, J. (1971), Finite element method with penalty, Rept. BN-710,
Mechanics, ASME· Joint Computer
Univ. Maryland.
B6. BABUSKA, l. (1 973), The finite element method with Lagrangian multipliers,
~ papers which are commentc;:d on in Numer. Math. 20, 179-192.
have found to be valuable references.
that a search of the literature would B7. BACKLUND, J. (1971), Mixed Finite Element Analysis of Elastic and Elasto-
les. This is particularly true of engi- plastíc Plates in Bending, Chahners lnstitute of Technology, Goteborg,
~ those journals in w&ich the finite Sweden.
ogether with the conference proceed- B8. BAUER, F. L. (1963), Optimally scaled matrices, Numer. Math. 5, 73-87.
significant analytical and theoretical B9. BAZELEY, G. P., Y. K. CHEUNG, B. M. IRoNs, and O. C. ZIENKIEWICZ
(1965), Triangular elements in plate bending-conforming and noncon-
ENKIEWICZ (1968), Curved thick shell forming solutions, Wright-Patterson l.
r reference to axisymmetric problems, Bto. BERGER, A., R. Scorr, and G. STRANG (1972), Approximate boundary
conditions in the finite element method, Symposium on Numerical Analysis,
t analysis of plate buckling using a Istituto Nazionale di Alta Matematica, Rome; to appear in Symposia
1tterson 111. Mathemat.ica, Academic Press, New York.
ning trilmgular finite element plate B11. BIRKHOFF, G. (1969), Piecewise bicubic interpolation and approximation
?th. in Eng. 2, 259-264. in polygons, Approximation with Special Emphasis on Spline Functions, ed.
l. Schoenberg, Academic Press, New York, 185-221.
), Modern Fuselage Analysis and the
B12. BIRKHOFF, G. (1971), Numerical solutions of elliptic equations, SJAM
Regional Conference Series, Vol. l.
ScHARPF (1968), The Hermes eight
B13. BIRKHOFF, G. (1972), Angular singularities of elliptic problems, J. Approx.'
1ethod, J. Royal Aero. Soc., 613-617.
Th. 6, 215-230.
'he LUMINA element for the matrix
B14. BIRKHOFF, G., and C. deBooR (1964), Error bounds forspline interpolation,
~oc.,514-517.
J. of Math. and Mech. 13, 827-836.
GRIEGER, and M. SORENSEN (1971),
B15. BtRKHOFF, G., and C. DE BooR (1965), Piecewise polynomial interpolation
element methods to stress analysis
and approximation, Approximation of Functions, ed. H. L. Garabedian,
n théir application to nuclear engi-
Elsevier, Amsterdam. ·
Bl6. BIRKHOFF, G., C. DE BooR, B. SwARTZ, and B. WENDROFF (1966), Rayleigh-
des espaces de distributions et des Ritz approximation by piecewise cubic polynomials, SIAM J. Num. Anal.
th. France, Mémoire 12. 13, 188-203.
rreurs de troncature des approxima- .817. BIRKHOFF, G., and G. F1x (1967), Rayleigh-Ritz approximation by trigo-
·. Anal. Appl. 21, 356-368. nometric polynomials, Jndian J. of Math. 9, 269-277.
), Sorne aspects of the method of the .BIS. BIRKHOFF, G., and G. FIX (1970), Accurate eigenvalue computations for
tnal problems, SYNSPADE, 1-67. elliptic problems, Duke University SIAM-AMS Symposium.
omain of definition ..-:(in Russian), B19. BIRKHOFF, G., M. H. ScHULTZ, and R. VARGA (1968), Piecewise Hermite
:o3.· interpolation in one and two variables with applications to partial differ-
ential equations, Numer. Math. 11, 232-256.
' hill functions, Tech. Note 648, Univ.
B20. BLAIR, J. J. (1971), Bounds for the change in the solutions of second order
elliptic PDEs when the boundary is perturbed, to appear.
method for domains with corners,
B21. DE BooR, C. ( 1968), The method of projections as -applied to the numerical
solution of two-point boundary value problems using. cubic splines, Thesis,
r the finite element method, Nume,r.
University of Michigan.
286 BIBÜO,GRAPHY
B22. DE BooR, C. (1968), On loca1 spline approximation by moments, J. of Math. C6. CIARLET, P. G., and P. A. R.a
and Mech. 11, 729-736. interpolation in Rn with app
B23. DE BooR, C., and G. F1x (1972), Spline approximation by quasi-interpolants, Rat. Mech. Anal. 46, 177-199
to appear in J. of Approx. Theory. C7. CIARLET, P. G., and P. A. RA'
Bi4. DE BooR, C., and B. SWARTZ (1972), Collocation at -Gaussian points, elements, with applications t
fos Alamos Rept. 72-65. in Appl. Mech. and Eng. 1, 2
B25. ~RAMBLE, J. H. (1971), Variational methods for the numerical solution of cs. CLouoH, R. W., and J.- L. Te
el1iptic problems, Lecture notes, Chalmers Institute of Technology, Gote- for analysis of plates in bend
borg, Sweden. C9. CLOUGH, R. W., and C. A. FJ
B26. BRAMBLE, J. H., T. DuPóNT, and V. J'HOMÉE (1972), Projection methods for analysis of plate bending,
for Dirichlet's problem in approximating polygonal domains with boundary ClO. CLOUGH, R. W. (1969), Comp
value corrections, MRC Tech. Rept. 1213, Univ. Wisconsin. bilt Symposium.
B27. BRAMBLE, J. H., and S. R. HILBERT (1970), Estimation of linear functionals Cll. CoURANT, R. (1943), Variati<
on Sobolev spaces with application to Fourier transforms and spline irtter- equilibrium and vibrations, 1
polation, SIAM J. Num. Anal. 1, 113-124. CoWPER, G. R. (1972), CUI
C12.
B28. BRAMBLE, J. H., and S. R. HILBERT (1971), Bounds for a class of linear Shells of Arbitrary Shape, N:
functionals with applications to Hermite interpolation, Numer. Math. 16, CowPER, G. R. (1972), Gaus~
Cl3.
362-369. script.
B29. BRAMBLE, J. H., and J. OsaORN (1972), Rate of convergence estimates for CoWPER, G. R., E. KosKo,
C14.
nonselfadjoint eigenvalue approximations, MRC Tech. Rept. 1232, Univ. Static and dynamic applicati<
Wisconsin. ing element, AIAA J. 1, 1957-
B30. BRAMBLE, J. H., andA. H. ScHATZ (1970), Rayleigh-Ritz-Galerkin methods
Dl. DEMJANOVIC, J. K. (1964), Tl
for Dirichlet) problem using subspaces without boundary conditions, matical physics, Dokl. Akad.
Comm. Pure Appl. Math. 23, 653-675.
D2. DEMJANOVIC, J. K. (1966), j
B3l. BRAMBLE, J. H., and A. H. ScHATZ (1971), On the numerical solution of method in elliptic problems,
elliptic boundary value problems by least square approximation of the Dokl. 1.
data, SYNSPADE, 107-133.
D3. DENDY, J. (1971), Thesis, Ric
B32. BRAMBLE, J. H., and M. ZLAMAL (1970), Triangular elements in the finite
element method, Math. of Comp. 24, 809-821. D4. DESCLOUX, J. (1972), On fini
260-265.
Cl. CARLSON, R. E., and C. A. HALL (1971), Ritz approximations to two-
D5. DESCLOUX, J. ( 1970), On the
dimensional bounda,ry value problems, Numer. Math. 18, 171-181.
1 Numer. Math. 15, 371-381.
C2. CÉA, J. (1964), Approximation variationelle des problemes aux limites, DouoLAS, J., and T. Duror
D6.
Ann. lnst. Fourier, 14, 345-444. problems, SIAM J. Numer. /:
C3. €HERNUKA, M. w., G. R. COWPER, G. M. LINDBERG, and M. n~ ÜLSON D7. DouGLAS, J., and T. DuPON1
(1972), Finite elemeni analysis.,of plates with curved edges, In t. J. for Num. for quasilinear parabolic equ:
Methods in Eng. 4, 49-65.
D8. DOUGLAS, J., and T. DUPONT
C4. CIARLET, P. G., M. H. ScHUL~z. and R. S. VARGA (1967), Numerical cients, unpublished.
methods of higher order accuracy for nonlinear bo.undary value problems,
D9. DouGLAS, J., atÍd T. DuPONT
Numer. Math. 9, 394-430; !:"umer. Math. 13, 51-77.
ti~ns with nonlinear boundar:
C5. CIARLET, P. G., and C. WAGSCHAL {1971), Multipoint Taylor.formulas and
DIO. DuP~NT, T. (1973), Galerk
applications to the finite element method, Numer. Math. 11, 84-100.
example, SIAM J. Numer. A~
BIBLIOGRAPHY 287
pproximation by m?ments, J. of Math.

C6. CIARLET, P. G., and P. A. RAVIART (1972), General Lagrange and Hermite
interpolation in Rn with applications to the finite element method, Arch.
1e approximatio:o by quasi-interpolants, Rat. Mech. Anal. 46, 177-199.
C7. CIARLET, P. G., and P. A. RAVIART (1972), Interpolation theory over curved
72), Collocation at Gaussian points, elements, with applications to finite element methods, Computer Methods ·
\.. in Appl. Mech. and Eng. 1, 217-249.
. í\
nethods for the numer:ical solution of CS. CLOUGH, R. W., and J; L. ToCHER (1965), Finite element stiffness matrices
.almers Institute of Technology, Gote- for analysis of plates in bending, Wright-Patterson l.
C9. CLOUGH, R. W., and C. A. FELIPPA (1968), A refined quadrilateral element
'. THOMÉE (1972), Projection methods for analysis of plate bending, Wright-Patterson JI.
ting polygonal domains with boundary ClO. CLOUGH, R. W. (1969), Comparison of three-ditnensional elements, Vander-
1213, Univ. Wisconsin. bilt Symposium.
1970), Estimation of linear functionals C11. CoURANT, R. (1943), Variational methods for the solution of problems of
:o Fourier transforms and spline inter- equilibrium and vibrations, Bull. Amer. Math. Soc. 49, 1-23.
3-124.
Cl2. COWPER, G. R. (1972), CURSHL: A High-Precision Finite Element for
· (1971), Bounds for a class of linear Shells of Arbitrary Shape, National Research Council of Canada Report.
rmite interpolation, Numer. Math. 16,
C13. CowPER, G. R. (1972), Gaussian quadrature formulas for triangles, manu-
script.
72), Rate of convergence estimat~s for
Ct4. CowPER, G, R., E. KosKo, G. M. LINDBERG, and M. D. ÜLSON (1969),
ations, MRC Tech. Rept. 1232, Univ.
Static and dynamic applications of a high-precision triangular plate bend-
ing element, AIAA J. 1, 1957-1965.
970), Rayleigh-Ritz-Galerkin methods
)paces without boundary conditions, Dl. DEMJANOVIC, J. K. (1964), The net method for sorne problems in mathe-
5. matical physics, Dokl. Akad. Nauk SSSR 159, Soviet Math. Dokl. 5.
02. DEMJANOVIC, J. K. (1966), Approxirnation and convergence of the net
(1971), On the numerical solution of
··method in elliptic problems, Dokl. Akad. Nauk SSSR 170, Soviet Math.
,Y least square approximation of the
Dokl. 7.
D3. DENDY, J. (1971), Thesis, Rice University.
970), Triangular elements in the finite
'809-821. 04. DESCLOUX:, J. (1972), On finite element matrices, S/AM J. Num. Anal. 9,
260-265.
(1971), Ritz approximations to two-
ns, Numer. Math. 18, 171-181. D5. DESCLOUX, J. (1970), On the numerical integration of the heat equation,
' Numer. Math. 15, 371-381.
·iationelle des problemes aux ·limites,
06. DouoLAS, J., and T. DuPONT (1970), Galerkin methods for parabolic
problems, SIAM J. Numer. Anal. 4, 575-626.
G. M. LINDBERG, and M. n~ ÜLSON
D7. DouGLAS, J., and T. DuPONT (1972), A finite element collocation method
ttes with curved edges, In t. J. for Num.
for quasilinear parabolic equations, manuscript.
08. DouGLAS, J., and T. DuPONT (1972), manuscript on interpolation of coeffi-
and R. S. VARGA (1967), Numerical cients, unpublished.
•r nonlinear boundary value próblems,
1ath. 13, 51-7i. D9: DouGLAS, J., and T. DuPONT (1973), Galerkin methods for parabolic equá-
tions with nonlinear boundary conditions, Numer. Math., 20, 213-237.
1971), Multipoint Taylor formulas and \
thod, Numer. Math. 17, 84-100. DIO. DuPONT, T. (1973), Galer~in methods for first-order hyperbolics: an
example, SJAM J. Numer. Anal.; to appear.
288 BIDLIOQRAPHY
F16. FRIED, l. (1971 ), Basic comput
D1 C DUPUIS, G., and J. J. GoEL-(1969), A curved element for thin elastic shel1s, of shells, lnt. J. Solids Struct.
Tech. Rept., Brown Univ.
F17. FRIED, l. (1971), Discretizati
D12. DUPUIS, 'G., and J. J. GoEL (1969), Eléments finis raffinés en élasticité finite elements, AIAA J. 9; 20
bidimensionelle, ZAMP 20, 858-881.
F18. FRIED, l. (1972), The /1 and loo
D13. DUPUIS, G., and J. J. GoEL (1970), Finite element with high degree of University.
ll~gularity, lnt. J. for Num. Meth. in Eng. 2, 563-577.
F19. FRIEDRICHS, K. (1928), Die
El. E~GATOUDIS, 1., B. I~ONS,andO. C. ZIENKIEWICZ (1968), Curved isopara-
Theorie ·der elastischen Platte
m~tric quadrilateral elements for finite element analysis, Int. J. Solids
Struct. 4, 31-42. F20. FRIEDRICHS, K. 0., and H. B.
generalized Neumann problen
Fl. FELIPPA, C. A. (1966), Refined finite element analysis of linear and non-
tia! Equations, ed. J. Bramble,
linear two-dimensional structures, Rept., Univ. California at Berkeley.
F2. FELIPPA, C. A. (1969), Analysis of plate bending problems by the finite Gl. GALLAGHER, R. H., and A. K
element method, SESM Rept. (Dept. Civil Eng.), Univ. California at ment elastoplastic analysis, Bt
Berkeley. in Vanderbilt and Jap~n-U.S
F3. FELIPPA, C. A., and R. W. CLOUGH (1970), The finite element method in G2. GEORGE, A. (1971), Computer
solid mechanics, Duke University SIAM-AMS Symposium, 210-252. Thesis, Stanford University.
F4. Fix, G. (1968), Orders of convergence of the Rayleigh-Rítz and Weinstein- G3. GEORGE, A. (1971), Block elim
Bazley methods, Proc. Nat. Acad. Sci. 61, 1219-1223. manuscript.
F5. FIX, G~ Ü969), Higher-order Rayleigh-Ritz approximations, J. Math. G4. GoEL, J. J. (1968), Constructi<
Mech. 18, 645-658. of Ritz's method, Numer. M1
F6. F1x, G., and G. STRANG (1969), Fourier analysis of the finite element G5. DIGUGLIELMO, F. (1969), C(
method in Ritz-Galerkin theory, Studies in Appl. Math. 48, 265-273. Sobolev sur des réseaux en si
F7. Fix, G., and S. GuLATI (1971), Computational problems arising from the Hl. HANNA, M. S., and K. T. SN
use of singular functions, Rept., Harvard Univ. problem in piecewise smoot
575-593.
F8. Fix, G., and N. NASSIF (1972), On finite element approximations to time-
dependen! problems, Numer. Math. 19, 127-135. H2. HARRICK, l. l. (1955), On the
boundary of a region by fun'
F9. FIX, G., S. GULATI, and G.-1. WAKOFF (1972), On the use of singular func- 37(79), 353-384.
tions with the finite element method, J. Comp. Physics, to appear.
H3. HERBOLo, R. J., M. H. Scm
FlO. Fax, L., P .. HENRICI, and C. MoLER (1967), Approximation and bounds for schemes for the numerical sol
eigenvalues of elliptic óperators, SIAM J. Numer. Anal. 4, 89-102. tional techniques, Aequ. Math
Fil. FRAEIJS DE ,VEUBEKE, B. (1965), Displacement and equilibrium modeis in H4. HERRMANN, L. R. (1967), Fini1
the finite element method, Chap. 9 of Stress Analysis, eds. O. C. Zienkiewicz Mech. Div. ASCE 94, 13-25.
and G. S_. Holister, Wiley, New York.
H5. HILTON, P. D., and J. HUTc
F12. FRAEI.JS DE VEUBEKE, B. (1968), A conforming finite element for plate cracked· plates, Eng. Fract: M
bending, Int. J~ Solids Structures 4, 96-108.
H6. HULME, B. L. (1968), lnterpol:
F13. FREDERICKSON, P. O. (1971), Generalized triangular splines, Math. Rept. 7, 18, 337-342.
Lakehead Univ., Canada.
11. IRONS, B. M. (1966), Engineel
F14. FRIED, l. (1971), Condition of fi.nite element matrices generated from stiffness methods, AIAA J. 4,
nonuniform meshes, AIAA J. 10, 219-221.
12. IRONS, B. M. (1968), ~oundotl
F15. FRIED, l. (1971), Accuracy of finite element eigenproblems, J. of Soun,d and
Vibration 18, '289-295. ~308-12.
BIBLIOGRAPHY 289
F16. FRIED, l. (1971), Basic computational problems in the finite element analysis
curved element for. thin elastic shells, of shells, lnt. J. Solids Struct. 7, 1705-1715.
F17. FRIED, l. (1971), Discretization and computational errors in high-order
Eléments finis." raffinés en élasticité finite elements, AIAA J. 9, 2071-2073.
Fl8. FRIED, l. (1 972), The 12 and /"" condition numbers ... , Conference at Brunel
. Finite element with high degree of University.
'::ng. 2, 563-577.
1\
Fl9. FRIEDRICHS, K. (1928), Die Randwert- und I;:igenwertprobleme aus der
ZIENKIEWICZ (1968), G4rved isopara-
Theorie der elastischen Platten, Math. Ann; 98, 205-247.
nite element analysis, Int. J. Solids
F20. FRIEDRICHS, K. 0., and H. B. KELLER (1966), A finite difference scheme for
generalized Neumann problems, in Numerical Solutions of Partial Di/feren-
' element analysis of linear and non-
~pt., Univ. California at Berkeley.
tial Equations, ed. J. Bramble, Academic Press,· New York.
Jlate bending problems by the finite 01. 0ALLAGHER, R. H., andA. K. DHALLA (1971), Direct flexibility finite ele-
pt. Civil Eng.), Univ. California at ment elastoplastic analysis, Berlin Symposium (see also papers by Oallagher
in Vanderbilt and Jap:;¡.n-U.S. Symposia).
(1970), The finite element method in 02. OEORGE, A. (1971), Computer implementation of the finite element method,
tM-AMS' Symposium, 210-254. Thesis, Stanford University.
~ of the Rayleigh-Ritz and Weinstein-

O 3. 0EORGE, A. ( 1971 ), Block elimination of finite element systems of equations, ·
i. 61, 1219-1223. manuscript.
eigh-Ritz approximations, J. Math. G4. O o EL, J. J. (1968), Construction of basic functions for numerical utilization
of Ritz's méthod, Numer. Math. 12, 435-447.
ourier analysis of the finite element 05. DIOUGLIELMO, F. (1969), Construction d'approximations des espaces de
dies in Appl. Math. 48, 265-273. Sobolev sur des réseaux en simplexes, Calco/o 6, 279-331.
putational problems arising from the Hl. HANNA, M. S., and K. T. SMITH (1967), Sorne remarks on the Dirichlet
•ard Univ. problem in piecewise smooth domai,ns, Comm. Pure Appl. Math. 20,
?75-593.
nite element approximations to time-
19, 127-135. H2. HARRICK, l. l. (1955), On the approximation of functions vanishing on the
boundary of a region by functions of a special form, Mat. Sbornik, N.S.
:p (1972), On the use of singular func- 37(79), 353-384.
J. Comp. Physics, to appear.
H3. HERBOLO, R. J., M. H. ScHULTZ, and R. S. VARGA (1969), Quadrature
1967), Approximation and bounds for schemes for the numerical solution of boundary value problems by varia-
1\1 J. Numer. Anal. 4, 89-102. ' tional techniques, Aequ. Math. 3, 96-119.
llacement and equilibrium modc~s in H4. HERRMANN, L. R. (1967), Finite-element bending analysis for plates, J. Eng.
Stress Analysis, eds. O. C. Zienkiewicz Mech. Div. ASCE 94, 13-25. ,
k.
H5. HILTON, P. D., and J. HUTCHINSON (1971), Plastic intensity factors for
conforming finite elément for plate cracked plates, Eng. Fract: Mech. 3, 435--451.
6-108.
1
H6. HuLME, B. L. (1968), Interpolation by Ritz approximation, J. Math. Mech.

ized triangular splines, Math. Rept. 7, 18, 337-342.
1 Il. IRONS, B. M. (1966), Engineering applications of numerical integration in
l
te element matrices generated from J stiffness methods, AIAA J. 4, 2035-2037. ·
f-221.
ement eigenproblems, J. of Sound and
l 12. lRONS, B. M. (1968), Rpundoff:criteria in direct stiffness solutions, AIAA J.
6, 1308-12. ~
1
290 BIBLIOGRAPHY
13. 1RoNs, B. M. (1969), Economical computer techniques for numerically M4. McLAY, R. W. (1968), Complet
integrated finite elements, lnt. J. for Num. Meth. in Eng. 1, 201-203. element displacement functions-
5th Aerospace Science Meeting.
, 14. 1RONS, B. M. (1970), A frontal solution program for finite element analyses,
Jnt. J. /ór Num. Meth. in Eng. 2, 5-32. M5. McLAY, R. W. (1971), On cert
method, Trans. ·o! the ASME 58-
15. 1RoNs, B. M. (1971), Quadrature rules for brick based finite elements,
AIAA J. 9, 293-294. M6. MELOSH, R. J. (1966), Basis for
ness method, AIAA J. 34, 153-l
16. 1~'Q,NS, B. M., andA. RAZZAQUE (1971 ), A new formulation for plate bending
eletnents, manuscript. M7. MIKHLIN, S. G. (1960), The sh
Dokl. 1, 1230-1233.
17. 1RONS, B. M., E. A. DE ÜLIVEIRA, and O. C. ZIENKIEWICZ (1970), Comments
on the paper: Theoretical foundations of the finite element method, lnt. M8. MILLER, C. (1971), Thesis, M.I.'
J. Solids Struct. 6, 695-697. M9. MITCHELL, A. R., G. PHILLIPS, a1
18. IRWIN, G. R. (1960), Fracture mechanics, Symposium on Naval Structural in the finite element method, J.
Mechanics. M lO. MORLEY, L. S. D. (1969), A ffiQ(
K l. KELLOGG, B. (1970), On the Poisson equation with intersecting interfaces, stress concentration problems i
Tech. Note BN-643, Univ. Maryland. Solids 17, 73-82.
K2. KELLOGG, B. (1971), Singularities in interface problems, SYNSPADE, N l. NITSCHE, J. (1968}, Ein Kriteriu:
351-400. . Verfahrens, Numer. Math. 11, 3•
N2. NITSCHE, J. (1968), Bemerkunge
K3. KONDRAT'Ev, V. A. (1968), Boundary problems fcr elliptic equations with
Verfahren, Math. Zeit. 106, 327,
conical or an~ular points, Trans. Moscow Math. Soc. 11.
N3. NITSCHE, J. (1970), Über ein V
K4. KouKAL, S. (1970}, Piecewise polynomial interpolations and their applica- Problemen bei Verwendung von
tions to partial differential equations, Czech. Sbornik VAJ.(Z, Brno, 29-38. unterworfen sind, Abh. Math. Se
K5. KRASNOSELSKII, M. A. (1950), The convergence of the Galerkin method for N4. NITSCHE, J. (1970), Lineare S¡
nonlinear equations, Dokl. Akad. Nauk SSSR 73, 1121-1124. Ritz für elliptische Randwertpro
K6. KRATOCHVIL, J., A. ZENISEK, and M. ZLAMAL (1971), A simple algorithm N5. NITSCHE, J. (1971), A projecti'
for the stiffness matrix of triangular plate bending finite elements, Int. J. su bspaces with almost zero bou
for Num. Meth. in Eng. 3, 553-563.
N6. NITSCHE, J., andA. ScHATZ (19
K7. KREISS, H. O. (1971), Difference Approximations for Ordinary Differential tions on spline-subspaces, Appli
Equations, Computer Science Department, Uppsala U~iversity.
01. OGANESJAN, L. A. (1966), Com
Ll. LAASONEN, P. (1967), On the discretization error of the Dirichlet problems under improved approximation
in a plane r~gion with corners, Ann. Acad. Scient. Fennicae 408, 3-15. 1146-1150~
L2. LEHMAN, R. S. (1959), Developments near an analytic corner of solutions 02. ÜGANESJAN, L. A., and L. A. R1
of elliptic partial differential equations, J. Math. Mech. 8, 727-760. vergence of variational differen•
L3. LINDBERG, G. M., and M. D. ÜLSON (1970), Convergencestudies of eigen- tions in two-dimensional regio
value solutlons using two finite plate bending elements, lnt. J. for Num. Matem. 9, 1102-1120.
Mcth. in Eng. 2, 99-116. 03. ÜLIVEIRA, E. A. DE (1968), The
Ml. MARCAL, P. V. (1971), Finite element analysis with nonlinearities-theory method, /nt. J. Solids Struct. 4,
and practice, Fir'st Japan-U.S. Seminar. ~N, M. D., and G. M. LIND
M2. MARTIN, H. C. (1971), Finite elements anc;i the analysis of geometrically shells with a doubly-curved triar
nonlinear problems, First Japan-U.S. Seminar. 299""7318.
M3. McCARTHY, C., and G. STRANG (1973), Optimal conditioning of matrices, Pl. PETERS, G., and J. H. WILKINS•
S/AM J. Num. 1-nal., to appear. eigenproblem, SIAM J. Num. A;
BffiLIOGRAPHY 291
1 computer technÍques for numerically M4. MeLAY, R. W. (1968), Completeness and convergence properties of finite
r-or Num. Meth. i~ Eng. 1, 201-203. element displacement functions-a general treatment, AIAA Paper 67-143,
1tion program for finite element analyses, 5th Aerospace Science Meeting.
-32. M5. McLAY, R. W. (1971), On certain approximations in the finite-element
rules for brick based finite elements, method, Trans.·ofthe ASME 58-61.
M6. MELOSH, R. J. (1966), Basis for derivation of matrices for the direct stiff-
n l ), A new formulatio;h, for plate bending ness method, AIAA J. 34, 153-170.
M7. MIKHLIN, S. G. (1960), The stability of the Ritz method, Soviet Math.
nd 0. C. ZIENKIEWICZ (1970), Comments Dokl. l, 1230-1233.
tions of the finite element method, Int. M8. MILLER, C. (1971), Thesis, M.I.T.
M9. MITCHELL, A. R., G. PHILLIPS, and R. WACHPRESS (1971), Forbidden shapes
:hanics, Symposium on -Naval Structural in the finite element method, J. /nst. Maths. Appl. 8.
MIO. MoRLEY, L. S. D. (1969), A modification of the Rayleigh-Ritz method for
::m equation with intersecting interfaces, stress concentration problems in elastostatics, J. of Mech. and Phys. o!
Id. Solids 17, 73-82.
~ in interface problems, _S YNSPA DE, Nl. NITSCHE, J. (1968), Ein Kriterium für die Quasi-OptimaliHit des Ritzschen
Verfahrens, Numer. Math. 11, 346-348.
uy problems fcr ellipti~ equations with N2. NITSCHE, J. (1968), Bemerkungen zur Approximationsgüte bei projektiven
toscow Math. Soc. 17. Verfahren, Math. Zeit. 106, 327-331.
1omial interpolations and their applica- N3. NITSCHE, J. (1970), Über ein Variationsprinzip zur Losung von Dirichlet
ns, Czech. Sbornik VAA;Z, Brno, 29-38. Problemen bei Verwendung von Teildiumen, die keinen Randbedingungen
unterworfen sind, Abh. Math. Sem. Univ. Hamburg 36.
:onvergence of the Galerkin method for
N4. NITSCHE, J. (1970), Lineare Spline-Funktionen und die Methoden von
lauk SSSR 73, 1121-1124.
Ritz für elliptische Randwertprobleme, Arch. Rat. Mech. Anal. 36, 348-355.
\11. ZLAMAL (1971), A simple algorithm
N5. NITSCHE, J. (1971), A projection method for Dirichlet problems using
1r plate bending finite elem~nts, Int. J.
subspaces with almost zero boundary conditions, manuscript.
N6. NITSCHE, J., andA. ScHATZ (1972), On local approximation of L2-projec-
pproximations for Ordinary Differential tions on spline-subspaces, Applicable Analysis 2, 161-168.
·tment, Uppsala U~iversity.
01. ÜGANESJAN, L. A. (1966), Convergence of variational difference schemes
tization error of the Dirichlet problems under improved approximation to the boundary, Soviet Math. Dokl. 7,
'l. Acad. Scient. Fennicae 408, 3-15.
1146-1150~
ts near an analyti¿ corner of solutions 02. ÜGANESJAN, L. A., and L. A. RUKHOVETS (1969), Study of the rate of con-
ons, J. Math. Mech. 8, 727-76fi. vergence of variational difference schemes for second-order elliptic equa-
-: (1970), Convergence studies ofeigen- tions in two-dimensional regions with smooth boundaries, Zh. Vychisl.
tte bending elements, ]nt. J. for Num. Matem. 9, 1102-1120.
03. ÜLIVEIRA, E. A. DE (1968), Theoretical foundations of the finite element
nt analysis with nonlinearities-theory methód, Int. J. Solids Struct. 4, 929-952.
ÜLSON, M. D., a~ M. LINDBERG (1971), Dynamic analysis of shallow
rtar.
• 04.
mts and the ana1ysis of geometrically shells with·a doubl;-~~~ed triangular finite element, J. Sound Vibration ·19,
:; Seminar. 299-318.
73), Optimal conditioning of matrices, Pl. PETERS, G., and J. H. WILKiNSON (1970), Ax J..Bx and the generalized
eigenproblem, SIAM J. Num. Anal. 7, 479-492.
l
292 BIBLIOGRAPHY
, P2.·,' 'PETERS, G., and J. H. WILKINSON (1971), Eigenvalues of Ax = J.Bx with

band symmetric A and B, Comput. J. 14. S6. STRANG, G. (1972), Approx
Math. 19, 81-98.
P3. 'PHI~LIPS, Z., ~nd D. V. PHILLIPS (1971), A.n automatic generation scheme
for plane and curved surfaces by isoparametric coordinates, Int. J.for Num. S7. STRANG, G. (1972), Variatio
Meth. in Eng. 3, 519-528. land Symposium.
P4. PIA~, T. H. H.,. and P. TONO (1969), Basis of finite element methods for SS. STRANG, G. (1973), Piecewi
solí~,, continua, Int. J. for Num. Meth. in Eng. 1, 3-28.
Bull. Amer. Math. Soc. 19, 1
PS. PIAN,' T. H. H. (1970), Finite element stiffness methods by different varia- S9. STRANG, G., and A. E. BE
tional principies in elasticity, Duke University SIAM-AMS Symposium, change in domain, Proc. A~
253-271. ' Berkeley.
P6. PIAN, T. H. H., P. ToNO, and C. H. LuK (1971), Elastic crack analysis by StO. STRANG, G., and G. FIX (1
a finite element hybrid method, Wright-Patterson Ill. method, Proc. GIME Sumn
P7. PtERCE, J. G., and R. S. VARGA (1972), Higher order convergence results Sil. SWARTZ, B., and B. WENDRC
for the Rayleigh-Ritz method applied to eigenvalue problems I, SIAM J. Math. of Comp. 23, 37-50.
Num. Anal. 9, 137-151. Tl. TAYLOR, R. L. ( 1972), Once
P8. PóLYA, G. (1952), Sur une interprétation de la méthode des différences analysis, Int. J. for Num. M
finies, qui peut fournir des bornes supérieures o u inférieures, Comptes T2. THOMÉE, V. (1964), Elliptic
Rendus 235, 995-997. Diff. Eqns. 3, 301-324.
P9. PRAGER, W., and J. L. SYNGE (1947), Approximations in elasticity based T3. THOMÉE, V. (1971), Polygonl
on the concept of function space, Quart. Appl. Math. 5, 241-269. MRC Tech. Rept. 1188, Un:
1
PI O. PRAGER, W. (1968), Variational principies for elastic plates with relaxed T4. TONO, P. (1969), Exact Sl
continuity requirements, Int. J. Solids Struct. 4, 837-844. method, AIAA J. 1, 178-18
PI J. PRICE, H. S., and R. S. VARGA (1970), Error bounds for semidiscrete Galer- TS. TONG, P. (1971), On the nu
kin approximations of parabolic problems, Duke University SIAM-AMS · Waterloo Conference.
Symposium, 14-94. T6. ToNO, P., and T. H. H. PI
Rl. RAI, A. K., and K. RAJAIAH (196J), Polygon-circle paradox of simply sup- in solving linear elastic pr<
ported thin plates under uniform pressure, AIAA J. 6, 155-156. T7. · ToNO, P., and T. H. H. PI
R2. REID, J. K. O972), On the construction and convergence of a finite-element by the assumed stress metl
solution of Laplace's equation,J. Inst. Maths. Appl. 9, 1-13. T8. ToNo, P., T. H. H. PIAN, a
SI. SANDER, G. (1,970), Application of the dual analysis principie, IUTAM frequencies by the finite •
Symposium. masses, J. Comp. Struct. l,
S2. ScnoENBERa, l. J. (1946), Contributions to the problerofpproxit;nation T9. TREFFTZ, E. (1926), Ein ~
of equidistant data by analytic functions, Quart. Appl Math. 4, 45-99, Congress Applied Mechanic
. 112..:Í41. T10. TURNER, M. J., R. w. CL
S3. ScHULTZ, M. H. (1969), R~yleigh-Ritz methods for multidimensional Stiffness and deflection am
problems, SIAM J. Num. Anal. 6, 523-538. V l. VAINIKKO, G. M. (1964), A
\
S4. ScHuLTZ, M. H. (1971), L2

error bounds for the Rayleigh-Ritz-Galerkin in the eigenvalue problem,
method, SIAM J. Num. Anal. 8, 737-748. V2. ·VAINIKKO, G. M. (1967),
SS. STRANG, G. (1971), The finite element method and approximation theory, methods in the eigenvalue
SYNSPADE, 547-584. 7, 18-32.
V3. VAN DER SLUIS, A. (1970),
algebraic systems, Numer.
BIBLIOGRAPHY 293
), Eigenvalues of Ax J.Bx with
S6. STRANO, G. (1972), Approximation in the finite element method, Numer.
. An automaÜc generation scheme Math. 19, 81-98 .
metric coordinates, Ittt~ J.for Num. S7. STRANO, G. (1972), Variational crimes in the finite element method, Mary-
land Symposium.
asís of finite element methods for SS. STRANO, G. (1973), Piecewise polynomials and the finite element method,.
'lEng. 1, 3-28. }., Bull. Amer. Math. Soc. 19, 1128-1137.
itfness methods by ditferent varia- S9. STRANO, G., and A. E. BEROER (1971), The change in solution due to
'liversity SIAM-AMS Symposium, change in domain, Proc. AMS Symposium on Partial Differential Equations,
Berkeley.
K (1971 ), Elastic crack analysis by S lO. STRANO, G., and G. F1x (1971), A Fourier· analysis of the finite element
f}¡atterson III. method, Proc. GIME Summer School, Ita/y, to appear.
Higher order convergence results S 11. SwARTZ, B., and B. WENDROFF (1969), Generalized finite ditference schemes, ·
[) eigenvalue problems 1, SIAM J. Math. of Comp. 23, 37-50.
Tl. TAYLOR, R. L. (1972), On completeness ofshape functions for finite element
on de la méthode des ditférences analysis, Int. J. for Num. Meth. in Eng. 4, 17-22.
·érieures ou inférieures, Comptes T2. THoMÉE, V. (1964), Elliptic ditference operators and Dirichlet's problem,
Diff. Eqns. 3, 301-324.
~pproximations in elasticity based T3. THOMÉE, V. (1971), Polygonal domain approximation in Dirichlet's problem,
. Appl. Math. 5, 241-269. MRC Tech. Rept. 1188, Univ. Wisconsin.
les for elastic plates with relaxed T4. ToNo, P. (1969), Exact solution of certain problems by finite-element
truct. 4, 837-844. method, AIAA J. 1, 178-180.
ror bounds for semidiscrete Galer- T5. ToNo, P. (1971), On the numerical problems of the finite element methods,
ms, Duke Univérsity SIAM-AMS .
j Waterloo Conference.
T6~- ToNo, P., and T. H. H. PIAN, The convergence of finite element method
rgon-circle paradox of simply sup- in solving linear elastic problems, Int. J. Solids Struct. 3, 865-879.
tre, AIAA J. 6, 155-156.
.nd convergence of a finite-element
j T7. ·ToNo, P., and T. H. H. PIAN (1970), Bounds to the infl.uence coefficients
by the assumed stress method, Int. J. Solids Struct. 6, 1429-1432.
(
.faths. Appl. 9, 1-13. '
T8. ToNo, P., T. H. H. PIAN, and L. L. BucciARELLI (1971), Mode shapes and
dual analysis principie, IUT1M frequencies by the finite element method using consistent and lumped
masses, J. Comp. Struct. 1, 623-638.
to the problem of approxiQlation 1 T9. TREFFTZ, E. (1926), Ein Gegenstück zum Ritzschen Verfahren, Second
ns, Quart. Appl. Mat'!: 4, 45-~9, Congress Applied Mechanics, Zurich.
TlO. TuRNER, M. J., R. W. CLOUOH, H. C. MARTIN, and L. J. ToPP (1956),
z methods for multidiinensional 1
,! Stitfness and defl.ection analysis of complex structures, J. Aero. Sciences 23.
38. i
V l. VAINIKKO, G. M. (1964), Asymptotic error estimates for projective methods
.
ls for the Rayleigh-Ritz-Galerkin
1,
1ethod and approximation theory,

~
1
V2.
in the eigenvalue problem, Zh. Vychisl. Mat. 4, 404-425.
VAINIKKO, G. M. (1967), On the speed of convergence of approximate
methods in the eigenvalue .problem, USSR Comp. Math. and Math. Phys.
! .
7, 18-32.
V3. VAN DER SLUIS, A. (1970), Condition, equilibration, and pivoting in linear
algebraic systems, Numer. Math. 15, 74-86.
294 · .~~BLIOGRAPHY
V4. VARGA, R. S. (1965), Hermite interpolation and Ritz-type methods for Z10. ZLAMAL, M. (1972), The
two-point boundary value problems, in Numerical Solutions of Partía[ boundaries, Int. J. for Nur.
'Dijferential Equations, ed. J. H. Bramble, Academic Press, New York.
V5. VARGA, R. S. (1970), Functional analysis and approximation theory in SUPPLEMENTARY BIBLIOGRAI
numerical analysis, SIAM Regional Conference Series, Vol. 3.
V6. VIss~:~'W. (1969), A refined mixed-type plate bending element, AIAA J. SI. FINLAYSON, T. (1972), The
7, 1801-1803. Principies, Academic Press, 1
Wl. WAIT, R., andA. R. MITCHELL (1971), Corner singularities in elliptic prob- S2. Conference on Numerical Ar
lems by finite element methods, J. Comp. Physics 8, 45-52. S3. Applications of the finite ele
W2. WEINBERGER, H. F. (1961), Variational Methods in Boundary Value Prob- Army Engineers Symposium
lems, University of Minnesota. S4. BATHE, K.-J. (1971), Solutior
W3. WIDLUND, O. B. (1971), Sorne recent applications of asymptotic error lems in structural engineerin~
expansions to finite-difference schemes, Proc. Roy. Soc. Lond. A323, S5. BATHE, K.-J., and E. L. WI
167-177. direct integration methods, té
, W4. WILLIAMS, M. L. (1952), Stress singularities resulting from various boundary S6. BIRMAN, M. S., and M. Z. Sor
conditiohs in angular corners of plates in extension, J. Appl. M ech. 526-527. tion of functions of the class·
W5. VviLSON, E. L., R. L. TAYLOR, W. P. DoHERTY, and J. GHABoussi (1971), S7. CLOUGH, R. W., and K.-J.B
Incompatible displacement models, University of Ilinois Symposium. response, Second U.S.-Japar
Yl. YAMAMOTO, Y., and N. ToKUDA (1971), A note on convergence of finite S8. FRIED, I., and S. K. YANG
element solutions, Int. _J. for Num. Meth. in Eng. 3, 485-493. a singularity, AIAA J. 10, 12
Zl. ZENISEK, A. (1970), Polynomial approximations on tetrahedrons in the S9. FRIED, l., and S. K. YANG(l
finite element method, manuscript. · bending element of quadratic
Z2. ZENISEK, A. (1970), Higher degree tetrahedral finite elements, manuscript. S10. Fum, H. (19?2), Finite elem
U.S.-Japan Seminar. ,
Z3. ZENISEK, A. (1970), Interpolation polynomials on the triangle, Numer.
Math. 15, 283-296. Sil. GIRAULT, V. (1972), A finite
J. Numer. Anal., to appear.
Z4. ZIENKIEWICZ? O. C. (1971), Isoparametric and allied numerically integrated
elements-a review, University of Illinois Symposium. S12. ÜOLUB, G. H., R. UNDERW(
algorithm for the symmetric
Z5. ZIENKIEWICZ, O. C., B. M. lRONS, et al. (1969), Isoparametric and associated ·
element families for two and three-dimensional analysis, Holand and Sl3. JEROME, J. W. (1973), Topic~
Bell [8]. sium on Approximation at A
Z6. ZIENKIEWICZ, Q. C., R. L. TAYLOR, and J. M. Too (1971), Reduced integra- S14. JoHNSON, C. (1972), On the e·
tion technique in general analysis of plates and shells, In t. J.for Num. Meth. plate bending problems, N un;
in Efíg. 3, 275-290. S15. JoHNSON, C. (1972), Converg
Z7. ZLAMAL, M. (1968), On the finite element method, Numer. Math. 12, plate bending problems, unp1
394-409. S16. KoRNEEV, V.G. (1970), The e
Z8. ZLAMAL, M. (1970), A finite elerpent procedure of the second order of a high order of accuracy, Ve
accuracy, Numer. Math. 14, 394-402. S17. McLEOD, R., and A. R. MI
Z9. ZLAMAL, M. (1972), Curved elements in the finite element method I, SIAM tions for curved elements in t
J. Num. Anal., to appear. 1o, 382-393.
1
1
~-
BIBUOGRAPHY 295
·'-,
ation and Ritz-type methods for Z10. ZLAMAL, M. (1972), The finite element method in domains with curved
n Numerical Solutions of Partial boundaries, Int. J. for Num. Meth. in Eng., to appear.
• Academic Press, New York.
sis and approximation theory in SUPPLEMENTARY BIBLIOGRAPHY
rerence Series, Vol. 3.
~ plate bending eleme,t:\,t, AIAA J. Sl. FINLAYSON, T. (1972), The Method of Weighted Residuals and Variational
Principies, Academic Press, New York.
:orner singularities in elliptic prob- S2. Conference on Numerical Analysis, Royal Irish Academy, Dublin, 1972.
'· Physics 8, 45-52. S3. Applications of the finite element method in geotechnical engineering, U. S.
f\,fethods in Boundary Value Prob- Army Engineers Symposium at Vicksburg, Mississippi, 1972.
S4. BATHE, K.-J. (1971), Solution methods for large generalized eigenvalue prob-
applications of asymptotic error lems in structural engineering, Thesis, Univ. of Calif. at Berkeley.
;, Proc. Roy. Soc. Lond. A323, S5. BATHE, K.-J., and E. L. WILSON (1972), Stability and accuracy analysis of
direct integration methods, to appear.
es resulting from various boundary S6. BIRMAN, M. S., and M. Z. SoLOMZAK (1967), Piecewise-polynomial approxima-
extension, J. Appl. Meoh. 526-527. tion of functions of the classes W~, Math. USSR-Sbornik 2, 295-317.
JHERTY, and J. GHABOUSSI (1971), S7. CLOUGH, R. W., and K.-J. BATHE (1972), Finite element analysis of dynamic
ersity of llinois Symposium. 1 response, Second U.S.-Japan Seminar.
, A note on convergence of finite SS. FRIED, I., and S. K. YANG (1972), Best finite elements distribution around
. in Eng. 3, 485-493. a singularity, AIAA J. 10, 1244-1246.
<imations on tetrahedrons in the S9. FRIED, 1., and S. K. YANG (1972), Triangular, 9 degrees of freedom, eo plate
bending element of quadratic accuracy, Q. Appl. Math., to appear.
1edral finite elements, manuscript. SlO. Fum, H. (1972), Finite element schemes: stability and convergence, Second
U.S.-Japan Seminar.
·nomials on the triangle, Numer.
S11. GIRAULT, V. (1972). A finite difference method on irregular networks, SIAM
i.
J. Numer. Anal., to appear.
e and allied numerically integrated
s Symposium. S12. GoLUB, G. H., R,. UNDERWOOD, and J. H. WILKINSON (1972), The Lanczos
algorithm for the symmetric Ax = lBx problem, to appear .
.969), Isoparametric and associated ·
imensional analysis, Holand and S13. JEROME, J. W. (1973), Topics in multivariate approximation theory, Sympo-
sium on Approximation at Austin, Texas.
J. M. Too (1971), Reduced integra- S14. JOHNSON, C. (1972), On the convergence of a mixed finite-element method for
~s and shells, Int. J.for Num. Meth.
plate bending problems, Numer. Math., to appear.
S15. JoHNSON, C. (1972), Convergence of another mixed finite-element method for
ment method, Numer. Math. 12, plate bending problems, unpublished.
S16. KORNEEV, V.G. (1970), The construction of variational difference schemes of
Jrocedure of the second order of a high order of accuracy, Vestnik Leningrad Univ. 25, 28-40.
S17. McLEOD, R., and A. R. MITCHELL (1972), The construction of basis func-
the finite element method I, SIAM tions for curved elements in the finite element method, J. Inst. Maths. App/ics.
10, 382-393.
296 BIBLIOGRAPHY
S18. MoTE, C. D. (1971), Global-local finite element, Int. J. for Numer. Meth. in
'Eng. 3, 565-574.
S19. PIAN, T. H. H., and P. ToNo (1972), Finite element methods in continuum
mechanics, Adv. in Appl. Mech. 12, 1-58.
S20. VANDERGRAFT, J. S. (1971), Generalized Rayleigh methods with applications
to ~ding eigenvalues of large matrices, Lin. Alg. and Applics. 4, 353-368.
S21. WEA\}ER, W., JR. (1971), The eigenvalue problem for banded matrices, Com-
puters·and Structures 1, 651-664.
INDEX OF NC
This will be a little more than an

way the ideas which are essential ir
book:
l. Norms, function spaces, ar

11. Energy inner products, elli
111. Finite element spaces and ;
The definitions in 1 and 11 are 1

the subject of finite e~ements,.
l. NORMS. FUNCTION SPACE

BOUNDARY CONDITIONS
A norm is a measure of the siz€

two functions (11 u - v 11). It satisfi€
inequality llu +vil< llull + llvll.
zero only if u is the zero function. 11
are bounded in terms of their inhc
are unique: u= O if /=O, or (by
the samef.
Sorne familiar norms are
i) the maximum norm = "sup

ii) the L 2 norm = 3C 0 norm =
In the discrete case, for vector

integrals are .replaced by correspo
the exponents 2 and 1/2 are replact:
: element, lnt. J. for Numer. Meth. in
'inite element methods in continuum

" .
. Rayleigh methods with applications
Lin. Alg. and Applics. 1, 353-368.
problem for banded Matrices, Com-
INDEX OF NOTATIONS
This will be a little more than an index. We shall try to summarize in a convenient
way the ideas which are essential in understanding three of the main themes of this
book:
l. Norms, function spaces, and boundary conditions

II. Energy inner products, ellipticity, and the Ritz projection
III. Finite element spaces and the patch test.
The definitions in I and II are more or less standard; those in III are special to
the subject of finite elements..
l. NORMS. FUNCTION SPACES, ANO

BOUNDARY CONDITIONS
A norm is a measure of the size of a function (11 u 11) or of t\le distance between
two functions (11 u - v 11). It satisfies the condition 11 cu 11 1e 111 u 11 and the triangle
inequality 11 u + v 11 < 11 u 11 + 11 v 11. Furthermore, unlike a seminorm, the norm is
zero only if u is the zero function. It follows that if all solutions u to a linear problem
are bounded in terms of their inhomogeneous terms <11 u 11 < e 11/11) then solutions
are unique: u O if f O, or (by $Uperposition) u 1 - u2 O if u 1 and u2 share
the sa111e f.
Sorne familiar norms are
i) the maximum norm "sup norm" Leo norm = sup 1u(x) j;

· xinn
ii) the L 2 norm =:reo norm (J 1u(x) ¡2 dx 1 ••• dx,.)ttz.
n
In the discrete case, for vectors u = (uh u 2 , ••• ) instead of functions u(x), the
integra1s are .replaced by corresponding sums. One generalizaÜon is to Lp norms;
the exponents 2 and 1/2 are replaced by p and 1/p. The triangle inequality is satisfied
297
298 INDEX OF NOTATIONS
xs contains eq if and only if s - q > 1:
(and we have a norm) for p > l. These spaces become interesting and valuable in of trace theorems: suppose u is in 3Cs
nonlinear problems; we have not found them essential to the linear theory. sidered as a function on r? Rough
' A se'cond· generalization, ·absolutely basic, is to include derivatives as well as xm-( 1/2) is a suitable "data space"
function values of u in computing a norm (pages 5, 143). The 3Cs norm combines u = g; it matches the solution space
the 3C 0 (or L 2 ).norms of all partial derivatives The central problem of partial dif
space of data to a space ·or solution
example -Au =fin the L"" norm, i1
and the pointwise theory suffers from
problems of any order 2m: 11 u lis < e
we take/in 3C 0 and look for u in 3Cj
the variational problem s = m: u is J
The seminorm 1u ls includes only those terms of order exactly 1rx 1= s; it is zero if only by the essential boundary condi1
u is a polynomial of degree s - l. The squares are introduced in equation (1) in
order to have an inner product structure, in other words, to make 3Cs a Hilbert
space (see II below). Fractional derivatives (non-integer s) also have important 11. ENERGY INNER PRODUCTS.
applications (pages 144-5, 260), but their definition is rather technical [1, 10, 14]- ANO THE RITZ PROJECTION
.except Wh<:fnn is the whole of n-space, when we use Fourier transforms,
Linear variational problems are 1
The classical case is to minimize a pl
we separate the terms of second and
tions v. The second-degree term is ti
Negative norms (the index s is negative, not the norm itself!) are defined by duality
energy inner product
(pages 16, 73, 167):
max lf uvl. (2) a(v, w) = ! (a(v + wl

llull-s v in :re• llvlls In terms of this inner product, the ce
·The functions in these "Sobolev spaces';--as distinct from their norms-are of the first variation, alias the equatü
usually defined by starting with a set of comparatively simple functions and then a(u, v) (f, v)
(3)
completing the space (page 11). The result is a Banach space; it contains the limit
point of any sequence for which ¡-uN -uM 1--4 O as N, M --4 co. Intuitively, the Integration by parts alters this wea1\
boles are filled in .. the Euler differential equation for u-
The completed space will depend on the original set of simple functions. If we with jump conditions appearing as in
start with the set e o of all continuous functions, then in the maximum norm, this NOTE: An inner product is bili
set contains all its limit functions and there are no boles to fill in. lf the original
set includes also1 the piecewise continuous functions, the final space Loo is much a(u + v, w + z) = a(u,
larger (and more difficult to describe). This point reappears in connection with
boundary conditions: if none are applied to the original set of all infinitely differen- and it is only when the strain energy
tiable functioris, then its completion is the whole space Xs(O); this isthe admissible · If this energy depended on the poin1
space forthe Neumann probleÍn, with free edges. If each member of the original set it would fail. Among Lp norms only
is required to vanish in a strip near the boundary r -the strip may be smaller for such an inner product; it is therefo
one function than another-then the completed space in the same norm is 3Ct;, with property extends to the spaces 3Cs, '
derivatives of order less than S vanishing on r (page 67). This is the adtnissible linear elasticity and other applicatio
space for the Dirich/et problem. ~ ' Problems which are not self-adjoiJ
We mention two groups of important but technical theorems about these spaces. work, not with a minimization. The
One group is typified by the Sobolev inequality (pages 73, 142), and answers the a(u, v) =F a(v, u), and complex~value
question: ifthe derivatives of order s 1 (integer or not) are in Lp 1 , are those of qrder if the real part of a(v, v) is elliptic
s 2 in Lp 2 ? In other words, which function spaces contain which others? (Sobolev:
INDEX OF NOTATIONS 299 ·
, :Jes contains eq if and only if s q > n/2.) The other technical results are the group
beco me interestjng :and valuable in of 'trace theorems: suppose u is in :Jes, how smooth are its ·boundary values, con-
es sen tia{ to the linear theory. sidered as a function on r? Rough answer: they are in :Jes-012>(r). Therefore
is to include derivatives as well as :Jem-(112> is a suitable "data space" for the inhomogeneous boundary condition
:es 5, 143). The :Jes norm combines u g; it matches the solution space Xe for u.
·The central problem of partial differential equations (pages 4-6) is to match a
space of data to a space 'of solutions. Such a match is not automatic. For the
1iX 1 = iX 1 + · · · + iX~ }< S: example -Au /in the Loo norm, it is not true that lluxx!! + lluyyll ~ Cll/11-
and the pointwise theory suffers from it. The :Jea norms are well matched for elliptic
dx1· · · dxn. problems of any order 2m: lluils ~ Cllflls- 2 m. For the Euler equation s 2m:
we take/in X 0 and Iook for u in :Jejm, satisfying all m boundary conditions. For
the variational problem s = m: u is found in the admissible space Xe, restricted
f order exactly !tX l = s; it is zero if
only by the essential boundary conditions.
s are introduced in equation (1) in
•ther words, to make :Jes a Hilbert
.on-integer s) also have important
11. ENERGY INNER PRODUCTS, ElLIPTICITY,
.tion is rather technical [1, 10, 14]-
ANO THE RITZ PROJECTION
we use Four~er transforms,
Linear variational problems are posed in terms of quadratic functionals l(v).
The classical case is to minimize a potential energy l(v) a(v, v) 2{!, v)- here
we separate the terms of second and first degree--over a space of admissible solu-
norm itself!) are defined by duality tions v. The second-degree term is the strain energy, and it is associated with an
energy inner product
uvj (2) a(v, w) = }(a(v + w, v + w)- a(v- w, v w))A

vrr;·
In terms of this inner product, the condition that u minimize I{v) is the vanishing
-as distinct from their norms-are · of the first variation, alias the equation of virtual work:
uatively simple functions and then
Banach space; it contains the limit {3) a(u, v) (f, v) for all admissible v.
~O as N, M---¡. oo. Intuitively, the
Integration by parts alters this weak form (or Gaferkin form) of the problem into
iginal set of simple functions. If we the Euler differential equation for u-of order 2m, without the presence of v, and
.s, then in the maximum norm, this · with jump conditions appearing as integrated terms at any discontinuities.
·e no boles to fill in. If the original NOTE: An inner product is bilinear,
::tions, the final space L"" is much a(u + v, w + z) a( u, w) + a(v, w) +a( u, z) +a( v, z),
•oint reappears in connection' with
original set of all infinitely differen- and it is only when the strain energy has a favorable form that th,is property holds.
~ space X"(Q}; this is.the admissible If this energy depended on the point of maximum strain, a(v, v) = max 1grad v 12 ,
:. If each member of tlíeoriginal set it would fail. Among LP norms only the case p 2 yields, through equation (2),
lry r -the strip may be smaller for such an inner product; it is therefore the only Hilbert space. The inner product
¡ space in the same norm is x~. with property extends to the spaces :;es, and also (thank God) to the strain energies in
r (page 67). This is the adtnissible linear elasticity and other applications.
Problems which are not self-adjoint begin directly with the equation {3) ofvirtual
:hnical theorems about these spaces. work, not with a minimization. The bilinear form a(u, v) is no longer symmetric,
y (pages 73, 142), and answers the a(u, v) :f= a(v, u), and complex-valued functions must be admitted. Nevertheless,
lr not} are in Lp 1 , are those of order if the real part of a(v, v) is elliptic (see below), then the results of the Galerkin
:es contain which others? (Sobolev:
,theory (page 119) completely paralleJ thc;>se ofthe Ritz theory. They coincide, when
a(u, 1J)-'is symmetric. . u. The interpolate assumes the san
The solvability of the fundamental variational equation (3) is guaranteed if the it is one ofthe trial polynomials: 1
form is elliptic·: Re a(v, v) > al! v 1!~. (So is the solvability of the -corresponding approximation theorems of Chapt4
parabolic equation.) In the case of systems of equations, with a vector of unknowns described earlier. The error u
u (u¡(x), ... , ur(x))-as is typical in applications-there appear several varieties degree k - 1 to which the shape i
of elliptic!ty. One possibility [1, 10, 14] asks that the eigenvalues of certain ~atrices The dimension of the trial spa<
of órder r\have positive real parts; this is too weak to guarantee success when the free parameters q1• Obviously N
Galerkin Ínethod is applied on a subspace. Strong ellipticity is a condition, not on quantity is the number M of pa1
the eigenvalues, but on the matrices themselves-and a still stronger condition is assembled matrix K to be comp~
the familiar one on Re a(v, v), which applies as successfully to systems asto single number of degrees of freedom (si
equations; v becomes an admissible vector of functions. Also for boundary condi- M is smaller, depending on the C4
tions there is a catalogue of possibilities, clearly described by Kellogg in the (Our conjecture for the trial spacc:
Baltimore Symposium volume [6]; for applications the central step is still to dis- k 1 and of continuity class eq
tinguish the essential conditions, and thereby the admissible space-and to require We come now to the patch tes¡
the definiteness of Re a(v, v) over that space. . recently it was hardly known (at·
The I,titz method is to minímize the functional/(v) over a sequence of subspaces created in the appendix to [B9], i
Sh. The fundamental theorem (page J9) establishes that the minimizing uh is the was convergent in one configurad'
projection of u opto Sh, in other words, uh is the closest function to u in the strain faced, under its official name in a
energy norm a(v, v). Therefore if each subspace Sh is contained in the next-as was story was told more recently by Ir
supposed in the classical Ritz method, and usually occurs for finite elements when still assailed by doubts that the te
new elements are formed by subdividing the old ones-the convergence in strain We are convinced that, under.
energy is monotonic as h --+ O. So is the convergence of eigenvalues. This may be on page 174, and is very simple te
useful, but it is not critical to the Ritz theory; monotonicity of the Sh is an extra 177; if the strain energy involves e
hypothesis, and monotonicity of convergence is an extra conclusion. JJ Dmrp1 should be calculated corre
are ignored in the nonconforminJ
higher-order patch test asks that
degree n - m; this generalizatio1
111~ FINITE ELEMENT SPACES ANO
convergence in strain energy of ori
THE PATCH TEST
A word about the proof. On
The usual description of a finite element specifies the form of the shape function estimating the inner product erro
(trial polynomial), and the location as well as· the parameter (function value v, the linear terms involvingfalso a
or sorne derivative D 1v) assigned to each node. Section> 1.9 contains a number of - Au for which the energy ir
examples. This is enough information to compute the element matrices, and to in the patch test means that for a
assemble them into the global stiffness and mass matrices K and M.
(4)
In Section 2.1, for mathematical reasons, we took one more step in describing
the trial space Sh: we defined a set of basis functions rp 17 ••• , 'PN for the space. Choosing P close to u over the se
The function rp 1 \Vas directly associated with the node z¡, and with the particular the basis by a(rp¡, rp1) = 1, (4) is lt
nodal pa_rameter D 1v. Jf the shape functions are uniquely defined by the values of vh as 2; q1rp¡,
the nódal parameters (whrch they must be!), then there is a unique trial function rp.
for which D1rpj(z1) 1, and all its other nodal parameters Dtrplzt) are zero, Thes~ (5)
· functions form a basis, because any trial function can be expanded in terms of
its nodal parameters as
vh = 2; q1rpb whe~;e the weight q1 D1vh(z1).

If it were true that
This leads immediately to the definition of the interpolare u1 of any given function
(6)
INDEX OF NOTATIONS 301
1e Ritz theory. They coincide, when

u. The interpolate assumes the same nodal parameters as u but inside each element
i t is on~ of ~he trial polynomials: u1 = ~ q,rp¡, where the ~eight qJ = n,u(zj). The
tal equation (3) is guaranteed if the
approxtmatwn theorems of Chapter 3 establish that u1 is close to u in the norms
described earlier. The error u- u1 depends on the element sizes ~. and on the
e solvability of the .corresponding
uations, with a vector of unknowns
degree k - 1 to which the shape functions are complete (page 136).
ions-there appear several varieties
The dimension of the trial space Sh is N, the number of basis functions rp and
: the eigenvalues of cer,tain matrices
free parameters q1• Obviously N depends on the number of elements. A crltical
~<'eak to guarantee sucd~ss when the
quantity is the number M of parameters per vertex; this permits the size of the
mg ellipticity is a condltion, not on
assembled matrix K to be compared for two competing elements. Let d be the
s-and a still stronger condition is
number of degrees of freedom (shape ·runction coefficients) within each element·
successfully to systems as to single
M is smaller, depending on the continuity constraints imposed between elements:
unctions. Also for boundary condi-
(Our conjecture for the trial space described on page 84, complete through degree
!arly described by Kellogg in the
k - 1 and of continuity class eq on triangles, is M = (k - 1 - q)(k 1 2q).)
ions the central step is still to dis-
We come now to the pateh test. This test has had very little publicity and until
te adm:issible space-and to require
recently it was hardly known (at least by that name) even to the expe;ts. It was
created in the appendix to [B9], in order to explain why the Zienkiewicz triangle
al/(v) over a sequence of subspaces
was convergent in one configuration but not in another (page 175). The test resur-
ishes that the minimizing uh is the
faced, under its official name, in a brief comment [19] on an earlier paper. The full
1e closest function to u in the strain
story was told more recently by Irons at the Baltimore Symposium but there he is
Sh is contained in the next-as was
still assailed by doubts that the test is sufficient for convergence. '
ally occurs for fi.nite elements when
We are convinced that, under reasonable hypotheses, it is. The te~t is described
•ld ones-the convergence in strain
on page 174, and is very simple to conduct. We recall the equivalent form on page
rgence of eigenvalues. This may be
177: if the strain energy involves derivatives Dmv of order m, then all the integrals
monotonicity of the Sh is an extra
is an extra conclusion.
f f Dmrp1 should be calculated correctly-even though interelement boundclry terms
are ignored in the nonconforming case, or numerical quadrature is applied. The
higher-order patch test asks that f f Pn-mDmrp1 be correct for a11· polynomials of
degree n-m; this generalization was made by the first author, and produces
convt?rgence in strain energy of order h2<n-m..;.o.
A word about the proof. On pages 179 and 186, the problem is reduced to
estimating the inner product error a*(u, vh) a(u, vh). (Por numerical integration
cifi.es the form of the shape function
the linear terms involving falso appear.) Suppose we stay with the model problem
s the parameter (function value v,
-:-Au f. for which the energy inner product is a(u, v) = f f Uxvx + u;yvy. Success
:. Section 1.9 contains a nuJilber of
m the patch test means that for any linear polynomial P,
1pute the element matrices, and to
ss matrices.K and M. (4)
te took one more step in descr,ibing
unctions rph .. . , rpN for the space. Choosing P close to u over the set E1 on which rp1 is non-zero, and renormalizing
he node Z¡, and with the particular t~ebasis by a(rp¡, rp) = 1, (4) is less than eh 11 u 1! 2 , E,· Therefore if we expand any
~e uniquely defi.ned by the values of
v as ~ q1tp¡,
en there is a unique triái function rp1 -
parameters D¡rpiz¡) are zero. These (5) [a*(u, vh) - a(u, vh)[ <eh~ [[ uii2,E, [q1 1
;;tion can be expanded in terms of
<eh(~ llulli.EY 12 (~qJ) 1 ' 2
~ c'h 11 u[b(~ qJ)ttz.
If it were true that

interpolate u1 of any given function
(6)
th:eh the convergence proof would be complete. The expression A is' less than
c"'ChiiÚ!Iz, and according to the estimateat the top of page 179, the strains are in
'error by O(l!).
The result is correct, but regrettably the inequality (6) is false. It amounts to
asking that the condition number of the stiffness matrix K be bounded-or that the
rp1 be uniformly independent in the energy norm. This is true of the mass matrix
(page 212?.,, but not of K. There is, however, a way out. Any conforming rp1 can be
ignored in.:'our calculations; the difference in (4) is identically zero. Therefore if the
trial space 'can be regarded as a conforming space to which sorne uniformly inde-
pendent nonconforming elements are added, the proof succeeds. This was obvious INDEX
in Wilson's case, since he began with the standard bilinear elements, and super-
imposedtwo nonconforming quadratics within each square. (Such interna! degrees
of freedom are called nodeless variables.) Because the squares never overlap, uniform
independence was automatic.
Crouzeix and Raviart have recently given a beautiful treatment of nonconform-
ing elements for diverge.nce-free fluids. Their technique applies to all elements whi.ch
pass the test in the following way: over each interelement edge or face, the integral
of the nonconforming jump is zero. Their technique for deducing convergence has
been formalized, and applied to plate elements, by both Ciarlet and Lascaux. A
The increasing application of finite elements - to the Navier-Stokes equations,
control problems, earthquake prediction, nonlinear elasticity and plasticity in soils Abstract method, 104, 270
as well as metals, and the design of tankers and reactors - promises a happy future Admissible space, 12,66
for both the analyst and the engineer. Anderheggen, 82, 132
Arch, 127
Area coordinates, 94, 96
Argyris, 2
Assembly, 29
Aubin, 155, 166
Babuska,124,133,151,193
Bandwidth, 105, 118, 134
Barlow, 168
Bathe, 238, 240
Bauer,237
Beam, 63, 121
Bell,96
Berger,198
Birkhoff, 261
Block power method, 237, 240
Boundary layer, 108, 196
Bramble,133, 134, 146; 147,169, 23
1 '
e
Cholesky, 36, 236
Ciarlet, 111, 161,169, 192
:te. The expression A is Iess than
top of page !79, trie strains are in
equality (6) is false. It amounts to

s matrix K be bounded-or that the
m. This is true of the mass matrix
.vay out. Any conform(l}g (/Ji can be
'is identically zero. THe'refore if the
Jace to which sorne uni/ormly inde-
proof succeeds. This was obvious IN DEX
.dard bilinear elements, and super-
each square. (Such interna! degrees
~ the squares never overlap, uniform
beautiful treatment of nonconform-

hnique applies to allelements which
terelement edge or face, the integral
1ique for deducing convergence has
, by both Ciarlet and Lascaux. A Clough, 2, 84, 238, 240
; - to the Navier-Stokes equations, Collocation, 117, 118
1ear elasticity and plasticity in soils Abstract method, 104, 270 Complementary energy, 129
reactors- promises a happy future Admissible space, 12, 66 Completion, 13
Anderheggen, 82, 132 Condition number, 35, 121, 206, 208, 244
Arch, 127 Conforming, 74, 137
Area coordinates, 94, 96 Conservative, 251 , 25 3
Argyris, 2 Consistency, 18, 23, 175, 241
Assembly, 29 Courant,74, 76,132,221,256
Aubin, 155, 166 Cowper, 127, 183
Crack, 115,144,260,269
B
Babuska,124,133,151,193 D
Bandwidth, 105, 118, 134
Barlow, 168 Degree, 136
Bathe, 238, 240 de Veubeke, 131
Bauer, 237 Dirichlet, 4, 64, 67
Beam, 63, 121 Douglas, 107, 168, 187,245,249
Be11,96 Dupont,107, 168, 1~7, 245,249,255
Berger, 198
Birkhoff, 261
E
Block power method, 237, 240
Boundary layer, 108, 196
Elastic-plastic, 111, 146
Bramble, 133, 134, 146; 147, 169, 231
Elements:
bicubic, 88, 102, 165, 270
e bilinear, 86, 157, 211
brick, 87
Cholesky, 36, 236 cubic, 55, 79, 226, 255
Ciarlet, 111, 161, 169, 192 linear, 27, 224
303
304 INDEX
linearon triangles, 75, 102, 152 lnterpolate, 4 3, 97

N
~quadratic, 5.3, 78, 103, 194 Interpolating basis, 102, 136
quintic, 82 Inverse hypothesis, 167
serendipity, 87, 160 lnverse iteration, 237 Natural condition, 117
spline,60,104,270 hons~99,174,183,186,209 Negative norms, 16, 73, 167
trilinear, 87 lsopara~etric, 109, 157, 199 Neumann, 4, 68
Element stiffness matrix, 28, 58, 81, 213 Nickell, 244
Elliptic, 6~. Nitsche, 169; 204, 267
Energy inner product, 39 Nitsché trick, 49, 107,166,268
Error: · J Nodal method, 101, 270
in displacement, 107, 166 Nonconforming, 178
in strains, 106, 165 Non-linear, 11 O, 111, 25 2
Johnson, 125
linear eleinents, 45 Norms, 5, 65
Jump condition, 14, 116, 261, 275
one-dimensional, 62 Num.erical integration, 32, 98, 109, 181
Essential condition, 4, 8, 12,117, 193
K
F o
Force method, 129 Kellogg,261,263
Oden, 111
Forced vibration, 167 Kondrat'ev, 266, 267
One-sided approximations, 146
Fried, 126, 163,207,213 Korn, 73
Osbom, 231
Fujii, 253'
Fundamental theorem, 39, 106
L p
G
Lagrange multiplier, 132, 133 Parlett, 240
Galerkin, 9, 115, 117, 119, 123, 125,219, Least squares, 133 Patch test, 164, 172, 174, 182
231 Lesaint, 25 5 Penalty, 132
Gauss elimination, 34, 238, 265 Lindberg, 127 Peters, 116, 238
Gauss quadrature, 99, 183, 188, 190 Load vector, 30 Plate, 71
George, 38,15 L-shaped membrane, 27 8 Poincaré, 42, 221
Gerschgorin, 21, 210, 211 Lumped matrix, 118,223, 226, 228,.244, Point load, 14, 73
Graded mesh, 115, 155 256
H a
M
Hermite, 56, 88 Quasi-interpolate, 14 2
Hilbert, 14_6, 147 Marcal, 111
Hybríd, 131, 132 Martín, 2
Hyperbolic, 251 Mass matrix, 29, 51, 96
McCarthy, 125 R
Minmax principie, 219, 221, 229
Mitchell, 108, 159, 197
Raviart, 161, 169, 192, 256
Mixed method, 120, 125
Rayleigh quotient, 216
Mode superposition, 244, 245
Reactor, 275
lnhomogeneous condition, 51, 70, 193, 199 Monotone, 11 O
Ritz method, 24
Interface, 260 , Mote, 131, 135
Roundoff, 110, 121, 215
INDEX 305
rpolate, 43, 97 N
rpolating basis, 102, 136
S
rse hypothesis, 167
rse iteration, 237 ~ Natural condition, 117 Saint~Venant, 197, 269
s;99,174,183, 186,209 Negative norms, 16, 73, 167 Schatz,133,134,267
arametric, 109, 157, 199 Neumann, 4, 68 Schultz, 111, 166,167, 206
Nickell, 244 Scott, 179, 203
Nitsche, 169, 204, 267 Shell,127,128
Nitsché trick, 49, 107, 166, 268 Singular functions, 263, 270
J Nodal method, 101, 270 Singularity, 13
Nonconforming, 17 8 Sobolev, 73, 144, 145, 183 /
Non~linear, 110, 111,252 Spline,60,89,104,141,155
lSOn, 125 Norms, 5, 65 Stability, 23, 20~, 244, 247,256
.p condition, 14, 116, 261, 275 Num.erical integration, 32, 98, 109, 181 Static condensation, 81
Stationary point, 116
Stiff, 98, 175, 244
Stiffness matrix, 29
K o Strain energy, 39
Stress intensity, 269
.ogg, 261, 263 Stress points, 151, 168
Oden, 111
tdrat'ev, 266, 267 Subspace' iteration, 237, 240
One~sided approximations, 146
n,73 Superconvergence, 168
Osborn, 231
Superfunction, 141, 15 5, 170
¡
l
Swartz, 249
L p
1 T
range multiplier, 132, 133 t¡
Parlett, 240
st squares, 13 3 ~ Taylor, 189
. ~ Patch test, 164, 172, 174, 182
aint, 255 Penalty, 132 Temam, 178
dberg, 127 ~ Peter~~ 116, 238 Thomée, 107
td vector, 30
1aped membrane, 27 8
nped matrix, 118, 223, 226, 228, 244,
i
'
~
Plate, 71
Poincaré, 42, 221
Point load, 14, 73
Toéplitz, 209,212, 224
Tong,253
Torsion, 269
256 Triangular coordinates, 94, 96
Truncation error, 18, 170
Turner, 76
Q
M
Quasi~interpolate, 142 .U
real, 111
rtin, 2
Uniform basis, 106, 137
ss matrix, 29, 57, 9.6
Carthy, 125 R
1max principie, 219, 221, 229
tchell, 108, 159, 197 Raviart,161, 169,192,256 V
l{ed method, 120, 125 Rayleigh quotient, 216
'de superposition, 244, 245 Reactor, 27 5 Vainikko, 231
tnotone, 11 O Ritz method, 24 Varga, 111, 249
•te, 131, 135 Roundoff,110,121,215 Virtual work, 41, 17 3
306 INDEX
w
Weak form, 9; 117, 219
Wendioff, 249
Wilkinson, 116, 238
Wilson, 176,179
z
Zenisek, 84
Zienkiewicz, 183
Zlamal, 164

An Analysis of The Finite Element Method1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Analysis of The Finite Element Method1

Uploaded by

Copyright:

Available Formats

AN ANAL YSIS OF

THE FINITE ELEMENT METHOD

ENGLEWOOD CLIFFS, ·N.J.

All rights reserved. No part of this book may be

Printed in the United States of America.

PRENTICE-HALL INTERNATIONAL, INC., London

The finite element method has been

merical analysts have turned to finite

he others, and was used by the first

3.1 Pointwise Approximation

4.1 Violations of the Rayleigh-l

1 AN INTRODUCTION TO THE THEORY 1

2 A ªUMMARY OF THE THEORY 101

3.1 Pointwise Approximation 136

4 VARIATIONAL CRIMES 172

4.1 Violations of the Rayleigh-Ritz Code 172

tent Spaces Sh 101

8.1 Cornees and Interfaces 257

BIBLIOGRAPHY 281 1 AN INTRODUCTI

INDEX OF NOTATIONS 297

The finite element method car

u(O) =O. (2) s: (/(>.

L is a linear operator, acting on a certain class of functions-those which in

:o. (2) J: (f(x)) 2 dx < oo.

g is not constrained, and it assumes

n class of functions-those which in

ion is precisely ·this: to match such

The 1uniqueness of the solution follows immediately from this estímate:

(4). llullzm < Cll/llo· (7)

We move now to a more applied question, the actual construction of the

(5) uix) = Jf sin(n !)x, An p(n - f) 2 + q.

Suppose the ·inhomogeneous term is expanded Üt a series of eigenfunctions:

Then integrating formally, the orthqgonality of the un gives

llu as N-- oo.

anded in a series of eigenfunctions: f {7T) :1= o

llity of the un gives

1.3. THE VARIATIONAL FORM OF THE PROBLEM

I(v) = (Lv, v) - 2(f, v)

nditions which involve;¡.only derivatives

u(O) = u'(n) O. (10) J: p(v' v:V) 2 + q(v

(f, v) = J: f(x)v(x) dx.

I(v) = J: [p(x)(v'(x)) 2 + q(x)(v(x)) 2

Since this holds for f on both s

If the minimizing u has two de

= J: [ -(pu')' + qu f]v + p(n)u'(n)v(n).

Remark 2. Up to now the inhomogeneous term f has been required to

I(v) = J: p(v') 2 + qv 2 - 2jv,

which is still to be minimized over 3C1.

p_u'_v_ I(v) = J: (v') 2 dx- 2v(x0 ),

at x 0 , and is a direct consequence

Since this is a second-order equa

tThis section is a digression from OUI

nce replacement for u'(n) O, and we operator,

u(n- h)- u(n

with variable p and q, we consider the .

m, in which u(x; ± h) and p(x1 ± h/2) Eí =0,

-(pe~)' + qe 2 -it:[(pu')'" + (pu'")'],

Naturally the centered difference is to be preferred.

Jr function. To compute the next term

' By the way, a similar proof leads to Gerschgorin's theorem in matrix

We call attention to one more p

11 Eh llo < C'h 2 ll U llr

1.5. THE\ RITZ METHOD ANO LINEAR ELEMENTS