Ferrante Neri

Linear Algebra
for Computational
Sciences and Engineering

Linear Algebra for Computational
Sciences and Engineering

Ferrante Neri

Linear Algebra
for Computational
Sciences and Engineering

Foreword by Alberto Grasso

123

Ferrante Neri
Centre for Computational Intelligence
De Montfort University
Leicester
UK

and

University of Jyväskylä
Jyväskylä
Finland

ISBN 978-3-319-40339-7 ISBN 978-3-319-40341-0 (eBook)
DOI 10.1007/978-3-319-40341-0

Library of Congress Control Number: 2016941610

© Springer International Publishing Switzerland 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature
The registered company is Springer International Publishing AG Switzerland

We can only see a short distance ahead, but
we can see plenty there that needs to be done

Alan Turing

Foreword

Linear Algebra in Physics

The history of linear algebra can be viewed within the context of two important
traditions.
The first tradition (within the history of mathematics) consists of the progressive
broadening of the concept of number so to include not only positive integers, but
also negative numbers, fractions, algebraic and transcendental irrationals.
Moreover, the symbols in the equations became matrices, polynomials, sets, per-
mutations. Complex numbers and vector analysis belong to this tradition. Within
the development of mathematics, the one was concerned not so much about solving
specific equations, but mostly about addressing general and fundamental questions.
The latter were approached by extending the operations and the properties of sum
and multiplication from integers to other linear algebraic structures. Different
algebraic structures (Lattices and Boolean algebra) generalized other kinds of
operations thus allowing to optimize some non-linear mathematical problems. As a
first example, Lattices were generalizations of order relations on algebraic spaces,
such as set inclusion in set theory and inequality in the familiar number systems (N,
Z, Q, and R). As a second example, Boolean algebra generalized the operations of
intersection and union and the Principle of Duality (De Morgan’s Relations),
already valid in set theory, to formalize the logic and the propositions’ calculus.
This approach to logic as an algebraic structure was much similar as the Descartes’
algebra approach to the geometry. Set theory and logic have been further advanced
in the past centuries. In particular, Hilbert attempted to build up mathematics by
using symbolic logic in a way that could prove its consistency. On the other hand,
Gödel proved that in any mathematical system there will always be statements that
can never be proven either true or false.
The second tradition (within the history of physical science) consists of the
search for mathematical entities and operations that represent aspects of the
physical reality. This tradition played a role in the Greek geometry’s bearing and its
following application to physical problems. When observing the space around us,

vii

viii Foreword

we always suppose the existence of a reference frame, identified with an ideal “rigid
body”, in the part of the universe in which the system we want to study evolves
(e.g. a three axes’ system having the Sun as their origin and direct versus three fixed
stars). This is modelled in the so called “Euclidean affine space”. A reference
frame’s choice is purely descriptive, at a purely kinematic level. Two reference
frames have to be intuitively considered distinct if the correspondent “rigid bodies”
are in relative motion. Therefore, it is important to fix the links (Linear
Transformations) between the kinematic entities associated to the same motion but
related to two different reference frames (Galileo’s Relativity).
In the XVII and XVIII centuries, some physical entities needed a new repre-
sentation. This necessity made the above-mentioned two traditions converged by
adding quantities as velocity, force, momentum and acceleration (vectors) to the
traditional quantities as mass and time (scalars). Important ideas led to the vectors’
major systems: the forces’ parallelogram concept by Galileo, the situations geom-
etry and calculus concepts by Leibniz and by Newton and the complex numbers’
geometrical representation. Kinematics studies the motion of bodies in space and in
time independently on the causes which provoke it. In classical physics, the role of
time is reduced to that of a parametric independent variable. It needs also to choose
a model for the body (or bodies) whose motion one wants to study. The funda-
mental and simpler model is that of point (useful only if the body’s extension is
smaller than the extension of its motion and of the other important physical
quantities considered in a particular problem). The motion of a point is represented
by a curve in the tridimensional Euclidean affine space. A second fundamental
model is the “rigid body” one, adopted for those extended bodies whose component
particles do not change mutual distances during the motion.
Later developments in Electricity, Magnetism, and Optics further promoted the
use of vectors in mathematical physics. The XIX century marked the development
of vector space methods, whose prototypes were the three-dimensional geometric
extensive algebra by Grassmann and the algebra of quaternions by Hamilton to
respectively represent orientation and rotation of a body in three dimensions. Thus,
it was already clear how a simple algebra should meet the needs of the physicists in
order to efficiently describe objects in space and in time (in particular, their
Dynamical Symmetries and the corresponding Conservation Laws) and the prop-
erties of space-time itself. Furthermore, the principal characteristic of a simple
algebra had to be its Linearity (or at most its multi-Linearity). During the latter part
of the XIX century, Gibbs based his three dimensional vector algebra on some ideas
by Grassmann and by Hamilton, while Clifford united these systems into a single
geometric algebra (direct product of quaternions’ algebras). After, the Einsteins
description of the four-dimensional continuum space-time (Special and General
Relativity Theories) required a Tensor Algebra. In 1930s, Pauli and Dirac intro-
duced Clifford algebra’s matrix representations for physical reasons: Pauli for
describing the electron spin, while Dirac for accommodating both the electron spin
and the special relativity.
Each algebraic system is widely used in Contemporary Physics and is a fun-
damental part of representing, interpreting, and understanding the nature. Linearity

Foreword ix

in physics is principally supported by three ideas: Superposition Principle,
Decoupling Principle, and Symmetry Principle.
Superposition Principle. Let us suppose to have a linear problem where each
Ok is the fundamental output (linear response) of each basic input Ik . Then, both an
arbitrary input as its own response can be written as a linear combination of the
basic ones, i.e. I ¼ c1 I1 þ . . . þ ck Ik and O ¼ c1 O1 þ . . . þ ck Ok .
Decoupling Principle. If a system of coupled differential equations (or differ-
ence equations) involves a diagonalizable square matrix A, then it is useful to
consider new variables x0k ¼ Uxk with ðk 2 N; 1  k  nÞ, where U is an Unitary
matrix and x0k is an orthogonal eigenvectors set (basis) of A. Rewriting the equa-
tions in terms of the x0k , the one discovers that each eigenvectors evolution is
independent on the others and that the form of each equation depends only on the
corresponding eigenvalue of A. By solving the equations so to get each x0k as a
function of time, it is also possible to get xk as a function of time (xk ¼ U1 x0k ) .
When A is not diagonalizable (not normal), the resulting equations for x are not
completely decoupled (Jordan canonical form), but are still relatively easy (of
course, if one does not take into account some deep problems related to the possible
presence of resonances).
Symmetry Principle. If A is a diagonal matrix representing a linear transfor-
mation of a physical system’s state and x0 k its eigenvectors set, each unitary
transformation satisfying the matrix equation UAU1 ¼ A (or UA ¼ AU) is called
“Symmetry Transformation” for the considered physical system. Its deep meaning
is to eventually change each eigenvector without changing the whole set of
eigenvectors and their corresponding eigenvalues.
Thus, special importance in computational physics is assumed by the standard
methods for solving systems of linear equations: the procedures suited for sym-
metric real matrices and the iterative methods converging fast when applied to
matrix having its non-zero elements concentrated near the main diagonal
(Diagonally Dominated Matrix).
Physics has a very strong tradition about tending to focus on some essential
aspects while neglecting others important issues. For example, Galileo founded the
Mechanics neglecting friction, despite its important effect on mechanics. The
statement of Galileo’s Inertia Law (Newton’s First Law, i.e. “An object not affected
by forces moves with constant velocity”) is a pure abstraction and it is approxi-
mately valid. While modelling, a popular simplification has been for centuries the
search of a linear equation approximating the nature. Both Ordinary and Partial
Linear Differential Equations appear through classical and quantum physics and
even where the equations are non-linear, Linear Approximations are extremely
powerful. For example, thanks to Newton’s Second Law, much of classical physics
is expressed in terms of second order ordinary differential equations’ systems. If the
2
force is a position’s linear function, the resulting equations are linear (m ddt2x ¼ Ax,
where A matrix not depending on x). Every solution may be written as a linear
combination of the special solutions (oscillation’s normal modes) coming from
eigenvectors of the A matrix. For nonlinear problems near equilibrium, the force

x Foreword

can always be expanded as a Taylor’s series and the leading (linear) term is
dominant for small oscillations. A detailed treatment of coupled small oscillations is
possible by obtaining a diagonal matrix of the coefficients in N coupled differential
equations by finding the eigenvalues and the eigenvectors of the Lagrange’s
equations for coupled oscillators. In classical mechanics, another example of lin-
earisation consists of looking for the principal moments and principal axes of a
solid body through solving the eigenvalues’ problem of a real symmetric matrix
(Inertia Tensor). In the theory of continua (e.g. hydrodynamics, diffusion and
thermal conduction, acoustic, electromagnetism), it is (sometimes) possible to
convert a partial differential equation into a system of linear equations by
employing the finite difference formalism. That ends up with a Diagonally
Dominated coefficients’ Matrix. In particular, Maxwell’s equations of electromag-
netism have an infinite number of degrees of freedom (i.e. the value of the field at
each point) but the Superposition Principle and the Decoupling Principle still apply.
The response to an arbitrary input is obtained as the convolution of a continuous
basis of Dirac δ functions and the relevant Green’s function.
Even without the differential geometry’s more advanced applications, the basic
concepts of multilinear mapping and tensor are used not only in classical physics
(e.g. inertia and electromagnetic field tensors), but also in engineering (e.g. dyadic).
In particle physics, it was important to analyse the problem of Neutrino
Oscillations, formally related both to the Decoupling and the Superposition
Principles. In this case, the Three Neutrinos Masses Matrix is not diagonal and not
normal in the so called Gauge States’ basis. However, through a bi-unitary trans-
formation (one unitary transformation for each “parity” of the gauge states), it is
possible to get the eigenvalues and their own eigenvectors (Mass States) which
allow to render it diagonal. After this transformation, it is possible to obtain the
Gauge States as a superposition (linear combination) of Mass States.
Schrödinger’s Linear Equation governs the non relativistic quantum mechanics
and many problems are reduced to obtain a diagonal Hamiltonian operator. Besides,
when studying the quantum angular momentum’s addition one considers
Clebsch-Gordon coefficients related to an unitary matrix that changes a basis in a
finite-dimensional space.
In experimental physics and statistical mechanics (Stochastic methods’ frame-
work) researchers encounter symmetric, real positive definite and thus diagonaliz-
able matrices (so-called covariance or dispersion matrix). The elements of a
covariance matrix in the i, j positions are the covariances between ith and jth
elements of a random vector (i.e. a vector of random variables, each with finite
variance). Intuitively, the variance’s notion is so generalized to multiple dimension.
The geometrical symmetry’s notion played an essential part in constructing
simplified theories governing the motion of galaxies and the microstructure of
matter (quarks’ motion confined inside the hadrons and leptons’ motion). It was not
until the Einstein’s era that the discovery of the space-time symmetries of the
fundamental laws and the meaning of their relations to the conservation laws were
fully appreciated, for example Lorentz Transformations, Noether’s Theorem and
Weyl’s Covariance. An object with a definite shape, size, location and orientation

but the system as a whole changes under such transformation by distinguishing between two or more fundamental states. . Fifty years after Galois. with applications in engineering as well as in modern physics. In this sense. quantum gravity.g. This is the case in general relativity. The search for a linear transformation leaving “formally invariant” the Maxwell’s equations of Electromagnetism led to the discovery of a group of rotations in space-time (Lorentz transformation). the so-called “Explicit Symmetry Breaking” occurs. On the contrary. one gains an important and easy calculation tool for modern differential geometry. In the 1920s. Cayley showed that every finite group is isomorphic to a certain permutation group (e. by Ruffini and by Abel (among others). in general. if the Lagrange function is not invariant under particular transformations. the essential role played by Lie’s groups.g. e. where the Lagrange function (or the Hamiltonian function.g.Foreword xi constitutes a state whose symmetry properties are to be studied. Galois introduced important concepts in group theory. by an angle or a simple phase rotation) but they have the same energy. Frequently. For example. was first emphasized by Wigner. He did this by showing that the functional relations among the roots of an equation have symmetries under the permutations of roots. be solved by algebraic methods. Weyl and Wigner recognized that certain group theory’s methods could be used as a powerful analytical tool in Quantum Physics. representing the energy of the system) is invariant under rotations (in the ferromagnetic phase) and under a complex scalar transformation (in the superconductive phase). and string theory. The higher its “degree of symmetry” (and the number of conditions defining the state is reduced) the greater is the number of transformations that leave the state of the object unchanged. While developing some ideas by Lagrange. by developing the determinants through the permutations’ theory and the related Levi-Civita symbolism. rotation isomorphic groups SOð3Þ and SU ð2Þ. In particular. If two states of an object are different (e. In physics. In 1850s. Their ideas have been used in many contemporary physics’ branches which range from the Theory of Solids to Nuclear Physics and Particle Physics. Lie unified many disconnected methods of solving differential equations (evolved over about two centuries) by introducing the concept of continuous transformation of a group in the theory of differential equations. Finally. the invariance of the equations of motion of a particle under the Galilean transformation is fundamental in Galileo’s relativity. the crystals’ geometrical symmetries are described in finite groups’ terms). This kind of symmetry breaking (for example) characterizes the ferromagnetic and the superconductive phases. it is important to understand why a symmetry of a system is observed to be broken. one refers to “Spontaneous Symmetry Breaking”. the underlying laws of a system maintain their form (Lagrange’s Equations are invariant) under a symmetry transformation. two different kinds of symmetry breakdown are considered. this happens when an external magnetic field is applied to a paramagnet (Zeeman’s Effect). This study showed that an equation of order n  5 cannot. In Classical Dynamics.

The book is organized with the aim of communicating to a wide and diverse range of back- grounds and aims. the knowledge of the laws among the elementary constituents of a system does not implicate an understanding of the global behaviour.g. which was introduced between the end of the XIX century and the beginning of the XX century (the work by Boltzmann and Gibbs).g. The book can be of great use to students of mathematics. A physicist needs algebra either to model a phenomenon (e. Since a prior rigorous knowledge about the subject is not assumed. but by probabilistic methods. Nonetheless. ferromagnetic phenomena). My suggestion is to read this book. Statistical Mechanics. or to model a portion of phenomenon (e. or to use it as a basic tool to develop complex modern theories (e. engineering. Moreover. Catania. quantum field theory). the reader may easily understand how linear algebra aids in numerical calculations and problems in dif- ferent and diverse topics of mathematics. deals with the problem of studying the behaviour of systems composed of many particles without determining each particle’s trajectory. even the most complex. physics. as mentioned above. consult it when needed. The plethora of examples make the topics. clear conclusions can be easily reached if a large number of atoms is observed (more precisely when the number of atoms tends to infinity). Perhaps the most interesting result of statistical mechanics consists of the emergence of col- lective behaviours: while one cannot say whether the water is in the solid or liquid state and which is the transition temperature by observing a small number of atoms. The latter is an example of a physical phenomenon which requires a mathe- matical instrument different from linear algebra. it is not easy at all to deduce from the forces acting between the water’s molecules because ice is lighter than water. This book provides the readers the basics of modern linear algebra. Unfortunately not the entire physics can be straightforwardly modelled by linear algebra. and engineering as well as to researchers in applied sciences who want to enhance their theoretical knowledge in algebra. Phase transitions are therefore created as a result of the collective behaviour of many components. classical mechanics). computer science. and computer science.g. easily accessible to the most practical minds. Italy Alberto Grasso April 2016 . I found this book a pleasant guide throughout linear algebra and an essential vademecum for the modern researcher who needs to understand the theory but has also to translate theoretical concepts into computational implementations. linear algebra and its understanding is one of the basic foundations for the study of Physics.xii Foreword We can therefore observe that the concepts of linearity and symmetry aided to solve many physical problems. physics. and enjoy it. For example.

the engineer. This story affected. In reality. The story narrated in this book is organized into two parts composed of six chapters each. On the other hand. our brain and brought us to the modern technological discoveries. thus will not be enough intuitive for a computer scientist/engineer. the theoretical research often looks at the world and practical necessities to be developed. and anybody who will need an in depth under- standing of the subject to let applied sciences progress. applied science is based on the theoretical progress.Preface Theory and practice are often seen as entities in opposition characterizing two different aspects of the world knowledge. Part I illustrates basic topics in algebra which xiii . the main concepts of linear algebra from the viewpoint of the computer scientist. happened at some point in our ancient history. and practical implications of every concept introduced and every statement proved. This book is oriented to researchers and graduate students in applied sciences but is also organized as a textbook suitable to courses of mathematics. or trivial undergraduate textbooks. thus twelve in total. The “narration” of this book is thought to flow as a story of (a part of) the mathematical thinking. without an adequate mathematical rigour in proofs and definitions. this book presents. century after century. The origin of this knowledge evolution is imagined to be originated in the stone age when some caveman/cavewoman had the necessity to assess the number of objects he/she was observing. attempting to provide the reader with examples. has been the beginning of mathematics. the book does not contain logical skips or intuitive explanations to replace proofs. topics are also presented as algorithms with associated pseu- docode. On the other hand. explanations. This conceptualization. “Linear Algebra for Computational Sciences and Engineering” aims at maintaining a balance between rigour and intuition. but also of logics. and somehow technology. Books of algebra are either extremely formal. rational thinking. without compromising on mathematical rigour. When appropriate. In particular. This book is based on the idea that theory and practice are not two disjointed worlds and that the knowledge is interconnected matter.

endomorphism. and systems of linear equations. Chapter 2 deals with matrix algebra introducing definitions and theorems. after having introduced vectors in an intuitive way as geometrical entities. While introducing these con- cepts it is shown that algebra is not only an abstract subject. The narration takes a break in Chap. 8 and 9 the connections with matrix and vector algebra is self-evident. As a theoretical . On the contrary.xiv Preface could be suitable for an introductory university module in Algebra while Part II presents more advanced topics that could be suitable for a more advance module. Chapter 3 continues the discussion about matrix algebra by explaining the theoretical principles of systems of linear equations as well as illustrating some exact and approximate methods to solve them. complex numbers. this book can be read as a handbook for researchers in applied sci- ences as the division into topics allows an easy selection of a specific topic of interest. Finally Chap. an algebraic interpretation and is thus equivalent to a matrix. Chapter 4. In a symmetric way. vector space theory. Most of knowledge achieved during the first five chapters is proposed again in Chap. and eigenvalues. Part II opens with an advanced introduction to algebra by illustrating basic algebraic structures in Chap. 12 introduces electrical networks as algebraic entities and shows how an engineering problem is the combination of multiple mathematical (in this case algebraic) problems. the implementation of algebraic techniques has major practical implications which must be taken into account. Chapter 5 gently introduces algebraic structures by providing statement and interpretation for the fundamental theorem of algebra. It is emphasized that the solution of an electric network incorporates graph theory. 1 which introduces the basic concepts and definitions in algebra and set theory. matrix theory. 5 where complex numbers and polynomials are discussed. It is shown how a conic has. 10 where some logical instruments for understanding the final chapters are introduced. 7. Part I opens with Chap. 6 where conics are introduced and explained. 1 are used in all the subsequent chapters. 9 which deals with linear mappings. These notions are then used within Chap. progressively abstracts and generalizes this concept leading to algebraic vectors which essentially require the solution of systems of linear equations and are founded on matrix theory. I would like to express my gratitude to my long-standing friend Alberto Grasso who inspired me with precious comments and useful discussions. Theory of vector spaces is described from a theoretical viewpoint as well as with reference to their physical/geometrical meaning. 8 where vector spaces are presented. Group and ring theories are introduced as well as the concept of field which constitutes the basics for Chap. Memory and operator representations are also discussed. Definitions and notation in Chap. Furthermore. Chapter 11 discusses graph theory and emphasizes the equiv- alence between a graph and a matrix/vector space. The narration about vectors leads to Chap. These concepts are the basics of complexity theory. In Chaps. Some simple algebraic operations are revisited as instructions to be executed within a machine. besides its geometrical meaning. thus covering all the topics presented within all the other chapters.

The study of Mathematics is similar to running of a marathon: it requires intelligence. Like in a marathon. I wish to thank my parents. Michéle Wrightham. and Fabio Caraffini for support and feedback. hard work. comfortable downhill. Furthermore. if not step by step. where the latter three are as important as the first one. a national research exercise etc. Vincenzo and Anna Maria. the passion towards the goal. Leicester. the study of mathematics does not have a clear and natural finish line. UK Ferrante Neri April 2016 . for the continued patience and encouragement during the writing of this book. As a final note. the study of mathematics contains easier and harder stretches. and determination. the publication of an article. Understanding mathematics is a lifetime journey which does not contain short-cuts but can be completed only mile by mile. he offered me a different perspective of Algebra which is more thor- oughly explained in the Foreword written directly by him. a funding bid. In a marathon. Unlike the marathon. I wish the readers a fruitful and enjoyable time reading this book. especially Joanne Bacon. and to persevere despite the difficulties. like in the study of mathematics. However. and nasty uphill bends.Preface xv physicist. This book is meant to be a training guide towards an initial understanding of linear and abstract algebra and possibly a first or complementary step towards better research in Computational Sciences and Engineering. it has the artificial finish lines that the society imposes to us such as an exam. patience. the most important point is the focus on the personal path. I hope this book will be a source of inspiration for young minds. Last but not least. To the youngest readers who are approaching Mathematics for the first time with the present book I would like to devote a thought. I would like to thank my colleagues of the Mathematics Team at De Montfort University.

He is also Visiting Professor at the University of Jyväskylä. where he was appointed Reader in Computational Intelligence. in 2002 and 2007 respectively. he also received a PhD in Scientific Computing and Optimization from University of Jyväskylä. In 2007. Finland. Since 2013.About the Author Ferrante Neri received a Master’s degree and a PhD in Electrical Engineering from the Technical University of Bari. Finland. Italy. United Kingdom. scalability in optimisation and large scale problems. Currently. Neri moved to De Montfort University. United Kingdom in 2012. His research interests include algorithmics. xvii . Finland in 2007. He was appointed Assistant Professor at the Department of Mathematical Information Technology at the University of Jyväskylä. he is Full Professor of Computational Intelligence Optimisation at De Montfort University. he teaches linear algebra and discrete mathematics at De Montfort University. metaheuristic optimisation. and in 2009 as a Research Fellow with Academy of Finland.

. .6 Orthogonal Matrices . . . . . 15 2.. .. . . . 6 1. . . . . .. .. . . . . . . . . .. . .2 Pivoting Strategies and Computational Cost. .1 Solution of a System of Linear Equations . . . . .. . . . . . . . . . .3 Basic Definitions in Set Theory . . . . . . . . . . . . . . . . . . . .. .3.. . . . .5 A Preliminary Introduction to Algebraic Structures.. . . . . . . . . . . . . . . . . . . . . . . . .4 Equivalence of Gaussian Elimination and LU Factorization . 41 3. . . . . . . . .3. . . . . . . . .. . . . . . . . . .. ... .3 Direct Methods . . . . . . . .. . . . . . 15 2. . . 8 1. . . . . . . . . 4 1. . . . . . .4 Number Sets . . . . .. . . . . . . . . . . . .1 Introduction. . . . . .. . . . . . . . . . . 18 2. . . . . . . . .. . . . . .8 Exercises . . . .. . . 63 3. . . . . 41 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Contents Part I Foundations of Linear Algebra 1 Basic Mathematical Thinking . . .. . . . . 65 3. . . . . . .. 3 1. . . . . . . . . . . . . . . . . . . . . . .. . 17 2. . . . . . . . . . . . . . ... . . . . .5 Invertible Matrices . . . . . . . . . . . . . . . . . . . . 4 1. . .2 Homogeneous Systems of Linear Equations . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . 56 3. . .. . . . . . . . . . . . . . 3 1. . . . .. . . . .. . . . 26 2. . . . . . 30 2. . . . . . . . . . . . . . . . . . . . .6 Exercises . . . . 16 2. . . . . . . . . . .2 Axiomatic System . . . . . ... .. . . . . . . 38 3 Systems of Linear Equations . . . . . . . . . . . . . . . .. .1 Order and Equivalence . . . . . .3. 49 3. . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . 11 1. . . .. . . .2 Basic Definitions About Matrices . . .4 Determinant of a Matrix . . . . . . . .. . . . .. . . . . . . . . . . 30 2. . . . . . . .. . . . . . . . . . .1 Gaussian Elimination . . . . . . .. .1 Numeric Vectors . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . ..3. . 12 2 Matrices . . .. . . . . . . . . . . . . .. . 72 xix . .7 Rank of a Matrix .3 LU Factorization . ... . . . . . . . . . . . . . . . . .. . .3. . .. . . ... . . . . . ... . . . . . . . .. . . . . . .. . .3 Matrix Operations . . . .. . 52 3. . . . ..

. . . . . .5. . . . . . . . . . . 181 6. . . . . . . . . . . .5. . . . . . . . . . .4 Iterative Methods . . . . . . . . . .4 Products of Vectors . .3 Partial Fractions. . . . . . . 84 3. . . . . . . . . . . . . . . . 76 3.6 Canonic Form of a Conic . . . . . . . . . . . . . . . . .2 An Intuitive Introduction to the Conics . . . . . . 89 3. . . . . . . . . . 169 6. . . . . . . . 157 6 An Introduction to Geometric Algebra and Conics .5 Matrix Representation of a Conic. . . . . . . . . . . . . . Asymptotes. . . . . . . . . . . . . . . 131 5. . . . . . . . . . . . . . . . . . . . . 81 3. . . . . .1 Jacobi’s Method. . . . . . . . . . . . . . . . . .. . . . . . . . .4. . . . . . .. . . . . . . . . . . . . . . . . . . . .2 Roots of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . .3 The Method of Successive over Relaxation . . . . .. . . . . . . . . . . . . . . . .4. . . . . . . . . . . . . . .3 Families of Straight Lines . . . . . .1 Simplified Representation of Degenerate Conics. . . . . . . . . . . . . . . . . . . . . . . . 131 5. . Centres. . . . . . . . . . . . . . . . . . .4. . . .2. . . . 159 6. . . .. . .1. . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5. . . . . . . . . . . . . . . . . 159 6. . . . . . . . . . . . . . . . . .1 Operations of Polynomials . . . . . . . . 152 5. . 94 4 Geometric Vectors . .. . . . . . . .5. . . . . . . . . . . . .4 Simplified Representation of Conics . . . . . . . 206 .4. . 128 5 Complex Numbers and Polynomials . . . . . . . . . . . . . . . . . . . . 162 6. . . . . . . . . . . . . . . . . 138 5. . 170 6. . . . . 121 4.4 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . . 187 6. .3 Bases and Matrices of Vectors.4 Numerical Comparison Among the Methods and Convergence Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Classification of a Conic: Asymptotic Directions of a Conic. . . . . . . . . . . . . . . . . . . . . . . 185 6. 184 6.. . . .2 Complex Polynomials. . 167 6. . . . . . . . . .. . . . .2 Simplified Representation of Non-degenerate Conics . . . .2. .2 Linear Dependence and Linear Independence . . .. . . 165 6. .xx Contents 3. . . . . . . . . . . . . . . . . . . .3 Analytical Representation of a Conic . . . .5 Diameters. . . . . . . . . . . . . 113 4. .1 Basic Concepts: Lines in the Plane. . . .. 204 6. . . . . . . . . . . . . . . . . . . . . . .. .1. .1 Equations of the Line . . . . 170 6. . . . . . . . . . . . . . . . . . . . . . . . . 97 4. 101 4. . . . . .1. . . . . . .5. .5 Exercises .. . . . . . . . . 193 6. . . . . . . . . . . . . . . . . . . . . . . . .1 Intersection with a Line . .. . . . . . 74 3. . . . . . . . . . . . . 138 5.1 Basic Concepts .3 Degenerate and Non-degenerate Conics: A Conic as a Matrix. . . . . . . . . . . . . . . . . . . .2 Gauss-Seidel’s Method . . . . . .5. . .2 Line Tangent to a Conic . .4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 6. . . . . . . . . . . . . . . . . . . . . . . . . . . .4. . . . . . and Axes of Conics . . . . . . 159 6. . . . . . . . . .5 Exercises . . . . . . . . . . 170 6. . . . . . . . . . . .1 Complex Numbers . . .5. . . . . . . . . . .2 Intersecting Lines. 97 4. . . . . . . . . . . . .6 Exercises . . . . . . . .

. 220 7. . . . . . . . . . . . . NP-Complete Problems . . . . . . . 341 9. . . . . . . . . . . . . . . 356 10. . 353 10. . . . . . . . . . . . 233 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 8. . . . . . . . . . . . 212 7. . . . . . . . . . .7 Exercises . . . . 233 8. .3. . . . . . . . . . . . . . . . .4. . . . . . . . . . .Contents xxi Part II Elements of Linear Algebra 7 An Overview on Algebraic Structures . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Polish and Reverse Polish Notation . 349 10. . . . . . . . . . . . . . . . . . . . . . . . . 231 8 Vector Spaces . . . . . . . . . . . . . . . . .3 Linear Span. . . . . . . . . . .4. . . 298 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Row and Column Spaces . . . . . . . . . . . . . . . 211 7. . . . . .6 Power Method . . . . . . . 360 10. . . 217 7. . . . . . .1 Introductory Concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 9. . . 315 9. . . . . . . . . 310 9. .6 Euclidean Spaces . . . . . . . . . 215 7. . . . . . . . . . . . . . . . . . . . .2 A Linear Mappings as a Matrix: A Summarizing Scheme. . . . . . . . . . . . . . . . . . . . . . . .4 Geometric Mappings . . . .1 Huffman Coding . . . . . . . . . . . . . . 289 9. . .1 Basic Concepts . . . . . . . 329 9. . .3. . . . .1 Matrix Representation of a Linear Mapping. . . . . .3. . . . . . . . . . . . . . . . . 349 10. . . . . . . . . . . . . . . . . . . . . . . . . . 345 10 An Introduction to Computational Complexity . . . . . . . . . . . . . . . 217 7. .3. . . . . . . .5 Diagonal Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and Eigenspaces . . . . . . . . .2 Semigroups and Monoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Exercises . . . . . . . . . . . . . . . . . . 252 8. 280 9 Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 7. 224 7.2 Equivalence and Congruence Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 9. . . 221 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NP-Hard.3. . 356 10. . . . . . . . . . . . . . 362 . . . . . . . . . . . . . . .1 Cosets . . . . . . . . . . . . . .5 Homomorphisms and Isomorphisms .3 Change of a Basis by Matrix Transformation. . . . .2 Endomorphisms and Kernel. . . . . . . . . . . . . . . . . .4 Rings . . . . . . . . . . .4 Eigenvalues. . 242 8. . . . . . . . . . . . . .3 Lagrange’s Theorem . . . NP. . . . . . . . . . . . . . . . . 283 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 7. . . . . . . . . .7 Exercises . . . . .2 Vector Subspaces. . . . . . . . . . . .2 P. . . . . . 311 9. . . . . 283 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Groups and Subgroups . . . . . . . . . . .3. . . . . . . . . . . . . . .1 Method for Determining Eigenvalues and Eigenvectors . .3. . . . . . . . . . . . . . . . . . . . . . . .1 Cancellation Law for Rings . . . . . 270 8. . . . . . . . . . .1 Complexity of Algorithms and Big-O Notation . . . . . . .3. Eigenvectors. . . .4 Basis and Dimension of a Vector Space .1 Basic Concepts . . . . .6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Linear Mappings and Matrices. . .3 Representing Information . . . . . . 305 9. . . . . . . . . . . . . . . . . . . . 234 8. . . . . . . .

3 Cycle Matrices. . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . .3 Travelling Salesman Problem. . . . . . . . . .7. . . . . . . . . . . 432 12. . . .2 Bi-poles . . . . .2 Incidence Matrices . . . . . . . . . . . . .1 Motivation and Basic Concepts . . . . .1 Passive Bi-poles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 12. . . . . . . . 374 11. .2. 441 12. . . . . . . . .1 The Social Network Problem . . . .xxii Contents 11 Graph Theory . . . 412 11. . . . . . . . . . . .4 Planar Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Phasorial Representation of Electrical Quantities . . . . . . . . . . . .7. . . . . . .2 Eulerian and Hamiltonian Graphs.5. . . . . . . .7. . . . . . . . . . . .5. . . . . . . . . . . . . . . . . . . . . . 420 11. . . . . . . . . . . . . . . . . . . . . . . 416 11. . . .1 Basic Concepts .2 Active Bi-poles . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Impedance. . . . . 387 11. . . . . . . . . . . . . . . . . . . .5. . . 431 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Some Applications of Graph Theory. . . .6 Graph Isomorphisms and Automorphisms . . . . . . . . . .8 Exercises . . . . . . . . . . . . .4. . . . . . . . . . . .7. . . . . . . . . . 459 References . . . . . . 422 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422 11. . . . . . . . . . . 404 11. . .1 Bi-poles in Series and Parallel . . . . . . . . . . . . . . . . . . . . 398 11. . .3. . . . . .7. . . . . . . . . . . . . . . . . 463 . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Trees and Cotrees . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . 391 11. . . . . . . . . .5 Remark. . . . . . . . . .5 Graph Matrices . . . . . . . . . . . .2. . . . 435 12. . . . . . . . . . . . . . . . . . . . . . . . .6 Exercises . . .4. . . . . . . . . . . . . . . . . . . . . . . . .3 Electrical Networks and Circuits . . . . . . . . . . . . . . . . . . . .4 Solving Electrical Networks . . . . . . . . . . . . .5. . . . . 379 11. . .3. . .2 The Four Colour Problem . 424 11. . .1 Adjacency Matrices .3 Bipartite Graphs. . . 424 11. . . . . . . . . . 394 11. . . . . . .4 Cut-Set Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 11. . . . . . . . . . . . . . . . 363 11. . . . . . . . . 451 Appendix B: Proofs of Theorems that Require Further Knowledge of Mathematics . . . . . . . . . . . . . 436 12. . . . . . . . . . . . . . . . . . . . . 425 11. . . . 434 12. . . .4 The Chinese Postman Problem. 432 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 11. . . . . . . . . . . . . . . . . . . . . . . . . . 438 12. . . . . . . . . . . . 426 12 Applied Linear Algebra: Electrical Networks . . . . . . . . . 448 12. . . . . . 363 11. . . . . . . . . . . . 436 12. 393 11. 449 Appendix A: A Non-linear Algebra: An Introduction to Boolean Algebra . . . . . . . . . . . . . . . . . . . . .5 Applications to Sociology or to the Spread of Epidemics . . . . . . . . . .5. . . . . . . . . . . .2 Euler’s Relation . . . . . . . . . . . . . . .5 Relation Among Fundamental Matrices.2 Kirchoff’s Laws . . . . . . . . . . . . . . . . 442 12.

Part I Foundations of Linear Algebra .

Obviously. That caveman who made this for first. Algebra. that is closely related to the concepts of set and its cardinality. mathematics has been with us with our capability of thinking and is the engine of human progress. © Springer International Publishing Switzerland 2016 3 F. trees. that traditionally is the discipline that connects quantities (numbers) to symbol (letters) in order to extract general rules. This system is named Axiomatic System. at an abstract level. Linear Algebra for Computational Sciences and Engineering. In order to mark the amount. since we totally have 10 fingers in our hands. at some point in the stone-age some caveman/cavewoman probably asked to himself how many objects (stones. This is probably a common opinion and. is the reason why our numeral system is base 10. mathematics is much more than enumeration as it is a logical thinking system in which the entire universe can potentially be represented and lives aside. The most natural gesture I can think is to lift a finger for each object taken into account. This book offers a gentle introduction to mathematics and especially to linear algebra (from Arabic al-gabr = connection).1 Introduction Mathematics.1007/978-3-319-40341-0_1 . is simply translated as science or expression of the knowledge. even when there is no physically meaningful representation. Although it is impossible to determine the beginning of mathematics in the history. DOI 10. as well as mathematics is based on a set of initial rules that are considered basic system of truth that is the basis of all the further discoveries. Neri. is the one who invented/discovered the concept of enumeration. fruits) he was looking at. Regardless of the fact that mathematics is something that exists in our brain as well as in the surrounding nature and we discover it little by little or an invention/abstraction of a human brain. he used a gesture with the hands where each gesture corresponded to a certain amount. from the Greek word “mathema”.Chapter 1 Basic Mathematical Thinking 1.

In other words the two statements are equivalent since the truth of one of them auto- matically causes the truth of the other. More specifically. In this book.e. 1. The axiomatic sys- tem is the ground onto the entire mathematics is built. besides being true that “if the hypotheses are verified then the thesis occurs” it is also true that “if the thesis is verified then the hypotheses occur”. Let us indicate with A a generic set and with x its element. A large number of proof strategies exist. i. in any way. On the basis of this ground.2 Axiomatic System A concept is said to be primitive when it cannot be rigorously defined since its meaning is intrinsically clear. a definition is a statement that introduces a new concept/object by using previously known concepts (and thus primitive concepts are necessary for defining new ones). Thus. if A and B are two statements. the theorem is symmetric. an axiom is a statement which appears unequivocally true and that does not require any proof to be verified but cannot be. i. we will use only the direct proof. the negated thesis will be new hypothesis that will lead to a paradox.e. this knowledge extension is named theorem. A successful proof is indicated with the symbol . we will write x ∈ A (otherwise x ∈/ A). A theorem that enhances the knowledge by achieving a minor result that is then usable to prove a major result is called lemma while a minor result that uses a major theorem to be proved is called corollary.e. theorems of this kind will be expressed in the form “A is verified if and only if B is verified”. a theorem of this kind can be expressed as “if A is verified than B occurs and if B is verified then A occurs”. In some cases. These objects are the elements of the set. A theorem can be expressed in the form: “if the hypotheses are verified then the thesis occurs”. In this book. from the hypotheses we will logically arrive to the thesis or by contradiction (or reductio ad absurdum). falsified. i. It must be remarked that a theorem that states the equivalence of two facts requires two proofs. More exactly. Primitive concepts and axioms compose the axiomatic system. a theorem of the kind “A is verified if and only if B is verified” is essentially two theorems in one. The set of logical steps to deduce the thesis on the basis of the hypotheses is here referred as mathematical proof or simply proof.4 1 Basic Mathematical Thinking 1. In order to indicate that x is an element of A. When the knowledge can be extended on the basis of previously established state- ments. An axiom or postulate is a premise or a starting point for reasoning. . The previously known statements are the hypotheses while the extension is the thesis. the statements “if A is verified than B occurs” and “if B is verified then A occurs” require two separate proofs.3 Basic Definitions in Set Theory The first important primitive concept of this book is the set that without mathematical rigour is here defined as a collection of objects that share a common feature. A proved result that is not as important as a theorem is called proposition. Hence.

6 Let m be the cardinality of a set B and n be the cardinality of A. Definition 1.7 Let A be a set. A and B.2 The cardinality of a set A is the number of elements contained in A.8 For two given sets. If a proposition is applied to all the elements of the set. Definition 1. If we want to state that “it exists at least one element of A such that” we will write ∃x ∈ A |. Definition 1. (A ∩ B) ∩ C = A ∩ (B ∩ C).9 For two given sets. we will write ∃x ∈ A. the union set C = A ∪ B is that set containing all the elements that are in either or both the sets A and B. we will write ∀x ∈ A. Definition 1. A and B. If we want to specify that only one element exists we will use the symbol ∃!. The statement ∀x ∈ ∅ is perfectly meaningful (and it is equivalent to “for no elements”) while the statement ∃x ∈ ∅ is always wrong.1 (Associativity of the intersection) (A ∩ B) ∩ C = A ∩ (B ∩ C) Proof Let us consider a generic element x such that x ∈ (A ∩ B) ∩ C.1 Two sets A and B are said to be coincident if every element of A is also an element of B and every element of B is also an element of A. Definition 1. The set composed of all the possible subsets of A (including the empty set and A itself) is said power set. Definition 1.  .1. If m ≤ n and all the elements of B are also elements of A.3 A set A is said empty and indicated with ∅ when it does not contain any element. Hence. Definition 1. Although proofs of set theory do not fall within the objectives of this book. then B is contained in A (or is a subset of A) and is indicated B ⊆ A. Proposition 1. the intersection set C = A ∩ B is that set containing all the elements that are in both the sets A and B. Hence the element x belongs to the three sets. This fact can be re-written by stating that x ∈ A and x ∈ (B ∩ C) that is x ∈ A ∩ (B ∩ C).3 Basic Definitions in Set Theory 5 Definition 1. We can repeat the same operation ∀x ∈ (A ∩ B) ∩ C and thus find out that all the elements of (A ∩ B) ∩ C are also elements of A ∩ (B ∩ C). This means that x ∈ (A ∩ B) and x ∈ C which means that x ∈ A and x ∈ B and x ∈ C.4 (Universal Quantifier) In order to indicate all the elements x of a set A. Definition 1.5 (Existential Quantifier) In order to indicate that it exists at least one element x of a set A. the statement “for all the elements of A it follows that” is synthetically written as ∀x ∈ A :. in order to have a general idea of the mathematical reasoning the proof of the following property is provided.

. b2 . . y) ∈ R or xR y. This means that x ∈ A. We can repeat the reasoning ∀x ∈ (A ) and find out that x ∈ A.11 For two given sets. This subset means that some elements of A relates to B according to a certain criterion R. c c Hence. an } and B = {b1 . (Ac )c = A. If x is the generic element of A and y the generic element of B. . . thus. the complement of a set A is the set of all the elements not belonging to A. b2 ) . Definition 1. The symmetric difference set is. .6 1 Basic Mathematical Thinking Definition 1. .12 For a given set A. . . b2 ) . . (an . Complement set: Ac = {x|x ∈ / A}. . Let us indicate each set by its elements as A = {a1 . x ∈ / Ac . (a1 . . . bm ) . a2 . .1 Order and Equivalence Definition 1. bm )} The Cartesian product A × A is indicated with A2 or in general A × A × A × . By definition. 1. (a2 . . . .3. . the symmetric difference C = AB = (A \ B) ∪ (B \ A) = (A ∪ B) \ (A ∩ B). .15 (Order Relation) Let us consider a set A and a relation R on A.10 For two given sets. This relation is said order relation and is indicated with if the following properties are verified. (a2 . A and B. the set of those elements that belong either to A or B (elements that do not belong to their intersection). A relation on C is an arbi- trary subset R ⊆ C. b1 ) . . Proposition 1. . Definition 1. . The set A is said domain while B is said codomain. bm ) . (a2 .2 (Complement of a complement)  c Ac =A Proof Let us consider a generic element x ∈ (Ac )c . The relation can be written as (x. The Cartesian product C is a new set generated by all the possible pairs C = A × B = {(a1 .13 (Cartesian Product) Let A and B be two sets with n and m their respective cardinalities. bn }. × A = An if A is multiplied n times.  Definition 1. . b2 ) . . The complement of A is indicated as Ac . . (a1 . b1 ) . . .14 Let C = A × B be a Cartesian product. b1 ) . A and B. (an . (an . the difference set C = A \ B is that set containing all the elements that are in A but not in B. . Definition 1.

in the same group of people. ) be a subset of a poset (X. By definition of supremum ∀u upper bound it fol- lows that u 1 u. some individuals can also be not successors of some others. that is the relation . such that u 1 = u 2 .e. y ∈ A : if x y then y  x The set A. if we consider a group of people we can identify the relation “to be successor of”.17 Let (Y. Intuitively. then u is called a least upper bound or supremum of Y (sup Y ). if we consider a group of people we can always sort them according theirs age. For example. to be younger or to have the same age) with a set of people is a totally ordered set since every group of people can be fully sorted on the basis of their age. Since a partially ordered relation is antisymmetric. we can have groups of individuals that are in relation with respect to the “to be successor of” order relation and some others that are unrelated to each other. Hence the relation “to not be older than” (i.2 If Y has an infimum. ). then l is called a greatest lower bound or infimum of Y (inf Y ). Proof Let us assume by contradiction that a set Y has two suprema. If l is a lower bound of Y such that k l for every other lower bound k of Y . u 1 and u 2 .  The same proof can be done for the uniqueness of the infima. a partially ordered set is a set whose elements (at least some of them) can be sorted according to a certain criterion. ) be a subset of a poset (X. y. If Y has a supremum. it follows that u 1 = u 2 .16 (Partially Ordered Set) A set A where the order relation is verified to some of its elements is a partially ordered set (also named as poset) and is indicated with (A. z ∈ A : if x y and y z then x z • antisymmetry: ∀x. is said totally ordered set. Theorem 1. Definition 1.3 Basic Definitions in Set Theory 7 • reflexivity: ∀x ∈ A : x x • transitivity: ∀x. An element u in X is an upper bound of Y if y u for every element y ∈ Y . For example.1. If u is an upper bound of Y such that u v for every other upper bound v of Y . u 1 u and u 2 u. ∀u upper bound it follows that u 2 u.1 Let Y be a nonempty subset of a poset X . respectively. Furthermore. this infimum is unique. on which the order relation is valid.18 Let (Y. Hence. Theorem 1. then this supremum is unique. Thus. Definition 1. An element l in X is said to be a lower bound of Y if l y for all y ∈ Y . Definition 1. It can be easily verified that this is an order relation as the properties are verified. Analogously. ). The partially ordered set can be seen as a relation that allows the sorting of groups within a set. . ).

1.23 Let f A → B be a mapping. he generates a mapping between a phys- ical situation and the position of the fingers. We may think of two sets such that each element of one set is equivalent to one element of the other set. This mapping is said to be an injection (or the function is injective) if the function values of two different elements is always different: ∀x1 and x2 ∈ A if x1 = x2 then f (x1 ) = f (x2 ).3 Let [a] and [b] be two equivalence classes and x ∈ [a] and y ∈ [b] two elements of the respective classes. More precisely. y2 ) ∈ f ⇒ y1 = y2 and ∀x ∈ A : ∃y ∈ B | (x. In addition.8 1 Basic Mathematical Thinking Definition 1. y) ∈ f can be also expressed as y = f (x). These sets are said to be equivalent sets. This means that enumeration is a special mapping f A → B that simultaneously satisfies the two properties described in the following two definitions. It follows that [a] = [b] if and only if x ≡ y. Definition 1. • reflexivity: ∀x ∈ A it happens that x ≡ x • symmetry: ∀x.22 Let f A → B be a mapping. This mapping is said to be a surjection (or the function is surjective) if all the elements of B are mapped by an element of A: ∀y ∈ B it follows that ∃x ∈ A such that y = f (x). y.19 (Function) A relation is said to be a mapping or function when it relates to any element of a set a unique element of another.4 Number Sets If we consider again the caveman example. z ∈ A if x ≡ y and y ≡ z then x ≡ z Definition 1. y ∈ A if x ≡ y then it also happens that y ≡ x • transitivity: ∀x. y) ∈ f where the symbol : A → B indicates that the mapping puts into relationship the set A and the set B and should be read “from A to B”. Definition 1. Definition 1. this mapping allows the relation of only one amount of object to only one position of the fingers.20 A relation R on set A is an equivalence relation and is indicated with ≡ if the following properties are verified. the concept (x. a mapping f : A → B is a relation R ⊆ A × B | ∀x ∈ A. Let A and B be two sets. . ∀y1 and y2 ∈ B it follows that (x. This proposition means that two equivalent elements are always belonging to the same class. The equivalence class of an element a is a set defined as [a] = {x ∈ A|x ≡ a}. Proposition 1. while ⇒ indicates the material implications and should be read “it follows that”. y1 ) ∈ f and (x.21 Let R be an equivalence relation defined on A.

4 Number Sets 9 When both injection and surjection are satisfied on a function (as in the case of enumeration). surjection says that each physical situation can be potentially represented (by means of a position of the fingers if up to ten objects are involved). In our example.} • rational numbers Q: the set containing all the possible fractions xy where x and y ∈ Z and y = 0 • real numbers R: the set containing Q and all the other numbers that cannot be expressed as fractions of relative numbers • complex numbers C: the set of numbers √ that an be expressed as a + jb where a. the set A is uncountably infinite. in a dense set we cannot identify a minimum neighbourhood radius ε > 0 that separates two neighbour numbers. Thanks to this introduction. −2. −1. This means that a number in a dense set is contiguously surrounded by neighbours. Obviously. 1. the set of quantities and the set of finger positions (symbols) are equiv- alent. the position of the fingers is thus a symbol that univocally identifies a quantity. . a set can be composed of numbers. More specifically. The set of natural numbers N can be defined as a discrete set {0.e. a set is said infinite if its cardinality is ∞. On the contrary. 2 .25 Let A be a number set. we can further characterize the sets by these definitions. A is said dense if ∀x0 ∈ A : ∃x | |x − x0 | < ε. A discrete set composed of infinite elements is still an infinite set but in a different way as it is countably infinite. Conversely. An important property of the number sets is that they are ordered (except for the complex numbers). We will consider ∞ as a special number that is larger than any possible number we can think. Within our example. 5. regardless how small ε is taken. The last primitive concept in this book is the infinity. b ∈ R and the imaginary unit j = −1. Any other number but ∞ is said finite. Definition 1. . . . 1. Injection says that each representation is unique (and hence unambiguous). If the set A is infinite but cannot be put into a bijective relation with the set of natural numbers. It must be remarked that two sets are equivalent if a bijection between them exists. when an ε > 0 can be found to separate two neighbour numbers the set is said discrete. this function is said to be a bijection (or the function is bijective). i. see Chap. 2. Definition 1. In this case it will be a number set. .1. Other relevant sets are: • relative numbers Z = {. these sets are uncountably infinite. . These symbols represent another important primitive concept in mathematics and is called number. . In other words.}. These definitions implicitly state that dense sets are always infinite. Definition 1. 0. indicated as ∞.26 (Countability) A set A is said to be countably infinite if A can be put into a bijective relation with the set of natural numbers N. .24 A set is said finite if its cardinality is a finite number. .

Let us define the relation R “to be a divisor”. y we can always assess whether x ≤ y or y ≤ x. Hence the relation is reflexive. This means that the relation above is an equivalence relation. Hence. Example 1. b[ denotes that infimum and supremum do not belong to the interval. Hence x z > 0. The infimum is 2 while a lower bound is 1.4 Let us now consider the set N \ {0}.5 Let us consider x. respectively. b] to denote that a and b belong to the interval. 4. respectively.10 1 Basic Mathematical Thinking It can be easily seen that all the number sets above except from C are totally ordered sets with respect to the relation ≤. Example 1.e. the relation is also transitive. Example 1. The product of two natural numbers is a natural number. ∀x. . Hence this relation is antisymmetric.3 Let us consider x. It follows that x has the same sign of z. the transitivity is verified. The supremum and infimum are 1 and 0. {2. Definition 1. • Let us check the transitivity: if xy = k ∈ N and yz = h ∈ N then xz = kyh y = kh. If the set is defined as ∀x ∈ R such that 0 < x < 1. • Let us check the symmetry: if xy = k ∈ N then xy = p = k1 which is surely not ∈ N. It follows that 14 is an upper bound while 12 is the supremum.27 An interval is a dense subset of R delimited by infimum and supre- mum. Let a be the infimum and b the supremum. Hence the relation is reflexive. • Let us check the reflexivity of this relation: a number is always a divisor of itself. Example 1. • Let us check the reflexivity of this relation: x−x 4 = 0 ∈ Z. This means that the relation above is of partial order. the notation ]a. i. respectively. • Let us check the reflexivity of this relation: xRx means x x = x 2 > 0 which is true. if we consider the set X ⊂ R defined as ∀x ∈ R such that 0 ≤ x.e. Example 1. Conversely.2 Let us consider now the same relation and the set X ⊂ R defined as ∀x ∈ R such that 0 ≤ x ≤ 1.1 Let us consider the set X ⊂ N. still the supremum and infimum are 1 and 0. y ∈ Z \ {0} and the following relation xR y: x y > 0. Finally. this set has no supremum. y ∈ Z and the following relation xR y: x − y is dividable by 4. i. • Let us check the symmetry: if x y > 0 then yx > 0 which is true. 12} and the relation “to be less or equal” ≤. 7. The interval is indicated as [a. • Let us check the transitivity: if x y > 0 and yz > 0 then x has the same sign of y and y has the same sign of z.

. the relation is also transitive. the result of the operation is still a member of the set. 0 ≤ d − b ≤ c − a = 0. . Definition 1. Hence if it is symmetric a − c = c − a.6 Let us consider the following set E = {(x. Hence the relation is reflexive. If the relation is symmetric (c − a. • Let us check the symmetry: if (a − c. Moreover. It is then antisymmetric.e. d − b) ∈ E with c − a ≥ 0. Hence. Although an in depth analysis of algebraic structures is out of the scopes of this chapter. This means that the relation above is an equivalence relation. d) : (a − c. y) ∈ R2 | (x ≥ 0) AN D (0 ≤ y ≤ x)} and the relation (a. and 0 ≤ d − f ≤ c − e. × X k . on the basis of a set. b − d) ∈ E. Example 1. This means that b = d. Hence. • Let us check the transitivity. . In other words.5 A Preliminary Introduction to Algebraic Structures If a set is a primitive concept. c − e ≥ 0.4 Number Sets 11 • Let us check the symmetry: if x−y 4 = k ∈ Z then y−x 4 = p = −k which is ∈ Z. This is possible only when a = c.1. In all the other cases this relation is never symmetric. Hence this relation is symmetric. b − b) = (0. For the hypothesis a − c ≥ 0. More advanced concepts related to algebraic structures will be given in Chap. . this section gives basic definitions and concepts. The k value is said arity of the operation. If A is X × X × . . We know that (a − c. the symmetry occurs only if the two solutions are coincident. The relation above is of partial order.29 Let us consider a set A and an operation f : A → B. algebraic structures are sets that allow some operations on their elements and satisfy some properties. If we sum positive numbers we obtain positive numbers. 7. • Let us check the reflexivity of this relation: (a − a.28 An operation is a function f : A → B where A ⊂ X 1 × X 2 × . (a − c + c − e) = (a − e) ≥ 0 and 0 ≤ b − d + d − f ≤ a − c + c − e that is 0 ≤ b − f ≤ a − e. 0 ≤ b − d ≤ a − c. b − d) ∈ E and if (c − e. b) R (c. the transitivity is valid. the set is said to be closed with respect to the operation f . Definition 1. 1. Hence. i. d − e) ∈ E. k ∈ N. b − d) ∈ E then a − c ≥ 0. The sum of these two numbers is ∈ Z. × X and B is X . • Let us check the transitivity: if x−y 4 = k ∈ Z and y−z4 = h ∈ Z then x−z 4 = x−y+y−z 4 = k + h. 0) ∈ E.

1. for a field. we obtain the real field. The sum is indicated with a + sign while the product operator is simply omitted (the product of x1 by x2 is indicated as x1 x2 ).6 Exercises 1.2 Prove the associativity of union operation (A ∪ B) ∪ C = A ∪ (B ∪ C) . In other words.12 1 Basic Mathematical Thinking Definition 1.1 Prove the following statement A ∪ (A ∩ B) = A. . indicated here with F. Definition 1. besides the commutative ring properties. then the ring is said to be commutative. • commutativity (sum): x1 + x2 = x2 + x1 • associativity (sum): (x1 + x2 ) + x3 = x1 + (x2 + x3 ) • neutral element (sum): x + 0 = x • inverse element (sum): ∀x ∈ R : ∃ (−x) |x + (−x) = 0 • associativity (product): (x1 x2 ) x3 = x1 (x2 x3 ) • distributivity 1: x1 (x2 + x3 ) = x1 x2 + x1 x3 • distributivity 2: (x2 + x3 ) x1 = x2 x1 + x3 x1 • neutral element (product): x1 = x = 1x Throughout this book. Definition 1. if we consider the set of real numbers R. Both these operations process two elements of R and return an element of R (R is closed with respect to these two operations).32 (Field) A field is a commutative ring which contains an inverse element with respect to the product for every element of the field except 0. the following properties must be valid. the inverse element with respect to the sum is also named opposite element. also the   • inverse element (product): ∀x ∈ F \ {0} : ∃ x −1 |x x −1 = 1 is valid. In addition. If in addition to the ring properties also the • commutativity (product): x1 x2 = x2 x1 is valid. and associate to it the sum and product operations. 1. For example.31 Let R be a ring.30 (Ring) A ring R is a set equipped with two operations called sum and product.

Determine whether or not this relation is reflexive. a + b = c + d • determine whether or not R is reflexive.   1.10 Let us consider the relation R on R defined in the following way: xR y : x y + 3 ≥ 3. b) that make the relation (a. anti-symmetric.6 Exercises 13 1. 1. Justify your answer with examples. and transitive.5 Calculate A × A.6 Detect the codomain of the following relation on N: { a. transitive on N×N • identify the values of (a. 1. and A × B if A = {a. c} and B = {x ∈ Z|x 2 − 2x − 8 = 0}   1. b) R (2. symmetric. Justify your answer with examples. 1.8 Let V the set of objects to sell in a shop.7 Detect the codomain of the following relation on R: { x. 1. symmetric. symmetric.11 Let us consider the set N × N and the relation defined on it (a.3 Prove the distributivity of union operation with respect to intersection (A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C) .9 Let us consider the relation R on R defined in the following way: xR y : x 2 + y 2 ≥ 4. x 5 |x ∈ R}.1. Determine whether or not the relation xR y defined as “the cost of x differs from the cost of y for less than a pound” is an equivalence relation. b) R (c. Determine whether or not this relation is reflexive. 3) valid .4 Prove the following statement (A ∪ B)c = Ac ∩ B c . d) . a 2 − 8 |a ∈ N}. and transitive. 1. 1. B × A. b.

. . Definition 2. bn ) be two numeric vectors ∈ Rn .1 Numeric Vectors Let R be the set of real number. . . . Let a. a2 . Definition 2. an bn generated by the sum of the products of each pair of corresponding components. The sum of these two vectors is the vector c = (a1 + b1 . c ∈ Rn and λ ∈ R. . . DOI 10. b2 .1 (Numeric Vector) Let n ∈ N and n > 0. .) is indicated with Rn and is a set of ordered n-tuples of real numbers. It can be proved that the following properties of the scalar product are valid. a2 . . .1007/978-3-319-40341-0_2 . . an ) of this set is named numeric vector of order n on the real field and the generic ai ∀i from 1 to n is said the i th component of the vector a. an ) and b = (b1 . an ) be a numeric vector ∈ Rn and λ a number ∈ R. an ) and b = (b1 . The set generated by the Cartesian product of R by itself n times (R × R × R × R . a2 . . . . Definition 2. The product of a vector by a scalar is the vector c = (λa1 . will still be indicated with R. . • symmetry: a · b = b · a • associativity: λ (b · a) = (λa) · b = a · (λb) • distributivity: a · (b + c) = a · b + a · c © Springer International Publishing Switzerland 2016 15 F. .3 Let a = (a1 . . . . b. Definition 2. . . . bn ) be two numeric vectors ∈ Rn .2 (Scalar) A numeric vector λ ∈ R1 is said scalar. Linear Algebra for Computational Sciences and Engineering. . an + bn ) generated by the sum of the corresponding components. . . . a2 + b2 . b2 .Chapter 2 Matrices 2. . The scalar product of a by b is a real number a · b = c = a1 b1 + a2 b2 . . . The set of real numbers with the corresponding algebraic sum and product operation. . a2 . λa2 . Neri. The generic element a = (a1 . . . . . . the real field. Definition 2.4 Let a = (a1 . λan ) generated by the product of λ by each corresponding component. . . .5 Let a = (a1 .

.4 2 A= 50 4 1 ⎛ ⎞ 2 5 ⎜ 7 0⎟ A =⎜ T ⎝ 3. .1 √ 2 7 3. j: a j. ai.n is said n order square matrix. n > 0.10 Let A ∈ Rn. Definition 2. . .. .2 . a2. Definition 2.n ⎟ A=⎜ ⎝ .n . a2. . . The diagonal of a matrix is the ordered n-tuple that displays the same index twice: ∀i from 1 to n ai. The set containing all the matrices of real numbers having m rows and n columns is indicated with Rm.8 Let A ∈ Rm.n ⎜ a2.7 A matrix is said null O if all its elements are zeros.. .1 a2.2 Basic Definitions About Matrices Definition 2.11 Let. am. j is said generic j th column vector.n .2 . Definition 2. . . . If m = n the matrix is said square while it is said rectangular otherwise.n . .1 . j . .. The transposed matrix of A is a matrix AT whose elements are the same of A but ∀i..n is said generic i th row vector while aj = a1. n ∈ N and both m. .1 a1.. .4 ⎟ √ 4⎠ 2 1 It can be easily proved that the transposed of the transposed of a matrix is the  T matrix itself: AT . a1.2 .  The  numeric vector ai = ai. Example 2. a1.n where each matrix element ai..16 2 Matrices 2.9 A matrix A ∈ Rn.i = ai.. Definition 2.2 . . A matrix (m × n) A is a generic table of the kind: ⎛ ⎞ a1. j ∈ R..1 am. .6 (Matrix) Let m. am. Definition 2. ⎠ ⎟ am.T j .i . j .

n . The matrix A is said lower triangular if all its elements above the diagonal (the elements ai. The matrix A is said upper triangular if all its elements below the diagonal (the elements ai.A ∈ Rn. j where j > i) are zeros. The trace of a matrix tr (A) is the sum of the diagonal n elements: tr (A) = i=1 ai. Definition 2.12 Let A ∈ Rn.i .n . j where i > j) are zeros. .

j : ci. Since the sum between two matrices is the sum over multiple numbers the properties are still valid for matrices.n . j : ai. • commutativity: A + B = B + A • associativity: (A + B) + C = A + (B + C) • neutral element: A + O = A • opposite element: ∀A ∈ Rm. B ∈ Rm.17 Let A ∈ Rm.2. j = λai. Definition 2. The product between matrices A and B is a matrix C . μ ∈ R : (λμ) A = (Aμ) λ = (Aλ) μ • distributivity of the product of a scalar by the sum of two matrices:∀A. AT = A. j + bi. The matrix sum C is defined as: ∀i.n is said symmetric when ∀i. B ∈ Rm.n and ∀λ ∈ Rλ (A + B) = λA + λB • distributivity of the product of a matrix by the sum of two scalars: ∀A ∈ Rm. • associativity: ∀A ∈ Rm. Definition 2. It can be easily proved that if A is symmetric.n and ∀λ. j . j .n |A + B = O The proof can be carried out simply considering that commutativity and associativity are valid for the sum between numbers.n : ∃!B ∈ Rm. The product of a scalar by a matrix is a matrix C defined as: ∀i. j : ci. 2.n and λ ∈ R.14 A matrix A ∈ Rn. j = ai.n and ∀λ.15 Let A. μ ∈ R : (λ + μ) A = λA + μA Definition 2. The following properties can be easily proved for the product of a scalar by a matrix.n .2 Basic Definitions About Matrices 17 Definition 2.13 An identity matrix I is a square matrix whose diagonal elements are all ones while all the other extra-diagonal elements are zeros. j = a j.3 Matrix Operations Definition 2.i .16 Let A ∈ Rm. The following properties can be easily proved for the sum operation amongst matrices.r and B ∈ Rr.

j = ai.2 Let us multiply the matrix A by the matrix B. j Example 2. 2731 A= 5041 . . j + ai. . j = ai · bj = nk=1 ai. j is defined in the following way: ci.1 b1. + ai. j + .2 b2.n bm.= AB whose generic element ci.k bk.

More generally.18 2 Matrices ⎛ ⎞ 1 2 ⎜2 5⎟ B=⎜ ⎝8 ⎟ 0⎠ 2 2 a1 · b1 a1 · b2 42 41 C = AB = = a2 · b1 a2 · b2 39 12 The following properties can be easily proved for the product between two matri- ces. • left distributivity: A (B + C) = AB + AC • right distributivity: (B + C) A = BA + CA • associativity: A (BC) = (AB) C • transposed of the product: (AB)T = BT AT • neutral element: ∀A : AI = A • absorbing element: ∀A : AO = O It must be observed that the commutativity with respect to the matrix product is generally not valid. or c − b − a or b − a − c or c − a − b or b − c − a. Since the commutativity is not valid for the product between matrices. . Proposition 2. it can be checked that for n objects there are n! (n factorial) permutations where n! = (n) (n − 1) (n − 2) . see also [1]. and c.1 Let A.n with sum and product is not a commutative ring. if we consider three objects a. . . the set Rm. We will call permutation every grouping of these objects. 2. a − b − c) and name it fundamental permutation. It may happen in some cases that AB = BA. In these cases the matrices are said commutable (one with respect to the other).n . b. In this case. there are totally six possible permutations. Let us define even class permutation a permutation undergone to an even number of inversions and odd class permutation a permutation undergone to an odd number of inversions.g.18 Let us consider n objects. (2) (1) with n ∈ N and (0)! = 1. Every time two objects in a permutation follow each other in a reverse order with respect to the fundamental we will call it inversion.4 Determinant of a Matrix Definition 2. We could fix a reference sequence (e. For example. B ∈ Rn. Every matrix A is commutable with O (and the result is always O) and with I (and the result is always A). we could group them as a − b − c or a − c − b. The tr (AB) = tr (BA).

.. .2. . . Totally n! associated products can be extracted from a matrix A ∈ Rn.2 .c1 a2. .19 (Determinant of a Matrix) Let us consider a matrix A ∈ Rn.. . . ⎠ an. . . Definition 2..n . This product is here referred to as associated product and can be expressed in the form ε(c) = a1.1 a2.4 Determinant of a Matrix 19 In other words. a sequence is an even class permutation if an even number of swaps is necessary to obtain the fundamental permutation.1 a1.1 an. .. .n ⎜ a2. . the scalar ηk is defined as: 1 if c1 − c2 − .c3 . an. a2. . − cn is an odd class permutation The sum of the n! associated products where each term is weighted by the corre- sponding ηk. ⎛ ⎞ a1. Let us consider 1 − 2 − . a1. −1 if c1 − c2 − . .cn (we order the factors of the product according to the column index).2 . − n as the fundamental permutation. − cn is an even class permutation ηk = .n From this matrix we can build the product of n elements that do not belong neither to the same row nor to the same column. Analogously. ⎟.. . a sequence is an odd class permutation if an odd number of swaps is necessary to obtain the fundamental permutation.2 . ..n .n ⎟ A=⎜ ⎝ . . an. .c2 a3..

If n = 2. 3).2 a3. det A is equal to the only element of the matrix. 2) are even class permutation because two swaps are required to obtain (1. Example 2. If n = 3. .1 a1. 3. On the contrary. is named determinant of the matrix A and is indicated as det A: det A = n! k=1 ηk εk (c). 3).3 a2. and (1.1 a2. the matrix A appears as a1. Looking at the column indices it can be noticed that in (1.2 a3.3 a3. 1).1 − a1.3 A = ⎝ a2.2 − a1.3 a3.2 a2. 1.3 − a1.3 ⎠ a3. 1.2 a1. 2) are odd class permutations because only one swap allows to be back to (1. 3.2 a2.1 a1.2 a2.2 a3.2 A= a2.1 .3 + a1.1 a2. 3) is the fundamental permutation. 1) and (3.3 and its det A = a1. (3. 2.1 a3.1 a2. it follows that (2. 2.2 − a1.1 a2. 2.1 a3. 3). 2. the matrix A appears as ⎛ ⎞ a1.2 .3 If n = 1.2 and its det A = a1.1 a2. (2.1 a3.2 a2.1 + a1.3 a2.

j = λ1 a j. j can be expressed as weighted sum of the other elements of the column: ∀ j : ∃λ1 .20 Let A be a matrix. • Let A be a matrix and det A its determinant.i+1 + . The new matrix As is such that ⎛ ⎞ 111 det As = ⎝ 1 1 3 ⎠ = 2 + 6 + 1 − 2 − 3 − 2 = 2. λi−1 . Example 2.1 + λ2 a j. In this case the matrix A remains the same and hence the determinant must be the same. If two rows (columns) are swapped the determinant of the modified matrix As is − det A. For a given matrix A ∈ Rn. . The ith row is said linear combination of the others if each of its element ai.i−1 + λi+1 a j. 113 Le us swap the second row with the third one. λn | ai. Example 2. 212 • if two rows (columns) of a matrix A are equal then the det A = 0. λi−1 a j. On the other hand. . λi+1 . • The determinant of a triangular matrix is equal to the product of the diagonal elements.20 2 Matrices Definition 2. the determinant remains the same. . • If to the elements of a row (column) the elements of another row (column) all multiplied by the same scalar λ are added. . . a j.5 Let us consider the following matrix: ⎛ ⎞ 111 det A = ⎝ 2 1 2 ⎠ = 3 + 2 + 2 − 1 − 2 − 6 = −2. .n . . λ2 . .2 + . . the third word equates the weighted sum obtained by multiplying the first row by 1 and summing to it the second row multiplied by 2. 2.n λn . . It follows that det As = det As = − det As which is possible only when det As = 0. after the row (column) swap the determinant must have an inverted sign. • The determinant of a matrix A is equal to the determinant of its transposed matrix: det A = det AT . λ2 = 1. .4 Let us consider the following matrix: ⎛ ⎞ 011 A = ⎝3 2 1⎠. . 653 The third row is a linear combination of the first two by means of scalars λ1 . the following properties about determinants are valid. This property can be easily proved by swapping the two identical rows.

• Let A be a matrix and det A its determinant. .2. + λn−1 an−1. 223 The determinant of this new matrix is det (An ) = 6 + 2 + 2 − 4 − 2 − 3 = 1. It is easy to verify that ⎛ ⎞ 523 det A = det ⎝ 8 6 2 ⎠ = 60 + 8 + 0 − 36 − 0 − 32 = 0. λn−1 |an. λ2 . .6 Let us consider the following matrix: ⎛ ⎞ 111 A = ⎝1 2 1⎠ 001 whose determinant det (A) = 2 − 1 = 1. • If a row (column) is a linear combination of the other rows (columns) then the determinant is zero: if ∀ index j : ∃λ1 . . Let us now add to the third row the first row multiplied by λ = 2: ⎛ ⎞ 111 An = ⎝ 1 2 1 ⎠ . Example 2. then det A = 0. it remained the same. i. . j + λ2 a2. . In other words. 22 det A = det =2 12 . j = λ1 a1. j . Example 2. the matrix A is such that a1 = λ1 a2 + λ2 a3 where λ1 = 1 and λ2 = 1. K ) (with/we mean in this case the division element by element). 202 • if a row (column) is proportional to another row (column) then the determinant is zero: if ∃i. j + .8 For the following matrix.e. . K .7 The following matrix ⎛ ⎞ 523 A = ⎝8 6 2⎠ 202 has the first column equal to the sum of the other two. . j|ai /aj = (K . Two equal rows (columns) are a special case of this property (K = 1) as well as a row(column) composed only zeros (K = 0). . If a row (column) is multiplied by a scalar λ its determinant is λ det A.4 Determinant of a Matrix 21 Example 2. . .

44 det λA = det = 16 − 8 = 8 = λ2 det A. s submatrix is a matrix obtained from A after by cancelling r rows and s columns. The determinant of the product between two matrices is equal to the products of the determinants: det AB = det A det B = det B det A = det BA.9 For the following matrix. s be two positive integer numbers such that 1 ≤ r ≤ m and 1 ≤ s ≤ n. Let r.n . If λ is a scalar.10 Let us consider the following matrices. 24 • Let A and B be two matrices and det A. If we calculate AB we obtain the product matrix: 10 AB = 16 whose determinant is 6 that is equal to det A det B.22 2 Matrices while. 1 −1 A= 1 2 and 12 B= 02 whose determinants are det A = 3 and det B = 2. A r. 22 det A = det = 4. 24 • Let A be a matrix and det A its determinant. if we multiply all the elements of the matrix by λ = 2. 22 det A = det =2 12 while. if we multiply the second row by 2. Definition 2. det λA = λn det A. If we calculate BA we obtain the product matrix: 33 BA = 24 whose determinant is 6 that is equal to det A det B. respectively. Example 2.21 (Submatrices) Let us consider a matrix A ∈ Rm. . Example 2. det B their respective determinants.

a3.3 ⎠ . The submatrix is obtained by cancelling only the i th row and the j th column from A is said complement submatrix to the element ai.13 Let us consider the following matrix A ∈ R3. The resulting matrix is said adjugate matrix (or adjunct or adjoint) of the matrix A and is indicated with adj(A).n . Let us substitute each element of the transposed matrix with its corresponding cofactor. Definition 2.2 .2 = (−1)M1. In order to achieve this purpose. j is defined as Ai.2. Now let us compute the transposed AT . Example 2.3 − a1.3 a3.2 = a2.1 a1.1 a3. j = (−1)i+ j Mi. Definition 2.24 Let us consider a matrix A ∈ Rn.2 a1. Definition 2. j .4 Determinant of a Matrix 23 Definition 2.22 Let us consider a matrix A ∈ Rm. The determinant of this submatrix is said minor.3 The complement submatrix to the element a1.3 A = ⎝ a2.25 (Adjugate Matrix) Let us consider a matrix A ∈ Rn. its determinant is said major determinant or simply major. j of the element ai.2 is a2.3 : ⎛ ⎞ 130 A = ⎝5 3 2⎠ 012 and compute the corresponding Adjugate Matrix. Example 2.23 Let us consider a matrix A ∈ Rn. j .n . 022 . If the submatrix is the largest square submatrix of the matrix A. The cofactor Ai. the cofactor A1.3 : ⎛ ⎞ a1.11 Let us consider a matrix A ∈ R3. j and corresponding complement minor Mi. its generic element ai.12 From the matrix of the previous example.2 a2. j and its determinant is here named complement minor and indicated with Mi. Example 2.n . let us compute AT : ⎛ ⎞ 150 AT = ⎝ 3 3 1 ⎠ .1 a2.n and one of its square submatrices.1 a3.2 a3.3 while the complement minor M1.3 A= a3.1 a2.1 a3.1 . j .

M3. The Adjugate Matrix adj (A) is: ⎛ ⎞ 4 −6 6 adj (A) = ⎝ −10 2 −2 ⎠ . and M3. The determinant of A can be computed. 5 −1 −12 Theorem 2.2 = 6.1 (I Laplace Theorem) Let A ∈ Rn. M1. M2.2 = 2.3 = −12.3 = 2.1 = 4. M2.24 2 Matrices Let us compute the nine complements minors: M1.2 = 1.1 = 10.1 = 5. M3.n .3 = 6. M1. M2.

j for any arbitrary i and . j Ai.as the sum of each row (element) multiplied by the corresponding cofactor: det A = nj=1 ai.

3 ⎠ .1 A1.2 (−1)A1.1 a2.15 Let us consider the following A ∈ R3.n det A = i=1 ai. j for any arbitrary j. det A = 0(−1)(−2) + 1(−4) + 1(−1)(−6) = 2. Hence. If we consider the first row.14 Let us consider the following A ∈ R3. the matrix is singular.1 + a2. Hence. Example 2.2 + a2.3 A1. Example 2. it follows that det A = a2.1 (−1)A2. j Ai. If we consider the second row. a3. the matrix is non- singular. Let us now calculate the determinant by apply- ing the I Laplace Theorem.3 A = ⎝ a2. Let us prove the I Laplace Theorem in the special case of an order 3 matrix: ⎛ ⎞ a1.3 .2 a2. Obviously.2 a1.2 a3.2 A2. Let us now calculate the determinant by applying the I Laplace Theorem.3 .3 .1 + a1.3 : ⎛⎞ 121 A = ⎝0 1 1⎠ 420 The determinant of this matrix is det A = 8−4−2 = 2.1 a3. det A = 2(0) + 1(0) + 3(0) = 0. the I Laplace Theorem can be expressed in the equivalent form: the determinant of a matrix is equal to scalar product of a row (column) vector by the corresponding vector of cofactors. The result is the same.3 (−1)A2.1 a1.3 : ⎛ ⎞ 2 −1 3 A = ⎝ 1 2 −1 ⎠ −1 −2 1 The determinant of this matrix is det A = 4 − 1 − 6 + 6 + 1 − 4 = 0. We arrive to the same conclusion. it follows that det A = a1.2 + a1.

3 + a1.2 = a1.  Theorem 2.1 + a1.1 a2.2 − a1.2 − a1.3 − a2.1 det − a1.1 a2.2 a2.3 det =  a3.2.2 (II Laplace Theorem) Let A ∈ Rn.2 a3.2 − a1.2 − a1.2 a3.2 a2.3  a3.1 a3.3 a3.1 a3.2 a3.1 − a1.1 a3.1 a2.1 a2.2 a2.3 a3. The sum of the elements of a row (column) multiplied.4 Determinant of a Matrix 25 Proof By applying the definition det A = a1.1 a3.1 a3.3 a2.3  a3.1 a2.2 − a2.3 + a1.3 a2.3 a2. By applying the I Laplace Theorem we obtain a2.3 a2.3 a2.2 det A = a1.3 a2.3 a3.1 + a1.1 a3.3 a3.3 a3.1 a2.2 det + a1.2 a2.1 a3.1 + a1.n with n > 1.3 − a2.3 a3.3 a2.2 a2.2 a3.3 − a1.1 a2.1 a3.1 = a1.2 a3.2 a2.3 − a1.2 a3.2 a3.2 .1 that is the determinant calculated by means of the definition.

by the corresponding cofactor related to another row (column) is always zero: nj=1 ai. j Ak. j = 0 for any arbitrary k = i .

a3.1 A2.1 a1.3 = 1 (2) + 2 (−4) + 1 (6) = 0.n and i=1 ai. Let us apply now the II Laplace Theorem by applying the scalar product of the first row vector by the vector of the cofactors associated to the second row: a1. and A2.3 A2. Hence.1 a2.1 a3.2 = (−4).2 a1. A2.2 = −4.2 A2.1 + a1.2 A2.k = 0 for any arbitrary k = j.3 A = ⎝ a2. The II Laplace Theorem can be equivalently stated as: the scalar product of a row (column) vector by the vector of cofactors associated to another row (column) is always null. j Ai.3 .3 ⎠ .3 A2. It can been seen that if we multiply the third row vector by the same vector of cofactors the result still is 0: a3.1 A2.1 = 2.1 = (−1) (−2). and A2.3 = 6.3 = 4 (2) + 2 (−4) + 0 (6) = 0. Now.2 a3.2 + a3.16 Let us consider again the matrix ⎛ ⎞ 121 A = ⎝0 1 1⎠ 420 and the cofactors associated to second row: A2. A2. let us prove the II Laplace Theorem in the special case of an order 3 matrix: ⎛ ⎞ a1.2 a2.2 + a1.3 = (−1) (−6).1 + a3. A2. Example 2.

1 + a2.1 a3.n . If det A = 0 the matrix is said non-singular.2 a3...3 a3.n is an invertible matrix then the inverse matrix is unique: ∃!B ∈ Rn.n .3 det =  a3..3 − a2.n |AB = I = BA.3 a3. Thus.  The only inverse matrix of the matrix A is indicated with A−1 .n and it adjunctive matrix.. ⎛ ⎞ A1. The matrix B is said inverse matrix of the matrix A.2 − a2. ⎠ A1...26 Let A ∈ Rn.3 a2. . . ⎠ an.1 a3.1 = a2.. The matrix A is said invertible if ∃ a matrix B ∈ Rn. ⎟ .3 a2..n . An.n ⎜ a2...1 a2.1 .5 Invertible Matrices Definition 2.2 . For the contradiction hypothesis CA = I. Definition 2. then C = B.1 + a2.2 a3.  2.. a2. Proof If the inverse matrix would not be unique (by contradiction) there would exists another matrix C ∈ Rn.1 a3.2 a2.2 a2.1 A2. If B is an inverse matrix of A and another inverse matrix C exists. An.3 − a2.3 a2.. This would mean that AC = I = CA. If det A = 0 the matrix is said singular.2 − a2..3 a2. .2 = a2. Theorem 2.n ⎟ A=⎜ ⎝ . .n .2 A2.. The inverse matrix A−1 = 1 det A adj (A). Thus.3 a2. Proof Let us consider the matrix A. ..1 a2.1 an.. then C = C (AB) = (CA) B. .n |AB = I = BA.4 Let A ∈ Rn..1 ⎜ A1. An.26 2 Matrices Proof Let us calculate the elements of the scalar product of the element of the second row by the cofactors associated to the first row: a2.2 ⎟ adj (A) = ⎜ ⎝ . j its generic cofactor.2 a2. the inverse matrix is unique. . .2 .1 a3..2 .3 a3.2 a3. a1. Theorem 2.n and Ai.1 a3..3 − a2.2 a3. . . C = IB = B.2 a3.2 − a2.n inverse of A.27 Let A ∈ Rn. .1 det − a2.3 a3.2 − a2..2 a2. The matrix C can be written as C = CI.1 a2.2 det + a2.3 If A ∈ Rn.1 a2.2 a2..n A2.1 a2.1 a3.2 . ⎟.3 a3. an.3 a3..1 = 0.1 a2.1 a1. Since by hypothesis I = AB. ⎛ ⎞ a1.3 + a2.

+ a1. 11 The determinant of this matrix is det A = 1.2 + . 0 ⎟ Aadj (A) = ⎜⎝ . . .n .2 + .. det A −1 2 Example 2. .  Example 2..2 An.n . the diagonal elements are equal to det A while. The transposed of this matrix is 21 AT = . + an.1 An.. .n An. .n A1. .n A1. . .n An. 11 which. .n ⎜ a2... . . + a2. for the I Laplace Theorem. for the II Laplace Theorem.1 A1. the extra-diagonal elements are equal to zero: ⎛ ⎞ det A 0 . .2 + ..n ⎟ ⎜ Aadj (A) = ⎝ ⎟. . in this case.2 An. . .. −1 2 The inverse of the matrix A is then −1 1 1 −1 A = adj (A) = .n . .n and let consider that..17 Let us calculate the inverse of the matrix 21 A= .. a2. .n A2.5 Invertible Matrices 27 Let us compute the product matrix Aadj (A). .2 A1.2 + . ⎠.18 Let us calculate the inverse of the matrix 1 1 A= . . . .n An.1 + a2. ⎛ ⎞ a1. Aadj (A) = (det A) I and A−1 = 1 det A adj (A).2 An.2.1 A1. . .1 + an. is equal to A.1 + a1. + a1. . 0 ⎜ 0 det A .1 + an.1 + a2.. det A Thus.1 A1.2 A1. ..1 An. an..2 + . + a2.1 An. . . . ⎠ an. The adjunctive matrix is 1 −1 adj (A) = .2 + . −2 1 . ⎟ 0 0 ..2 A1. + an. .1 + a1.. . a1.

2 −a1.2 then 1 a2.1 Let A ∈ R2.2 A−1 = . 131 .28 2 Matrices The determinant of this matrix is det A = 3. 1 1 The adjunctive matrix is 1 −1 adj (A) = .1 a1. The transposed of this matrix is 1 −2 A = T .2 the adjunctive matrix is a2.1 a2.4.3 : ⎛ ⎞ 211 A = ⎝0 1 0⎠.1 a1. 2 1 The inverse of the matrix A is then 1 −1 1 1 1 −1 − 13 A = adj (A) = = 3 2 1 .1 a1.1 a2.2 −a1. a2. Corollary 2.2 A = .2 : a1.1 A = T .19 Let us now invert a matrix A ∈ R3.2 A= .1 a1. det A 3 2 1 3 3 The latter two examples suggest introduce the corollary of Theorem 2.1 and the inverse is −1 1 a2.2 adj (A) = −a2.2 a2.1 Proof Let us calculate the transposed of A: a1.1 Example 2.2 −a1. det A −a2. a1. det A −a2.

A is non-singular.2. Furthermore. only one inverse matrix can be found in accor- dant with Theorem 2.2 Let A ∈ Rn. Proof (1) If A is invertible then A is non-singular  If A is invertible then ∃!A−1 |AA−1 = I = A−1 A. then det A = 0. Intuitively. We know that B = A−1 and thus A is invertible. The following theorem introduces the theoretical foundation of this intuition. We could not have taken B in this way if det A = 0.n taken as B = det1 A adj (A).5 Invertible Matrices 29 The determinant of this matrix is det A = 2 − 1 = 1. the determinant of the product of two (square) matrices is equal to the product  of the determinants of these two matrices. The matrix A is invertible if and only if A is non- singular. 101 The adjunctive matrix is ⎛ ⎞ 1 2 −1 adj (A) = ⎝ 0 1 0 ⎠ −1 −5 2 and the corresponding inverse matrix is ⎛ ⎞ 1 2 −1 1 A−1 = adj (A) = ⎝ 0 1 0 ⎠ . we can consider a matrix B ∈ Rn. The transposed of this matrix is ⎛ ⎞ 201 AT = ⎝ 1 1 3 ⎠ . . i. det A −1 −5 2 In all the previous examples. Since for the properties of the determinant. Theorem 2. it can be observed that for a singular matrix the inverse cannot be calculated since the formula A−1 = det1 A adj (A) cannot be applied (as it would require a division by zero). i.n . Hence. det A = 0. Thus.  Corollary 2.  (2) If A is non-singular then A is invertible If A is non-singular.e. det A = 0.3.5 Let A ∈ Rn. Then det AA−1 = det I = 1.n be an invertible matrix. it follows that det AA−1 = (det A) det A−1 = 1. It follows that det A−1 = 1 det A . all the inverted matrices are non-singular.e.

the determinants  beTan T  are still equal: det AA = det I.6 An orthogonal matrix is always non-singular and its determinant is either 1 or −1.30 2 Matrices 2. (det A)2 = 1.) A matrix is orthogonal if and only if the sum of the squares of the element of a row (column) is equal to 1 and the scalar product of any two arbitrary rows (columns) is equal to 0.21 The rank of the matrix 1 −1 −2 −1 1 0 is 2 as the submatrix −1 −2 1 0 is non-singular. Proof Let A ∈ Rn.7 Rank of a Matrix Definition 2. det I = 1. Example 2. For the properties of the determinant det AAT = det A det AT .20 The following matrices are orthogonal: sin(α) cos(α) cos(α) −sin(α) and ⎛ ⎞ sin(α) 0 cos(α) ⎝ 0 1 0 ⎠.6 Orthogonal Matrices Definition 2. det A = det AT . Theorem 2. is the highest order of the non-singular submatrix ⊂ A. This can happen only when det A = ±1. Then. indicated as ρA .n is said orthogonal if the product between it and its transposed is the identity matrix: AAT = I = AT A.n orthogonal matrix. Example 2.n with A assumed to be different from the null matrix. AA = I. If A is the null matrix then its rank is taken equal to 0.28 A matrix A ∈ Rn.29 Let A ∈ Rm. Thus.7 (Property of Orthogonal Matrices. The rank of the matrix A.  Theorem 2. . Thus. −cos(α) 0 sin(α) 2.

4 ⎝ a2. A r +1 square submatrix of A that contains also Mr is here said edged submatrix.30 Let A ∈ Rm. If Mr is non-singular and all the edged submatrices of Mr are singular.1 a3. n}.23 Let us consider the following matrix ⎛ ⎞ 1213 A = ⎝1 0 1 1⎠. Thus ρA = 2.4 ⎠ . a3. then the rank of A is r .1 a2.2 a1.3 a1.4 ⎠ a3.3 ⎠ a3. a1.2.1 a3.22 Let us consider a matrix A ∈ R4.3 a3.2 a1.3 a3. 1213 . ⎛ ⎞ a1.4 From A let us extract the following submatrix Mr . 1 −1 1 It can be easily seen that det A = 0 while there is at least one non-singular square submatrix having order 2.2 a2.8 (Kronecker’s Theorem) Let A ∈ Rm.1 a1.1 a1. a3.3 a2.n and let us indicate with Mr a r order square submatrix of A.2 a3. Definition 2.2 a3.4 A = ⎝ a2.3 and ⎛ ⎞ a1.3 Edged submatrices are ⎛ ⎞ a1.1 a2.2 a2.1 a1.3 .3 a2.3 Mr = . Example 2.1 a3.1 a2.1 a1.1 a3.n and Mr a r order square sub- matrix of A with 1 ≤ r < min{m.3 a1. Example 2.3 ⎝ a2.7 Rank of a Matrix 31 Let us consider the following matrix: ⎛ ⎞ −1 2 −1 A = ⎝ 2 −3 2 ⎠ .4 Theorem 2.

that is ⎛ ⎞ 213 ⎝0 1 1⎠ 213 and ⎛ ⎞ 113 ⎝1 1 1⎠. We can verify this fact by considering the other order 3 submatrices. hence the rank of A is indeed 2. Lemma 2.24 Let us consider the following matrix 21 A= 12 and the matrix 011 B= .32 2 Matrices The submatrix 12 10 is non-singular while its edged submatrices ⎛ ⎞ 121 ⎝1 0 1⎠ 121 and ⎛ ⎞ 123 ⎝1 0 1⎠ 123 are both singular. Let A be non-singular and ρB be the rank of the matrix B.n and B ∈ Rn. 111 . Example 2. Kronecker’s Theorem states that this is enough to conclude that the rank of the matrix A is 2. also these two submatrices are singular.1 Let A ∈ Rn.q . It follows that the rank of the product matrix AB is ρB . 113 Obviously.

If we consider a matrix B having rank equal to 1 011 B= 011 and multiply A by B we obtain 033 AB = 033 that has rank ρAB = ρB = 1.7 Rank of a Matrix 33 The matrix A is non-singular and hence has rank ρA = 2. and ρAB be the rank of the product matrix AB.1. in accordance with Lemma 2. .n and B ∈ Rn.9 (Sylvester’s Law of Nullity) Let A ∈ Rm. let us consider the following two matrices: 01 A= 11 and 41 B= .q .25 In order to check the Sylvester’s law of nullity. The product matrix 133 AB = 233 has rank ρAB = ρB = 2. Example 2. 00 Matrix A is non-singular. Let ρA and ρB be the ranks of the matrices A and B. hence has rank ρA = 2 while B is singular and had has rank ρB = 1. respectively. as expected from Lemma 2.2.1. It follows that ρAB ≥ ρA + ρB − n. The matrix B has also rank ρB = 2 since the submatrix 01 11 is clearly non-singular. Theorem 2.

i.26 Let us consider again the matrix A which we know has rank ρA = 2 and the following matrix ρB having ρB = 2: 41 B= . 42 The product matrix is non-singular. 013 . Example 2. Example 2. the Sylvester’s law of nullity for square matrices.e.3 Let A ∈ Rn. respectively. We can easily verify that ρA + ρB − n = 2 + 1 − 2 = 1 that is equal to ρAB . It follows that ρAB = ρA + ρB − n. In other words. i. and ρAB be the rank of the product matrix AB.n and B ∈ Rn. the Sylvester’s law of nullity tells us before calculating it. Let ρA and ρB be the ranks of the matrices A and B.27 Before entering into next theorem and its proof.e.34 2 Matrices Let us calculate the product AB: 00 AB = 41 whose rank is ρAB = 1. that AB is non-singular. let us introduce the concept of partitioning of a matrix. its rank is ρAB = 2. Let us consider the following two matrices: ⎛ ⎞ 210 A = ⎝1 3 1⎠ 110 and ⎛ ⎞ 111 B = ⎝2 1 0⎠. These examples suggest the following corollary. 01 In accordance with the Sylvester’s law of nullity we expect that the product AB has a rank > 1 since ρA + ρB − n = 2 + 2 − 2 = 2.n . which verifies this theorem. Corollary 2. Let us verify this fact: 01 AB = .

2 = .1 A2.1 A2.1 A1.2 A= A2. 13 0 A1.1 where 111 B1. and  A2.1 = 1 1 .2 B2.1 B1.1 = 210 and  B2.1 = 0 1 3 . Analogously. A2.1 B= B2.1 A2.2 = 0 .1 .7 Rank of a Matrix 35 We can calculate the matrix product ⎛ ⎞ 432 AB = ⎝ 7 5 4 ⎠ . We can now treat the submatrices as matrix element.2 where 21 A1.1 B1.2 B2. This means that in order to calculate AB we can write A1.1 AB = = . 1  A2. the matrix B can be written as B1. 321 Let us now re-write the matrices A as A1.2 B1.1 A1.1 + A2. This variable replacement is said partitioning of a matrix.1 A1.1 = .2.1 + A1.2 B2.

1 + A2. division into sub-matrices): A1. Let ρA and ρB be the ranks of the matrices A and B.1 321 The Sylvester’s Law of Nullity can be also expressed in a weaker form that is often used in applied sciences and engineering.1 AB = = ⎝7 5 4⎠.1 + A1.2 B2.1 A2.10 (Weak Sylvester’s Law of Nullity) Let A ∈ Rm.1 B1.2 B2.1 + A2.2 B2. It follows that ρA + ρB ≤ n. Hence.1 B= .1 B1.1 = 11 + 0 013 = 210    = 321 000 = 321 .1 B1. A2.1 of size ρA × q.1 = + 013 = 13 210 1 432 000 432 = + = 741 013 754 and  111   A2.1 has rank ρA and size ρA × ρA .e.2 A= A2.1 = O A2.1 Since AB = O.1 + A1. Theorem 2.1 + A1.2 B2.1 B1.n and B ∈ Rn.36 2 Matrices We can easily verify this fact by calculating the two matrix expressions: 21 111 0  A1.2 where A1. B2.1 B1.1 B1. Proof Let us rearrange the rows and columns of A in order to have the following partitioning (i.1 = O. B1. it follows that A1. Analogously. let us perform the partitioning of the matrix B in order to have B1. we obtain the same AB matrix as before by operation with the partitions instead of working with the elements: ⎛ ⎞ 432 A1.2 B2.q such that AB = O.1 + A2. .2 B2.1 A1.

1 A1. 1 . Furthermore.2. for the Lemma 2.1 = −A−1 1. ρB ≤ (n − ρA ) ⇒ ρA +ρB ≤ n. ρA = 2 and ρB = 1.1 .2 B2.7 Rank of a Matrix 37 From the first equation B1. Obviously n = 3 which is 2 + 1. Let us consider the following non-singular matrix I A−1 1.1 Since the matrix P is non-singular.29 Let us consider now the following matrices: ⎛ ⎞ 52 1 A = ⎝ 0 1 −2 ⎠ 42 0 and ⎛ ⎞ −1 B = ⎝ 2 ⎠.1 is ρ B . ρ B can be at most (n − ρ A ). The size of B2. Example 2. B2.1 is (n − ρ A ) × q.28 Let us consider the following two matrices verifying the hypotheses of the theorem: ⎛ ⎞ 100 A = ⎝5 1 0⎠ 000 and ⎛ ⎞ 000 B = ⎝0 0 0⎠. 001 The product of the two matrices is AB = O.  Example 2.2 P= . It follows that the rank of B2. In other words. the matrix PB has rank ρ B . O I If we calculate PB we obtain O PB = .1 A1.1. Hence.

3. 1 −12 −2 . 0. 15 2 1 2. 5) 2. Considering that n = 3.3 Multiply the following two matrices: 2 1 −1 A= −2 3 4 ⎛ ⎞ 1 1 22 B = ⎝ 2 −3 1 3 ⎠ .1 Calculate the product of λ = 3 by the vector x = (2. 4.2 Calculate the scalar product of x = (2. 2. the weak Sylvester’s law of nullity is verified since n = ρA + ρB .4 Multiply the following two matrices: ⎛ ⎞ 7 −1 2 A = ⎝ 2 3 −2 ⎠ 5 4 1 ⎛ ⎞ −21 −3 2 B = ⎝ 4 1 −1 ⎠ . ρB = 1. −1 −1 5 2 2.38 2 Matrices We can easily verify that A is singular and its rank is ρA = 2. −3. 1) by y = (3. 1. It can be observed that if A is a non-singular matrix of order n and B is a vector having n rows.5 Multiply the following two matrices: ⎛ ⎞ 2 4 1 A = ⎝ 0 −1 0 ⎠ 1 8 0 ⎛ ⎞ 0 8 1 B = ⎝ 0 −1 0 ⎠ . Obviously. −1) 2.8 Exercises 2. −3. Moreover. we can verify that AB = O. the only way to have AB = O is if B null vector.

(2) Invert the matrix. (3) Prove the property. ⎛ ⎞ 2 −1 2 A = ⎝ 2 5 −3 ⎠ k 4 k+1 ⎛ ⎞ k −2 2 − k B = ⎝ k + 4 1 −1 ⎠ .8 Exercises 39 2. ⎛ ⎞ 5 −3 4 A = ⎝ 6 2 −1 ⎠ 12 4. 0 −2 3 (1) Verify that A is invertible. (2) Verify for the provided example the Property of Orthogonal Matrices.2 3 2.6 For each of the following matrices.7 Compute the adj (A) where A is the following matrix. (3) Verify that AA−1 = I.9 Let us consider the following matrix A: ⎛ ⎞ 3 −1 2 A =⎝0 1 2⎠. 3 2 1 2.10 (1) Give an example of orthogonal matrix. determine the values of k that make the matrix singular. 2. .2. 3 −3 A= 8 2 ⎛ ⎞ 6 2 2 A = ⎝ 7 −2 −5 ⎠ 13 6 4 ⎛ ⎞ 1 −2 3 −1 ⎜ 2 1 −1 2 ⎟ A =⎜ ⎝ 3 −2 ⎟ −1 −2 ⎠ −1 2 −1 −4 2.11 Prove the Lemma 2.8 Invert the following matrices. 2.1.

. These equations compose a system of linear equations indicated as: ⎧ ⎪ ⎪ a1. y2 . + an. + an. ⎪. . . © Springer International Publishing Switzerland 2016 41 F. . ..2 2 2.Chapter 3 Systems of Linear Equations 3. . . + a2. . .2 x2 + . . .1007/978-3-319-40341-0_3 . ai is said coefficient of the equation. .n xn = bn Every ordered n-tuple of real numbers y1 . x2 . . .2 x2 + .n xn = b1 ⎪ ⎨a x + a x + . .2 y2 + .1 y1 + a1. .1 x1 + a1. . .1 A linear equation in R in the variables x1 .n yn = b2 ⎪ ⎪. ⎪ ⎪ ⎩ an. xn is an equation of the kind: a1 x1 + a2 x2 + . and b is said known term. + a x = b 2.1 y1 + a2. . Definition 3. ⎪ ⎩ an. yn such that ⎧ ⎪ ⎪a1. + an xn = b where ∀ index i. .1 x1 + an. . .1 y1 + an. Linear Algebra for Computational Sciences and Engineering.1 1 2. . ai xi is said ith term of the equation. Coefficients and known term are constant and known numbers in R while the variables are an unknown set of numbers in R that satisfy the equality. .2 y2 + . DOI 10.n yn = b1 ⎪ ⎨a2..n yn = bn is said solution of the system of linear equations. . .1 Solution of a System of Linear Equations Definition 3. . x2 . + a1. + a1.2 y2 + . Neri.n n 2 . xn .2 Let us consider m (with m > 1) linear equations in the variables x1 .

. i. . Ax = b. then ∃!Y|AY = b.1 a2. a2.2 ..⎠ ⎟ xn ⎛ ⎞ b1 ⎜ b2 ⎟ b=⎜ ⎝. . If A is non-singular. .1 a2... .. The matrix Ac ∈ Rm.2 .42 3 Systems of Linear Equations A system can be written as a matrix equation Ax = b where ⎛ ⎞ a1. . am.2 . . Theorem 3.. . .1 a1. Proof Let us consider the system Ax = b. a1. there is only one solution simultaneously satisfying all the equations: if det A = 0.2 . the solution does not change.n ⎟ A=⎜ ⎝ . .1 am.n bm Two systems of linear equations are said to be equivalent if they have the same solution.. ⎠ ⎟ am.1 a1.. If A is non-singular for the Theorem 2.. Let us multiply A−1 by the equation representing the system: A−1 (Ax) = A−1 b ⇒ . a matrix A−1 exists.n ⎜ a2. . .2 . It can be observed that if two equations of a system (two rows of the matrix Ac ) are swapped. ⎟ am.n b1 ⎜ a2.. . ..n+1 whose first n columns are those of the matrix A and the (n + 1)th column is the vector b is said complete matrix: ⎛ ⎞ a1..e.... ... ..5 the matrix A is invertible.⎠. . Let us consider a system of n linear equations in n variables.2 .1 Cramer’s Theorem.⎠ ⎟ bm The coefficient matrix A is said incomplete matrix..1 am.. a1. . . a2.n b2 ⎟ Ac = (A|b) = ⎜ ⎝ ...n ⎛ ⎞ x1 ⎜ x2 ⎟ x=⎜ ⎝.. am.

3 the inverse matrix A−1 is unique and thus also the vector x is unique. i.  . the only one solution solving the system exists.e. −1 ⇒ A A x = A−1 b ⇒ ⇒ Ix = A−1 b ⇒ x = A−1 b For Theorem 2.

3. Obviously it follows that .1 Solution of a System of Linear Equations 43 We could verify that A−1 b is the solution of the system by substituting it within the system Ax = b.

.

the system is solved. The inverse matrix A−1 = det1 A adj (a). . the solution of a system of linear equations Ax = b is x = A−1 b. det A 1 −1 1 1 Thus x = A−1 b = 6 − 3 − 2. Thus.1 Solve the following system by inverting the coefficient matrix: ⎧ ⎪ ⎨2x − y + z = 3 x + 2z = 3 . A A−1 b = AA−1 b = Ib = b. on the basis of the Cramer’s Theorem. 1. the inverse matrix of the coefficient matrix should be computed and multiplied by the vector of known terms. the matrix is non-singular and is invertible. 6 − 3 − 3. The transposed is: ⎛ ⎞ 2 1 1 AT = ⎝ −1 0 −1 ⎠ 1 2 0 The inverse matrix is: ⎛ ⎞ 2 −1 −2 1 1⎝ adj (A) = 2 −1 −3 ⎠ . −3 + 3 + 1 = 1. A system of linear equations that satisfies the hypotheses of the Cramer’s Theorem is said a Cramer system. in order to solve a system of lin- ear equations. Thus. i. 0. Example 3. In other words. ⎪ ⎩ x−y=1 The system can be re-written as a matrix equation Ac = b where ⎛ ⎞ 2 −1 1 A = ⎝1 0 2⎠ 1 −1 0 and ⎛ ⎞ 3 b = ⎝3⎠ 1 In order to verify the non-singularity. It can be easily seen that det A = 1.e. let us compute det A.

44 3 Systems of Linear Equations

Theorem 3.2 (Cramer’s Method) For a given system of linear equations Ax = b
with A non-singular, a generic solution xi element of x can be computed as, see [2]:

det Ai
xi =
det A

where Ai is constructed from matrix A by substituting the ith column with the vector
of known terms b:
⎛ ⎞
a1,1 a1,2 . . . b1 . . . a1,n
⎜ a2,1 a2,2 . . . b2 . . . a2,n ⎟
⎜ ⎟
⎝ ... ... ... ... ... ... ⎠.
an,1 an,2 . . . bn . . . an,n

Proof Let us consider a system of linear equations


⎪ a1,1 x1 + a1,2 x2 + . . . + a1,n xn = b1

⎨a x + a x + . . . + a x = b
2,1 1 2,2 2 2,n n 2
.

⎪ ...


an,1 x1 + an,2 x2 + . . . + an,n xn = bn

We can compute x = A−1 b:
⎛ ⎞ ⎛ ⎞⎛ ⎞
x1 A1,1 A2,1 ... An,1 b1
⎜ x2 ⎟ 1 ⎜ A1,2 A2,2 ... An,2 ⎟ ⎜ b2 ⎟
⎜ ⎟= ⎜ ⎟⎜ ⎟
⎝ . . . ⎠ det A ⎝ . . . ... ... ... ⎠⎝...⎠
xn A1,n A2,n ... An,n bn

that is
⎛ ⎞ ⎛ ⎞
x1 A1,1 b1 + A2,1 b2 + . . . + An,1 bn
⎜ x2 ⎟ ⎜ ⎟
⎜ ⎟ = 1 ⎜ A1,2 b1 + A2,2 b2 + . . . + An,2 bn ⎟ .
⎝ . . . ⎠ det A ⎝ ... ⎠
xn A1,n b1 + A2,n b2 + · · · + An,n bn

For the I Laplace Theorem the vector of solutions can be written as:
⎛ ⎞ ⎛ ⎞
x1 det A1
⎜ x2 ⎟ ⎜ ⎟
⎜ ⎟ = 1 ⎜ det A2 ⎟ .
⎝ . . . ⎠ det A ⎝ . . . ⎠
xn det An



3.1 Solution of a System of Linear Equations 45

Example 3.2 Considering the system of the previous example, where det A = 1,
⎛ ⎞
3 −1 1
x1 = det ⎝ 3 0 2 ⎠ = 1
1 −1 0
⎛ ⎞
231
x2 = det ⎝ 1 3 2 ⎠ = 0
110

and,
⎛ ⎞
2 −1 3
x3 = det ⎝ 1 0 3 ⎠ = 1.
1 −1 1

Definition 3.3 A system of m linear equations in n variables is said compatible if it
has at least one solution, determined if it has only one solution, undetermined if it
has infinite solutions, and incompatible if it has no solutions.

Theorem 3.3 (Rouchè-Capelli Theorem (Kronecker-Capelli Theorem)) A system
of m linear equations in n variables Ax = b is compatible if and only if both the
incomplete and complete matrices (A and Ac respectively) are characterised by the
same rank ρA = ρAc = ρ named rank of the system, see [3].

A proof of the Rouchè-Capelli theorem is given in Appendix B.
The non-singular submatrix having order ρ is said fundamental submatrix. The
first practical implication of the Rouchè-Capelli Theorem is that when a system of
m linear equations in n variables is considered, its compatibility can be verified by
computing ρA and ρAc .
• If ρA < ρAc the system is incompatible and thus it has no solutions.
• If ρA = ρAc the system is compatible. Under these conditions, three cases can be
identified.
– case 1: If ρA = ρAc = ρ = n = m, the system is a Cramer’s system and can be
solved by the Cramer’s method.
– case 2: If ρA = ρAc = ρ = n < m, ρ equations of the system compose a
Cramer’s system (and as such has only one solution). The remaining m − ρ
equations are a linear combination of the other, these equations are redundant
and the system has only one
 solution.
<n
– case 3: If ρA = ρAc = ρ , the system is undetermined and has ∞n−ρ
≤m
solutions.

46 3 Systems of Linear Equations

Example 3.3 Let us consider the following system of linear equations:


⎨3x1 + 2x2 + x3 = 1
x1 − x2 = 2 .


2x1 + x3 = 4

The incomplete and complete matrices associated to this system are:
⎛ ⎞
3 2 1
A = ⎝ 1 −1 0 ⎠
2 0 1

and
⎛ ⎞
3 2 11
A = ⎝ 1 −1 0 2 ⎠
c

2 0 14

The det (A) = −3. Hence, the rank ρA = 3. It follows that ρAc = 3 since a
non-singular 3 × 3 submatrix can be extracted (A) and a 4 × 4 submatrix cannot be
extracted since the size of Ac is 3 × 4. Hence, ρA = ρAc = m = n = 3 (case 1). The
system can be solved by Cramer’s Method.
Only one solution exists and is:
⎛ ⎞
1 2 1
det ⎝ 2 −1 0 ⎠
4 0 1 1
x1 = =
−3 3
⎛ ⎞
311
det ⎝ 1 2 0 ⎠
241 5
x2 = =−
−3 3
⎛ ⎞
3 2 1
det ⎝ 1 −1 2 ⎠
2 0 4 10
x3 = = .
−3 3

Example 3.4 Let us now consider the following system of linear equations:


⎪ 3x1 + 2x2 + x3 = 1

⎨x − x = 2
1 2
.

⎪ 2x + x3 = 4


1
6x1 + x2 + 2x3 = 7

3.1 Solution of a System of Linear Equations 47

In this case we have m = 4 equations and n = 3 variables. From the previous
example we already know that ρA = 3. The complete matrix has also rank ρAc = 3
because the 4th row is a linear combination of the first three (the 4th row is the sum
of the first three rows). Hence ρA = ρAc = n = 3 ≤ m = 4. This is a case 2. The
system has only one solution, that is that calculated above and the 4th equation is
redundant (the same solution satisfies this equation as well).

Example 3.5 Let us now consider the following system of linear equations:


⎨3x1 + 2x2 + 5x3 = 5
x1 − x2 = 0 .


2x1 + 2x3 = 2

In this case we have m = 3 equations and n = 3 variables. The matrix associated
to the system is
⎛ ⎞
3 2 5
A = ⎝ 1 −1 0 ⎠ .
2 0 2

It can be observed that the 3rd column is linear combination of the other two.
Hence, this matrix is singular. As such, it cannot be solved by Cramer’s Method
(nor by Cramer’s Theorem). The rank of this matrix is 2 as well as the rank of the
complete matrix. Hence, we have ρA = ρAc = 2 < n = 3. For the Rouchè-Capelli
Theorem the system has ∞n−ρ = ∞1 solutions. This is a case 3.
It can be observed that any solution proportionate to 1, 1, 1 solves the system
above, e.g. 100, 100, 100 would be a solution of the system. We can synthetically
write that α, α, α, ∀α.

Example 3.6 Finally let us consider the system of linear equations:


⎪ 3x1 + 2x2 + x3 = 1

⎨x − x = 2
1 2
.

⎪ 2x + x3 = 4


1
6x1 + x2 + x3 = 6

We already know from the examples above that ρA = 3. In this case the ρAc = 4
because det (Ac ) = −6 = 0. Hence, ρA = ρAc . The system is incompatible, i.e.
there is no solution satisfying the system.

48 3 Systems of Linear Equations

Example 3.7 Let us consider the following system of linear equations:


⎪ x1 + x2 − x3 = 2




⎨2x1 + x3 = 1
x2 + 3x3 = −3 .



⎪ 2x1 + x2 + 4x3 = −2


⎩x + 2x + 2x = −1
1 2 3

The incomplete and complete matrices associated to this system are:
⎛ ⎞
1 1 −1
⎜2 0 1 ⎟
⎜ ⎟
A=⎜ ⎜0 1 3 ⎟

⎝2 1 4 ⎠
12 2

and
⎛ ⎞
1 1 −1 2
⎜2 0 1 1 ⎟
⎜ ⎟
Ac = ⎜
⎜0 1 3 −3 ⎟
⎟.
⎝2 1 4 −2 ⎠
1 2 2 −1

It can be verified that the ρA = ρAc = 3. Thus, the system for the Rouchè-Capelli
theorem is compatible. Since ρ = n < m we are in case 2. The 4th row of Ac is a
linear combination of the 2nd and 3rd rows (it is the sum of those two rows). The
5th row of Ac is a linear combination of 1st and 3rd rows (it is the sum of those two
rows). Thus the last two equations are redundant and the solution 1, 0, −1 solves the
system of 5 equations in 3 variables.

Example 3.8 Let us consider the following system of linear equations:


⎨5x + y + 6z = 6
2x − y + z = 1


3x − 2y + z = 1.

The incomplete and complete matrices associated to this system are:
⎛ ⎞
5 1 6
A = ⎝ 2 −1 1 ⎠
3 −2 1

3.1 Solution of a System of Linear Equations 49

and
⎛ ⎞
5 1 66
Ac = ⎝ 2 −1 1 1 ⎠ .
3 −2 1 1

The third column of the matrix A is some of the first two columns, hence it is
singular. It can be seen that ρA = 2 = ρAc . The system is therefore compatible.
Moreover ρA = ρAc = 2 < n = 3. Hence the system is undetermined has ∞
solutions.

Example 3.9 Let us consider the following system of linear equations:


⎨5x + y + 6z = 6
2x − y + z = 1


3x − 2y + z = 0.

We know that ρA = 2. It can be easily seen that ρAc = 3. Hence ρA = 2
< ρAc = 3, i.e. the system is impossible.

3.2 Homogeneous Systems of Linear Equations

Definition 3.4 A system of linear equations Ax = b is said homogeneous if the
vector of known terms b is composed of only zeros and is indicated with O:


⎪ a1,1 x1 + a1,2 x2 + . . . + a1,n xn = 0

⎨a x + a x + . . . + a x = 0
2,1 1 2,2 2 2,n n
.

⎪ . . .


am,1 x1 + am,2 x2 + . . . + am,n xn = 0

Theorem 3.4 A homogeneous system of linear equations is always compatible as it
always has at least the solution composed of only zeros.

Proof From the properties of the matrix product AO = O, ∀ matrix A ∈ Rm,n and
vector O ∈ Rn,1 . 

Thus, for the Rouchè-Capelli Theorem, if the rank ρ of the system is equal to
n, then the system is determined and has only one solution, that is O. If ρ < n the
system has ∞n−ρ solutions in addition to O.

50 3 Systems of Linear Equations

Example 3.10 The following homogeneous system of linear equations


⎨x + y = 0
2x − 3y + 4z = 0


2y + 5z = 0

is associated to its incomplete matrix
⎛ ⎞
1 1 0
A = ⎝ 2 −3 4 ⎠ .
0 2 5

The determinant of the incomplete matrix is det (−15 − 8 − 10) = −33. Thus,
the matrix is non-singular and consequently ρA = 3. The system is thus determined.
If we apply Cramer’s method we can easily find that the only solution of this system
is 0, 0, 0.

Theorem 3.5 If the n-tuple α1 , α2 , . . . , αn is a solution of the homogeneous system
Ax = O, then ∀λ ∈ R : λα1 , λα2 , . . . , λαn is also a solution of the system.

Proof Let us consider the generic ith row of the matrix A : ai,1 , ai,2 , . . . , ai,n .
Let us multiply this row by λα1 , λα2 , .

. . , λαn . The result of the multiplication
is
ai,1 λα1 + ai,2 λα2 + . . . + ai,n λαn = λ α1 ai,1 + α2 ai,2 + . . . + αn ai,n = λ0 = 0.
This operation can be repeated ∀ith row. Thus, λα1 , λα2 , . . . , λαn is a solution of the
system ∀λ ∈ R. 

Example 3.11 The following homogeneous system of linear equations


⎨3x + 2y + z = 0
4x + y + 3z = 0


3x + 2y + z = 0

has the following incomplete matrix
⎛ ⎞
321
A = ⎝4 1 3⎠
321

which is singular. Hence it follows that ρA = ρAc = 2 and that the system has ∞
solutions. For example 1, −1, −1 is a solution of the system. Also, 2, −2, −2 is
a solution of the system as well as 5, −5, −5 or 1000, −1000, −1000. In general
λ, −λ, −λ is a solution of this system ∀λ ∈ R.

Theorem 3.6 Let us consider a homogeneous system Ax = O. If α1 , α2 , . . . , αn and
β1 , β2 , . . . , βn are both solutions of the system, then every linear combination of these

3.2 Homogeneous Systems of Linear Equations 51

two n-tuple is solution of the system: ∀λ, μ ∈ R, λα1 + μβ1 , λα2 + μβ2 , . . . , λαn +
μβn is also a solution.

Proof Let us consider the generic ith row of the matrix A : ai,1 , ai,2 , . . . , ai,n . Let
us multiply this row by λα1 + μβ1 , λα2 + μβ2 , . . . , λαn + μβn . The result of the
multiplication

is ai.1 (λα1 + μβ1 ) + ai.2.

2 .n+1 constructed from the matrix A and adding one row ⎛ ⎞ ar.1 an+1. −2. Let the rank ρ associated to the system be n.2 . (−1)i+1 Di .. ..2 .n αn + μ ai. ⎠ an. −14) that is also a solution of the system. −2) = (14. (λα2 + μβ2 ) + . . This system has ∞1 solutions proportionate to the n-tuple D1 . ... −1) + 5 (2. .. ⎜ .. .n+1 ⎜ a2.. .n+1 Let us indicate with à a matrix ∈ Rn+1. . .1 β1 + ai. Theorem 3. . .. −D2 . . . . −2) = 4 (1. .1 a2.. ⎟ ⎟ ⎝ an. .. .1 an. ar.n+1 ⎟ à = ⎜⎜ ⎟.  Example 3. −2.. .12 Let us consider again the homogeneous system of linear equations ⎧ ⎪ ⎨3x + 2y + z = 0 4x + y + 3z = 0 ⎪ ⎩ 3x + 2y + z = 0.n βn = 0 + 0 = 0.1 a2.1 an. −14. −1. a1.2 β2 + .. + ai.. . ..7 Let Ax = O be a homogeneous system of n equations in n + 1 vari- ables. −1) + μ (2.n+1 ⎟ A=⎜ ⎝ . Proof Let us consider the matrix A: ⎛ ⎞ a1. .. (−1)n+2 Dn+1 where ∀ index i.2 . .n+1 ⎟ ⎜ ⎟ ⎜ a2. a2. Let us choose two arbitrary real numbers λ = 4 and μ = 6.2 α2 + . a2.1 a1. We know that this system has ∞ solutions and that 1. ⎟. a1. an+1.2 ... .... .1 α1 + ai. . −2.1 ar. an. −1. Let us calculate the linear combination of these two solutions by means of the scalars λ and μ: λ (1. . + ai. −2 are two solutions. . −1. an.1 a1.2 .n+1 ⎜ a1.n (λα n + μβn ) = λ ai...n+1 . .2 . Di is the determinant of the matrix A after the ith column has been cancelled. + ai.2 .n+1 ⎠ an+1. −1 and 2. .

Thus. if we multiply the ith row of the matrix A by D1 . + (−1)n ai. . −D2 . (−1)n Dn+1 can be seen as the cofactors Ãn+1. and third column and computing the respective deter- minants we obtain that the ∞ solutions solving this system are all proportionate to 1. see also [4]. . As shown in Chap. respectively.1 An+1. .2 + . The application of the Cramer’s Method (see Theorem 3. −D2 . −1. . If n = 6. Hence. . 2. Ãn+1. . . Ãn+1.n . . It can be easily verified that the computational cost of the matrix inversion is the same as that of the Cramer’s Method. the solution of the system requires 5040 mathematical operations by matrix inversion or Cramer’s Method. (−1)n Dn+1 is a solution of the system. (−1)n Dn+1 we obtain: ai. −1.n+1 An+1.3 Direct Methods Let us consider a Cramer’s system Ax = b where A ∈ Rn.n+1 . if we neglect computational simple operations such as the transposition. 3. . . Let us identify each term with a mathematical operation and we can conclude that. 101 Cancelling first.n+1 related to n + 1th row of the matrix Ã.1).n+1 Dn+1 = = ai. .52 3 Systems of Linear Equations The elements of the n-tuple D1 . −D2 . . .13 Let us consider the following homogeneous system of linear equa- tions:  2x + y + z = 0 .2 An+1. Thus D1 . it . . . it would require the calculation of one determinant of a n order matrix and n2 determinants of n − 1 order matrices.1 D1 − ai. . the solution of a system of linear equations requires at least n! + n2 ((n − 1)!) and n! + n (n!) mathematical operations by matrix inversion and Cramer’s Method. . second.1 + ai. x + 0y + z = 0 The associated matrix is   211 A= . The solution of this system can be laborious indeed as.2). + ai. This expression is equal to 0 due to the II Laplace Theorem. . by applying the Cramer’s Theorem (matrix inver- sion in Theorem 3.2 . .2 D2 + . would require the calculation of one determinant of a n order matrix and n determinants of n order matrices. a deter- minant is the sum of n! terms where each term is the result of a multiplication. .1 .  Example 3.

If the system is composed of 50 equations in 50 variables (this would not even be a large problem in many engineering applications) the modern computer will need to perform more than 1. . ai+1. One class of these methods.1 = 0. The following operations on the matrix A are said elementary row operations: • E1: swap of two rows ai and aj • E2: multiplication of a row ai by a scalar λ • E3: substitution of a row ai by the sum of the row ai to another row aj By combining E2 and E3. this waiting time is obviously unacceptable.8 billions of mathematical operations per second and can quickly solve a system of 6 linear equations in 6 variables in a fraction of second. thus requiring over 1.7 Let A ∈ Rm. . if ai. The first non-null element in each row is said pivot element of the row. On the other hand. ⎝0 0 2 3⎠ 0 0 0 4 Definition 3. Definition 3. On the basis of this consideration during the last centuries mathematicians inves- tigated methods to solve systems of linear equations by drastically reduction the amount of required calculations.n . ai+1.2 = 0. namely direct methods perform a set of matrix transformations to re-write the system of linear equation in a new form that is easy to be solved.j−1 = 0 Definition 3.n be a staircase matrix. . row echelon form) if the following properties are verified: • rows entirely consisting of zeros are placed at the bottom of the matrix • in each non-null row. The matrix A is said staircase matrix (a. .3.a. we obtain a transformation consisting of the substitution of the row ai by the sum of the row ai to another row aj multiplied by a scalar λ (ai + λaj ).8 GHz clock frequency can perform 2.j = 0 then ai+1. Example 3. a modern computer having 2.k.55 1066 operations by Cramer’s Method.n . see [5].14 The following matrices are staircase matrices: ⎛ ⎞ 2617 ⎜0 0 1 3⎟ ⎜ ⎟ ⎝0 0 2 3⎠ 0000 ⎛ ⎞ 3 2 1 7 ⎜0 2 1 3⎟ ⎜ ⎟. the first non-null element cannot be in a column to the right of any non-null element below it: ∀ index i.6 Let A ∈ Rm.75 1046 millennia to be solved. If we consider that the estimated age of the universe approximately 13 106 millennia. .3 Direct Methods 53 can be very laborious to be solved by hand.5 Let A ∈ Rm.

0 0 1 −2 −2 finally. This matrix is said equivalent to A. ⎛ ⎞ 11 0 1 2 ⎜0 2 0 1 0⎟ ⎜ ⎟ ⎝ 0 0 −1 1 5 ⎠ .n .15 Let us consider the matrix ⎛ ⎞ 0 2 −1 2 5 ⎜0 2 0 1 0⎟ ⎜ ⎟.n . If we apply the elementary row operations on A we obtain a new matrix C ∈ Rm. Definition 3. let us add to third row the second row multiplied by -1 ⎛ ⎞ 11 0 1 2 ⎜0 2 0 1 0 ⎟ ⎜ ⎟ ⎝ 0 0 −1 1 5 ⎠ . ⎛ ⎞ 1 1 0 1 2 ⎜0 2 0 1 0⎟ ⎜ ⎟.8 For every matrix A a staircase matrix equivalent to it exists: ∀A ∈ Rm.54 3 Systems of Linear Equations It can be easily observed that the elementary row operations do not affect the singularity of the matrix of square matrices or. Theorem 3. . ⎝0 2 −1 2 5⎠ 1 1 1 −1 0 then. let us add to the fourth row the first row multiplied by −1 ⎛ ⎞ 11 0 1 2 ⎜0 2 0 1 0 ⎟ ⎜ ⎟ ⎝ 0 2 −1 2 5 ⎠ .8 (Equivalent Matrices) Let us consider a matrix A ∈ Rm. let us add to the fourth row the third row. more generally. Example 3.n equivalent to it. 0 0 1 −2 −2 then. ⎝1 1 0 1 2⎠ 11 1 −1 0 Let us swap first and third row.n : exists a staircase matrix C ∈ Rm. 0 0 0 −1 3 The obtained matrix is a staircase matrix. the rank of matrices.

a row is added to another row. • When E1 is applied. . Proof By following the definition of equivalent matrices. then Ãc can be generated from Ac by applying the elementary row operations. • When E2 is applied.3 Direct Methods 55 Definition 3. If the complete matrix Ac is a staircase matrix then the system is said staircase system. a scalar is multiplied to all the terms of the equation. the swap of two rows.10 (Equivalent Systems) Let us consider two systems of linear equa- tions in the same variables: Ax = b and Cx = D. i.1 x1 +ai. .9 Let us consider a system of m linear equations in n variables Ax = b. the equations of the system are swapped. In this case the equation ai.n xn = bi is substituted by the equation .e. Theorem 3. . + λai. • When E3 is applied. Let us analyse the effect of the elementary row operations on the complete matrix.1 x1 + λai. These two systems are equivalent if they have the same solutions. i.n xn = bi is substituted by λai. . the equation ai.n+1 equivalent to Ac .e.2 x2 + . Each operation of the complete matrix obviously has a meaning in the system of linear equations. Thus after E1 operation the modified system is equivalent to the original one.2 x2 + . .1 x1 +ai. .3.2 x2 + . This operation has no effect on the solution of the system. Let Ac ∈ Rm. If another system of linear equations is associated to a complete matrix Ãc ∈ Rm. a row is multiplied by a non-null scalar λ.9 Let us consider a system of linear equations Ax = b.e. + ai. The two equations have the same solutions and thus after E2 operation the modified systems is equivalent to the original one. then the two systems are also equivalent. i.n+1 be the complete matrix associated to this system. if Ãc is equivalent to Ac . Definition 3.n xn = λbi . + ai.

.

.

. If the n-tuple y1 .n + aj.2 + aj.1 x1 + ai.2 x2 + . ai.1 + aj. .2 x2 + . y2 . yn is solution of the original system is obviously solution of ai. + a. . + ai. .1 x1 + ai. .n xn = bi + bj . . . .

i.n xn = bi and aj.1 .

. . x1 + aj.2 x 2 + . + a.

 By combining the results of Theorems 3.n + aj.+ ai. yn also verifies ai. Thus. after E3 operation the modified system is equivalent to the original one. .2 + aj. the following Corollary can be easily proved. y1 .1 x1 + ai. Thus. Corollary 3.1 Every system of linear equations is equivalent to a staircase system of linear equations. .1 + aj. .n xn = bj . . y2 .n xn = bi + bj . .2 x2 +. .9.j.8 and 3. .

starting from a system Ax = b consists of the following steps.n xn an−1. The aim of this manipulation is to have a triangular incomplete matrix.n xn = bn . + a1. the second equation is in two variables but one of them is known from the first equation thus being in one variable and so on. i ai. ⎪ ⎪  ⎩x = bi − nj=i+1 ai. The transformed system can then be solved with a modest computational effort.n n 2 ⎪ ⎪ .. ..2 x2 + . . is a procedure that transforms any system of linear equations into an equivalent triangular system.i In an analogous way. the last but one equation is in two variables but one of them is known from the last equation thus being in one variable and so on.n−1 ⎪ ⎪.1 x1 + a1. An upper triangular system is of the kind ⎧ ⎪ ⎪ a1. ⎪ ⎩ an.n .1 Gaussian Elimination The Gaussian elimination. The complete matrix Ac can be then manipulated by means of the elementary row operations to generate an equivalent staircase system. + a x = b 2. . . The Gaussian elimination. • Construct the complete matrix Ac • Apply the elementary row operations to obtain a staircase complete matrix and triangular incomplete matrix • Write down the new system of linear equations • Solve the nth equation of the system and use the result to solve the (n − 1)th • Continue recursively until the first equation .j xj .2 2 2. the last equation is in only one variable and thus can be independently solved. was previously presented by Chinese mathematicians in the II century AC.nn ⎪ ⎪ ⎨xn−1 = bn−1 −an−1. the first equation is in only one variable and thus can be independently solved. The system can be solved row by row by sequentially applying ⎧ ⎪ ⎪xn = abn. This procedure. If the matrix A is triangular the variables are uncoupled: in the case of upper triangular A.n xn = b1 ⎪ ⎨a x + . see [6]. see [6].3. although named after Carl Friedrich Gauss. if A is lower triangular. . Let us consider a system of n linear equations in n variables Ax = b with A ∈ Rn. 3. .56 3 Systems of Linear Equations This is the theoretical foundation for the so called direct methods.

. . after substituting. .1 a1.j = ai.j and bi(1) = bi .1 a2.16 Let us solve by Gaussian elimination the following (determined) sys- tem of linear equations: ⎧ ⎪ ⎨x1 − x2 + x3 = 1 x1 + x2 = 4 . the matrix A at the first step is ⎛ (1) (1) (1) ⎞ a1. a2. . .n . x3 can be immediately derived: x3 = 21 . x1 = 49 . n.2 .. Then x3 is sub- stituted in the second equation and x2 is detected: x2 = 47 . .1 an. . . Let us determine the general transformation formulas of the Gaussian elimination. an. ⎪ ⎩ 2x1 + 2x2 + 2x3 = 9 The associated complete matrix is ⎛ ⎞ 1 −1 1 1 Ac = (A|b) = ⎝ 1 1 0 4 ⎠ . . 2 2 29 By applying the elementary row operations we obtain the staircase matrix ⎛ ⎞ 1 −1 1 1 Ã = (A|b) = ⎝ 0 2 −1 3 ⎠ . . .2 ..3 Direct Methods 57 Example 3. Finally. ⎪ ⎩ 2x3 = 1 From the last equation. Let us pose ai..n ⎟ ⎝ .3. ⎠ (1) (1) (1) an.j xj = bi j=1 (1) for i = 1. . 2.. Hence. . ... a1.2 .n ⎜ (1) (1) (1) ⎟ ⎜ ⎟ A(1) = ⎜ a2. A system of linear equation Ax = b can be re-written as  n ai. c 0 0 2 1 The matrix corresponds to the system ⎧ ⎪ ⎨x1 − x2 + x3 = 1 2x2 − x3 = 4 ..

⎠ (2) (2) 0 an. .n ⎟ A(2) = ⎜ 0 a2. (1) j a j=2 1.. .1 b1(1) . .1 (1) b i = 2.j b i = 2. n a1. if we add the first of these equations to the second of the original system. .1 x1 + j=2 a1.1 Let us now generate n − 1 equations by multiplying the latter equation by −a2. .. . 3. . . . 3..j xj = bi(1) − (1) ai. .n . (1) a1.1 1 ⎧ (1)  (1) (1) ⎪ ⎨ a1. .1 a1.j xj = b1   ⇒ ⎪  (1) (1) (1) a a (1) − i. . n where ⎧ (1) (1) ⎪ (2) ⎨ai. we obtain a new system of linear equations that is equivalent to the original one and is ⎧ (1) n (1) (1) ⎪ ⎨ a1.j − ai.1(1)1.j 1 ⎩ n (2) (2) a j=2 i.1 x1 + nj=2 a1.j xj = b1   (1) (1) −a a (1) ⇒ ⎪ (1) ⎩ ai. .1 Thus. the matrix A at the second step is ⎛ (1) (1) (1) ⎞ a1.1 x1 + nj=2 i. 3. −a3. the second of these equations to the third of the original system.1 : (1) n a1.2 . 3.1 (1) ⎩ nj=2 ai. .1 a1. .j x1 + x = b1(1) .1 ⎪ ⎩bi(2) = bi(1) − (1) ai.1 j=2 1.1(1) 1.1 1 ⎧  ⎨ a(1) x1 + n a(1) xj = b(1) 1. .j (1) a1..1 a1. an. a1.58 3 Systems of Linear Equations and the system can be written as  (1)  (1) a1. the last of these equations to the nth of the original system. n.1 1 .1 These n − 1 equations are equal to the first row of the system after a multiplication by a scalar.1 a1.1 x1 + j=2 (1) xj = (1) b1 i = 2. . . . . −an.. . .j (1) −ai.1 (1) (1) (1) x1 + nj=2 ai. Let us consider the first equation of the system. .2 ⎟ ⎝ . .1 a1.2 .1 (1) −ai.j xj = b1(1) (1)  (1) ai.1 1 .1 1 .j xj = bi(1) i = 2. . . a1. . .1 x1 + nj=2 a1. Let us divide this equation by (1) a1.. .j jx = b i i = 2.1 x1 + nj=2 ai.j xj = bi − (1) ai... .n ⎜ (2) (2) ⎟ ⎜ . . respectively: (1) n (1) (1) −ai. .1 a1. Thus. 3.j xj − ai. n a1.j (1) = ai. n. . a2. . .. .

.j − (2) a2. one by one.2 and sum them.2 a2. n ai.n . a3. . . .2 a2. . .j (2) = ai. n. .2 (2) −ai.1 a1.n ⎜ (2) (2) (2) ⎟ ⎜ 0 a2.n ⎟ (3) ⎜ ⎟ A =⎜ 0 (3) (3) ⎟ . n.2 a(2) 2 2..j xj = bi i = 3.2 x2 + j=3 ai. to the last n − 2 equations of the system at the second step ⎧  (2) (2) ⎪ ⎨ a2. (2) ai. 0 a3. ..2 Let us multiply the second equation by − (2) for i = 2. ..3 . .j xj = b2(2) n (3) (3) j=2 ai. 4. 4. (2) a2.j xj = bi(2) i = 3. . . .. ..j xj = b2(2) (2) n (2) ai. . n..3 . . n where ⎧ (2) (2) ⎪ (3) ⎨ai. We can re-write this system as  (2) n (2) a2.2 x2+ j=3 a2. ..j xj = b2(2)   n (2) (2) (2) (2) ⇒ ⎪ xj = bi(2) − (2) b2(2) i = 3. a1.2 x2 + j=3 a1. . . ⎠ (3) (3) 0 0 an.2 b(2) a2. .2 n − 2 new equations (2) n (2) (2) −ai. .2 ⎪ ⎩bi(3) = bi(2) − (2) ai.3. an. .j −ai.2 a2. . 3..2 x2 + (2) xj = i = 3.2 x2 + nj=3 a2. . j=3 a2.j (2) a2.2 a1. . .2 ⎩ j=3 ai.n ⎟ ⎜ ⎝ .2 a2.3 . . .2  (2) n (2) a2.3 . . . 4 . .j xj = bi(2) j=2 for i = 2.2 b2(2) .. n thus generating a2. . . .j − ai.3 Direct Methods 59 We can now repeat the same steps for the system of n − 1 equations in n − 1 unknowns  n (2) ai. 4. . a2.. 3.j ai.2 The matrix associated to the system at the third step becomes ⎛ (1) (1) (1) (1) ⎞ a1.

⎠ ⎟ rn and.k (k) ai.. . . this system is triangular and can be easily solved...2 a1. a2.. ...n ⎟ ⎜ x2 ⎟ ⎜ b2(2) ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ 0 (3) (3) ⎟ ⎜ x ⎟ = ⎜ (3) ⎟ . . ⎜ 0 a3. n ak.⎠ ⎝ . j = k + 1. .1 Row Vector Notation for Gaussian Elimination Let us know write an equivalent formulation of the Gaussian transformation by using the row vector notation.... .. The generic Gaussian transformation formulas at the step k can be thus written as: (k) (k+1) (k) ai.60 3 Systems of Linear Equations By repeating these steps for all the rows we finally reach a triangular system in the form ⎛ (1) (1) (1) (1) ⎞⎛ ⎞ ⎛ ⎞ a1.n x1 b1(1) ⎜ (2) (2) (2) ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 0 a2. .3. .. ⎠ rn(1) . . to emphasize that we are working at the step one we can write the complete matrix as ⎛ (1) ⎞ r1 ⎜ (1) ⎟ ⎜ ⎟ Ac(1) = ⎜ r2 ⎟ . . ..3 .3 .3 . .k (k) bi(k+1) = bi(k) − (k) ai.n ⎟ ⎜ 3 ⎟ ⎜ b3 ⎟ ⎝ . a1. .j − (k) ak.1 a1. . ak.1. .. an.j = ai.. Let us consider a system of linear equations in a matrix formulation: Ax = b and let us write the complete matrix Ac in terms of its row vectors ⎛⎞ r1 ⎜ r2 ⎟ Ac = ⎜ ⎝.. .j i. . . ⎠ 0 0 0 .. . a3. ⎝ .n (n) xn bn(n) As shown above.k (k) bk i = k + 1. ⎠⎝.. n. .2 a2.k 3..

2 ..  (1)  −an.3.1  (1)  −a3.. .. .. .1 After the application of these steps the complete matrix can be written as ⎛ (2) (2) (2) (2) ⎞ a1. . ⎝ .n bn The Gaussian transformation to obtain the matrix at the step (3) are: r1(3) = r1(2) r2(3) = r2(2)  (1)  −a3. . .3 Direct Methods 61 The Gaussian transformation to obtain the matrix at the step (2) are: r1(2) = r1(1)  (1)  −a2. ⎠ (2) (2) (2) 0 an..... . . a2.n b2(3) ⎟ ⎜ ⎟ A c(3) =⎜ ⎜ 0 0 .1 rn(2) = rn(1) + (1) r1(1) .2 . . .n (3) ⎟ b3(3) ⎟ .2 r3(3) (2) = r3 + (2) r2(2) a2. a1. a1. . an. .n b1 ⎜ (2) (2) (2) ⎟ ⎜ ⎟ Ac(2) = ⎜ 0 a2. .1 r3(2) (1) = r3 + (1) r1(1) a1.. ...1 a1.2 . .1 .. . .2 rn(3) = rn(2) + (2) r2(2) a2.2(3) (3) . an.. ..2 . a3... ⎠ (3) 0 0 .  (2)  −an...1 a1.2 ..2 which leads to the following complete matrix ⎛ (3) (3) (3) ⎞ a1.1 r2(2) = r2(1) + (1) r1(1) a1. . ..n bn(3) . . ⎜ ⎟ ⎝ .n b1(3) ⎜ ⎟ ⎜ 0 a2.n b2 ⎟ . a1.. ... a2..

1 r3(2) = r3(1) + (1) r1(1) = r3(1) + 0r1(1) a1. −1 −2 0 −8 ⎠ 3 −3 −2 4 7 Let us apply the Gaussian transformations to move to the step (2) r(2) = r(1)  1 (1)  1 −a2. ⎪ ⎪ −x − 2x = −8 ⎪ ⎩ 2 3 3x1 − 3x2 − 2x3 + 4x4 = 7 The associated complete matrix is ⎛ ⎞ 1 −1 −1 1 0 ⎜2 0 2 0 8 ⎟ Ac(1) = (A|b) = ⎜ ⎝0 ⎟..1  (1)  −a4.  (k)  −an.1 r2(2) = r2(1) + (1) r1(1) = r2(1) − 2r1(1) a1.1  (1)  −a3.k  (k)  (k+1) (k) −ak+2...k rk+1 = rk+1 + (k) rk(k) ak.1 . Example 3.62 3 Systems of Linear Equations At the generic step (k + 1) the Gaussian transformation formulas are r1(k+1) = r1(k) r2(k+1) = r2(k) .17 Let us apply the Gaussian elimination to solve the following system of linear equations: ⎧ ⎪ ⎪ x1 − x2 − x3 + x4 = 0 ⎪ ⎨2x + 2x = 8 1 3 .k which completes the description of the method.k rn(k+1) = rn(k) + (k) rk(k) ak. (k+1) rk = rk(k)  (k)  (k+1) (k) −ak+1.k .k rk+2 = rk+2 + (k) rk(k) ak..1 r4(2) (1) = r4 + (1) r1(1) = r4(1) − 3r1(1) a1.

3. that is a .2 Pivoting Strategies and Computational Cost From the Gaussian transformation formulas in Sect. 0 −1 −4 ⎠ 0 0 1 1 7 We would need one more step to obtain a triangular matrix.2 thus obtaining the following complete matrix ⎛ ⎞ 1 −1 −1 1 0 ⎜ 0 2 4 −2 8 ⎟ Ac(2) = (A|b) = ⎜ ⎝0 0 ⎟. ⎟ 0 0 0 −1 −4 From the system of linear equations associated to this matrix we can easily find that x4 = 4. 3. after two steps the matrix is already triangular. . The condition that all the pivotal elements must be non-null is not an actually limiting condition. x2 = 2. . 3. in this case. n.1 r4 = r4 + (2) r2(2) = r4(1) + 0r2(2) a2. 0 −1 −2 0 −8 ⎠ 0 0 1 1 7 Let us apply the Gaussian transformations to move to the step (2) r1(3) = r1(2) r(3) = r(2)  2 (2)  2 −a3. .8. However. the matrix A can be transformed into an equivalent staircase matrix. . x3 = 3. it follows that the Gaussian (k) elimination can be applied only when ak.2 r3(3) = r3(2) + (2) r2(2) = r3(2) + 21 r2(2) a2. and x1 = 1. These elements are said pivotal elements of the triangular matrix. 2.3. For the Theorem 3. It is enough to swap the third and fourth rows to obtain ⎛ ⎞ 1 −1 −1 1 0 ⎜ 0 2 4 −2 8 ⎟ Ac(2) = (A|b) = ⎜ ⎝0 0 1 1 7 ⎠.3 Direct Methods 63 thus obtaining the following complete matrix ⎛ ⎞ 1 −1 −1 1 0 ⎜0 2 4 −2 8 ⎟ A c(2) ⎜ = (A|b) = ⎝ ⎟.3.1.k = 0 for k = 1.2  (2)  (2) (1) −a4.

k  = max ai. i. ⎪ ⎩ 50x1 − x2 − 2x3 = −8 The associated complete matrix at the step 1 is ⎛ ⎞ 0 1 −1 0 Ac(1) = (A|b) = ⎝ 2 0 6 10 ⎠ . For this reasons.18 Let us consider the following system of linear equation ⎧ ⎪ ⎨x2 − x3 = 4 2x1 + 6x3 = 10 . Example 3. that is that row having in the first column the coefficient having maximum absolute value.64 3 Systems of Linear Equations triangular matrix in the square case. If the matrix is non-singular.e. the product of the diagonal elements. we can swap the rows in order to be able to apply the Gaussian transformation. Nonetheless. at first the element ar. the system is determined. the rth and kth rows are swapped.1  (1)  (2) (1) −a3. must be non- null. a non-singular matrix can always be re-arranged as a triangular matrix displaying non-null diagonal elements.1 r3 = r3 + (1) r1(1) . 50 −1 −2 −8 (1) We cannot apply the Gaussian transformation because a1.1 r2(2) = r2(1) + (1) r1(1) a1. In other words.k is the maximum in absolute value. Then. 0 1 −1 0 Now we can apply the first step of Gaussian elimination: r1(2) = r1(1)  (1)  −a2. The partial pivoting at the step 1 consists of swapping the first and the third rows. Two simple strategies are here illustrated. i. This means that a system of linear equations might need a preprocessing strategy prior to the application of the Gaussian elimination. The first one. the determinant. namely partial pivoting consists of swapping the kth row of the matrix at the step k with that row such that the element in the kth column below (k) (k) ak.1 .1 = 0 and we cannot perform a division by zero.k      (k)   (k)  ar. a1.e.k  k≤i≤n is found. The resulting matrix is ⎛ ⎞ 50 −1 −2 −8 Ac(1) = (A|b) = ⎝ 2 0 6 10 ⎠ .

3. in order to pass from the matrix A(k) to the matrix A(k+1) . Considering that a triangular system requires n2 operations.3.4 104 arithmetic operations. see [5.  n−1 n−1 3 (n − k)2 + 2 (n − k) k=1 k=1 arithmetic operations are totally required. a perturbation of the sequence of variables. 3 (n − k)2 arithmetic operations are required while to pass from the vector b(k) to the vector b(k+1) .3 Direct Methods 65 Another option is the total pivoting. It can be proved that Gaussian elimination requires about n3 arithmetic operations to detect the solution. On the other hand. Thus. This means that a modern computer can solve this problem in about one thousandth of second. total pivoting performs a swap of the columns. Hence.s i. 6 2 3 2 6 Thus. As a conclusion of this overview on Gaussian elimination. in the example of a system of 50 linear equations in 50 variables the solution is found after about 8. Total pivoting guarantees that pivotal elements are not small numbers and thus that the multipliers are not big numbers. the elements of the solution vector must be rearranged to obtain the original sequence. It follows that if the total pivoting is applied. This strategy at first seeks for the indices r and s such that  (k)    a  = max a(k)  r. i. we obtain .3. the question that could be posed is: “What is the advantage of Gaussian elimination with respect to the Cramer’s Method ?”.3 LU Factorization Equivalently to the Gaussian elimination.j≤n and then swaps rth and kth rows as well as sth and kth columns. the LU factorization is a direct method that transforms a matrix A into a matrix product LU where L is a lower triangular matrix having the diagonal elements all equal to 1 and U is an upper triangular matrix. 2 (n − k) arithmetic operations are required.j k≤i. 7]. if we aim at solving a system of linear equations Ax = b. if we neglect the pivotal strategy. More specifically. in order to determine the matrix A(n) starting from A and to determine the vector b(n) starting from b. it can be easily proved that the total amount of arithmetic operations necessary to solve a system of linear equations by means of the Gaussian method is   n (n − 1) (2n − 1) n (n − 1) 2 3 7 2 +3 + n2 = n3 + n2 − n.e.

66 3 Systems of Linear Equations

Ax = b ⇒
⇒ LUx = b.

If we pose Ux = y, we solve at first the triangular system Ly = b and then extract
x from the triangular system Ux = y.
The main advantage of factorizing the matrix A into LU with respect to Gaussian
elimination is that the method does not alter the vector of known terms b. In appli-
cations, such as modelling, where the vector of known terms can vary (e.g. if comes
from measurements), a new system of linear equation must be solved. Whereas
Gaussian elimination would impose the solution of the entire computational task,
LU factorization would require only the last steps to be performed again since the
factorization itself would not change.
The theoretical foundation of the LU factorization is given in the following theo-
rem.

Theorem 3.10 Let A ∈ Rn,n be a non-singular matrix. Let us indicate with Ak
the submatrix having order k composed of the first k rows and k columns of A. If
det Ak = 0 for k = 1, 2, . . . , n then ∃! lower triangular matrix L having all the
diagonal elements equal to 1 and ∃! upper triangular matrix U such that A = LU.

Under the hypotheses of this theorem, every matrix can be decomposed into the
two triangular matrices. Before entering into implementation details let consider the
LU factorization at the intuitive level throughout the next example.

Example 3.19 If we consider the following system of linear equations


⎨x + 3y + 6z = 17
2x + 8y + 16z = 42


5x + 21y + 45z = 91

and the corresponding incomplete matrix A
⎛ ⎞
1 3 6
A = ⎝ 2 8 16 ⎠ ,
5 21 45

we can impose the factorization A = LU. This means
⎛ ⎞ ⎛ ⎞⎛ ⎞
1 3 6 l1,1 0 0 u1,1 u1,2 u1,3
A = ⎝ 2 8 16 ⎠ = ⎝ l2,1 l2,2 0 ⎠ ⎝ 0 u2,2 u2,3 ⎠ .
5 21 45 l3,1 l3,2 l3,3 0 0 u3,3

If we perform the multiplication of the two matrices we obtain the following
system of 9 equations in 12 variables.

3.3 Direct Methods 67


⎪ l1,1 u1,1 = 1



⎪ l1,1 u1,2 = 3





⎪ l 1,1 u1,3 = 6



⎨l2,1 u1,1 = 2
l2,1 u1,2 + l2,2 u2,2 = 8


⎪l2,1 u1,3 + l2,2 u2,3 = 16




⎪ l3,1 u1,1 = 5





⎪ l3,1 u1,2 + l3,2 u2,2 = 21


l3,1 u1,3 + l3,2 u2,3 + l3,3 u3,3 = 45.

Since this system has infinite solutions we can impose some extra equations. Let
us impose that l1,1 = l2,2 = l3,3 = 1. By substitution we find that


⎪ u1,1 = 1



⎪ u 1,2 = 3



⎪ u1,3 = 6





⎨ 2,1 = 2
l
u2,2 = 2



⎪ u2,3 = 4



⎪l3,1 = 5





⎪ l3,2 = 3


u3,3 = 3.

The A = LU factorization is then
⎛ ⎞ ⎛ ⎞⎛ ⎞
1 3 6 100 136
⎝ 2 8 16 ⎠ = ⎝ 2 1 0 ⎠ ⎝ 0 2 4 ⎠ .
5 21 45 531 003

In order to solve the original system of linear equations Ax = b we can write

Ax = b ⇒ LUx = b ⇒ Lw = b.

where Ux = w.
Let us solve first Lw = b, that is
⎛ ⎞⎛ ⎞ ⎛ ⎞
100 w1 17
⎝ 2 1 0 ⎠ ⎝ w2 ⎠ = ⎝ 42 ⎠ .
531 w3 91

68 3 Systems of Linear Equations

Since this system is triangular, it can be easily solved by substitution and its
solution is w1 = 17, w2 = 8, w3 = −18. With these results, the system Ux = w
must be solved:
⎛ ⎞⎛ ⎞ ⎛ ⎞
136 x 17
⎝0 2 4⎠⎝y⎠ = ⎝ 8 ⎠
003 z −18

which, by substitution, leads to z = −6, y = 16, and x = 5, that is the solution to
the initial system of linear equations by LU factorization.

Let us now derive the general transformation formulas. Let A be
⎛ ⎞
a1,1 a1,2 . . . a1,n
⎜ a2,1 a2,2 . . . a2,n ⎟
A=⎜ ⎝ ... ... ... ... ⎠

an,1 an,2 . . . an,n

while L and U are respectively
⎛ ⎞
0 1 ... 0
⎜ l2,1 1 ... 0 ⎟
L=⎜
⎝... ...

... ...⎠
ln,1 ln,2 ... 1
⎛ ⎞
u1,1 u1,2 ... u1,n
⎜ 0 u2,2 ... u2,n ⎟
U=⎜
⎝ ...
⎟.
... ... ... ⎠
0 0 ... un,n

If we impose A = LU we obtain


n 
min(i,j)
ai,j = li,k uk,j = li,k uk,j
k=1 k=1

for i, j = 1, 2, . . . , n.
In the case i ≤ j, i.e. in the case of the triangular upper part of the matrix we have


i 
i−1 
i−1
ai,j = li,k uk,j = li,k uk,j + li,i ui,j = li,k uk,j + ui,j
k=1 k=1 k=1

3.3 Direct Methods 69

This equation is equivalent to

i−1
ui,j = ai,j − li,k uk,j
k=1

that is the formula to determine the elements of U.
Let us consider the case j < i, i.e. the lower triangular part of the matrix


j

j−1
ai,j = li,k uk,j = li,k uk,j + li,j uj,j .
k=1 k=1

This equation is equivalent to
 
1 
j−1
li,j = ai,j − li,k uk,j
uj,j
k=1

that is the formula to determine the elements of L.
In order to construct the matrices L and U, the formulas to determine their elements
should be properly combined. Two procedures (algorithms) are here considered.
The first procedure, namely Crout’s Algorithm, consists of the steps illustrated in
Algorithm 1.
In other words, the Crout’s Algorithm computes alternately the rows of the two
triangular matrices until the matrices have been filled. Another popular way to full
the matrices L and U is by means the so called Doolittle’s Algorithm. This procedure
is illustrated in Algorithm 2 and consists of filling the rows of U alternately with the
columns of L.

Example 3.20 Let us apply the Doolittle’s Algorithm to perform LU factorization
of the following matrix

Algorithm 1 Crout’s Algorithm
compute the first row of U
compute the second row of L
compute the second row of U
compute the third row of L
compute the third row of U
compute the fourth row of L
compute the fourth row of U
...

70 3 Systems of Linear Equations

Algorithm 2 Doolittle’s Algorithm
compute the first row of U
compute the first column of L
compute the second row of U
compute the second column of L
compute the third row of U
compute the third column of L
compute the fourth row of U
...

⎛ ⎞
1 −1 3 −4
⎜2 −3 9 −9 ⎟
A=⎜
⎝3
⎟.
1 −1 −10 ⎠
1 2 −4 −1

At the first step, the first row of the matrix U is filled by the formula

0
u1,j = a1,j − l1,k uk,j
k=1

for j = 1, 2, 3, 4.
This means
u1,1 = a1,1 = 1
u1,2 = a1,2 = −1
u1,3 = a1,3 = 3
u1,4 = a1,4 = −4.

Then, the first column of L is filled by the formula
 
1 
0
li,1 = ai,1 − li,k uk,1
u1,1
k=1

for i = 2, 3, 4.
This means a2,1
l2,1 = u1,1
=2
a3,1
l3,1 = u1,1
=3
a4,1
l4,1 = u1,1
= 1.

Thus, the matrices L and U at the moment appear as
⎛ ⎞
1 0 0 0
⎜2 1 0 0⎟
L=⎜ ⎝ 3 l3,2 1 0 ⎠ .

1 l4,2 l4,3 1

3.3 Direct Methods 71

and
⎛ ⎞
1 −1 3 −4
⎜0 u2,2 u2,3 u2,4 ⎟
U=⎜
⎝0

0 u3,3 u3,4 ⎠
0 0 0 u4,4

Then, the second row of the matrix U is found by applying


1
u2,j = a2,j − l2,k uk,j = a2,j − l2,1 u1,j
k=1

for j = 2, 3, 4.
This means
u2,2 = a2,2 − l2,1 u1,2 = −1
u2,3 = a2,3 − l2,1 u1,3 = 3
u2,4 = a2,4 − l2,1 u1,4 = −1.

The second column of L is given by
 
1 
1
1

li,2 = ai,2 − li,k uk,2 = ai,2 − li,1 u1,2
u2,2 u2,2
k=1

for i = 3, 4.
This means
a3,2 −l3,1 u1,2
l3,2 = u2,2
= −4
a4,2 −l4,1 u1,2
l4,2 = u2,2
= −3.

The third row of the matrix U is given by


2
u3,j = a3,j − l3,k uk,j = a3,j − l3,1 u1,j − l3,2 u2,j
k=1

for j = 3, 4.
This means
u3,3 = 2
u3,4 = −2.

In order to complete the matrix L, we compute
 
1 
2
li,3 = ai,1 − li,k uk,3
u3,3
k=1

72 3 Systems of Linear Equations

for i = 4, i.e. l4,3 = 1.
Finally, the matrix U is computed by


3
u4,j = a4,j − l4,k uk,j
k=1

for j = 4, i.e. u4,4 = 2.
Thus, the L and U matrices are
⎛ ⎞
1 0 0 0
⎜2 1 0 0⎟
L=⎜
⎝3
⎟.
−4 1 0⎠
1 −3 1 1

and
⎛ ⎞
1 −1 3 −4
⎜0 −1 3 −1 ⎟
U=⎜
⎝0
⎟.
0 2 −2 ⎠
0 0 0 2

3.3.4 Equivalence of Gaussian Elimination and LU
Factorization

It can be easily shown that Gaussian elimination and LU factorization are essentially
two different implementations of the same method. In order to remark this fact it
can be shown how a LU factorization can be performed by applying the Gaussian
elimination.
Let Ax = b be a system of linear equations with
⎛ ⎞
a1,1 a1,2 ... a1,n
⎜ a2,1 a2,2 ... a2,n ⎟
A=⎜
⎝ ...
⎟.
... ... ... ⎠
an,1 an,2 ... an,n

Let us apply the Gaussian elimination to this system of linear equation. Let us
indicate with Gt the triangular incomplete matrix resulting from the application of
the Gaussian elimination:
⎛ ⎞
g1,1 g1,2 . . . g1,n
⎜ 0 g2,2 . . . g2,n ⎟
Gt = ⎜⎝ ... ... ... ... ⎠.

0 0 . . . gn,n

3.3 Direct Methods 73

In order to obtain the matrix Gt from the matrix A, as shown in Sect. 3.3.1.1,
linear combinations of row vectors must be calculated by means of weights
 (1)
  (1)
  (2)

a2,1 a3,1 a3,2
(1)
, (1)
,... (2)
....
a1,1 a1,1 a2,2

Let us arrange these weights in a matrix in the following way
⎛ ⎞
1 0 ... ... 0
 
⎜ a2,1 (1) ⎟
⎜ (1) 1 ... ... 0 ⎟
⎜ a1,1 ⎟
⎜    ⎟
⎜ a(1) (2) ⎟
Lt = ⎜
⎜ a(1)
3,1 a3,2
(2) 1 ... 0 ⎟⎟.
⎜ 1,1 a2,2 ⎟
⎜ ... ... ...⎟
⎜   . . . . . . ⎟
⎝ a(1) (2)
an,2

n,1
(1) (2) ... ... 1
a1,1 a2,2

It can be easily shown that
A = Lt Gt

and thus Gaussian elimination implicitly performs LU factorization where U is the
triangular Gaussian matrix and L is the matrix Lt of the Gaussian multipliers.
Let us clarify this fact by means of the following example.

Example 3.21 Let us consider again the system of linear equations


⎨x + 3y + 6z = 17
2x + 8y + 16z = 42


5x + 21y + 45z = 91

and its corresponding incomplete matrix A
⎛ ⎞
1 3 6
A = ⎝ 2 8 16 ⎠ ,
5 21 45

Let us apply Gaussian elimination to obtain a triangular matrix. At the first step

r1(1) = r1(0) = 1 3 6
r2(1) = r2(0) − 2r1(1) = 0 2 4
r3(1) = r3(0) − 5r1(1) = 0 6 15

to the exact solution of the system of linear equations in infinite steps. 531 It can be easily verified that ⎛ ⎞⎛ ⎞ ⎛ ⎞ 100 136 1 3 6 Lt Gt = ⎝ 2 1 0 ⎠ ⎝ 0 2 4 ⎠ = ⎝ 2 8 16 ⎠ = A.4 Iterative Methods These methods. Let us apply the second step: r1(2) = r1(1) = 1 3 6 r2(2) = r2(1) = 0 2 4 r3(2) = r3(2) − 3r2(2) = 0 0 3 which leads to the following matrices ⎛ ⎞ 136 Gt = ⎝ 0 2 4 ⎠ 003 and ⎛ ⎞ 100 Lt = ⎝ 2 1 0 ⎠ . 5#1 where # simply indicates that there is a nun-null element which has not been calcu- lated yet. 531 003 5 21 45 3. . Unlike direct methods that converge to the theoretical solution in a finite time. For this reason. iteratively apply some formulas to detect the solution of the system. under some conditions. these methods are often indicated as iterative methods. iterative methods are approximate since they converge. starting from an initial guess x(0) .74 3 Systems of Linear Equations which leads to ⎛ ⎞ 13 6 ⎝0 2 4 ⎠ 0 6 15 and to a preliminary Lt matrix ⎛ ⎞ 100 ⎝2 1 0⎠.

if the limk→∞ |x(k) − c| grows indefinitely then the method is said to diverge. Let us consider a non-singular matrix M and write the following equation: b − Ax = O ⇒ Mx + (b − Ax) = Mx ⇒ ⇒ M−1 (Mx + (b − Ax)) = M−1 Mx ⇒ −1 −1 ⇒ x = M Mx + M (b − Ax) = M−1 Mx + M−1 b − M−1 Ax ⇒ . and the approximated solution at the kth step x(k) . the solution of the system c.11 (Convergence of an iterative method. the starting solution x(0) . However. it can be expressed as b − Ax = O.) If we consider a system of linear equations Ax = b. then an approximated method is said to converge to the solution of the system when lim |x(k) − c| = O k→∞ On the contrary. it must be remarked that both direct and iterative methods perform sub- sequent steps to detect the solution of systems of linear equations. Definition 3. All Iterative methods are characterized by the same structure. If Ax = b is a system of linear equations. iter- ative methods progressively manipulate a candidate solution.4 Iterative Methods 75 Furthermore.3. while direct methods progressively manipulate the matrices (complete or incomplete).

this section gives some examples of simple iterative methods. An iterative method converges to the solution of a system of linear equations for every initial guess x(0) . the an iterative method is characterized by the update formula x = Hx + t and if we emphasize that this is an update formula x(k+1) = Hx(k) + t. While a thorough study of iterative methods in not a scope of this book. . ⇒ x = I − M−1 A x + M−1 b. If we replace the variables as H = I − M−1 A and t = M−1 b. In this formulation the convergence condition can be written easily. 9. The meaning and calculation procedure for eigenvalues are given in Chap. if and only if the absolute value of the maximum eigenvalue of matrix H is < 1.

76 3 Systems of Linear Equations 3. .j xj 1 an.1 x1(0) + a2. + a x (1) = b n. . Let us consider a system of linear equations Ax = b with A non-singular.n n n The system of linear equation can be rearranged as: ⎧ (k+1)    ⎪ ⎪ x1 = b1 − nj=1.n xn = b1 (0) ⎪ ⎪ ⎨ a2. .1 1 n. b = O.n n n At the generic step k.2 1 ⎪ ⎪.1 x1(k) + a2.2 x2 + .j=1 a1..j xj(k) a1. the system can be written as: ⎧ (1) (0) ⎪ ⎪a1. 2.1 x1(k+1) + a1.2 x2(k) + .j=i ai. .1 1 n. + a2. . The next example clarifies the implementation of the method in practice. .. .2 2 n. .2 x2(1) + . . ⎪ ⎪   ⎪ ⎩x (k+1) = b − n (k) n n j=1.. . .j=n an. Let us indicate with ⎛ ⎞ x1(0) ⎜ x (0) ⎟ x(0) = ⎜ ⎟ ⎝ .n .j xj(k) a2.i Jacobi’s method is conceptually very simple and also easy to implement.4. + a x (k+1) = b n. The method is named after Carl Gustav Jacob Jacobi (1804-1851). . .n xn(k) = b1 ⎪ ⎨ a2.2 2 n. ⎪ ⎪ ⎩a x (0) + a x (0) + . j=1. + a1. . ⎠ xn(0) the initial guess. ⎪ ⎪ .n xn(0) = b2 .. ⎪ ⎪ ⎩a x (k) + a x (k) + . Jacobi’s method simply makes use of this manipulation to iteratively detect the solution.n xn(k) = b2 . the system can be written as: ⎧ ⎪ ⎪ ⎪ a1.j xj(k) ⎠ .1 x1 + a1. + a1.1 Jacobi’s Method The Jacobi’s method is the first and simplest iterative method illustrated in this chapter.1 1 ⎪ ⎪   ⎪ ⎨ (k+1)  x2 = b2 − nj=1. The generic update formula for the ith variable is ⎛ ⎞  n 1 xi(k+1) = ⎝bi − ai.j=2 a2. and A does not display zeros on its main diagonal. ⎪ ⎪ . .2 x2(k+1) + . + a2. At the first step. .

Let us solve now the same system by means of the application of Jacobi’s method. The solution by the application of Cramer’s method is x = 334 49 y = − 334 13 z = − 167 65 .22 The following system of linear equations ⎧ ⎪ ⎨10x + 2y + z = 1 10y − z = 0 ⎪ ⎩ x + y − 10z = 4 is determined as the matrix associated to it is non-singular and has determinant equal to −1002. Let us write the update formulas at first: .4 Iterative Methods 77 Example 3.3.

x (k+1) = 10 1 1 − 2y(k) − z(k) .

(k) y(k+1) = 10 1 z (k+1) .

z = − 10 4 − x (k) − 10y(k) 1 and take our initial guess x (0) = 0 y(0) = 0 z(0) = 0. Let us now apply the method .

x (1) = 10 1 1 − 2y(0) − z(0) = 0.1 .

(0) y(1) = 10 1 z =0 .

(1) 1 These three values are use to calculate x (2) . y(2) . z(2) : .4. z = − 10 4 − x (0) − 10y(0) = −0.

x (2) = 1 1 − 2y(1) − z(1) = 0.14 .

04 10 . (1) 10 y(2) = 1 z = −0.

We can apply Jacobi’s method iteratively and obtain x (3) = 0. z(2) = − 10 1 4 − x (1) − 10y(1) = −0.147 y(3) = −0.39.39.039 z(3) = −0. .

. . ⎟ an.. ⎠ 0 0 . x (4) = 0. This solution already gives an approximation of the exact solution. . an... At the step (10) we have x (10) = 0.....146707 y(10) = −0.. ... ..... .n ⎟ F=⎜ ⎝. a1. ⎟ ....14672 y(4) = −0.3892.........⎠. 0 ⎜ a2. 0 and ⎛ ⎞ a1.. 0 ⎟ D=⎜ ⎝ .2 .n .1 0 . ⎠ 0 0 ..039 z(3) = −0. Jacobi’s method can be expressed also in a matrix form.1468 y(3) = −0.389220. The solution at the step (4) approximately solves the system.1 an.78 3 Systems of Linear Equations x (3) = 0. 0 ⎜ 0 a2. . . ⎟.. . ...389222 which is about 10−9 distant from the exact solution.2 .03892 z(4) = −0. .038922 z(10) = −0.1 0 .2 . If we indicate with ⎛ ⎞ 0 0 . 0 ⎛ ⎞ 0 a1. We can check it by substituting these numbers into the system: 10x (4) + 2y(4) + z(4) = 1... .n ⎜ 0 0 ... .. 0 ⎟ E=⎜ ⎝ ..0001 10y(4) − z(4) = 0. a2.00002 x (4) + y(4) − 10z(4) = 4..

4 Iterative Methods 79 the system of linear equation can be written as Ex(k) + Fx(k) + Dx(k+1) = b. the equation can be written as . that is x(k+1) = −D−1 (E + F) x(k) + D−1 b. Considering that E + F = A − D.3.

Example 3.23 The system of linear equations related to the previous example can be written in a matrix form Ax = b. in the case of Jacobi’s method H = I − D−1 A and t = D−1 b. x(k+1) = I − D−1 A x(k) + D−1 b. Considering that ⎛ ⎞ 000 E = ⎝0 0 0⎠. Hence. 0 0 −10 we can calculate the inverse D−1 : ⎛ 1 ⎞ 10 00 D−1 = ⎝ 0 10 1 0 ⎠. 110 ⎛ ⎞ 02 1 F = ⎝ 0 0 −1 ⎠ 00 0 and ⎛ ⎞ 10 0 0 D = ⎝ 0 10 0 ⎠ . 0 0 − 10 1 The matrix representation of the Jacobi’s method means that the vector of the solution is updated according to the formula ⎛ ⎞ ⎛ (k) ⎞ x (k+1) x ⎝ y(k+1) ⎠ = H ⎝ y(k) ⎠ + t z(k+1) z(k) .

80 3 Systems of Linear Equations where .

038922 z(10) = −0.14 ⎜ (2) ⎟ ⎜ ⎟ 10 10 ⎝ y ⎠ = ⎝ 0 0 10 1 ⎠ ⎝ 0 ⎠ + ⎝ 0 ⎠ = ⎝ −0. the pseudocode of the Jacobi’s method is shown in Algo- rithm 3. 0 0 − 10 1 4 − 104 It can be verified that the iterative application of matrix multiplication and sum leads to an approximated solution. we have ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ 1 ⎞ ⎛ 1 ⎞ x (1) 0 − 15 − 10 1 0 10 10 ⎜ (1) ⎟ ⎜ 1 ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ y ⎠ = ⎝ 0 0 10 ⎠⎝0⎠ + ⎝ 0 ⎠ = ⎝ 0 ⎠. For the sake of clarity. ⎛ ⎞ ⎛ ⎞⎛ x (2) 0 − 15 − 10 1 1 ⎞ ⎛ 1 ⎞ ⎛ ⎞ 0. H = I − D−1 A = ⎛ ⎞ ⎛ 1 ⎞⎛ ⎞ 100 10 0 0 10 2 1 = ⎝ 0 1 0 ⎠ − ⎝ 0 101 0 ⎠ ⎝ 0 10 −1 ⎠ = 001 0 0 − 10 1 1 1 −10 ⎛ ⎞ 0 − 5 − 10 1 1 = ⎝ 0 0 10 1 ⎠ 1 1 10 10 0 and t =⎛D−1 b = ⎞ ⎛ ⎞ ⎛ ⎞ 1 1 10 0 0 1 10 = ⎝ 0 101 0 ⎠⎝0⎠ = ⎝ 0 ⎠. Obviously the two ways of writing Jacobi’s method lead to the same results (as they are the same thing).04 ⎠ . z(1) 1 1 10 10 0 0 − 10 4 − 10 4 Then.39 10 10 If we keep on iterating the procedure we obtain the same result at step (10): x (10) = 0. Considering our initial guess x (0) = 0 y(0) = 0 z(0) = 0.389222. . z (2) 1 1 0 − 10 4 − 10 4 −0.146707 y(10) = −0.

⎪ n j=1 i an.j xj end if end for yi = a1i. The system is ⎧ ⎪ ⎨10x + 2y + z = 1 10y − z = 0 ⎪ ⎩ x + y − 10z = 4.1 1 a2..j x (k+1) 1 .i (bi − s) end for x=y end while 3.1 1 ⎪ ⎪ (k+1)  ⎪ ⎪ n (k) (k+1)  ⎪ ⎪ x = b − a x − a x 1 ⎪ ⎪ 2 2 j=3 2.   ⎪   ⎪ ⎪ xi(k+1) = bi − nj=i+1 ai.2 ⎨ . With Gauss-Siedel’s method.j xj(k) a1..   ⎪ ⎩x (k+1) = bn − n−1 ai..3. the xi values replace the old ones as soon as they have been calculated.2 Gauss-Seidel’s Method The Gauss-Siedel’s method. .i ⎪ ⎪ j=1 ⎪ ⎪ ⎪ ⎪ .j xj(k) − i−1 ai.j xj(k+1) a1i. albeit simplistic.24 Let us now solve the same system of linear equations considered above by means of Gauss-Seidel’s method. Seidel (1821–1896) is a greedy variant of the Jacobi’s method. With the Jacobi’s method the update of x(k+1) occurs only when all the values xi(k+1) are available. named after Carl Friedrich Gauss (1777–1855) and Philipp L. Thus.j j 2. often (but not always) allows a faster convergence to the solution of the system.n Example 3. This variant. from a system of linear equations the formulas for the Gauss-Siedel’s method are: ⎧    (k+1) ⎪ ⎪x1 ⎪ = b1 − nj=2 a1.4 Iterative Methods 81 Algorithm 3 Jacobi’s Method input A and b n is the size of A while precision conditions do for i = 1 : n do s=0 for j = 1 : n do if j = i then s = s + ai..4.

82 3 Systems of Linear Equations Let us write the update formulas of Gauss-Seidel’s method at first: .

x (k+1) = 1 − 2y(k) − z(k) 1 .

(k) 10 y(k+1) = z 1 (k+1) .

10 z = − 10 4 − x (k+1) − 10y(k+1) . we have . 1 Starting from the initial guess x (0) = 0 y(0) = 0 z(0) = 0.

1 1 . x (1) = 1 − 2y(0) − z(0) = 0.

10 y(1) = z(0) = 0 1 .

10 z = − 10 4 − x (1) − 10y(1) = −0.39. (1) 1 Iterating the procedure we have .

139 1 . x (2) = 1 − 2y(1) − z(1) = 0.

039 . (1) 10 y(2) = z 1 = −0.

(2) 1 . 10 z = − 10 4 − x (2) − 10y(2) = −0.39.

1468 1 . x (3) = 1 − 2y(2) − z(2) = 0.

(2) 10 y(3) = z 1 = −0.039 .

38922. 10 z = − 10 4 − x (3) − 10y(3) = −0. (3) 1 .

146722 1 . x (4) = 1 − 2y(3) − z(3) = 0.

(3) 10 y(4) = z 1 = −0.038922 .

. (4) 1 At the step (10) the solution is x (10) = 0.146707 y(10) = −0.389222.10 z = − 10 4 − x (4) − 10y(4) = −0.389220.038922 z(10) = −0. which is at most 10−12 distant from the exact solution.

00 0 We can calculate the inverse matrix ⎛ 1 ⎞ 10 0 0 ⎜ ⎟ G−1 = ⎝ 0 1 10 0 ⎠ 1 1 10 10 − 10 1 .4 Iterative Methods 83 Let us re-write the Gauss-Siedel’s method in terms of matrix equations.....n ⎜ 0 0 . 0 we can write Ax = b ⇒ Gx(k+1) + Sx(k) = b ⇒ ⇒ x(k+1) = −G−1 Sx(k) + G−1 b.3. .1 an.2 . ⎠ 0 0 . a2. The matrices characterizing the method are ⎛ ⎞ 10 0 0 G = ⎝ 0 10 0 ⎠ .... If we pose ⎛ ⎞ a1..... . ..1 0 .2 ... 0 ⎜ a2.. 1 1 −10 ⎛ ⎞ 02 1 S = ⎝ 0 0 −1 ⎠ . . a1. Example 3. .25 Let us reach the same result of the system above by means of matrix formulation. 0 ⎟ G=⎜ ⎝ .. for the Gauss-Seidel’s method the general scheme of the iterative methods can be applied by posing H = −G−1 S and t = G−1 b. .1 a2.. ...n ⎟ ⎜ S=⎝ ⎟ .. . .. an. .n and ⎛ ⎞ 0 a1.. Hence. . ⎠ ⎟ an.2 . .

1 ⎠ ⎝ y(k) ⎠ + ⎝ 0 ⎠ z(k+1) 0 0. For the sake of clarity. The SOR method corrects the Gauss-Siedel method by including in the update formula a dependency on the tentative solution at the step . at the step (10) we have x (10) = 0.1 x 0.2 0. Algorithm 4 Gauss-Seidel’s Method input A and b n is the size of A while precision conditions do for i = 1 : n do s=0 for j = 1 : n do if j = i then s = s + ai.4. is a variant of the Gauss-Siedel’s method which has been designed to obtain a faster convergence of the original method.g.1 ⎝ y(k+1) ⎠ = − ⎝ 0 0 −0.39 If we apply iteratively this formula we have the same results above.i (bi − s) end for end while 3. the pseudocode of the Gauss-Seidel’s method is shown in Algorithm 4. briefly indicated as SOR.j xj end if end for xi = a1i.84 3 Systems of Linear Equations and write the update formula as ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ (k ⎞ ⎛ 1 ⎞⎛ ⎞ x (k+1 1 10 0 0 02 1 x 10 0 0 1 ⎜ (k+1 ⎟ ⎜ ⎟⎜ ⎟ ⎜ (k ⎟ ⎜ ⎟⎜ ⎟ ⎝y ⎠ = −⎝ 0 1 10 0 ⎠ ⎝ 0 0 −1 ⎠ ⎝ y ⎠ + ⎝ 0 10 0 ⎠ ⎝ 0 ⎠ 1 z(k+1 1 10 1 10 − 10 1 00 0 z(k 1 1 10 10 − 10 1 4 that is ⎛ ⎞ ⎛ ⎞ ⎛ (k) ⎞ ⎛ ⎞ x (k+1) 0 0.038922 z(10) = −0.3 The Method of Successive over Relaxation The Method of Successive Over Relaxation. see [8]. e.146707 y(10) = −0.389222.02 0 z(k) −0.

26 Let us solve again the system of linear equation ⎧ ⎪ ⎨10x + 2y + z = 1 10y − z = 0 ⎪ ⎩ x + y − 10z = 4.   ⎪   ⎪ ⎪ xi(k+1) = bi − nj=i+1 ai. ⎪   ⎪ ⎩x (k+1) = bn − n−1 ai.1 x ω + (1 − ω) x2(k) ⎪ ⎨ 2 j=3 j 1 a2.j x (k+1) ω + (1 − ω) x (k) . Let us pose ω = 0. if x(k) is the solution at the step k and xGS (k+1) is the update to the step k + 1 according to the Gauss-Siedel’s method.1  ⎪ ⎪ (k+1) n (k) (k+1) ⎪ ⎪ ⎪ x = b2 − a2. . The explicit update formula of the SOR method can be simply obtained from that of Gauss-Seidel’s method by adding the contribution due to x(k) : ⎧  n  (k+1) (k) ⎪ ⎪ ⎪ x = b1 − a j=2 1.2 . This time let us apply SOR method. Obviously if ω = 1 the SOR method degenerates into the Gauss-Siedel’s method.4 Iterative Methods 85 k.n n Example 3. ⎪ n j=1 j an. the update formula of the SOR method is: xSOR (k+1) = ωxGS (k+1) + (1 − ω) x(k) where ω is a parameter to be set. More specifically.j xj(k+1) aωi.. .i + (1 − ω) xi(k) ⎪ ⎪ j=1 ⎪ ⎪ ⎪.j x − a 2.9 and write the update equations .3.j xj(k) − i−1 ai..j jx ω + (1 − ω) x1(k) ⎪ ⎪ 1  a1.

1x (k) 10 .9 1 − 2y(k) − z(k) + 0. x (k+1) = 0.

9 z + 0. (k) y(k+1) = 0.1y(k) 10 .

z(k+1) = − 0. Let us start again from the initial guess x (0) = 0 y(0) = 0 z(0) = 0.1z(k) . and let us calculate for a few iterations the solution by the SOR method: .9 10 4 − x (k+1) − 10y(k+1) + 0.

x (1) = 1 1 − 2y(0) − z(0) + 0.1x (0) = 0.09 .

1y(0) = 0 10 . (0) 10 y(1) = 1 z + 0.

3519 . z(1) = − 10 1 4 − x (1) − 10y(1) + +0.1z(0) = −0.

86 3 Systems of Linear Equations Iterating the procedure we have .

x (2) = 0.9 1 − 2y(1) − z(1) + 0.1x (1) = 0.130671 10 .

1y(1) = −0. y(2) = 0.031671 10 .9 z(1) + 0.

z(2) = − 0.1z(1) = −0.9 10 4 − x (2) − 10y(2) + 0. .386280.

1x (2) = 0. x (3) = 1 1 − 2y(2) − z(2) + 0.143533 .

037932 10 .1y(2) = −0. (2) 10 y(3) = 1 z + 0.

z(3) = − 10 1 4 − x (3) − 10y(3) + 0. .1z(2) = −0.389124.

1x (3) = 0.146202 10 . x (4) = 0.9 1 − 2y(3) − z(3) + 0.

038814 10 .1y(3) = −0.9 z(3) + 0. y(4) = 0.

.146707 y(10) = −0. It can be observed that the best results (or better the fastest convergence) were obtained by Gauss-Seidel. 0 ⎜ a2.2 .1 0 . as well as the selection of the parameter value is discussed in the following sections.. If... . ⎟ an.9 10 4 − x (4) − 10y(4) + 0.. The reason behind the use of the SOR method is the parameter ω which allows an easy control of the performance of the method. ..... 0 . whose error is at the most in the order of 10−8 . ....1 an.n ⎜ 0 0 .. .389247.⎠. 0 ⎟ E=⎜ ⎝ ....... ⎠ 0 0 . ⎟.389222. as in the case of Jacobi’s method.. . we indicate with ⎛ ⎞ 0 0 . 0 ⎛ ⎞ 0 a1. z(4) = − 0.. .. This topic.. At the step (10) we have x (10) = 0..2 . .1z(3) = −0. a2.038922 z(10) = −0. a1. .n ⎟ F=⎜ ⎝. ..

. .. 0 ⎟ ⎜ D=⎝ ⎟ . If we consider the update index according to Gauss-Seidel’s method. we can write Ex(k+1) + Fx(k) + Dx(k+1) = b ⇒ .4 Iterative Methods 87 and ⎛ ⎞ a1..1 0 ....3.n we can indicate the system Ax = b as Ex + Fx + Dx = b. .. ⎠ 0 0 . ..2 ...... 0 ⎜ 0 a2.. an.

The SOR method corrects the formula above as . ⇒ x(k+1) = D−1 −Ex(k+1) − Fx(k) + b .

27 Let us write the matrix update formula of the SOR method for the system of linear equation above. x(k+1) = ωD−1 −Ex(k+1) − Fx(k) + b + (1 − ω) x(k) . Considering that ⎛ ⎞ 000 E = ⎝0 0 0⎠. 110 ⎛ ⎞ 02 1 F = ⎝ 0 0 −1 ⎠ 00 0 . the SOR is expressed in the general form of an iterative method. Example 3. Hence. we obtain   Dx(k+1) = ω −Ex(k+1) − Fx(k) + b + (1 − ω) Dx(k) ⇒   ⇒ (D + ωE) x(k+1) = ω −Fx(k) + b + (1 − ω) Dx(k) = ((1 − ω) D − ωF) x(k) + ωb ⇒   ⇒ x(k+1) = (D + ωE)−1 ((1 − ω) D − ωF) x(k) + ωb . Extracting x(k+1) . We can then pose that H = (D + ωE)−1 ((1 − ω) D − ωF) and t = (D + ωE)−1 ωb.

8 −0.18 −0.09 =⎝ 0 0.1 0 ⎠.0072 0. 0.9 −10 The inverse of this triangular matrix is ⎛ ⎞ 0.1 −0.9 0.1 ⎝ 0 10 0 ⎠ − 0. if ω = 0.88 3 Systems of Linear Equations and ⎛ ⎞ 10 0 0 D = ⎝ 0 10 0 ⎠ . 0 0 −10 obviously.1 0.009 −0.1 . 0 0 −1 Finally.009 −0.9 0 and ⎛ ⎞ 10 0 0 D + ωE = ⎝ 0 10 0 ⎠ .009 0.1 Let us calculate now ((1 − ω) D − ωF) = ⎛ ⎛ ⎞ ⎛ ⎞⎞ 10 0 0 02 1 = ⎝0.9 = ⎝ 0 1 0. let us calculate H by multiplying the two matrices: H = (D + ωE)−1 ((1 − ω) D − ωF) = ⎛ ⎞ 0.9 ⎝ 0 0 −1 ⎠⎠ = 0 0 −10 00 0 ⎛ ⎞ 1 −1.9 0. 0.9 ⎠ .09 ⎠ 0.9 we have ⎛ ⎞ 0 0 0 ωE = ⎝ 0 0 0 ⎠ 0.1 0 0 (D + ωE)−1 =⎝ 0 0.

the following example is given. Example 3.009 0.4 Numerical Comparison Among the Methods and Convergence Conditions In order to understand differences and relative advantages amongst the three iterative methods described in the previous sections.3.9 =⎝ 0 0.1 3.6 ⎛ ⎞ 0.009 −0.4.3519 For the sake of clarity the pseudocode explaining the logical steps of the SOR method is given in Algorithm 5.i (bi − s) + (1 − ω) xi end for end while 3.1 0 ⎠⎝ 0 ⎠ = 0.4 Iterative Methods 89 Then to calculate the vector t we have t = (D + ωE)−1 ωb = ⎛ ⎞⎛ ⎞ 0.1 0 0 0. −0.j xj end if end for xi = aωi.09 =⎝ 0 ⎠.28 Let us consider the following system of linear equations ⎧ ⎪ ⎨5x − 2y + 3z = −1 3x + 9y + z = 2 ⎪ ⎩ 2x + y + 7z = 3. Algorithm 5 Method of SOR input A and b input ω n is the size of A while precision conditions do for i = 1 : n do s=0 for j = 1 : n do if j = i then s = s + ai. .

let us apply at first the Jacobi’s method.90 3 Systems of Linear Equations In order to solve it. The system can be re-written as ⎧ ⎪ ⎨x = − 5 + 5 y − 5 z 1 2 3 y = 29 − 39 x − 19 z ⎪ ⎩ z = 37 − 27 x − 17 y while Jacobi’s update formulas from step (k) yo step (k + 1) are 1 2 3 x (k+1) = − + y(k) − z(k) 5 5 5 2 3 1 y(k+1) = − x (k) − z(k) 9 9 9 3 2 1 z(k+1) = − x (k) − y(k) . 7 7 7 .

4993197 4 –0.2920333 0.2937098 0.0000739 |Ax(8) − b| = ⎝ 0. 7 7 7 .3682540 0. 0. we obtain x(1) = − 15 . If we choose x(0) = (0.20000 0.4539683 3 –0. 37 .3795014 0.4285714 2 –0.2936251 0.4938876 5 –0.4950285 After eight iterations the Jacobi’s method returns a solution x(8) such that ⎛ ⎞ 0. 0.4951156 8 –0.2945326 0.4949190 7 –0.2222222 0.3817788 0. If we substitute iteratively the guess solutions we obtain k x y z 0 0 0 0 1 –0. 0) as an initial guess. The Gauss-Seidel’s update formulas from step (k) yo step (k + 1) are 1 2 3 x (k+1) = − + y(k) − z(k) 5 5 5 2 3 1 y(k+1) = − x (k+1) − z(k) 9 9 9 3 2 1 z(k+1) = − x (k+1) − y(k+1) .0001868 Let us now apply the Gauss-Seidel’s method to the same system of linear equa- tions. 29 .3797171 0.3795193 0.0002267 ⎠ .4959320 6 –0.2938036 0.2412698 0.3795479 0.2946054 0.3758730 0.

0).293726 0.0000002 ⎠ . Gauss-Siedel is more accurate since ⎛ ⎞ 0.20000 0.3795312 0. for the same amount of iterations.2898765 0.3.3794887 0.2937294 0. 0.4950495 It can be observed that Jacobi’s and Gauss-Seidel’s methods converge to very similar solutions. The application of the Gauss-Seidel’s method leads to the following results: k x y z 0 0 0 0 1 –0.288 and z(1) = 37 − 27 x (1) − 17 y(1) = 0. If the initial guess is x(0) .4950359 6 –0.3765362 0.9.4444444 2 –0.3795378 0.4950493 8 –0.4950477 7 –0. at first x (1) = −0.444.3791007 0.4949322 5 –0.2936764 0. The SOR update formulas from step (k) yo step (k + 1) are   1 2 3 x (k+1) = ω − + y(k) − z(k) + (1 − ω) x (k) 5 5 5   2 3 (k+1) 1 (k) y(k+1) = ω − x − z + (1 − ω) y(k) 9 9 9   3 2 (k+1) 1 (k+1) z(k+1) = ω − x − y + (1 − ω) z(k) .3795372 0.2888889 0.2935701 0. and then y1 = 2 9 − 39 x (1) − 19 z(0) = 0. However.2937286 0. 0 Finally.3511111 0.2937293 0.4942146 4 –0.200.4 Iterative Methods 91 If the initial guess is x(0) = (0.4874780 3 –0. 7 7 7 Let us set ω = 0. let us solve the system above by applying the SOR method.0000005 |Ax(8) − b| = ⎝ 0.

18.1y(0) = 1 (1) . 0).1 (0) = −0. and then y = 0. 2 = 3(0. at first x (1) = 0. 0.200) + 0.9 9 − 9 x − 19 z(0) + 0.9 (−0.

254 and z(1) = 0.3793616 0.4895893 4 –0.4937639 5 –0.2906873 0.2936894 0. k x y z 0 0 0 0 1 –0.254000 0.3795283 0.2821273 0.3794967 0.3787826 0.4947487 6 –0.2935583 0.4722278 3 –0.399.18000 0. 0.3222051 0.2937272 0.4950331 8 –0.3656577 0.2937200 0.3993429 2 –0.4950457 .4949792 7 –0.9 37 − 27 x (1) − 17 y(1) + 0.3762966 0.5z(0) = 0.2929988 0.

Obviously. Theorem 3. although Jacobi’s method appears to be the least powerful method out of the three under examination. A wrong choice can make the method diverge away from the solution of the system.0000410 |Ax(8) − b| = ⎝ 0. Hence. to a higher performance of the method. the natural way Jacobi’s method can be distributed make the method appealing when large linear systems (when the order n of the system is a large number) must be solved and a cluster is available. given the previous system of linear equations. the SOR method has the advantage that the convergence of the method can be explic- itly controlled by tuning ω.0000098 that is slightly worse than that detected by Gauss-Siedel. the parameter ω is such that |ω − 1| < 1.7905479e10). For example. The selection of ω is not the only issue relevant to the convergence of the method. after eight steps x(8) = (1. On the other hand. 3. A parallelization would not be so easy for Gauss-Seidel’s method since each row requires a value calculated in the previous row. in some cases.7332907e10. −5. if ω is chosen equal to 8. . The following example better clarifies this point. the calculation of each row occurs independently on the other rows. The relation between ω and convergence of SOR is given by the following theorem. This tuning can be a difficult task but may lead. an iteration of Jacobi’s method can be easily parallelized by distributing the calculations related to each row to a different CPU.9691761e10.92 3 Systems of Linear Equations Again. it hides a precious advantage in the computational era. If the SOR method converges to the solution of the system for the given matrix and known terms. At each iteration. After eight steps the solution x(8) is such that ⎛ ⎞ 0. As a further remark.29 Let us consider the following system of linear equations: ⎧ ⎪ ⎨5x − 2y = 4 9x + 3y + z = 2 ⎪ ⎩ 8x + y + z = 2.0000054 ⎠ 0. Example 3. the SOR method detected a very similar solution with respect to that found by Jacobi’s and Gauss-Seidel’s methods. The error related to this solution would be in the order of 1011 and would grow over the subsequent itera- tions.11 Let us consider a system of n linear equations in n variables Ax = b.

If A is strictly diagonally dominant.2 | + |a1. .n−1 | Theorem 3. .989 for Jacobi’s method and ⎛ ⎞ 22.. a tuning of ω can lead to the detection of a good approximation of the solution.3 | + . we obtain ⎛ ⎞ 0 |Ax(100) − b| = ⎝ 0 ⎠. The matrix A is said to be strictly diagonally dominant if the absolute value of each entry on the main diagonal is greater than the sum of the absolute values of the other entries in the same row. + |a1. More specifically.12 Let Ax = b be a system of n linear equations in n variables.1 | > |a1. . |an. the system above cannot be tackled by means of Jacobi’s nor Gauss-Seidel’s methods. Definition 3. 8. |ai. after 100 iterations. then this system of linear equations has a unique solution to which the Jacobi’s and Gauss-Seidel’s methods will converge for any initial approximation x(0) .2 | > |a2. The following definition and theorem clarify the reason.n | > |an. we obtain ⎛ ⎞ 93044.4 Iterative Methods 93 The application Jacobi’s and Gauss-Seidel’s methods do not converge to the solu- tion of the system.2843257 ⎠ 0 for Gauss-Seidel’s method. + |ai. In other words.88 ⎠ 95058. For example.n be a square matrix..2 | + .1 | + |an..n | . .3 | + . |a1.. .12 Let A ∈ Rn.5.1 | + |ai. .n | . + |a2. .882e − 16 This example suggests that not all the systems of linear equations can be tackled by an iterative method. .2 | + .1 | + |a1.372 |Ax(100) − b| = ⎝ 116511.3. + |an.n | |a2.988105 |Ax(100) − b| = 1014 ⎝ 7. .i | > |ai. that is. On the other hand. respectively. if ω is set equal to 0.

1 Solve the following system of linear equations by applying at first the Cramer’s Theorem and then the Cramer’s Method: ⎧ ⎪ ⎨x − 2y + z = 2 x + 5y + 2z = 3 . Hence.e. Nonetheless.5 Exercises 3. 1 1 8 that is not strictly diagonally dominant. we obtain an equivalent system whose associated incomplete matrix is ⎛ ⎞ 7 1 2 C = ⎝ 1 −10 2 ⎠ .1 where the symbol O represents the computational complexity of the method. since a row swap (elementary row operation E1) leads to an equivalent system of linear equations. Jacobi’s and Gauss- Seidel’s methods will converge to the solution. ⎪ ⎩ x − 3y + z = 1 . some systems of linear equations can be solved by Jacobi’s and Gauss-Seidel methods even though the associated matrix A is not strictly diagonal dominant. is associated to the matrix ⎛ ⎞ 1 −10 2 A = ⎝7 1 2⎠. if we swap the first and second equations.94 3 Systems of Linear Equations It must be noted that this theorem states that the strict diagonal dominance guaran- tees the convergence of the Jacobi’s and Gauss-Seidel’s methods. see Chap.30 The system of linear equations ⎧ ⎪ ⎨x − 10y + 2z = −4 7x + y + 2z = 3 ⎪ ⎩ x + y + 8z = −6. i. 1 1 8 Obviously this matrix is strictly diagonally dominant. The reverse impli- cation is not true. Moreover. 10. then Ax = b can still be solved by Jacobi’s and Gauss-Seidel’s methods. if the rows of the complete matrix Ac can be rearranged so that A can be transformed into a strictly diagonally dominant matrix C. Example 3. A synoptic scheme of the methods for solving systems of linear equation is dis- played in Table 3. 3.

1 Synopsis of the methods for solving linear systems Cramer Direct Iterative (Rouchè-Capelli) Operational feature Determinant Manipulate matrix Manipulate guess solution Outcome Exact solution Exact solution Approximate solution .5 Exercises 95 Table 3.3.

10) Chap. Computational cost Unacceptably high High O n3 . see ∞ to the exact (O (n!). see Chap. 10 solution but it can be stopped after .

2 Determine for what values of the parameter k. k steps with k · O n2 . undetermined.3 Determine for what values of the parameter k. and incompatible. the homogeneous system is determined. ⎪ ⎩ 2x + z = 0 . ⎪ ⎩ x − y − 2z = 2 − k 2 3. ⎪ ⎩ 4x − y = 1 3.4 Determine for what values of the parameter k. unde- termined. the system is determined. ⎧ ⎪ ⎨(k + 1)x + ky − z = 1 − k kx − y + (k − 1)z = 1 . 10 × 10) (up to approx 1000 × 1000) (k) Hypothesis No hypothesis akk = 0 (solvable by Conditions on the pivoting) eigenvalues of the matrix 3. ⎧ ⎪ ⎨(k + 2)x + (k − 1)y − z = k − 2 kx − ky = 2 . 10 Practical usability Very small matrices Medium matrices large matrices (up to approx. see Chap. and incompatible. ⎧ ⎪ ⎨(k + 2)x − ky + z = 0 x + ky + kz = 0 . and incompatible. unde- termined. the system is determined.

Gauss-Seidel’s.6 Solve the following system of linear equations by applying LU factorization ⎧ ⎪ ⎨2x + 5y + 7z = 1 3x + 4y + 8z = 1 . and SOR methods. ⎧ ⎪ ⎪3x − 2y + z + w = 3 ⎪ ⎨2x + 4y + 5z − 2w = 8 .5 Solve the following system of linear equations by applying both Cramer’s Method and Gaussian elimination. ⎧ ⎪ ⎨8x − 4y = 2 2x + 6y + 3z = 8 . ⎪ ⎪x + 7y − 4z = 5 ⎪ ⎩ 4y + z + 2w = 4 3.96 3 Systems of Linear Equations 3. ⎪ ⎩ x + y + 4z = 5 3.8 Solve the following system of linear equations by Jacobi’s and Gauss-Seidel’s methods. ⎪ ⎩ 14x − y + 6z = 2 . ⎪ ⎪ x + y + 4z − 5w = 2 ⎪ ⎩ 6x + 5y + 5z − 2w = 1 3. ⎧ ⎪ ⎨10x + 6y + 3z = 4 4x − 12y = 1 .7 Solve the following system of linear equations by applying LU factorization ⎧ ⎪ ⎪ 2x + 5y + 7z − 4w = 1 ⎪ ⎨3x + 4y + 8z + w = 1 . ⎪ ⎩ x − y + 6z = 2 3.9 Solve the following system of linear equations by Jacobi’s.

As such. Linear Algebra for Computational Sciences and Engineering.1007/978-3-319-40341-0_4 . Neri. Each element of R2 can be thus seen as a point P = (x1 . it can be graphically represented by a plane [1]. −2 −1 0 1 2 The set R2 = R × R is also dense and infinite.1 Basic Concepts It can be proved that R is a dense set. the horizontal reference axis is referred to as abscissa’s axis while the vertical reference axis is referred to as ordinate’s axis. it can be graphically represented as an infinite continuous line. As such. see [9]. x2 ) belonging to this plane. © Springer International Publishing Switzerland 2016 97 F. see [1].Chapter 4 Geometric Vectors 4. Without a generality loss. DOI 10. Within a Cartesian reference system. let us fix a Cartesian reference system within this plane.

. This direction is oriented in two ways on the basis of the starting and final point. This line identifies univocally a direction. A geometric vector in the plane − → v with starting point P and final point Q is a mathematical entity character- ized by: (1) its oriented direction. Let us consider an arbitrary pair of points P = (x1 .1 Two lines belonging to the same plane are said parallel if they have no common points. the concept of direction is not formally defined at this stage. The first is following the line from P to Q and the second from Q to P.1 Let us consider the following two points of the plane P = (2. Along this line.98 4 Geometric Vectors In summary. only one line passes trough P and Q. identified by P and Q (2) its module ||−→ v ||. that is the distance between P and Q It must be remarked that while the distance is well defined above. since there is a bijection between lines and R and between R2 and planes. By means of simple considerations of Euclidean geometry. A more clear explanation of direction will be clear at the end of this chapter and will be formalized in Chap. The Euclidean distance  d (P Q) = (2 − 2)2 + (2 − 1)2 = 1. x2 ) and Q = (y1 . Example 4. 2). characterised by the Euclidean distance d (P Q) = (y1 − x1 )2 + (y2 − x2 )2 . see [9]. respectively. Definition 4.3 Let P and Q be two points in the plane. the points between  P and Q are a segment. 6. y2 ) belonging to the same plane. Definition 4. 1) and Q = (2.2 A direction of a line is another line parallel to it and passing through the origin of the Cartesian system. Definition 4. we can identify the concept of line (an plane) with the concepts of dense set in one and two dimensions.

2 from a slightly different perspective. A geometric vector in the space −→v with starting point P and final point Q is a mathematical entity characterized by: (1) its oriented direction. 3). 5). The definition above can be extended to vectors in the space. The segment from P to Q is identified by the distance of the two points and the direction of the line passing though P and Q.1 Basic Concepts 99 The concept of geometric vector is the same as the number vector defined in Chap. R2 ) there is a bijective relation.5 When three vectors in the space all belong to the same plane are said coplanar. Definition 4. two points belonging to R3 are represented as P = (x1 . or (7.e. where − → w ∈ V3 − →w belongs to the same plane where −→ u − → and v lie. Example 4.3 A vector − →v ∈ V3 is. Let us define within this set the following operations. y2 . 8) where one of the points identifying the direction is the origin O. 0. they determine a unique plane which they both belong to. The sum − →w =− →u +−→v = (B − A) + −→ (C − B) = C − A = AC. Example 4. Definition 4. 2). Let us indicate with V3 the set containing all the possible geometric vectors in the space.2 A vector of the plane can be (1. identified by P and Q (2) its module ||−→ v ||. Definition 4. x2 . for example. It can be observed that if the starting point is the origin O. Let − → u = −→ −→ AB = (B − A) and − →v = BC = (C − B). If two vectors in the space do not have the same direction. More formally. the two concepts coincide. The null vector −→ o of the space coincides with O = (0. that is the distance between P and Q A point in the plane belongs to infinite lines.4 Let P and Q be two points in the space. (2. Definition 4. a vector in the (3D) space belongs to infinite planes. y3 ). . between the set of vectors in the plane and the set of points in the plane (i. 5. x3 ) and Q = (y1 . 0) Obviously there is a bijective relation also between the set of vectors in the space and points in the space. In the latter case.7 (Sum of two vectors) Let − → u and − →v be two vectors ∈ V3 . Analogously.4. the geometric vector is the restriction to R2 of the number vector. (6.6 A special vector in the space is the vector with starting and final point in the origin O of the reference system. Considering that the origin of the reference system is arbitrary. This vector. namely null vector is indicated with −→ o.

6) . 3) C = (4. 3. Q ∈ R3 . 3. 8.5 Let us consider the following points ∈ R2 : A = (1. 1) B = (2. 4) and the following vectors −→ AB = B − A = (1.−v ∈ V3 : − → → u + −→v = −v +− → →u → −  • associativity: ∀ u . • commutativity: ∀ −→u . 4) and Q = −→ −→ (2. v . 6). The sum of these two vectors is −→ −→ OP + OQ = (3. 10) .4 Let us consider two points P. 4) −→ OQ = (2. . respectively. 1) . w ∈ V3 : u + v + w = − − → − → − → − → − → − → →u + −v +→w • neutral element: ∀−→v : ∃!− → v +− o |− → →o =− → o +− →v = −→ v − →− → − − → • opposite element: ∀ v : ∃!−v|−v + v = v + −v = − − → → − → →o From the properties of the sum we can derive the difference between two vectors − → −→ −→ → − −→ u = AB and − w = AC: − → u −→ w = (B − A) − (C − A) = B − C = CB Example 4. 5. 2) −→ BC = C − B = (2. −→ OP = (1.100 4 Geometric Vectors C B A O The following properties are valid for the sum between vectors. 5. Example 4. The vectors OP and OQ are. where P = (1.

. then o =− →w + −− w = λ− → v + → − → − → −λ v = 0 v + −0 v . If − → →o .1) −→ It can be observed that the vector AC would be the coordinates of the point C if the origin of the reference system is A. 1). Without a loss of general- − → − → − → − →  → ity. .  4. From basic arithmetics we know that 0 = 0 + 0. μ ∈ R and ∀ v ∈ V3 : λ μ u = λ− − → v μ   • distributivity 1: ∀ λ ∈ R and ∀ − →u . − → − → − → − → w = o . 3) . The following properties are valid for the product of a scalar by a vector.6 If λ = 2 and − → v = (1. Proposition 4. . The product of a scalar λ by a vector v is a new vector ∈ V3 . The linear combination of the n vectors by means of the n scalars is the vector − → w = λ1 − → v1 + λ2 − →v2 + . w + − w = − − → − → − → →o . the vector λ− → v = (2. it follows that − → o = λ− →o. . . If λ = 0. . − → v2 . module || λv|| = |λ| || v ||. Definition 4.− →v ∈ V3 : λ − → u +− →v = λ− → u + λ− →v • − → − → − → distributivity 2: ∀ λ.1 Basic Concepts 101 The sum of these two vectors is −→ −→ −→ AC = AB + BC = B − A + C − B = C − A = (3. − → vn be n vectors ∈ V3 .4. 2). Hence.1 Let λ ∈ R and − → v =− v ∈ V3 .  Let us prove that if − →v =− →o then λ− →v =− → o . λ2 .9 Let λ1 . − → → −  Considering λ−→ v =− v . . If either λ = 0 or − → → o . (4. μ ∈ R and ∀ v ∈ V3 :(λ + μ) v = λ v + μ v − → • neutral element: ∀− →v : 1− →v =−→v   − → − →− → → − −→ • opposite element: ∀ u : ∃!−v|−v + − v =→ v + −v = − →o Example 4. • commutativity: ∀ λ ∈ R and ∀ − →v ∈ V3 : λ−→ = − v − →v λ  • → → associativity: ∀ λ. If − → − →  − →   − → − →  we sum to both the terms −λ o . 2. it follows that λ o + −λ o = λ o + λ o +  −  −λ→ o From the properties of the sum of vectors. w + o = − − → − → →w . λ− − → → v having − → − → − → the same direction of v . .  − w =  λ v with λ ∈ R and v ∈ V3 . λn be n scalars ∈ R and − → v1 . 1. then λ− → v is − → equal to the null vector o Proof Let us prove that if λ = 0 then λ− →v = − →o . − →  →  →  o = 0− → v + 0− →v + −0− v = 0− →v + 0− v − 0− →v = 0− →v = λ− →v with λ = 0. then λ− v = λ− → → o =λ − o +→ o = λ− →o + λ−→o . .8 (Product of a scalar by a vector) Let − → v be a vector ∈ V3 and λ a scalar ∈ R. λn − → vn . . From the properties of the sum of vectors we know that ∀ w ∈ V3 . . and orientation is the same of − →v if λ is positive and the opposite if λ is negative. Without a loss of generality. it follow that o = o + o . .2 Linear Dependence and Linear Independence Definition 4. From the properties of the sum − → of vectors we know that ∀ w ∈ V3 .

. . λ2 . .8 Let us consider the following vectors ∈ V3 : − → v1 = (1.1 for both. 0. 0. . Thus the system has ∞1 solutions. These vectors are said lin- early dependent if the null vector can be expressed as their linear combination by means of non-null coefficients: ∃ n-tuple λ1 . . 0. . For example. . λn ∈ R|− →o = λ1 − →v1 + λ2 − → v2 + − → . not only λ1 . Definition 4. Example 4.10 Let − →v1 . 2. 0. If the only way to obtain a null linear combination is by means of coefficients all null. For example if the tuple −4. if λ1 . 0 solves the system. . these vectors are linearly independent if there exists a triple λ1 . . 0. If at least one tuple − → λ1 . . λ2 . − → v2 . . − → v3 are linearly dependent. . This is a homogeneous system of linear equations.11 Let − →v1 . 0. . 2. λ3 = 0. −→v2 . . λ2 . λ3 = 0. − → vn be n vectors ∈ V3 . − → vn be n vectors ∈ V3 . . . . λn vn with λ1 . the vectors are linearly dependent. 1) − → v2 = (1. . . . . 2) . + λn vn with λ1 . 0 such that o = λ1 v1 + λ2 − − → → v2 + λ3 −→ v3 can be found. λ2 . −→ v2 . λ2 .7 Let us consider three vectors − v1 . .102 4 Geometric Vectors Definition 4. 0 the equation − → o = λ1 − →v1 + λ2 − → v2 + λ3 − → v3 is always verified for the Proposition 4. in our case. 2) . . 1. . 0. 0. Obviously. From its defini- tion. linearly dependent and linearly independent vectors. the vectors − → v1 . . . . − → → v2 . These vectors are said lin- early independent if the null vector can be expressed as their linear combination only by means of null coefficients:  n-tuple λ1 . 4. Hence. 4. . . 0. 0 is such that − →o = −4− v1 + 5− → → v2 + 0− → v3 then the tree vectors are linearly dependent. 0 such that − →o = λ1 − →v1 + λ2 − → v2 + λ3 −→ v3 . 1) − → v = (2. Thus. λ2 . 3 Let us check whether or not these vectors are linearly dependent. . 1) + λ3 (2. . is (0. λn = 0. . λ2 . It can be observed that the matrix associated to the system is singular and has rank ρ = 2. λn ∈ R|− →o = λ1 − →v1 + λ2 − → v2 + − → . then the vectors are linearly independent. λ2 . λn = 0. Example 4. The latter equation. . . . −1 is a solution of the system. λ2 . . 5. . λ3 ∈ R and = 0. and − → v3 ∈ V3 . λ3 = 0. . which can be written as ⎧ ⎪ ⎨λ1 + λ2 + 2λ3 = 0 2λ1 + λ2 + 4λ3 = 0 ⎪ ⎩ λ1 + λ2 + 2λ3 = 0. 1) + λ2 (1. 0) = λ1 (1. λ1 . λ3 = 2. 0. 1.

λ3 = 0. . + λn vn . 0) − → v2 = (3. 0) − → v = (0. − →v2 . λ3 (0. Proof If the vectors a linearly dependent. . λ2 . 0. 0) + λ3 (4. . 1. 2) . Thus the vectors − → v1 . λn = 0. 2) .10 Let us check the linear dependence for the following vectors − → v1 = (1. 1. 0) + λ2 (0. . Theorem 4. 0. 0) = λ1 (1. 3 This means that we have to find those scalars λ1 . 2. . 0. λ2 . . 1. 0. . 2. 0. λ3 = 0. . Thus.4. − → vn be n vectors ∈ V3 . 0. 0. 3 This means that we have to find those scalars λ1 . − → v2 . let us suppose that λ1 = 0. . 0) + λ2 (3. Example 4. 0) − → v2 = (0. ∃λ1 . 1) . 0. 0) = λ1 (1. Without a loss of generality. 1. 0. 0) + λ3 (0. The vectors are linearly independent. . λ2 . which leads to ⎧ ⎪ ⎨λ1 + 3λ2 + 4λ3 = 0 2λ1 + λ2 = 0 ⎪ ⎩ λ3 = 0. λ3 (0. 0|− → o =λ1 − → v1 + − → − → λ2 v2 + .1 Let − → v1 . 0. . 0. 1) . λ2 . λ2 . 0. − → v3 are linearly independent. This system is determined and its only solution is λ1 . . .2 Linear Dependence and Linear Independence 103 Example 4. These vectors are linearly dependent if and only if at least one of them can be expressed as a linear combination of the others. which leads to ⎧ ⎪ ⎨ λ1 = 0 λ2 = 0 ⎪ ⎩ 2λ3 = 0 whose only solution is λ1 . . 0. . 0) − → v = (4.9 Let us check the linear dependence for the following vectors − → v1 = (1.

λ2 . λn−1 .  If one vector can be expressed as a linear combination of the others then we can write − →vn = λ1 − → v1 + λ2 − → v2 + . μ2 = 2. The latter example has been reported to remark that Theorem 4. + → vn −λ1 −λ1 One vector has been expressed as linear combination of the others. 1. 1. v2 as a linear combination of − Let us try to express − → → v1 and − → v3 : − → v2 = ν2 − → v1 + ν3 − → v3 . 2) which leads to ⎧ ⎪ ⎨ν2 + 2ν3 = 1 2ν2 + 4ν3 = 1 ⎪ ⎩ ν2 + 2ν3 = 1 which is an impossible system of linear equations.1 states that in a list of linearly dependent vectors at least one can be expressed as a linear combination of . 4.. . . − →vn = λ1 − → v1 + λ2 − → v2 + . . . 0. 1) − → v2 = (1. + λ − 2 2 v−→ − − → v n−1 n−1 n The null vector − →o has been expressed as a linear combination of the n vectors by means of the coefficients λ1 . .104 4 Geometric Vectors − → o = λ1 − → v1 + λ2 − → v2 + . . 1) = ν2 (1. . . . . . . . 0. . 0. 1) + ν3 (2.  Example 4. + λn−1 − v−→ n−1 ⇒ ⇒− → o =λ − →v +λ − 1 1 →v + . 2. + λ v ⇒ − → 1 1 2 2 n n λ2 − λn − ⇒− → v1 = → v2 + . Thus − → v2 cannot be expressed as − → − → a linear combination of v1 and v3 . + λn − → vn ⇒ − → − → ⇒ −λ v = λ v + . . . 2) 3 are linearly dependent. 1) − → v = (2. .. + λn−1 − v−→ n−1 . −1 = 0. We can express one of them as linear combination of the other two: v3 = μ1 − − → →v1 + μ2 − → v2 with μ1 . In order to find ν2 and ν3 let us write (1. The vectors are linearly dependent.11 We know that the vectors − → v1 = (1. 4. Thus. 2. .

λh . . λ1 . . . Let h ∈ N with 0 < h < n. . 0. . . Let us check that all four are linearly dependent. 2. Thus. . + λn vn where λ1 . 0) = λ1 (2. hence − → v1 . Proposition 4. 1) − → v2 = (1. λ3 . . + λh − →vh with λ1 . . 0. − → → v2 . 2) . .2 Let − → v1 . . 0. + λn − → vn . . which leads to ⎧ ⎪ ⎨2λ1 + λ2 + 3λ3 + 5λ4 = 0 2λ1 + λ2 + 3λ3 + λ4 = 0 ⎪ ⎩ λ1 + λ2 + 2λ3 + 2λ4 = 0. − tion of the system. 0. 1) + λ3 (3. If h vectors are linearly dependent.4. . λh+1 . λ4 = 1. Proof If h vectors a linearly dependent. 1. 3. 1. then we can write − → o = λ1 − → v1 + λ2 − → v2 + . 2) + λ4 (5. . . . . λn = 0. λ4 such that (0. λn = 0. −1.12 Let us consider the following vectors − → v1 = (2. Thus.2 Linear Dependence and Linear Independence 105 the other. λ2 . . For example. λh = 0. 2) − → v = (5. − → v2 . λ2 . Without a loss of generality v1 = − let us assume that − → → o . 3. . . . . Proposition 4. if we consider the linear combination of these vectors: λ1 − → o + λ2 −→ v2 + . Thus. 0.3 Let −→v1 . not necessarily all of them can be expressed as a linear combination of the others. . . Thus. − → vn be n vectors ∈ V3 . λ2 . . λ2 . 0. . If one of these vectors is null − → o . − → v2 . . . 1) − → v3 = (3. We have to find those scalars λ1 . . Proof We know from hypothesis that one vector is null. the n vectors are linearly dependent. − → v3 . 4 v3 = − We can easily notice that − → → v1 + − → v2 . − → v3 are linearly dependent. − → v2 .  Example 4. λ3 . − → vn be n vectors ∈ V3 . . + λh − → vh + λh+1 v−−→ − → h+1 + . . Even if we assume that all the other λ values λh+1 . . . 0. . 2. . then the n vectors are linearly dependent. 1) + λ2 (1. 1. then − → o = λ1 − → v1 + λ2 − → v2 + . then all the n vectors are linearly dependent. . . . . − → v4 are linearly dependent. . . . This system has ∞1 solutions. . 1. 2) . . 0 is a solu- v1 . 1.

13 Let us consider the following vectors − → v1 = (0.1 Let − →u and −→v ∈ V3 . 0. 3. − → o = λ− →u + μ− → v ⇒ − → − → ⇒ λ u = −μ v ⇒ μ→ ⇒− → u =− − v.106 4 Geometric Vectors If λ1 is chosen equal to a real scalar k = 0. . . 0) − → v2 = (1. λ3 = 50.12 Let − →u and − →v ∈ V3 . Thus. 0. μ ∈ R and λ. 0. 5. 1) + λ3 (8. Let us suppose that λ = 0. 0) = λ1 (0. it can be observed that every vector is parallel to the null vector − →o. λ . It can be observed that for the parallelism relation the following properties are valid: • reflexivity: a vector (line) is parallel to itself • symmetry: if − → u is parallel to − →v then − →v is parallel to − → u − → • transitivity: if u is parallel to v and v is parallel to − −→ − → w . e. λ2 . and orientation are said equipollent. .  Example 4. 3. Definition 4. then they could be expressed as u = λ v with λ ∈ R: ∃λ ∈ R|− − → − → →u = λ− →v. Thus. μ = 0. direction. In addition. 2) . 3 Let us check the linear dependence (0. λ2 . . . B. Theorem 4. 5. . 0. 0 = 0. . . The two vectors are linearly dependent if and only if they are parallel. 0. then the linear combination will be equal to the null vector − → o for λ1 . It can be proved that equipollence is an equivalence relation. The two vectors are parallel if they have the same direction. Proof If the two vectors are linearly dependent then the null vector can be expressed as linear combination by means of non-null coefficients: ∃λ. 0. 1) − → v = (8. Thus. . the vectors are linearly dependent. the vectors are linearly dependent. This means that a vector having a first point A is equivalent to one with same module. For example λ1 . 0. direction. . λn = k.2 Let − → u and −→ v ∈ V3 .g. 2) . 0|− →o = λ−→u + μ−→v . If the two vectors are parallel. . 0) + λ2 (1. 0 satisfies the equation. . then − → → u is parallel to − →w Since these three properties are simultaneously verified the parallelism is an equiv- alence relation. Lemma 4. and orientation having a different starting point. Two vectors having same module.

1. 1) − → v = (2. there exists only one scalar λ such that − →v = λ− → u. The null vector has been expressed as the linear combination of the two vector by means of the coefficients 1. − → o = (λ − μ) − → v ⇒ v = − since for the hypothesis − → → o. i. i. We can easily check that these vectors are linearly dependent since − → o = λ1 − → v2 + λ2 − → v1 with λ1 .e. 2. . μ| − → v and − u = λ− → → u = μ− → v with λ = μ. the only λ value such that − → → u is λ = 3. Theorem 4.14 The vectors − → v1 = (1.3 Let − → v = − v ∈ V1 with − → →o . λ2 = 1. Thus. Each − → vector u ∈ V1 can be expressed in only one way as the product of a scalar λ and the vectors − → v ∈ V1 ∃!λ|− v : ∀− → → u = λ−→v. by contradiction hypothesis. 0. −2. the unidimen- sional vectors.2 Linear Dependence and Linear Independence 107 The two vectors are parallel.  If the two vectors are parallel then ∃λ ∈ R|− → u = λ−→v . Thus.15 Let us consider the following vector ∈ V1 : − → v = (6) The previous theorem simply says that for any other vector − → u ∈ V1 . Hence. − → u = λ−→v ⇒− →o = − → − → u − λ v .4. then we reached a contradiction. let − → v be linearly independent. −λ = 0. ⇒λ−μ=0⇒λ=μ Since λ = μ.  Example 4. If − → v = λ− u = (2). 2) .  Example 4. Let us indicate with V1 the set of vectors belonging to one line. Proof By contradiction let us assume that ∃λ. 2 v2 = 2− These two vectors are parallel since − → → v1 .e. the vectors are linearly dependent.

 Example 4. 6. 2. 1) − → v = (2. − → − → − → Thus u . and w are linearly dependent.16 Let us consider the following three vectors ∈ V3 : − → u = (1. − →u = λ− → v + μ− →w . The three vectors would be coplanar.− →v . v . Proof If −→ u. B b O A C c On the basis of the geometrical construction. where − → u = O A. they all belong to the same plane. and −→w are coplanar.108 4 Geometric Vectors Theorem 4.− → w ∈ V3 and = − v . there is at least one plane that contains the three vectors. For the Theorem 4. −→ −→ −→ OA = OB + OC −→ → −→ −→ where OA = − u . The three vectors are coplanar if − → − → − → and only if u . Let us indicated with B. and OC = μ− →w .  − → − → − → If the vectors are linearly dependent they can be expressed as u = λ v + μ w . v . A. and C −→ the projection of the point A over the directions determined by the segments Ob and −→ Oc (or by the vectors −→v and −→ w ) respectively. OB = λ− →v . A parallelogram is determined by the vertices ABCD. • If − →v and − → w are not parallel. respectively. 4) − → w = (4. [10]). Thus. the three vectors determine a unique direction. . and w belong to the same plane. Within this plane the sum − →u = λ− → v + μ− →w is verified. and − → →o . they determine one unique plane containing both of them. 6) .4 Let − →u . • If − → v and − w are parallel then − → → u is parallel to both of them. Let us consider three coplanar vectors having all an arbitrary starting point O and final −→ points. see e. The vector −→u −→ − → belongs to a plane where λ v and μ w belong to. b. Since infinite planes include a line (direction). Obviously λ−→v is parallel to − →v and μ−→w is parallel to − → w. Thus. 2.g. This is true because the sum of two vectors in a plane is a vector belonging to the same plane (the sum is a closed operator.1. and c. the vectors are linearly dependent.

0. 1) − → v = (2. 2) − → w = (4. 1. The vector − →w identifies another direction. 6.2 Linear Dependence and Linear Independence 109 We can easily observe that (4.− → w are also coplanar. μ): ∃!λ. μ ∈ R such that − → u = λv + μ− → w .5 Let − →u . 2. 6) .4. and − → − → − → also ∃λ . If these three vectors are coplanar and two of them ( v and w ) are linearly independent (− − → − → → v and − → w are not parallel to − → each other). − 21 .18 If we consider again − → u = (1. 1) − → v = (2. The vector w is the weighted sum of the other two and thus belongs to the same plane where − →u and −→v lie. μ = λ . μ.e. − → Proof Let us assume by contradiction that ∃λ. 6. Thus.− → w ∈ V3 and = − v .g. The fact that one vector is linear combination of the other two has a second meaning. The vectors − → u − → − → and v identify a plane. by means of only one tuple of scalars λ. Theorem 4. Thus − → u .− →w are linearly dependent. 4. The equation − → o = λ− →u + μ−→v + ν−→w is verified for infinite values of λ. 1) + (2. 2.− →v . Thus.17 The three vectors inV3 − → u = (1. Example 4. 2. 2. 2. This means that the vectors −→u . 4) that is w = λ− − → → u + μ− → v. the third ( u ) can be expressed as the linear combination of the other two in only one way (i. 4) − → w = (4. μ ∈ R such that u = λ v + μ w with λ. − →  →  → o = λ − λ − v + μ − μ − w. 6. 6) = 2 (1. μ .− →v . Thus λ = λ and μ = μ . e. The two directions identify a plane containing the three vectors. μ|− →u = − → λ v +μw. Let us check the linear dependence. and − → →o . Since the vectors are linearly independent λ − λ = 0 and μ − μ = 0. the vectors are coplanar. 6) . In this case − → u and − →v are parallel and thus identify one direction.  Example 4. ν.

2. .2 and 4. The system is determined and the only solution is λ.3 and Theorems 4. and t ∈ V3 . 4) . v . 1 is unique. which leads to ⎧ ⎪ ⎨λ + 2μ = 4 2λ + 2μ = 6 ⎪ ⎩ λ + 4μ = 6.110 4 Geometric Vectors we know that w = λ− − → → u + μ− → v. 6. the vectors are linearly dependent for Proposition 4. 1. 2. These vectors are always linearly depen- − → − → dent: ∀ u . μ. Let us consider the case of four vectors such that each triple is not coplanar. with λ. 1.− →v . ν = 0. 0.− → w . This equation can be written as − → t = λ− → u + μ− → v + ν− → w ⇒ − → − → − → − → − → ⇒ o =λu +μ v +νw − t . Let us verify it: (4. The theorem above says that the couple λ. Proof If one vector is the null vector − → o or two vectors are parallel or three vectors are coplanar. − → Theorem 4. It can be observed that −→ −→ −→ −→ OC = OA + AB + BC. μ = 2.4. μ = 2. w . 1) + μ (2.6 Let − →u . μ = 2. 0| t = λ− − → − → − → →u + μ−→v + ν− → w. 6) = λ (1. Without a loss of generality let us consider all the vectors have the same arbitrary starting point O. and t ∈ V3 : ∃λ.

μ. ν = 5.19 The theorem above tells us that three arbitrary vectors ∈ V3 − → u = (0. In summary. 0. 6. Also in this case. In a similar way. 1) + μ (0. 0. When this situation occurs the vectors are linearly dependent. Any fourth vector would give redundant pieces of information. 1. the concept of linear dependence can be interpreted as the presence of redundant pieces of infor- mation within a mathematical description. two parallel vectors. 5) are necessarily linearly dependent. 0. μ. 5) = λ (0. or four vectors in the space are linearly dependent. 1. −1 = 0. If two vectors are not parallel they identify a plane and describe a point within it. ν.2 Linear Dependence and Linear Independence 111 The null vector has been expressed as linear combination of the four vectors by means of non-null coefficients as λ. 0) − → w = (1. 1. 1. There is no need of two numbers to represent a unidimensional object. In a similar way. The vectors are linearly dependent. The system is determined and λ. If we write − → t = λ− → u + μ− → v + ν− → w that is (4. 0) − → t = (4. This object requires two numbers to be described and a third coordinate would be unnecessary. 1. Looking at its geometric implications. This fact is mathematically trans- lated into the statement that four (or more) vectors in the space are surely linearly dependent.  Example 4. Intuitively. 1) − → v = (0. two parallel vectors in the space identify only one direction and are de facto describing a unidimensional problem. 0. redundant pieces information are present in the mathematical description and the vectors are linearly dependent.4. 4 is its solution. 0) + ν (1. . 0. three vectors in a plane. The three dimensional problem degenerates in a unidimensional problem. three vectors are needed to describe an object in the space. 6. 0) which leads to the following system of linear equations: ⎧ ⎪ν = 4 ⎨ λ+μ=6 ⎪ ⎩ λ = 5. Finally. we can easily see that in a unidimensional space one number fully describes an object. three (or more) coplanar vectors in the space are de facto a two-dimensional problem.

it follows that − → − → μ− t =  λ u + − → w = λ − v + ν − → → u + − → ν −  μ v + → w ⇒ → ⇒ λ−λ u + μ−μ v + ν−ν w =− − → − → →o. every vector t in V3 is univocally identified by those three coefficients λ. Since. ν): − → ∃!λ. Let us formally prove this statement Theorem 4.e. It − → follows that the vector t can be expressed as the linear combination of the other three in only one way (i. 0. if t . 1) − → v = (0. 0. by contradiction. ν = 5. 0. μ.− →v . and − →w be three linearly independent vectors ∈ V3 and − → λ. when three linearly independent vectors − → − → u . and − → w are fixed. 0) − → t = (4. −u . μ. and − → w are linearly independent. then t is implicitly identified. 6. by means of only one tuple of scalars λ. 5) . μ. μ.− →v . Let a vector t be another vector ∈ V3 . 0 be a triple of scalars.−→ v . each vector of V3 can be univocally expressed as the linear combination of three linearly independent vectors.20 Let us consider again − → u = (0. ν = 0. . 1.→v .  Example 4.− → v . − → u . ν = 0. there exists another triple λ . μ . −→ We can easily verify that if λ. 0 such that t = λ− → u + μ− → v + ν− → w Proof If. − → → − Also. ν and − → u . ν = 0. Equivalently. 4. 1. it follows that   λ − λ = 0 ⇒ λ = λ μ − μ  = 0 ⇒ μ = μ ν − ν = 0 ⇒ ν = ν . μ.7 Let − →u. Thus. ν such that − → t = λ− → u + μ− → v + ν− → w.−→ w are fixed. μ. 0. the scalars λ. 0 such that − → t = λ − → u + μ − →v + ν − → w . ν are univocally determined (the resulting system is determined). We know that − → t = λ− → u + μ− → v + ν− → w with λ. 0) − → w = (1. μ. 1.− → w are fixed.112 4 Geometric Vectors Each point of the space can then be detected by three linearly independent vectors. for hypothesis.

4. 1) we mean that the vector has been written in − → − → − → the basis B = { e1 . − → e2 . ν = 3. v3 ∈ R and can be indicated as (v1 .7 every vector belonging to V3 can be univocally expressed as the linear combination of the vectors composing the basis: ∀− v ∈ V3 : − → →v = v1 − → e1 + − → − → v2 e2 + v3 e3 . 0. Each vi . y. 1. 0. v2 . 6) in the basis B. A basis of V3 can be seen as a set of vectors able to generate. μ. 0) − → e = (4. i. − → → e2 . e2 .3 Bases and Matrices of Vectors Definition 4. 1. − →v = (x. For Theorem 4. thus. − → Thus. 0) − → e = (0. 5. e2 . z). − → → e2 . The following corollary gives a formalization of this statement. − → Let us express an arbitrary vector t = (4. − → e3 }. − − → − → − → − → →e2 .e.7 any vector ∈ V3 can be univocally expressed as a linear combination of the vectors of this basis. 1) . − → e3 }. 0. − → e3 is a basis of vector having module equal to 1 and direction of the axes of the reference system. when we write t = (3.g. For Theorem 4. 1. v2 .13 A vector basis in V3 is a triple of linearly independent vectors B = {−e1 . . 3 Example 4. 1. 2) − → e2 = (0. Every time in the previous sections of this chapter a vector has been indicated e. these vectors are a basis B = {− → e1 . 0) 3 are linearly independent. ∀i is named component of the vector − →v . e3 }. for a fixed basis B = { e1 . any arbitrary vector ∈ V3 . As such. we implicitly meant that v = x− − → → e1 + y − → e2 + z − → e3 . by linear combination. It results that − → e1 + μ− t = λ− → → e2 + ν − → e3 with λ.21 We can easily verify that the vectors − → e1 = (0. two vectors are equal if they have the same components in the same basis. where −e1 . − → e1 = (1. In general. each vector belonging to V3 is identified by its component and. In this case the vector − → v is represented in the basis B = { e1 . 8.3 Bases and Matrices of Vectors 113 4. 5. v3 ). 0) − → e2 = (0. e3 }. − → e3 }. where v1 .

− → → − − → If we considered another vector t ∈ V3 with t = t it would follow that − → another triple λ . μ.7. μ.  The following corollary formally states a concept previously introduced in an intuitive way: there is an equivalence between the vectors of the space and the points of the space. the mapping is bijective.114 4 Geometric Vectors − → Corollary 4. For Theorem 4. Proof For a fixed basis B = {− → e1 .6.  Proposition 4. − e1 . ν) . Corollary 4. For Theorem 4. for Theorem 4. − → e3 } let us consider the mapping φ : V3 → R3 defined as − → t = (λ. − as a linear combination of − → → e2 . a vector is always representable e1 . μ.6 t . ν). it follows that the triple λ . μ . μ . Thus. where . μ.1 Every vector t ∈ V3 can be represented as a linear combination of − → − → − → the vectors e1 . − → e3 } is fixed. ν) − →   t = λ . there exists a bijection between the set of vectors in the space V3 and the set of points R3 . μ. e2 . ν for Theorem 4. − → e3 and thus. → e2 . ν. in the basis the vector t is univocally determined by the triple (λ. − → e2 . These two vectors are parallel if and only if the rank ρA of the matrix A associated to the corresponding components is < 2.7 it follows that − → e1 + μ− t = λ− → → e2 + ν − → e3 − → by means of a unique triple λ. This mapping is surjective since. where − → e1 + μ− t = λ− → → e2 + ν − → e3 → − − → This mapping is injective since for t = t with − → t = (λ. We can indefinitely reiterate this operation and discover that any arbitrary vector ∈ V3 can be represented as linear combination of − e1 . Thus. μ. ν would be univocally associated to t . ν. − → e2 . ν = λ. − → → e2 . − → e3 by means of a unique triple of scalars. e3 composing a basis of V3 . μ . μ. ν. always associated to a triple λ. ν = λ.2 If an arbitrary basis B = {−→e1 .4 Let −→u.−→v ∈ V3 . − → − → → − Proof Let us consider a generic vector t ∈ V3 . − → e3 are linearly dependent.

u1 u2 u3 A= . v1 v2 v3 .

ku 2 . This can happen in the follow- ing cases. These vectors can always be expressed as − →u = λ− → v . μku 1 ) ⇒ − → v = (ku 1 . 12) v = 2− are parallel since − → → u .22 The following vectors − → u = (1. 0) and (v1 .  Example 4. 6) − → v = (2. u 1 = λv1 u 2 = λv2 u 3 = λv3 . • Two columns are composed of zeros. • A row is composed of zeros. − → Since every vector is parallel to o . μv1 ). • Each pair of columns is proportional. Thus ρA < 2. 6. 0. the vectors are parallel. μu 1 ) and (v1 . the vectors are parallel. ku 3 ) = − → k u . λv1 .4. λku 1 . every 2 submatrix has null determinant. • Two rows are proportional. Since two vectors are the equal if and only if they have the same components. Obviously the matrix . If uv11 = k. The vectors are parallel. Thus.3 Bases and Matrices of Vectors 115 Proof If − → v are parallel they could be expressed as − u and − → → u = λ− → v with λ ∈ R. The vectors are parallel. λu 1 . Since the two rows are proportional. Thus. then − → v = (ku 1 .  If ρA < 2. there is no non-singular order 2 submatrix. 0. The vectors are of the kind (u 1 . 0). − → u = λ− →v ⇒   ⇒ u 1 e1 + u 2 e2 + u 3 e3 = λ v1 e1 + v2 e2 + v3 − − → − → − → − → − → → e3 . 3. This means that one vector is the null vector − →o. The vectors are of the kind (u 1 .

6. 9) . 4. 6) − → v = (3. 13 6 2 6 12 has rank equal to 1.23 The following vectors − → u = (2. that is < 2. Example 4.

116 4 Geometric Vectors are associated to a matrix .

.

this matrix has rank ρ < 2. Example 4. Thus. 6) − → v = k (2. 3. 6) .5 we can write the two vectors as − → u = (2. Thus. 4. 246 2 2λ 2μ = 369 3 3λ 3μ with λ.24 Let us determine the values of h that make − → u parallel to − → v. these two vectors are parallel. 4. If we pose k = uv11 = 1. μ = 2. Ever pair of columns is proportional. u = (h − 1) − − → → e1 + 2h − → e2 + − → e3 − → − → − → v = e + 4e + e − → 1 2 3 These two vectors are parallel if and only if the matrix A has a rank ρA < 2: .

h − 1 2h 1 A= . 1 4 1 .

h − 1 2h Let us compute det = 4h − 4 − 2h = 2h − 4. The vectors.

are par- 1 4 2h 1 allel if 2h − 4 = 0 ⇒ h = 2. In addition. we have to impose that det = 4 1 .

− →v .5 Let − →u . w1 w2 w3 Proof If the vectors are coplanar then they are linearly dependent for Theorem 4. h−1 1 0 ⇒ h = 2 and det = h − 1 − 1 = 0. For Theorem 4. where ⎛ ⎞ u1 u2 u3 A = ⎝ v1 v2 v3 ⎠ . Thus.1 one of them can be expressed as linear combination of the others: .4. and − → w ∈ V3 The three vectors are coplanar if and only if the determinant of the matrix A is equal to 0. the vectors are parallel if 1 1 h = 2. Proposition 4.

Example 4. • A column is null. u 3 ) = λ (v1 . • One row is linear combination of the other two rows. 12) − →v = (2. This means that one vector is null and two vectors determine a plane. (v1 .  Propositions 4. the problem is practi- cally unidimensional and the rank is 1. • A row is null.e. u 2 . v3 ) + μ (w1 . w2 . i. u 2 . u 2 . the three vectors are in the (same) plane.14 Two vectors (or two lines) in V3 are said to be perpendicular when their direction compose four angles of 90◦ . The vectors are linearly dependent and thus coplanar. they identify a plane and the rank of the associated matrix is 2.  If det (A) = 0 the following conditions may occur. w3 ) ⇒ (u 1 . • One column is linear combination of the other two columns. 4) ⎛ ⎞ 5 3 12 The det ⎝ 2 8 4 ⎠ = 160 + 12 − 312 − 96 − 24 + 260 = 0. .5 clarify the meaning of determinant and rank of a matrix by offering an immediate geometric interpretation. v2 . This means that the three vectors can be expressed as (u 1 . v2 . the vectors are in V2 . 4) − → w = (1. It can be observed that − → → u − 2− → v. the three vectors are coplanar.4. λv1 + μv2 ). i. if we consider only two vectors in the space we can geometrically interpret the concept of rank of a matrix. −13. The first row of the matrix A has been expressed as linear combination of the other two rows. 3. Thus det (A) = 0. 8. The vectors are 1 −13 4 w =− coplanar. (w1 . This situation occurs when a redundancy appears in the mathematical description. If the vectors are copla- nar the volume is zero as well as the associated determinant.25 Let us verify whether or not the following three vectors are coplanar: − →u = (5. w2 . If the vectors are not parallel. Since one component is not independent. u 3 ) = λv1 + μw1 . One component is null for all the vectors and thus the vectors are in V2 . Definition 4. If the two vectors are parallel. λu 1 + μu 2 ). Thus. λw1 + μw2 ). the three vectors are in the (same) plane. λv2 + μw2 .e.4 and 4. the rank of a matrix can be geometrically interpreted as the actual dimensionality of a mathematical description. In other words. λv3 + μw3 . The determinant of a 3 × 3 matrix can be interpreted as a volume generated by three vectors. In a similar way.3 Bases and Matrices of Vectors 117 − → u = λ− → v + μw ⇒ (u 1 .

1) − → w = (1. 0) − → j = (0. Intuitively we may think that two multi-dimensional objects are orthogonal when all the angles generated by the intersection of these two objects are 90◦ . k } the basis is said orthonormal. Now. k }.26 Let us consider the following vectors in an orthonormal basis of ver- − → − → − → sors { i . Vectors are in general represented in an orthonormal basis of versors but can be rewritten in another basis by simply imposing the equivalence of each component. 0.−→ w }. 0.118 4 Geometric Vectors Definition 4. In V3 . 8. orthogonality is a more general term and refers to multi- dimensional objects (such as planes in the space). 0) − → k = (0.− →v . This concept is better explained in Chap. 10 3 The vectors − → u. 1) .−→v . 0. and − → w are linearly independent. While perpendicularity refers to lines (or vectors) in the plane and means that these two objects generate a 90◦ angle.− → v . − →u = (2. −1) − →v = (1. −1) Let us verify that − → u . j . Example 4. 0. the two concepts are closely related. j . 3) − → t = (2. The following example clarifies this fact. Definition 4. However. and − → w are linearly independent: ⎛ ⎞ 2 0 −1 det ⎝ 1 2 1 ⎠ = 12 + 2 = 14 = 0. an orthonormal basis composed of vectors is − → i = (1. The defin- ition above states that this solid is orthogonal when all the angles involved are 90◦ . .16 If a basis of vectors in V3 is composed of three orthogonal vectors − → − → − → { i . 2. It must be remarked that the notion of perpendicularity does not coincide with the notion of orthogonality.15 Three vectors in V3 are said to be orthogonal if each of them is perpendicular to the other two.17 When the vectors composing an orthonormal basis have modulus equal to 1. the vectors composing this basis are said versors. Definition 4. let us determine the − → components of t in the new basis {− → u . For example. three vectors in V3 can be interpreted as an object in the space. 1. We may conclude that perpendicularity is orthogonality among lines of the plane. −1.

2. ν the following linear system must be solved: ⎧ ⎪ ⎨2λ + μ + ν = 2 2μ = −1 . −1) + μ (1. t = 87 − → u − 1−→ 3 − → 2 v + 14 w . 2. 2. 0. and ν = 14 3 . −1) It can be easily verified that − → u . 1) + ν (1. 1) − → w = (3. 2μ + 2ν. μ. −1) = λ (2. 0) − → t = (2.3 Bases and Matrices of Vectors 119 − → t = λ− → u + μ− → v + ν− → w ⇒ ⇒ (2.− → v . In order to detect the values of λ. In order to detect the values of λ. − → The solution of the linear system is λ = 87 . The system is determined. ν the following linear system must be solved: ⎧ ⎪ ⎨2λ + μ + 3ν = 2 2μ + 2ν = −1 .− → v . ⎪ ⎩ −λ + μ + 3ν = −1 It can be easily observed that the matrix associated to this linear system is the transposed of the matrix A associated to the three vectors. k }. −λ + μ) . −1) = (2λ + μ + 3ν. −1) + μ (1. −1) = (2λ + μ + ν. 0. Thus. − →u = (2. −1. −λ + μ + 3ν) . 0. −1. Example 4.4. 2μ. j . ⎪ ⎩ −λ + μ = −1 . −1. −1. 3) ⇒ ⇒ (2. μ = − 21 . μ. and − → w are linearly dependent since the matrix ⎛ ⎞ 2 0 −1 det ⎝ 1 2 1 ⎠ = 0. 1) + ν (3. 32 0 − → If we try to express anyway t as a linear combination of − → u. −1) = λ (2. −1) − →v = (1. 0) ⇒ ⇒ (2. 0. 2. −1. 2.27 Let us consider the following vectors in an orthonormal basis of ver- − → − → − → sors { i . and − → w we obtain − → t = λ− → u + μ− → v + ν− → w ⇒ ⇒ (2.

k }. has ∞ solutions. v . 2μ. 3) − → t = (0. u . This fact can be geometrically seen with the following sentence: it is impossible to generate a vector in the space starting from three coplanar vectors.120 4 Geometric Vectors This system is impossible because the rank of the incomplete matrix is 2 whilst the rank of the complete matrix is 3.−→v . 0. 0) − → t = (2. 0. 0. − → The system in undetermined and. and − → w are linearly independent and thus are a basis. 0) = λ (2. In the same plane one vector can be expressed as the linear combination of three other vectors in an infinite number of ways. − → Let us express t in the basis of − → u .− → v . we obtain the following system of linear equations: ⎧ ⎪ ⎨2λ + μ + 3ν = 2 2μ + 2ν = 4 . It can be observed that t − → − → − → − → − → and v are parallel. If we try to express t as linear combination of them. −1) + μ (1. ⎪ ⎩ −λ + μ = 2 The rank of the incomplete matrix is 2 as well as the rank of the complete matrix. The three coplanar vectors are effectively in two dimensions and their combination cannot generate a three- dimensional object.28 Let us consider the following vectors in an orthonormal basis of ver- − → − → − → sors { i . hence. j . 0) = (2λ + μ + ν. 1) − → w = (1. 2. The system has no solutions. − → u = (2. 0. 2. 0) We know that −→u . 1) + ν (1.− → v . 1) − → w = (3. −1) − → v = (1. 2. Example 4. and − → w: − → t = λ− →u + μ− →v + ν− →w ⇒ ⇒ (0. 4. 0. j . Example 4. −λ + μ + 3ν) . 0. −1) − → v = (1. Hence t . and − →w are coplanar. 3) ⇒ ⇒ (0. .29 Let us consider the following vectors in an orthonormal basis of ver- − → − → − → sors { i . 0. 0. k }. − → u = (2. 2) − → We already know that − → u. and w are coplanar. 2.

4. 0. 0. 0. μ.30 If we consider three linearly dependent vectors. ν we have the following homogeneous system of linear equations: ⎧ ⎪ ⎨2λ + μ + 3ν = 0 2μ + 2ν = 0 . 0) as a linear combination of − → u . μ. ⎪ ⎩ −λ + μ = 0 The rank of the incomplete matrix is 2 as well as the rank of the complete matrix. 0. This means that at least one solution = 0. 0. we found out − → that the only linear combination of u . such as − → u = (2. The convex angle determined by the vectors is said angle of the vectors. Hence. and − − → w that return the null vector − → → o is by means of the tuple λ. The only solution of the system is − → λ. μ. ν = 0. It can be observed that t is the null vector. 1) − → w = (3. This is another way to express the linear dependence of the vectors. has ∞ solutions besides 0. 0. and − →w by means of three scalars λ. 0. The system in undetermined and.−→ v . This angle can be between 0 and π .4 Products of Vectors Definition 4. ν = 0. Example 4.− → v . 2. We have verified the linear independence of the vectors − →u . 0. −1) − → v = (1. v . These examples are extremely important as they link systems of linear equations and vectors highlighting how they correspond to different formulations of the same concepts. ν the following linear system must be solved: ⎧ ⎪ ⎨2λ + μ + ν = 0 2μ = 0 . 2. 0) − → and we try to express the vector t = (0.− → v ∈ V3 having both an arbitrary starting point O. hence. μ.18 Let −→ u . and − →w. ⎪ ⎩ −λ + μ + 3ν = 0 This is a homogeneous system of linear equations. The system is determined as the associated incomplete matrix is non-singular. 0 such that − → o = λ− →u + μ− →v + ν− → w exists. . 4. 0.3 Bases and Matrices of Vectors 121 In order to detect the values of λ.

− →v ∈ V3 .  The following properties are valid for the scalar product with − → u . − → − → − → − → Proposition 4.7 Let −→ u . and − → w ∈ V3 and λ ∈ R. the one vector is the null vector − →o . If a module is 0. cos φ = 0 and the scalar product is 0. their angle φ = 2 . The scalar product is an operator that associates a scalar to two vectors (V3 × V3 → R) according to the following formula: − →u−→v = ||−→u ||||− → v || cos φ. The following proposition addresses this question.) Let − →u .−→v ∈ V3 . The scalar product of these two vectors is equal − → − → to 0 ( u v = 0) if and only if they are perpendicular. 2?”.  π If the vectors are perpendicular. every vector is perpendicular to the null vector. A natural question would be “How does the latter definition relate to that given in Chap. the vectors are perpendicular. Thus the vectors are perpendicular.− → v ∈ V3 having both an arbitrary starting point O and let us indicate with φ their angle.122 4 Geometric Vectors Definition 4. Since a null vector has an undetermined direction.− → v . with − → u =u 1 i + u 2 j + u 3 k and − → v =v1 i + − → − → v2 j + v3 k . . Proof If −→ u− →v = 0 ⇒ ||− → u ||||− →v || cos φ = 0. Thus.6 Let −→u . If cos φ = 0 ⇒ φ = π2 (+kπ with k ∈ N).19 (Scalar Product (Dot Product). This equality is verified either if at least one of the modules in 0 or if cos φ = 0. Proposition 4. The scalar product is equal to: − → u−v = ||− → → u ||||− → v || cos φ = u 1 v1 + u 2 v2 + u 3 v3 . • commutativity: − → u− − →v = →− − →  v  u− → −  → • homogeneity: λ u v = λ→ → − → u − v =→ u λ− v     • w − associativity: − → → u−→v = − w− → →u − → v → −  →− • w − distributivity with respect to the sum of vectors: − → u +→ v =−w→ w− u +− → → v This is the second time in this book that the term “scalar product is used”.

32 The two vectors − → u = (2. j . The vector product is an operator that associates a vector to two vectors (V3 × V3 → V3 ) and is indicated with −→ u ⊗− →v . − →u− →v = u 1 v1 + u 2 v2 + u 3 v3 .g.20 (Vector Product (Cross Product). These two vectors are not perpendicular. 5. These two vectors are perpendicular.31 The two vectors − → u = (1. 1) have scalar product − → u−→ v = 1 (0) + 5 (6) − 3 (1) = 27. Thus.− →v ∈ V3 having both an arbitrary starting point O and let us indicate with φ their angle.4 Products of Vectors 123 Proof The scalar product can be expressed in the following way. i j . the two scalar products defined in this book are homonyms because they are the same concept from different perspectives. Definition 4.  In other words. is equal to 1 because the vectors are parallel (φ = 0 ⇒ cos φ = 1) and the module is unitary. is equal to 0 because the vectors are perpendicular.4. e. the scalar product of a basis − →− → vector by itself. 6. k } is taken orthonormal. e. −3) − → v = (1. Example 4.  −→ − → →  − − → − → → − − → u−→ v = u1 i + u2 j + u3 k v1 i + v2 j + v3 k = − →− → − →− → − →− → = (u 1 v1 ) i i + (u 1 v2 ) i j + (u 1 v3 ) i k + − →− → − →− → − →− → + (u 2 v1 ) j i + (u 2 v2 ) j j + (u 2 v3 ) j k + − →− → − →−→ − →− → (u 3 v1 ) k i + (u 3 v2 ) k j + (u 3 v3 ) k k − → −→ − → Considering that { i . The resulting vector has module according to the following formula: ||− → v || = ||− u ⊗− → → u ||||− → v || sin φ. while the scalar product of a a basis vector by − →− → another basis vector. . Example 4. 5.) Let − → u . −3) − → v = (0. 2. i i . 4) have scalar product − → u−→ v = 2 (1) + 5 (2) − 3 (4) = 0.g.

v is equal to o if and only if the two vectors are parallel. The vector product of the vectors − → − → − → u .9 Let − → u . If one of the vector is the null vector − →o then the vectors are parallel. If sin φ = 0 ⇒ φ = 0 and the vectors are parallel.  − → − → →  − − → − → → − − → u ⊗− → v = u 1 i + u 2 j + u 3 k ⊗ v1 i + v2 j + v3 k = − → − → − → − → − → − → = (u 1 v1 ) i ⊗ i + (u 1 v2 ) i ⊗ j + (u 1 v3 ) i ⊗ k + − → − → − → − → − → − → + (u 2 v1 ) j ⊗ i + (u 2 v2 ) j ⊗ j + (u 2 v3 ) j ⊗ k + − → − → − → − → − → − → (u 3 v1 ) k ⊗ i + (u 3 v2 ) k ⊗ j + (u 3 v3 ) k ⊗ k . ⎛−→ − → →⎞ − i j k A = ⎝ u1 u2 u3 ⎠ . The vector product :− → u ⊗− → v is equal to the (symbolic) determinant of the matrix A where. The orientation of − → − → u ⊗ v is given by the so-called right-hand-rule.8 Let − → u .  The following properties for the vector product are valid.  If the vectors are parallel φ = 0 ⇒ sin φ = 0. graphically represented below. Proof The vector product is equal to the null vector −→ o either if one of the two vectors is the null vector or if the sin φ = 0. • anticommutativity: − → − →v = −− →v ⊗ − →  −→ u ⊗− → − → − → u • homogeneity: λ u ⊗ v = u ⊗ λ v → −  → − w ⊗ − • distributivity with respect to the sum of vectors: − → u +→ v =−w ⊗→ u + − → w ⊗ v − → Proposition 4. Thus the vector product is equal to 0.−→v ∈ V3 and φ their angle.−→v ∈ V3 and φ their angle. Proposition 4.124 4 Geometric Vectors The direction of − → u ⊗− →v is perpendicular to that of −→ u and −→v . v1 v2 v3 Proof The vector product of − → u by − → v is calculated in the following way.

k } is orthonormal. i ⊗ i . v = 2− These vectors are parallel since − → → u . 4 10 2 Example 4.33 Let us consider the following two vectors − → u = (2. Since { i .g. 5. is equal to − →o since − → − → − → the vectors are parallel. j .34 Let us consider the following two vectors: − → u = (4. . 1) − → v = (4. Example 4.  The det (A) is said symbolic because it is based on a matrix composed of hetero- geneous elements (instead of being composed of only numbers). e. the vector product of two vectors composing the basis is the third vector. 2) . Hence. −2) − →v = (1. 10. for anticommutativity − → − → − → j ⊗ i =−k − → − → − → k ⊗ j =−i − → − → − → i ⊗ k =− j . Let us check their parallelism by cross product ⎛− → − → →⎞ − i j k − → − → − → − → − → − → − − → − → v ⊗ u = det ⎝ 2 → 5 1 ⎠ = 10 i + 4 j + 20 k − 20 k − 4 j − 10 i = o .4. the equation becomes: − → − → − → − → u ⊗− → v = (u 1 v2 ) k − (u 1 v3 ) j − (u 2 v1 ) k + − → − → − → + (u 2 v3 ) i + (u 3 v1 ) j − (u 3 v2 ) i = − → − → − → = (u 2 v3 ) i + (u 3 v1 ) j + (u 1 v2 ) k + − → − → − → − (u 3 v2 ) i − (u 1 v3 ) j − (u 2 v1 ) k = = det (A) . 2) . 1.4 Products of Vectors 125 − → − → The vector product of a basis vector by itself. 0. we obtain: − → − → − → i ⊗ j = k − → − → − → j ⊗ k = i − → − → − → k ⊗ i = j and. Thus.

Proof If the tree vectors are coplanar. Thus the scalar product between them is equal to 0. the − → − → vector product u ⊗ v is a vector perpendicular to both and to the plane that contains them.126 4 Geometric Vectors The scalar product of these two vectors is 4 (1) + 1 (0) + 2 (−2) = 0.e. 2 0 2 Definition 4. ).−→w ∈ − → − → V3 having components u = (u 1 . k } be an orthonormal basis. in the basis B. −→ u . The mixed product is an operator that associates a scalar to three vectors (V3 × V3 × V3 → R) and is defined as the scalar product of one of the three vectors by the vector product of the other two vectors: − → → v − u ⊗− → w Proposition 4. i.21 (Mixed Product (Triple Product). respectively. → v 1 2 3 v and w 1 w 2 w3 ).11 Let B = { i .−→ v . By definition of vector product. Since u ⊗ v is perpendicular to the three vectors − →u .10 Let − → u . u 3 ). The three vectors are coplanar if and only if their mixed product − →u ⊗− →v − → w is equal to 0. If we calculate the vector product we obtain ⎛− → − →− →⎞ i j k − → − → − → det ⎝ 4 1 −2 ⎠ = 2 i − 12 j − 2 k . The vectors are perpendicular. . → − →  If the mixed product − u ⊗→ v − w and − w is equal to 0.− → w are coplanar. Thus. − → = (w . v = (v . there is only one plane that contains all of them.− →v . the mixed product is equal to 0. − → →u ⊗−→v are perpendicular. The mixed product − u ⊗− →v − → w = det (A) where the matrix A is: ⎛ ⎞ u1 u2 u3 A = ⎝ v1 v2 v3 ⎠ .−→ w ∈ V3 having all an arbitrary starting point O.− →v . Let − →u . w1 w2 w3 Proof The vector product is equal to ⎛−→ − →− →⎞ i j k − → − → u ⊗ v = det ⎝ u 1 u2 u3 ⎠ = v1 v2 v3 . −→u ⊗− →v is perpendicular to both −→u and − →v and to − → − → the plane determined by them.  − → −→ − → Proposition 4. − w and − → →u ⊗− →v are perpen- dicular. j .−→ w ∈ V3 having all an arbitrary starting point O.− →v . . − → u and − →v are also coplanar.) Let − →u .−→w .− →v . Since − →w belongs to the same plane. Thus. u 2 .

4.4 Products of Vectors 127 .

.

.

.

u2 u3 − → u1 u3 − → u1 u2 −→ = det i + det j + det k v2 v3 v1 v3 v1 v2 because of the I Laplace Theorem. The mixed product is then obtained by calculating the scalar product between − → w − → − → and u ⊗ v : .

.

.

.

 w = 2− Example 4. 1. − → → v − u ⊗− → u2 u3 u1 u3 u1 u2 w = det w1 + det w2 + det w3 = det (A) v2 v3 v1 v3 v1 v2 for the I Laplace Theorem. 6. 4. 2) (4.35 The following three vectors are coplanar since − → → u + 3− → v: (1. 2. Example 4.36 The following three vectors are coplanar since two of these vectors are parallel (hence only two directions are under consideration): (2. 3) (5. 8) . 4 16 8 This result could have seen immediately by considering that the third row is a linear combination of the first two. Let us check that the vectors are coplanar by verifying that the matrix associated to these vectors is singular ⎛ ⎞ 1 2 1 det ⎝ 0 4 2 ⎠ = 0. 2) . 1) (0. . 512 This result could have seen immediately by considering that the second row is the first two multiplied by 3. 16. Let us check that the vectors are coplanar by verifying that the matrix associated to these vectors is singular ⎛ ⎞ 221 det ⎝ 6 6 3 ⎠ = 0. 2. 1) (6.

−2) 4. and − → w are linearly independent and. if possible. −3) − → t = (2. 0. the values of h that make − → u parallel to − → v. if they exist. v . 4) − → w = (1.1 Determine. j . Determine whether or not − → u . Determine whether or not −→ u . and − → w are − → linearly dependent and. 0. −3) − → − → − → in the orthonormal basis { i . 4.128 4 Geometric Vectors 4. if they exists. −4) . 2) − →v = (1. express v as a linear combination of the other two by means of non-null scalars (hint: express − →v = λ− → u + μw and find λ and μ). −3. w }.3 Let us consider the following vectors − → u = (3.4 Compute scalar and vector product of the following two vectors: − →u = (2.2 Determine. − → − → − → − → find a new basis of t in { u .− → v .5 Exercises 4. − → e1 + (2h − 1) − u = (3h − 5) − → → e2 + 3− → e3 − → − → − → v = e − e + 3e − → 1 2 3 4. . 2) − → v = (1. 0. − → u = 2−e1 − − → → e2 + 3− →e3 − → − → − → − → v = e1 + e2 − 2 e3 w = h− − → → e −−1 → e + (h − 1) − 2 →e 3 4. 2) − → v = (4. 0. 4. the values of h that make − → u . k }. −2) − →w = (1. 1. −6.− →v . 1.− →v . and − → w coplanar. if possible.5 Let us consider the following vectors − → u = (2.

4. −1) − → w = (1. −3. 0.6 Let us consider the following vectors − → u = (2. 2) − → v = (3.5 Exercises 129 4. Determine whether or not the vectors are linearly independent. 0. . 2) .

Linear Algebra for Computational Sciences and Engineering. On the other hand. Neri. More specifically. In order to represent these numbers Gerolamo Cardano in the XVI century introduced the concept of Imaginary numbers. √ Example 5. DOI 10. then the set is said closed with respect to that operator. by defining the imaginary unit j as the square root of −1: j = −1. © Springer International Publishing Switzerland 2016 131 F. Imaginary numbers compose a set of numbers represented by the symbol I. if a square root of a negative number has to be calculated the result is not determined and is not a real number.1 −9 = j3.2 Let us consider the imaginary numbers j2 and j5. The basic arithmetic operations can be applied to imaginary numbers. if the result of the operation is still an element of the set regardless of the input of the operator. R is not closed with respect to the square root operation. This means that the square roots of negative numbers can be represented.1 Complex Numbers As mentioned in Chap. For example it is easy to verify that R is closed with respect to the sum as the sum of two real numbers is certainly a real number. √ see [11].1007/978-3-319-40341-0_5 . • sum: ja + jb = j (a + b) • difference: ja − jb = j (a − b) • product: ja jb = −ab • division: ja jb = ab Example 5. for a given set and an operator applied to its elements. It follows that j2 + j5 = j7 j2 − j5 = − j3 j2 j5 = −10 j2 j5 = 25 . 1.Chapter 5 Complex Numbers and Polynomials 5.

respectively. Furthermore. Since 0 j = j0 = 0. If z 1 = a + jb and z 2 = c + jd. the zero has an interesting role. the set of complex numbers C contains numbers that are the sum of real and imaginary parts. the projections of the point on the real and imaginary axes. In addition. The representation of a complex number in a Gaussian plan must not be confused with the representation of a point in R2 .132 5 Complex Numbers and Polynomials It can be observed that while the set  is closed with respect to sum and difference operations. • sum: z 1 + z 2 = a + c + j (c + d) • product: z 1 z 2 = (a + jb) (c + jd) = ac + jad + jbc − bd = ac − bd + j (ad + bc) • division: zz21 = a+ jb c+ jd = (a+ jb)(c− jd) (c+ jd)(c− jd) = ac+bd+ j(bc−ad) c2 +d 2 . a is the real part of the complex number while jb is its imaginary part. the basic arithmetic operations can be applied to complex numbers. Example 5. Although in both cases there is a bijection between the set and the points of the plane. i. Definition 5. the zero is both a real and imaginary number.e. it is not closed with respect to product and division. For example. while the set R2 is the Cartesian product R × R. the product of two imaginary numbers is a real number. Complex numbers can be graphically represented as points in the so called Gaussian plane where real an imaginary parts are.3 The number a + jb = 3 + j2 is a complex number. it can be seen as the intersection of the two sets. The set of complex numbers is indicated with C.1 A complex number is a number that can be expressed as z = a + jb where a and b are real numbers and j is the imaginary unit.

• a= z+ż 2 • b= z−ż 2 . 1. In a similar way we can define a field of imaginary numbers.6 Let us consider the following conjugate complex numbers z = 3 + j2 and ż = 3 − j2. We know that the field of real number is the set R with its sum and product operations. z 1 + z 2 = 6 − j2 z 1 z 2 = 8 − j20 + j6 + 15 = 23 − j14 z1 z2 = 2− 4+ j3 j5 (4+ j3)(2+ j5) = (2− j5)(2+ j5) = −7+ 29 j26 .2 Let z = a + jb be a complex number. From the division of complex numbers. and division. we have defined the operations of sum and product over the set of complex numbers C. The following basic arithmetic operations can be defined for a complex number and its conjugate.1 Complex Numbers 133 Example 5. 2 + j2 8 8 8 Definition 5. The complex number a − jb is said conjugate of z and is indicated with ż.5 1 2 − j2 1 1 = = −j . In addition. . An important characterization of complex numbers can be done on the basis of the definitions of ordered set and field given in Chap. product.5. • sum: z + ż = a + jb + a − jb = 2a • difference: z − ż = a + jb − a + jb = j2b • product: z ż = (a + jb) (a − jb) = a 2 − jab + jab − j 2 b2 = a 2 + b2 Example 5.4 Let us consider the complex numbers z 1 = 4 + j3 and z 2 = 2 − j5l Let us compute their sum. From the first basic arithmetic operations we can extract that if z = a + jb. z a + jb (a + jb) (a − jb) a + b2 Example 5. It can easily be verified that the field properties are valid for sum and product over complex numbers. the inverse of a complex number z = a + jb can be easily verified as 1 1 a − jb a − jb = = = 2 . It follows that z + ż = 6 z − ż = j4 z ż = 9 + 5 = 15.

c ∈ R with c > 0 : if x1 ≤ x2 then x1 + c ≤ x2 + c • ∀x1 . . ∠θ ). x3 ∈ R : if x1 ≤ x2 and x2 ≤ x3 then x1 ≤ x3 • ∀x1 .e.1 The imaginary field I is not totally ordered. x2 ∈ R with x1 > 0 and x2 > 0 : x1 x2 > 0 Proposition 5. x2 . Let us consider x1 . Since one of the total order requirement is not respected the imaginary field is not totally ordered. x2 ∈ R : if x1 ≤ x2 and x2 ≤ x1 then x1 = x2 • ∀x1 . x1 x2 = −bd < 0. Let b > 0.134 5 Complex Numbers and Polynomials The real field R is totally ordered. A complex number can have an equivalent representation using a system of polar coordinates. x1 x2 = j 2 bd with bd > 0. from a complex number z = a + jb. the following properties are valid. we can represent the same number in terms of radius (or module) ρ and phase θ . • ∀x1 . Let x1 = jb and x2 = jd.  It follows that the complex field is not totally ordered. where  ρ= a 2 + b2    arctan ab if a > 0 θ=   arctan ab + π if a < 0. and d > 0. i. More specifically. x2 ∈ I. and indicate as (ρ. As an intuitive explanation of this fact. Then. Proof Let us prove that the property: ∀x1 . Thus. The representation of a complex number as z = a + jb is said in rectangular coordinates. x2 . x2 ∈ I with x1 > 0 and x2 > 0 : x1 x2 > 0 is not valid in the imaginary field. x2 ∈ R with x1 = x2 : either x1 ≤ x2 or x2 ≤ x1 • ∀x1 . two complex numbers cannot be in general sorted in the same way as there is no explicit criterion to sort two points in a plane.

3◦ . the √ √ 8  radius ρ2 = −42 + 82 = 80 and the phase θ1 = arctan −4 + 180◦ = 116. 9◦ .5.√ Let us represent √ these two numbers in polar coordinates. z 1 z 2 = ρ1 (cos θ1 + j sin θ1 ) ρ2 (cos θ2 + j sin θ2 ) = = ρ1 ρ2 (cos θ1 + j sin θ1 ) (cos θ2 + j sin θ2 ) = = ρ1 ρ2 (cos θ1 cos θ2 + j cos θ1 sin θ2 + j sin θ1 cos θ2 − sin θ1 sin θ2 ) = = ρ1 ρ2 (cos (θ1 + θ2 ) + j sin (θ1 + θ2 )) . It follows that z 1 z 2 = (10. It is here reminded that cos (α − β) = cos α cos β + sin α sin β and sin (α + β) = sin α cos β + cos α sin β. From simple trigonometric considerations on the geometric representation of a complex number we can derive that a = ρ cos θ b = ρ sin θ. in order to computer their product. we can represent a complex number as z = a + jb = ρ (cos θ + j sin θ ) . As for z 2 . the radius ρ1 = 22 + 32 = 13 and the phase θ1 = arctan 23 = 56. 6◦ . ∠172.7 Let us consider the two following complex numbers: z 1 = 2 + j3 and z 2 = −4+ j8. Let us now compute sum and product of these two complex numbers. Example 5. ∠45◦ ). ∠30◦ ) and z 2 = (2. it is enough to calculate the product of their modules and to sum their phases. The sum is z 1 + z 2 = −2 + j11. From the product of two complex numbers in polar coordinates.  As for z 1 . ∠75◦ ) = 10 (cos (75◦ ) + j sin (75◦ )) .8 Let us consider the following complex numbers z 1 = (5. This means that if two complex numbers are represented in polar coordinates.  The productcan be computed  from the polar representation of √ √ the numbers: z 1 z 2 = 13 80 . z 2 = ρ2 (cos θ2 + j sin θ2 ) and compute their product. it immediately follows that the nth power of a complex number is given by z n = ρ n (cos (nθ ) + j sin (nθ )) .1 Complex Numbers 135 Example 5. Let us consider two complex numbers z 1 =ρ1 (cos θ1 + j sin θ1 ). . Thus.

From these formulas we can write ⎧ ⎪ ⎨ρ2 = ρ1 n cos θ2 = cos (nθ1 ) ⇒ ⎪ ⎩ sin θ2 = sin (nθ1 ) ⎧ √ ⎪ ⎨ρ1 = n ρ2 ⇒ θ1 = θ2 +2kπ ⎪ ⎩ n θ1 = θ2 +2kπ n . Thus. If z 1 = ρ1 (cos θ1 + j sin θ1 ) and z 2 = ρ2 (cos θ2 + j sin θ2 ) then z 2 = (ρ1 (cos θ1 + j sin θ1 ))n ⇒ ⇒ ρ2 (cos θ2 + j sin θ2 ) = ρ1n (cos (nθ1 ) + j sin (nθ1 )) . The complex number z in polar coordinates is √  z= 8.136 5 Complex Numbers and Polynomials Example 5. where k ∈ N. the nth root can be derived.9 Let us consider the complex number z = 2 + j2. let us suppose that (z 1 )n = z 2 . Let us calculate z 4 = 64 (cos (180◦ ) + j sin (180◦ )) = −64. From this formula. the formula for the nth root is . ∠45◦ . More specifically.

.

.

n n In a more compact way and neglecting 2kπ . if z = ρ (cos θ + j sin θ ). √ √ θ2 + 2kπ θ2 + 2kπ n z = ρ2 cos n + j sin . then .

.

.

n n Example 5.10 Let us consider the complex number z = (8. . √ √ θ θ n z = ρ cos n + j sin . ∠15◦ ) = 2 (cos (15◦ ) + j sin (15◦ )) . ∠45◦ ) and calculate √ 3 z = (2. An alternative (and equivalent) formulation of the nth power of a complex number is the so-called De Moivre formula.

may appear as its logical consequence. and thus this is not the original proof. Nonetheless. π Example 5. Example 5. Proof If we multiply z by j we obtain: j z = jρe jθ = jρ (cos θ + j sin θ ) = = ρ ( j cos θ − sin θ ) = ρ (− sin θ + j cos θ ) = = ρ (sin (−θ ) + j cos (θ )) = ρ (cos (θ + 90◦ ) + j sin (θ + 90◦ )) = ◦ = ρe j(θ+90 ) .1 Complex Numbers 137 Theorem 5. A proof of the Euler’s Formula is reported in Appendix B. . It follows that ◦ j z = ρ (cos (θ + 90◦ ) + j sin (θ + 90◦ )) = ρe j(θ+90 ) . (cos θ + j sin θ )n = cos (nθ ) + j sin (nθ ) . √ √ j π4 ◦ ◦ 2 2 e = cos 45 + j sin 45 = +j . the following theorem broadens the interpretation of the concept of com- plex numbers. thus. Finally.12 Let us consider the complex number z = 2 + j2 and multiply is by j: j z = j2 − 2 = −2 + j2.1 (De Moivre’s Formula) For every real number θ ∈ R and integer n ∈ N. e jθ = cos θ + j sin θ.71828. The Euler formula is an important result that allows to connect exponential func- tions to sinusoidal functions to complex numbers by means of their polar represen- tation. the De Moivre’s formula is anterior with respect to the Euler’s formula.  This means that the multiplication of a complex number by the imaginary unit j can be interpreted as a 90◦ rotation of the complex number within the Gaussian plane. 2 2 As a remark. Proposition 5. see [13]. base of natural logarithm.2 (Euler’s Formula) For every real number θ ∈ R. it can be seen as an extension of the Euler’s formula and.5.11 For θ = 45◦ = 4 . Theorem 5.2 Let z = ρe jθ = ρ (cos θ + j sin θ ) be a complex number. see [12]. where e is the Euler’s number 2.

if we calculate Euler’s formula for θ = π we obtain: e jπ + 1 = 0.2 Complex Polynomials 5. an ∈ C. . . Let us multiply this number by j: j z = j5 (cos 15◦ + j sin 15◦ ) = 5 (− sin 15◦ + j cos 15◦ ) = = 5 (sin −15◦ + j cos 15◦ ) = 5 (cos 105◦ + j sin 105◦ ) = ◦ ◦ 5 (cos (90◦ + 15◦ ) + j sin (90◦ + 15◦ )) = 5e j(15 +90 ) Finally. 0. i.1 Operations of Polynomials Definition 5. an z n = ak z k k=0 . all these elements appearing only once in the equation. . π. as well as the basic operations. i. which can be written as 5 (cos 15◦ + j sin 15◦ ). . .e. that is the so called Euler’s identity.3 Let n ∈ N and a0 .2. j. . sum. e. multiplication.e. This equation is historically considered as an example of mathematical beauty as it contains the basic numbers of mathemat- ics. .13 Let us consider the complex number z = ρ∠15◦ . The function p (z) in the complex variable z ∈ C defined as  n p (z) = a0 + a1 z + a2 z 2 + . 1. 5. + . a1 .138 5 Complex Numbers and Polynomials In the Gaussian plane this means Example 5. and exponentiation. .

3 Let p1 (z) = nk=0 ak z k and p2 (z) = m k k=0 bk z be two polyno- mials of orders n and m. Example 5. the order of the sum polynomial p3 (z) = p1 (z) + p2 (z) is the greatest among n and m. The order n of the polynomial is the maximum value of k corresponding to a non-null coeffi- cient ak . . the polynomial is said constant polynomial. • If m = n.2 Complex Polynomials 139 is said complex polynomial in the coefficients ak and complex variable z.  Definition 5.5 Let p (z) = nk=0 ak z k be a polynomial. the order of the sum polynomial p3 (z) = p1 (z) + p2 (z) is ≤ n. The sum polynomial is p3 (z) = 3z 3 + 4z 2 + 2. respectively. The two polynomials are said identical p1 (z) = p2 (z) if and only if the following two conditions are both satisfied: • the order n of the two polynomials is the same • ∀k ∈ N with k ≤ n : ak = bk .   Definition 5. If ∀k ∈ N with k ≤ n : ak = 0. The sum polynomial is the polynomial p3 (z) = nk=0 ak z k + m b k=0 k z k .15 Let p1 (z) = nk=0 ak z k and p2 (z) = m k k=0 bk z be two complex polynomials with m < n.16 Let us consider the polynomials p1 (z) = z 3 − 2z p2 (z) = 2z 3 + 4z 2 + 2z + 2. If ∀k ∈ N with 0 < k ≤ n : ak = 0 and a0 = 0.14 The following function p (z) = 4z 4 − 5z 3 + z 2 − 6 is a polynomial.6 (Identity Principle) Let p1 (z) = nk=0 ak z k and p2 (z) = nk=0 bk z k be two complex polynomials. Example 5.4 Let p (z) = nk=0 ak z k be a polynomial.  Definition 5.   Proposition 5.   Example 5. The two polynomials are identical if and only if • ∀k ∈ N with k ≤ m : ak = bk • ∀k ∈ N with m < k ≤ n : ak = 0 n m Definition 5. • If m = n. respectively. the polynomial is said null polynomial.5.7 Let p1 (z) = k=0 ak z and p2 (z) = k k k=0 bk z be two poly- nomials of orders n and m.

The order of the product polynomial p3 (z) = p1 (z) p2 (z) is n + m. The product polynomial p3 (z) = 2z 3 − 4z 2 + 2z 2 − 4z = 2z 3 − 2z 2 − 4z is of order 2 + 1 = 3. respectively and p2 (z) = 0. their sum results into a polynomial of the first order. 3 and 2. The sum polynomial could have a lower order with respect to the starting ones. Hence the sum polynomial is of order 3.   Definition 5. Example 5.17 To clarify the meaning of this proposition let us consider the following polynomials: p1 (z) = 5z 3 + 3z − 2 and p2 (z) = 4z 2 + z + 8.3 (Euclidean Division) Let p1 (z) = nk=0 ak z k and p2 (z) = m k=0 bk z k be two polynomials of orders n and m. . i. are of order 2 and 1 respectively.4 Let p1 (z) = k=0 ak z and p2 (z) = k k k=0 bk z be two poly- nomials of orders n and m. respectively. respectively.   Theorem 5.  The product polynomial is a polynomial p3 = a k=0 k z b k=0 k z k . It is obvious that the sum polynomial is of the same order of the greatest among the orders of the two polynomials. The division of polynomials p1 (z) (dividend) by p2 (z) (divisor) results into a polynomial p1 (z) p3 (z) = = q (z) + d (z) p2 (z) which can be rewritten as p1 (z) = p2 (z) q (z) + r (z) where q (z) is said polynomial quotient and r (z) is said polynomial remainder. n m Proposition 5.140 5 Complex Numbers and Polynomials Example 5. if we consider two polynomials of the same order such as p1 (z) = 5z 3 +3z −2 and p2 (z) = −5z 3 + z + 8.18 The following two polynomials p1 (z) = z 2 − 2z p2 (z) = 2z + 2.8 Let p1 (z) = nk=0 ak z k and p2 (z) = m k k=0 bk z be two polyno- mials of norders kn and mm. The order r or the polynomial remainder is strictly less than the order m of the divisor p2 (z): r < m.e. On the other hand.

the equation above violates the identity principle of two polynomials. and the order of r0 (z). Let us name l the order of (q (z) − q0 (z)). are both < m.4 we know that ∃!q (z) and ∃!r (z) such that p1 (z) = (z − α) q(z) + r (z). r0 (z) exist such that p1 (z) = p2 (z) q (z) + r (z) p1 (z) = p2 (z) q0 (z) + r0 (z) where the order of r (z). From Theorem 5. we reached a contradiction as the polynomial quotient and remainder must be unique.20 Let us consider a special case of the Theorem 5. k ∃! complex polynomial q (z) and ∃! complex polynomial r (z) having order r < m| p1 (z) = p2 (z) q (z) + r (z). The order of p2 (z) (q (z) − q0 (z)) is m + l ≥ m. Proof By contradiction. the order m of p2 (z) is 1 and the order r of the remainder is < m. r0 . let us assume that two pairs of complex polynomials q (z). the following equality is verified: 0 = p2 (z) (q (z) − q0 (z)) + (r (z) − r0 (z)) ⇒ ⇒ r0 (z) − r (z) = p2 (z) (q (z) − q0 (z)) . i.4 where the order of p1 (z) is n while the order of p2 (z) is 1. Thus.  Example 5. it is zero. r . .4 (Uniqueness of Polynomial Quotient and Remainder) Let p1 (z) = m a k=0 k z k and p 2 (z) = k=0 bk z be two complex polynomials with m < n.e. More specifically. We have that the order n of p1 (z) is 2.5. r (z) and q0 (z). From the hypothesis we know that the order of p2 (z) is m. The following theorem shows that polynomial quotient and remainder are unique for a given pair of polynomials p1 (z) and p2 (z).2 Complex Polynomials 141 Example 5. It follows that p1 (z) = p2 (z) q (z) + r (z) where q (z) = z + 3 r (z) = 17. Thus. Theorem n 5. Since the order of r0 (z) − r (z) can be at most m − 1.19 Let us consider the following polynomials p1 (z) = z 2 − z + 5 p2 (z) = z − 4. p2 (z) = (z − α).

Hence.10 Let p (z) be a polynomial. Hence.5 (Polynomial Remainder Theorem or little Bézout’s Theorem) Let p (z) = nk=0 ak z k be a complex polynomial having order n ≥ 1. . Theorem5.e. Proof From the Euclidean division in Theorem 5. In the case p2 (z) = (z − α). that is 1.142 5 Complex Numbers and Polynomials The order r of the polynomial r (z) < than the order of p2 (z). 5. To highlight that the polynomial remainder is a constant. The values of z such that p (z) = 0 are said roots or solutions of the polynomial.e. the Euclidean division is p (z) = (z − α) q (z) + r. the polynomial r (z) is either the constant or the null polynomial. the polynomial remainder r (z) is a constant.9 Let p1 (z) = nk=0 ak z k and p2 (z) = m k k=0 bk z be two complex polynomials. i. i. Hence. Thus the order of the polynomial r (z) is 0. The polynomial p (z) is divisible by (z − α) if and only if p (α) = 0 (α is a root of the polynomial). It must be observed that the null polynomial is divisible by all polynomials while all polynomials are divisible by a constant polynomial. the polynomial remainder r (z) has order 0. r = p (α) . a polynomial p1 (z) is divisible by p2 (z) if ∃! polynomial q (z) such that p1 (z) = (z − α) q (z) (with r (z) = 0 ∀z). Let us calculate the polynomial p (z) in α p (α) = (α − α) q (α) + r = r.2. Proof If p (z) is divisible by (z − α) then we may write p (z) = (z − α) q (z) . The polynomial p1 (z) is said to be divisible by p2 (z) if ∃! polynomial q (z) such that p1 (z) = p2 (z) q (z) (with r (z) = 0 ∀z). let us indicate it with r .3 we know that p (z) = (z − α) q (z) + r (z) with the order of r (z) less than the order of (z − α).2 Roots of Polynomials Definition 5.1 (Ruffini’s Theorem) Let p (z) = nk=0 ak z k be a complex polynomial having order n ≥ 1.   Definition 5. The polynomial remainder of the division of p (z) by (z − α) is r (z) = p (α).  Corollary 5.

(z + 2) It can be easily verified that the polynomial reminder of this division is r = p (−2) = −9. in the case of the division of  4  −z + 3z 2 + 4 (z + 2) we obtain r = p (−2) = 0. . for the Euclidean division and Polynomial Remainder Theorem it results that p (z) = (z − α) q (z) + r n−1 where r is a constant and q (z) = k=0 bk z k . A practical implication of Polynomial Reminder and Ruffini’s Theorems is the ncalled kRuffini’s rule that is an algorithm for dividing a polynomial p (z) = so k=0 ak z by a first order polynomial (z − α). On the contrary.21 Let us consider the division of polynomials  4  −z + 3z 2 − 5 . In the latter case the two polynomials are divisible. it follows that r = 0 and that p (z) = (z − α) q (z) that is p (z) is divisible by (z − α). Obviously. Considering that p (z) = (z − α) q (z) + r (z) and for the little Bézout’s Theorem p (α) = r .2 Complex Polynomials 143 Thus. for z = α we have p (α) = (α − α) q (α) = 0. then p (α) = 0.  Example 5. The algorithm consists of the following steps.5. . a1 a0 α .  If α is a root of the polynomial. At the beginning the coefficients are arranged as an an−1 .

e. then p (z) is divisible by (z − α). . each coefficient bk of q (z) can be recursively calculated as bk = ak+1 + bk+1 α for k = n − 1. since real numbers are special cases of complex numbers. i. then this polynomial has at least one root. b0 = a1 + b1 α Finally. a1 a0 α bn−1 = an bn−2 = an−1 + bn−1 α . . . . This root is not necessarily a real number but it could be a complex number as in the case of x 2 + 1. As a first observation of the Fundamental Theorem of Algebra. Furthermore. . In order to do this.. 0: an an−1 . n − 1. Obviously.  Theorem 5. This means that a real polynomial always has at least one root.144 5 Complex Numbers and Polynomials and the coefficient corresponding to the maximum power in the polynomial an is initialized in the second row. . a1 a0 α bn−1 = an From this point. . for the Ruffini’s Theorem if α ∈ C is the root of the polynomial. By applying Ruffini’s rule we obtain −1 0 3 0 −5 −2 −1 2 −1 2   Hence the polynomial quotient is −z 3 + 2z 2 − z + 2 and the polynomial reminder as expected from the Polynomial Remainder Theorem is r = a0 + b0 α = −9. A proof of this theorem is given in Appendix B. complex numbers with null imaginary part.6 (Fundamental Theorem of Algebra) If p (z) = nk=0 ak z k is a com- plex polynomial having order n ≥ 1. Let us consider the following natural polynomial (a polynomial where all the coefficients are natural numbers): 8 − x = 9. .. let us give a second interpretation of the Fundamental Theorem of Algebra. Let us rename it as bn−1 as it is the coefficient of the maximum power of q (z): an an−1 . let us consider the set of natural numbers N. . the theorem is valid for real polynomials too. the remainder r is r = a0 + b0 α.22 Let us consider the division of the polynomial −z 4 + 3z 2 − 5 by (z + 2).   Example 5.

For this reason we have the need of “expanding” the set of natural numbers to the set of relative numbers Z. we can conclude that also “the set of real numbers is not closed with respect to the nth root operation”. Theorem 5. let us consider the following rational polynomial (a polynomial where all the coefficients are relative numbers): x 2 = 2. multiplication (division). we introduce the set of relative numbers Q and we conclude that “the set of relative number is not closed with respect to the division operation”. This statement can be written as: “the set of natural numbers is not closed with respect to the subtraction operation”. let us consider the real polynomial x 2 = −1. it will sure have at least one complex root.” Remembering that a field is essentially a set with its operations. We can conclude that “the set of rational numbers is not closed with respect to the nth root operation”. More specifically. Now. . we need a set expansion. Finally. The root is said single or simple is p (z) is divisible by (z − α) but not (z − α)2 . To find the root. In order to solve this equation we had to introduce the set of complex numbers. although all the coefficients of the polynomial are relative numbers the root of this polynomial is not a relative number. it is not closed when the square root of a negative number is taken. exponentiation (nth root). Thus we need a further set expansion and we need to introduce the set of real numbers R.5.11 Let p (z) be a complex polynomial and α its root. Now. the Fundamental Theorem of Algebra guarantees that if we consider a complex polynomial. An algebraically closed field is a field that contains the roots for every non-constant polynomial. the latter state- ment can be re-written according to the equivalent and alternative formulation of the Fundamental Theorem of Algebra. Now. Hence.7 (Fundamental Theorem of Algebra (Alternative formulation)) The field of complex numbers is algebraically closed. Definition 5. the root of this polynomial is not a natural number. let us consider the following relative polynomial (a polynomial where all the coefficients are relative numbers): −5x = 3. In a similar way.2 Complex Polynomials 145 Although all the coefficients of this polynomial are natural numbers. The roots of this polynomial are not relative numbers. The roots of this polynomial are not real numbers. This means that we can conclude that “the set of complex numbers is closed with respect to the operations of sum (subtraction). Thus.

It follows that also q (α2 ) = 0.e. Since p (z) is divisible by (z − α1 ). against the identity principle. by contradiction. If α2 is a root of q (α2 ). αn . . Hence. . . the polynomial is said to have n distinct roots α1 .  . . Let us consider p (α2 ) = (α2 − α1 ) q (α2 ) with α1 = α2 . p (z) = (z − α1 ) (z − α2 ) (z − α3 ) q2 (z) . p (z) is divisible by (z − α1 ) . (z − α2 ) . αn+1 . then we can write p (z) = (z − α1 ) q (z) . If we iterate this procedure we obtain p (z) = (z − α1 ) (z − α2 ) .146 5 Complex Numbers and Polynomials Definition 5. . Let us consider the root α3 = α2 = α1 . α2 . . that the polynomial has n + 1 distinct roots α1 . We have written an equality between a n order polynomial ( p (z) has order n for hypothesis) and a n + 1 order polynomial. (z − αn ) with h constant and α1 = α2 = . It follows that q1 (α3 ) = 0 and that we can write q1 (z) = (z − α3 ) q2 (z) . Hence. Since α2 is a root of the polynomial p (α2 ) = 0. . . α2 . . We have reached a contradiction. Hence. then this polynomial has at most n distinct solutions. Then. . p (z) = (z − α1 ) (z − α2 ) q1 (z) . p (α3 ) = 0. and (z − αn+1 ). .  Theorem 5.8 (Theorem on the distinct roots of a polynomial) If p (z) = nk=0 ak z k is a complex polynomial having order n ≥ 1. . then q (z) is divisible by (z − α2 ). . i. . we can write q (z) = (z − α2 ) q1 (z) . . (z − αn+1 ) qn with qn constant. Proof Let us assume. . = αn . .12 If a polynomial can be expressed as p (z) = h (z − α1 ) (z − α2 ) . .

 3Obviously. and 3 and the polynomial can be written as   z 3 + 2z 2 − 11z − 12 = (z + 1) (z + 4) (z − 3) . i.2 If two complex polynomials p1 (z) and p2 (z) of order n ≥ 1 take the same value in n + 1 points. the polynomial z 4 − z 3 − 17z 2 + 21z + 36 is of order 4 and cannot have more than 4 distinct roots.e.25 The polynomial z 2 + 2z + 5 cannot have more than 2 distinct roots. the roots of this polynomial are −1 and −4 and can be written as   z 2 + 5z + 4 = (z + 1) (z + 4) . In particular.2 Complex Polynomials 147 Corollary 5. Example 5.5. The polynomial can be written as   z 4 − z 3 − 17z 2 + 21z + 36 = (z + 1) (z + 4) (z − 3) (z − 3) .23 Let us consider the following order 2 polynomial: z 2 + 5z + 4. In this case the roots are not simple real roots but two distinct complex roots. −4. −4. a polynomial  can have both real and complex roots as in the case of z + z 2 + 3z − 5 = (z + 1 − j2) (z + 1 + 2 j) (z − 1).24 The polynomial z 3 + 2z 2 − 11z − 12 cannot have more than 3 distinct roots. that is −1. This situation is explained in the following definition. then the two polynomials are identical. −1 + j2 and −1 − j2. The polynomial can be written as  2  z + 2z + 5 = (z + 1 − j2) (z + 1 + 2 j) . and 3 as the polynomial above but the root −3 is repeated twice.26 Finally. . Example 5. Example 5. It can be verified that this polynomial has 3 roots. Example 5. The roots are −1. this polynomial cannot have more than 2 distinct roots. For the Theorem on the distinct roots of a polynomial.

. Example 5. αs are its roots having algebraic multiplicity h 1 .27 In the example  above. h 2 . It follows that the addend of order n is an z n = qz h 1 +h 2 +. If we substitute in the p (z) expression we obtain p (z) = (z − α) q1 (z) (z − α1 )h 1 (z − α2 )h 2 .. . αs . . . (z − αs )h s . . . h s then we can write that ∃!q (z) such that p (z) = q (z) (z − α1 )h 1 (z − α2 )h 2 . A solution is said multiple with algebraic multiplicity k ∈ N and k > 1 if p (z) is divisible by (z − α)k but not divisible by (z − α)k+1 . . . Thus. . Since this is against the definition of multiplicity of a root. . . . It follows that q (z) is a constant q and the polynomial is p (z) = q (z − α1 )h 1 (z − α2 )h 2 . h s then h1 + h2 + · · · + hs = n and p (z) = an (z − α1 )h 1 (z − α2 )h 2 . . If α1 . . we reached a contradiction. h 2 . . by contradiction. Let us assume. that q (z) is of order ≥ 1. . ∃q1 (z) |q (z) = (z − α) q1 (z). At first. . . . . α must be equal to one of them. . In this case p (z) must be divisible by (z − αi )h i +1 .+h s . . . (z − αs )h s . this polynomial has at least one root α ∈ C. . (z − αs )h s . a2 z 2 + a1 z + a0 . α2 . Let us re-write the polynomial p (z) as p (z) = an z n + an−1 z n−1 + . 3 is a solution (or a root) of multiplicity 2 because the polynomial z 4 − z 3 − 17z 2 + 21z + 36 is divisible by (z − 3)2 and not by (z − 3)3 . For the Fundamental Theorem of Algebra. α2 . . Proof Since the polynomial p (z) has roots α1 .148 5 Complex Numbers and Polynomials Definition 5. . .13 Let p (z) be a complex polynomial in the variable z. we need to prove that q (z) is a constant. This means that α is also a root of p (z). . α2 .. . (z − αs )h s .  Theorem 5. Since for hypothesis the roots of p (z) are α1 . Let us consider a generic index i such that α = αi . αn with the respective mul- tiplicity values h 1 .9 Let p (z) = nk=0 ak z k be a complex polynomial of order n > 1 in the variable z. . .

9. the polynomial can be written as (z + 1) (z + 4) (z − 3)2 . . α˙2 . h s then α˙1 . . (z − αs )h s . = 1. 2 − j with multiplicity 2 and 2 + j with multiplicity 2. . . . . an = 1. . two of them having multiplicity 1 and one having multiplicity 2: h 1 = 1. h 3 = 2. . If α1 .  Definition 5. The roots are −1 with multiplicity 1.5. Example 5. . . . . . α2 .  Proposition 5. αs are its roots having algebraic multiplicity h 1 . With reference to Theorem 5. . h 2 . As shown above. . Thus.    Example 5. Proposition 5. If α1 . + αs = − aan−1n an−2 • α1 α2 + α2 α3 . + αs−1 αs = an • α1 α2 .28 The polynomial z 4 − z 3 − 17z 2 + 21z + 36 in the previous exam- ple has three roots. . A conjugate complex polynomial ṗ (z)  is a polynomial whose coefficients are conjugate of the coefficients of p (z): ṗ (z) = nk=0 a˙k z k . . . . αn are its roots it follows that • α1 + α2 + .29 Let us consider the polynomial 2z 7 − 20z 6 + 70z 5 − 80z 4 − 90z 3 + 252z 2 − 30z + 200 which can be written as 2(z + 1)2 (z − 4)(z 2 − 4z + 5)2 . h 2 . α˙s with algebraic multiplicity h 1 . αs = (−1)n aan0 . . .2 Complex Polynomials 149 For the identity principle h1 + h2 + · · · + hs = n an = q. . .14 Let p (z) = nk=0 ak z k be a complex polynomial of order n ≥ 1. h s are roots of ṗ (z). h 2 .5 Let p (z) be a complex polynomial of order n ≥ 1.6 Let p (z) = nk=0 ak z k be a complex polynomial of order n ≥ 1. . the polynomial can be written as p (z) = an (z − α1 )h 1 (z − α2 )h 2 . . It follows that h 1 + h 2 + h 3 = 4 that is the order of the polynomial. . α2 . . Hence we have that the sum of the multiplicity values is 1 + 2 + 2 + 2 = 7 that is the order of the polynomial and an = 2. . . 4 with multiplicity 2.

Proof The roots are the solution of the equation az 2 + bz + c = 0 ⇒ az 2 + bz = −c. It has not been explained yet how to determine these roots. az − b is a trivial problem: b az − b ⇒ α = . Babylonian and Chinese mathe- maticians: √ α1 = −b 2 −4ac 2a + √b 2a α2 = −b 2 −4ac 2a − b 2a Let us prove this formula. Let us multiply both members of the equation by 4a: 4a 2 z 2 + 4abz = −4ac ⇒ (2az)2 + 2 (2az) b = −4ac.2. a The roots of a polynomial of order 2.2.150 5 Complex Numbers and Polynomials 5. can be found analytically by the popular formula developed by antique Indian. az 2 + bz + c. From this equation we obtain √ −b b2 −4ac α1 = 2a + √ 2a −b b2 −4ac α2 = 2a − 2a  The method for the calculation of the roots of a polynomial of order 3. Let us add b2 to both members  (2az)2 +2 (2az) b+b2 = −4ac+b2 ⇒ ((2az) + b)2 = b2 −4ac ⇒ ((2az) + b) = ± b2 − 4ac. . where 2 p = ac − 3ab 2 2b3 q = a − 3a 2 + 27a d bc 3. The detection of the root of a polynomial of order 1. In order to pursue this aim let us consider polynomials of increasing orders. az 3 +bz 2 + cz + d has been introduced in the XVI century thanks to the studies of Girolamo Cardano and Niccoló Tartaglia. what type of roots exist. The solution of az 3 + bz 2 + cz + d = 0 can be calculated by posing x = y − 3ab and thus obtaining a new equation: y 3 + py + q = 0. and how many roots are in a polynomial.1 How to Determine the Roots of a Polynomial The previous sections explain what a root is.

i. If Δ = 0 the roots are  α1 = −2 3 − q2  α2 = α3 = 3 − q2 . The proof of the solving formulas are not reported as they are outside the scopes of this book. The roots are:    α1 = 2 − 3p + cos θ3    α2 = 2 − 3p + cos θ+2π   3  α3 = 2 − 3p + cos θ+4π 3 .e. A representation of the solving method is the following. If Δ > 0 the roots are  α1 = u + v  √ √  α2 = u − 2 + j 23 + v − 21 − j 23 1  √   √  α3 = u − 21 − j 23 + v − 21 + j 23 .2 Complex Polynomials 151 The solutions of this equation are given by y = u + v where   q2 p3 u= − q2 + + 3 4 27   q2 p3 v= − q2 − + 3 4 27 q2 p3 and two solutions whose values depend of Δ = 4 + 27 . in the form ax 4 + bx 3 + cx 2 + d x + e. the detection of the roots have been investigated by Lodovico Ferrari and Girolamo Cardano in the XVI century. the complex number − q2 + j −Δ must be expressed in polar coordinates as (ρ∠θ ) . .  α1 = − 4a b −S+ −4S 2 − 2 p + 1 q  2 s α2 = − 4a b − S − 21 −4S 2 − 2 p + q  s α3 = − 4a b + S + 21 −4S 2 − 2 p + q  s α4 = − 4a b + S − 21 −4S 2 − 2 p + q s where 2 p = 8ac−3b 8a 2 2 q = b −4abc+8a 3 8a 3 d . in order to find the roots. √ If Δ < 0. If the polynomial is of order 4.5.

152 5 Complex Numbers and Polynomials The value of S is given by  .

[14]. it is clear that the detection of the roots of a polynomial is in general a difficult task. the detection of the roots of a polynomial having order 5 or higher is an impossible task.g.10 (Abel-Ruffini’s Theorem) There is no general algebraic solution to polynomial equations of degree five or higher with arbitrary coefficients. However. 5. A description of the numerical methods does not fall within the scopes of this book. More drastically.15 Let p1 (z) = m k=0 ak z and p2 (z) = k k k=0 bk z be two complex polynomials. However. p1 (z) Q (z) = p2 (z) is said rational fraction in the variable z.3 Partial Fractions  n Definition 5. some examples of numerical methods for finding the roots of a high order polynomial are the bisection and secant methods. This means that if the calculation of the roots of a polynomial of order ≥ 5 must be calculated a numerical method must be implemented to find an approximated solution as the problem has no analytic solution. This fact is proved in the so called Abel-Ruffini’s Theorem. see e. The function Q (z) obtained by dividing p1 (z) by p2 (z). 1 2 1 Δ0 S= − p+ Q+ 2 3 3a Q where √ 2 3 3 Δ1 + Δ1 −Δ0 Q= 2 Δ0 = c2 − 3bd + 12ae Δ1 = 2c − 9bcd + 27b2 e + 27ad 2 − 72ace. The partial fraction decomposition (or partial fraction expansion) is a mathematical procedure that con- sists of expressing the fraction as a sum of rational fractions where the denominators are of lower order that of p2 (z): p1 (z)  f i (z) n Q (z) = = p2 (z) g (z) k=1 i . Let Q (z) = pp21 (z) (z) be a rational fraction in the variable z. 3 The proof of the solving method is also not reported in this book. Theorem 5.

. i. . . the integration term by term can be much easier if the fraction has been decomposed. αm such that p1 (αk ) = 0 ∀k ∈ N with 1 ≤ k ≤ n.3 Partial Fractions 153 This decomposition can be of great help to break a complex problem into many simple problems. . . . Let us distinguish three cases: • rational fractions with distinct/single real poles • rational fractions with multiple poles • rational fractions with complex poles rational fractions with only distinct real poles are characterized by a denominator of the kind p2 (z) = (z − β1 ) (z − β2 ) . rational fractions with quadratic terms are characterized by a denominator of the kind   p2 (z) = z 2 + ξ z + ζ i.e. Akh are constant coefficients. An are constant coefficients. . βn such that p2 (βk ) = 0 ∀k ∈ N with 1 ≤ k ≤ n. C are constant coefficients. the poles have null imaginary parts (the poles are real numbers). see Chap. . . In this second case the rational fraction can be written as p1 (z) A1k A2k Akh Q (z) = = + + . β2 . and the term poles the values β1 . .e. A2 . Let us consider the case of a proper fraction. + (z − βk )h (z − βk ) (z − βk )2 (z − βk )h where A1k .e. + (z − β1 ) (z − β2 ) . . the order of p1 (z) ≤ than the order of p2 (z) (m ≤ n). . . each multiple pole in the denomi- nator appears as p2 (z) = (z − βk )h i. . . .. A2k . α2 . (z − βn ) i. 12. If the rational fraction contains multiple poles. . In this first case the rational fraction can be written as p1 (z) A1 A2 An Q (z) = = + + .. . Let us indicate with the term zeros the values α1 . For example. . (z − βn ) z + ξz + ζ where B. (z − βn ) (z − β1 ) (z − β1 ) (z − βn ) where A1 . . . In this third case the rational fraction can be written as p1 (z) Bz + C Q (z) = = 2  (z − β2 ) . . some poles are real numbers with multiplicity > 1. some poles are conjugate imaginary or conjugate complex numbers. . . .e.5.

154 5 Complex Numbers and Polynomials Obviously. In the case of multiple complex poles.30 Let us consider the following rational fraction 8z − 42 . (z − 1) (z − 2)2 z − 1 z − 2 (z − 2)2 . Ak . from the equation n f k (z) p2 (z) p1 (z) = k=1 gk (z) the coefficients ak of the polynomial p1 (z) are imposed to be equal to the corre- sponding ones on the left hand side of the equation. z2 + 3z − 18 z+6 z−3 Example 5. z2 + 3z − 18 This rational fraction has two single poles and can be written as 8z − 42 A1 A2 = + . In order to find these coefficients. and Bk whose solution completes the partial fraction decomposition. we can write the numerator as 8z − 42 = A1 (z − 3) + A2 (z + 6) = A1 z − A1 3 + A2 z + A2 6 = (A1 + A2 ) z − 3A1 + 6A2 . Bk . The fraction can be written as 4z 2 A1 A12 A22 = + + . z 3 − 5z 2 + 8z − 4 This rational fraction has one single pole and one double pole. This operation leads to a system j j of linear equations in the variables Ak . (z + 6) (z − 3) z+6 z−3 Thus. the corresponding constant j j coefficients are indicated with Bk and Ck . Hence the partial fraction decomposition is 8z − 42 10 2 = − . the polynomial can contain single and multiple poles as well as real and complex poles. We can now set the following system of linear equations in the variables A1 and A2  A1 + A2 = 8 −3A1 + 6A2 = −42 whose solution is A1 = 10 and A2 = −2.31 Let us consider the following rational fraction 4z 2 . Example 5.

z z 2 + 2z − 6 z z + 2z − 6 The numerator can be written as   8z 2 − 12 = A1 z 2 + 2z − 6 + (B1 z + C1 ) z = = z 2 A1 + 2z A1 − 6A1 + z 2 B1 + zC1 = z 2 (A1 + B1 ) + z (2 A1 + C1 ) − 6A1 and the following system of linear equations can be set ⎧ ⎪ ⎨ A1 + B1 = 8 2 A1 + C 1 = 0 ⎪ ⎩ −6A1 = −12. z 3 + 2z 2 − 6z z z + 2z − 6 . z 3 − 5z 2 + 8z − 4 z − 1 (z − 2)2 Example 5. The partial fraction decomposi- tion is 8z 2 − 12 2 6z − 4 = + 2 . A12 = 0. The partial fraction decomposi- tion is 4z 2 4 16 = + . whose solution is A1 = 4. whose solution is A1 = 2.32 Let us consider the following rational fraction 8z 2 − 12 . B1 = 6. and C1 = −4.5. and A22 = 16.3 Partial Fractions 155 The numerator can be written as 4z 2 = A1 (z − 2)2 + A12 (z − 2) (z − 1) + A22 (z − 1) = = z 2 A1 + 4 A1 − 4z A1 + z 2 A12 − 3z A12 + 2 A12 + z A22 − A22 =     = z 2 A1 + A12 + z A22 − 3A12 − 4 A1 + 4 A1 + 2 A12 − A22 and the following system of linear equations can be set ⎧ ⎪ ⎨ A1 + A2 = 4 1 −4 A1 − 3A12 + A22 = 0 ⎪ ⎩ 4 A1 + 2 A12 − A22 = 0. The fraction can be written as 8z 2 − 12 A1 B1 z + C1  = + 2 . z3 + 2z 2 − 6z This rational fraction has one pole in the origin and two conjugate complex poles.

E m−n can be determined at the same time of the coefficients resulting from the expansion of pr2(z) (z) by posing the identity of the coefficients and solving the resulting system of linear equations. 2z 2 + z The rational fraction can be written as 4z 3 + 10z + 4 4z 3 + 10z + 4 A1 A2 = = z E1 + E0 + + .4.4.33 Let us consider the following improper fraction 4z 3 + 10z + 4 . .156 5 Complex Numbers and Polynomials Let us now consider a rational fraction p1 (z) Q (z) = p2 (z) where the order m of p1 (z) is > than the order n of p2 (z). + E m−n z m−n and the improper fraction can be expanded as p1 (z) r (z) = E 0 + E 1 z + E 2 z 2 + . . Example 5. The coefficients E 0 . p2 (z) p2 (z) The polynomial q (z) is of order m − n and can be expressed as : q (z) = E 0 + E 1 z + E 2 z 2 + . We know also that the order of p1 (z) is equal to the sum of the orders of p2 (z) and q (z). + E m−n z m−n + p2 (z) p2 (z) and apply the partial fraction expansion for pr2(z) (z) that is certainly proper as the order of r (z) is always < than the order of p2 (z) for the Theorem 5. . By applying the Theorem 5. we know that every polynomial p1 (z) can be expressed as p1 (z) = p2 (z) q (z) + r (z) by means of unique q (z) and r (z) polynomials. . we can express the improper fraction as p1 (z) r (z) = q (z) + . Thus. E 1 . . A partial fraction expansion can be performed also in this case but some consider- ations must be done. This rational fraction is called improper fraction. . 2z + z 2 z (2z + 1) z 2z + 1 . . .

4 Calculate 3 5 + j5 5. A2 = 3. z a + b2 5.3 Partial Fractions 157 The numerator can be expressed as 4z 3 + 10z + 4 = z 2 (2z + 1) E 1 + z (2z + 1) E 0 + (2z + 1) A1 + z A2 = = 2z E 1 + z E 1 + 2z E 0 + z E 0 + 2z A1 + z A2 + A1 = z 3 2E 1 + z 2 (E 1 + 2E 0 ) + z (2 A1 + A2 + E 0 ) + A1 . 3 2 2 We can then pose the system of linear equations ⎧ ⎪2E 1 = 4 ⎪ ⎪ ⎨ E + 2E = 0 1 0 ⎪2 A1 + A2 + E 0 = 10 ⎪ ⎪ ⎩ A1 = 4 whose solution is A1 = 4. Thus.1 Verify that if z = a + jb. 1 + 3j 5. and E 1 = 2. E 0 = −1. then 1 a − jb = 2 .5 Calculate 2− j . ·) is a field. +.6 Expand in partial fractions the following rational fraction −9z + 9 .3 Sum and multiply the following complex numbers: z 1 = −2 + j6 and z 2 = 3 − j8.2 Verify that (C. 5. the partial fraction expansion 4z 3 + 10z + 4 4 3 = 2z − 1 + + . √ 5.5. 2z 2 + 7z − 4 . 2z + z 2 z 2z + 1 5.4 Exercises 5.

9 Expand in partial fractions the following rational fraction 5z 2 + 4z − 1  2 . z2 + z + 6 z . (z − 1)2 (z + 2) 5.8 Expand in partial fractions the following rational fraction 5z  . z2 + z + 8 z 5.158 5 Complex Numbers and Polynomials 5.11 Expand in partial fractions the following rational fraction 5z 6 − 4z 4 + 3z 2 − 4z + 2  2 .7 Expand in partial fractions the following rational fraction 3z + 1 . z3 + 3z 2+ 3z + 2 5. z 2 + z + 3 (z − 1) 5.10 Expand in partial fractions the following rational fraction 6z 4 − 4z + 3   .

let us start with considering the three-dimensional space. segment. While in depth geometrical aspects of the conics lie outside the scopes of this chapter. m) and P0 (x0 . distance between two points. and direction of a line. lines.1 Basic Concepts: Lines in the Plane This chapter introduces the conics and characterizes them from an algebraic perspec- tive. If R2 can be represented as the plane. Linear Algebra for Computational Sciences and Engineering. a line is an infinite subset of R2 . a line can be characterized by two numbers which we will indicate here as (l. Let us think about the line passing through P0 and having direction (l. From the algebra of the vectors we also know that the direction of a line is identified by the components of a vector having the same direction. points. Intuitively. © Springer International Publishing Switzerland 2016 159 F. In order to achieve this aim.1. We have previously introduced. m).1 Equations of the Line v = #» Let #» o be a vector of the plane having components (l. y0 ) be a point of the plane.e. We have also introduced in Chap.1 Let P and Q be two points of the plane and dPQ be the distance between two points. m).1007/978-3-319-40341-0_6 . Neri. 6. 4. DOI 10. Definition 6. we may think that. and planes exist.Chapter 6 An Introduction to Geometric Algebra and Conics 6. i. in Chap. within this space. 4 the concepts of point. The point M of the segment PQ such that dPM = dMQ is said middle point. this chapter is an opportunity to revisit concepts studied in other chapters such as matrix and determinant and assign a new geometric characterization to them. the concept of line as representation of the set R.

Let us determine the equation of the line passing through P0 and having the direction of #» v. m) = (0. Example 6. . b. 4). The equation ax + by + c = 0 is said analytic representation of the line or ana- lytic equation of the line in its explicit form. y − y0 ) are parallel if and only if   (x − x0 ) (y − y0 ) det = 0. Let us impose the parallelism between #» v and P0 P:   (x − 1) (y − 1) det = 4 (x − 1) − 3 (y − 1) = 4x − 4 − 3y + 3 = 4x − 3y − 1 = 0. The coefficients a and b are non-null because (l. By applying some simple arithmetic operations ax + by + c = 0 ⇒ by = −ax − c ⇒ ⇒ y = − ab x − bc ⇒ y = kx + q which is known as analytic equation of the line in its explicit form. For Proposition 4. 0).1 Let is consider the point P0 = (1.2 A line in a plane is a set of points P (x.4 vectors #» v = (l.160 6 An Introduction to Geometric Algebra and Conics Now let us consider a point P (x. m) and P0 P = (x − x0 . y) of the plane. y − y0 ). The segment P0 P can be inter- preted as a vector having components (x − x0 . y) ∈ R2 such that ax + by + c = 0 where a. 3 4 Definition 6. l m This situation occurs when (x − x0 ) m − (y − y0 ) l = 0 ⇒ ⇒ mx − ly − mx0 + ly0 = 0 ⇒ ax + by + c = 0 where a=m b = −l c = −mx0 + ly0 . and c are three coefficients ∈ R. 1) and the vector #» v = (3.

−5). let us choose an arbitrary value for x0 and let us use the equation of the line to find the corresponding y0 value. b) of the line equation in its explicit form is perpendicular to the line. The equation above (x − x0 ) m − (y − y0 ) l = 0 can be re-written as (x − x0 ) (y − y0 ) = . Example 6.6. 5). b) = a (x − x0 ) + b (y − y0 ) = = ax + by − ax0 − by0 = ax + by + c = 0. This equation means that a line having equation ax + by + c = 0 is perpendicular to the direction (a. (y − y0 )) (a. a) = (b. The parametric equations are then . b). While t varies the line is identified. In other words. Let us now calculate the scalar product ((x − x0 ) . l m This equation yields to the following system of linear equations   x − x0 = lt x (t) = lt + x0 ⇒ y − y0 = mt y (t) = mt + y0 where t is a parameter.1 Basic Concepts: Lines in the Plane 161 It must be observed that a line having direction (l.2 The line having equation 5x + 4y − 2 = 0 has direction (−4. we can write the parametric equa- tions as  x (t) = 4t + x0 y (t) = 5t + y0 . Definition 6. 5) = (4. b) and let us impose that it is null: (x − x0 ) . We know that the direction numbers (l. m) has equation ax + by + c = 0 where a = m and b = −l. Analogously a line having equation ax + by + c = 0 has direction (−b. In order to find x0 and y0 . Hence. the direction identified by the coefficients (a.3 The components of a vectors parallel to a line are said direction numbers of the line. m) = (4. Let us write an alternative representation of the line. For example if we choose x0 = 1 we have y0 = 1.3 Let 5x − 4y − 1 = 0 be the equation of a line in the plane. (y − y0 ) (a. −a). Example 6. The equations of the system are said parametric equations of the line.

y1 − y2 ) is parallel to the segment PP2 = (x − x2 . y − y2 ). In order to do it. If the two segments have the same direction are thus aligned and belong to the same line. let us impose that for a generic point P (x. 6. l1 : a1 x + b1 y + c1 = 0 l2 : a2 x + b2 y + c2 = 0. respectively. y2 ) be two points of the plane. the segment P1 P2 = (x1 − x2 .1. Let P1 (x1 . 5) and P2 (−2. We would like to write the equation of the line between the two points. . y).2 Intersecting Lines Let l1 and l2 be two lines of the plane having equation. y1 ) and P2 (x2 . The parallelism is given by   (x − x2 ) (y − y2 ) (x − x2 ) (x1 − x2 ) det =0⇒ = (x1 − x2 ) (y1 − y2 ) (y − y2 ) (y1 − y2 ) that is the equation of a line between two points.162 6 An Introduction to Geometric Algebra and Conics  x (t) = 4t + 1 y (t) = 5t + 1. 8) has equation x +2 1+2 = y−8 5−8 which can equivalently be written as (5 − 8) (x + 2) − (1 + 2) (x + 2) . Example 6. Let us now write the equation of the line in a slightly different way.4 A line passing through the points P1 (1.

and by applying the Rouchè-Capelli Theorem the following cases are distinguished. the system is incompatible and has no solutions. At first. this means that the two lines are parallel. In other words. The system above is associated to the following incomplete and complete matrices. the solutions of a system of equations can be interpreted as the intersection of objects. Thus. Equivalently we may state that the coordinates (x0 . respectively. we may observe that a new characterization of the concept of system of linear equations is given. respectively. Geometrically. the system is undetermined and has ∞ solutions. By extension an n × n system of linear equation represents lines in a n-dimensional space. Geometrically. • case 2: If ρA = 1 and ρAc = 2. when it exists. this means that the lines intersect in a single point. y0 ) is the solution of the following system of linear equations:  a1 x + b1 y + c1 = 0 a2 x + b2 y + c2 = 0. If these two line intersect in a point P0 it follows that the point P0 belongs to both the line.1 Basic Concepts: Lines in the Plane 163 We aim at studying the position of these two line in the plane.6. (x0 . thus the system is of the kind  ax + by + c1 = 0 λax + λby + c2 = 0 with λ ∈ R. • case 3: If ρA = 1 and ρAc = 1. y0 ) of this point P0 simultaneously satisfy the equations of the lines l1 and l2 . is the intersection of these lines. Geometrically this means that the two lines are overlapped and the system is of the kind .   a1 b1 A= a2 b2 and   a1 b1 −c1 Ac = . a2 b2 −c2 Indicating with ρA and ρAc the ranks of the matrices A and Ac . the system is determined and has only one solution. the system has two linear equations in two variables. even when not all the equations are equations of the line. In this chapter we study lines in the plane. In general. • case 1: If ρA = 2 (which yields that also ρAc = 2). A system having size 3 × 3 can be seen as the equation of three lines in the space. Let us focus on the case of two lines in the plane. A system of linear equation can be seen as a set of lines and its solution.

If det (A) = 0 it follows that ρA = 1 (ρA = 0 would mean that there are no lines on the plane).6 Let us find. This means that the following system of linear equations must be set  2x + y − 1 = 0 . the intersection point of the lines having equa- tion 2x + y − 1 = 0 and 4x − y + 2 = 0.5 Let us find.164 6 An Introduction to Geometric Algebra and Conics  ax + by + c = 0 λax + λby + λc = 0 with λ ∈ R. if possible. 43 . Thus. the intersection point of the lines having equa- tion 2x + y − 1 = 0 and 4x + 2y + 2 = 0. 4x − y + 2 = 0 The incomplete matrix is non-singular as it has determinant −6. Example 6. the lines are intersecting. Example 6. The solution of the system is   1 1 det −2 −1 1 x= =− −6 6 and   2 1 det 4 −2 4 y= = . −6 3   The intersection point is − 16 . The associated incomplete matrix   21 42 is singular and thus ρA = 1 while the rank of the complete matrix   21 1 4 2 −2 is 2. The system is incompatible and thus the two lines are parallel. It can be observed that if det (A) = 0 it follows that ρA = 2 and that also ρAc = 2. if possible. .

Definition 6. for Rouchè-Capelli Theorem.1 Let l1 . The associated system of linear equations ⎧ ⎪ ⎨a1 x + b1 y + c1 = 0 a2 x + b2 y + c2 = 0 ⎪ ⎩ a3 x + b3 y + c3 = 0 composed of three equations in two variables. The intersection point of all the infinite lines is said center of the family. This means that if an arbitrary 2 × 2 submatrix is extracted it is singular. In summary. 6. and l3 belong to the same family of parallel lines. l2 . The solution is then the intersection point of the family. For the I Laplace Theorem det (A) = 0. The lines can be parallel or overlapped. The lines l1 .6. and l3 belong to the same family if and only if ⎛ ⎞ a1 b1 c1 det (A) = ⎝ a2 b2 c2 ⎠ = 0.5 A family of parallel straight lines is the set of infinite lines of the plane having the same direction. Thus. l2 . Theorem 6.  If det (A) = 0 the rank of the matrix A is < 3. if possible. and l3 be three lines of the plane having equations l1 : a1 x + b1 y + c1 = 0 l2 : a2 x + b2 y + c2 = 0 l3 : a3 x + b3 y + c3 = 0. has only one solution. . the rank of the matrix A is 1. This means that det (A) = 0. It can be easily seen that the second equations is the first one multiplied by 2.1 Basic Concepts: Lines in the Plane 165 Example 6. Thus. if the lines belong to the same family of either intersecting or parallel lines then det (A) = 0. the two equations represent the same line. Thus.4 A family of intersecting straight lines is the set of infinite lines of the plane that contain a common point. The two lines are overlapped and have infinite points in common.7 Let us find.1. a3 b3 c3 Proof If l1 . the three lines belong to the same family of intersecting lines. If the rank is 2 the system is compatible and determined. they are simulta- neously verified for the center of the family. If l1 . l2 . the rank of the matrix A is 2. and l3 belong to the same family of intersecting lines. the intersection point of the lines having equa- tion 2x + y − 1 = 0 and 4x + 2y − 2 = 0.3 Families of Straight Lines Definition 6. l2 .

if k = ∞ the line l2 is identified. If we substitute these values in the third equation of the system of linear equations above we obtain (λa1 + μa2 ) x + (λb1 + μb2 ) y + (λc1 + μc2 ) = 0 which can be re-written as λ (a1 x + b1 y + c1 ) + μ (a2 x + b2 y + c2 ) = 0 which is said equation of the family of lines. This means that ∃ a pair of real scalar (λ. Hence if det (A) = 0 the three lines belong to the same family of either intersecting or parallel lines. When the parameters (λ. μ) = (0.166 6 An Introduction to Geometric Algebra and Conics If the rank is 1 each pair of lines is of the kind ax + by + c = 0 λax + λb + λc = 0. Each value of k identifies a line. Thus. the three lines belong to the same family of parallel lines. Without a loss of generality let assume that λ = 0. It follows that λ (a1 x + b1 y + c1 ) + μ (a2 x + b2 y + c2 ) = 0 ⇒ ⇒ (a1 x + b1 y + c1 ) + k (a2 x + b2 y + c2 ) = 0 with k = μλ . We know that (λ.  If det (A) = 0 then at least a row of A is linear combination of the other two. If k = 0 the line l1 is obtained. 0). 0) such that a3 = λa1 + μa2 b3 = λb1 + μb2 c3 = λc1 + μc2 . If the lines l1 and l2 are parallel then can be written as ax + by + c1 = 0 νax + νb + c2 = 0 and the family of lines becomes (ax + by + c1 ) + k (νax + νb + c2 ) = 0 ⇒ ⇒ a (1 + νk) x + b (1 + νk) y + c1 + c2 k = 0 ⇒ ⇒ ax + by + h = 0 . μ) vary a line of the family is identified. μ) = (0. This means that all the lines are parallel (have the same direction).

we can perform the orthogonal projection for all the infinite points contained in a line. Let us indicate one of the two lines with z and let us refer to it as “axis” while the other one is simply indicated as “line”. is the solution of the system of linear equations  2x + 2y + 4 = 0 2x − 4y + 8 = 0 that is x = − 83 and y = 23 . Definition 6.2 An Intuitive Introduction to the Conics Definition 6. Now. In this way. Example 6.1 Basic Concepts: Lines in the Plane 167 with c1 + c2 k h= . 6.9 The center of the family (2x + 2y + 4) l + (2x − 4y + 8) m = 0. The angle between the line and the axis is indicated as θ . it can intersect the cone in three ways according to the angle φ between this plane and the axis z. 6.7 An angle between a line and a plane is the angle ≤90◦ between that line and its orthogonal projection on the plane. More specifically the following cases can occur: . If we take into consideration a plane in the space.6.8 A family of straight lines is for example (5x + 3y − 1) λ + (4x − 2y + 6) μ = 0. Let us imagine that the line performs a full rotation (360◦ ) around the axis. we can perform the orthogonal projection of a line on a plane. if the lines are parallel.1 the axis is denoted with an arrow. Following the definition of orthogonal projection of a point on a plane. This rotation generates a solid object which we will refer to as “cone”. let us consider two lines that are chosen to be not parallel. all the lines have the same direction. In Fig.6 The orthogonal projection of a point P of the space on a plane is a point of this plane obtained by connecting P with the plane by means of a line orthogonal to the plane (orthogonal to all the lines contained in the plane). Example 6. 1 + νk Thus.

These fig- ures. A graphical representation of the conics is given in Fig. three more cases can be distinguished: • a plane intersecting the cone with 0 ≤ φ < θ and passing for the intersection between the axis of the cone and the line that generates it. namely hyperbola • if φ = θ the intersection is an open figure. coming from the intersection of a cone with a plane are said conics.168 6 An Introduction to Geometric Algebra and Conics Fig. 6. 6. that is a line Fig.2 Conics as conical sections . that is two intersecting lines • a plane tangent to the cone with φ = θ .1 Cone as rotational solid • if 0 ≤ φ < θ the intersection is an open figure. namely ellipse The special ellipse corresponding to φ = π2 is named circumference. Besides the cases listed above.2. 6. namely parabola • if θ < φ ≤ π2 the intersection is a closed figure.

A conic C is the locus of points P ∈ R2 of a plane such that the ratio ddPF Pd is constant.1) ⇒ (x − α)2 + (y − β)2 = e2 (ax+by+c) 2 a 2 +b2 . yQ not belonging to this line.2 An Intuitive Introduction to the Conics 169 • a plane intersecting the cone with 0 ≤ φ < θ in the intersection between the axis of the cone and the line that generates it. . namely focus and d be a line in the same plane.6. that is a point These three cases are a special hyperbola. Let us indicate with dPF the distance from P to F and with dPd the distance from P to d. and ellipse. [15]: axQ + byQ + c √ . The easiest example of locus of points is a line where the condition is given by its equation ax + by + c =0.9 (Conic as a locus of points) Let F be a point of a plane. x and y respectively. 6. β). b.  If we consider a point Q xQ . from basic geometry we know that the distance between Q and the line is given by. If the coordinates of the focus F are (α. respectively.3 Analytical Representation of a Conic Now that the conics have been introduced let us define them again in a different way. which is called analytical representation of a conic. Definition 6. we can re-write ddPF Pd = e in the following way: dPF dPd = e ⇒ dPF = edPd ⇒ dPF 2 = e2 dPd 2 ⇒ (6. parabola. These special conics are said degenerate conics. This is a second order (quadratic) algebraic equation in two variables. and c are coefficients. and the equation of the directrix is indicated with ax + by + c = 0 where x and y are variable while a.     dPF C = P =e dPd where e is a constant namely eccentricity of the conic. namely directrix.g. corresponding to a point and a line. a 2 + b2 Definition 6. Let us consider the generic point P of the plane. see e.8 A set of points whose location satisfies or is determined by one or more specified conditions is said locus of points.

and the directrix coinciding with the ordinate axis. that is the equation ax + by + c = 0 becomes x = 0 (equivalently a = 1. b = 0.4.4 Simplified Representation of Conics This section works out the analytical representation of a conic given in Eq. This simplified representation allows a better understanding of the conic equations and a straightforward graphic representation by means of conic diagrams. and ellipse. • The conic is two intersecting lines. The geometrical meaning of this equation is two overlapped lines. hyperbola. with k = − 1 − e2 > 0. let us choose a reference system having its origin in F. 1 − e2 < 0 ⇒ e > 1: the equation of the conic becomes −kx 2 + y 2 = 0.4. parabola. If we solve this equation we √ obtain. y = ± kx. at first. that is (0. Without a loss of generality. This equation has only one solu- tion in R2 . respectively. that is α = β = 0. 1 − e2 = 0 ⇒ e = 1: the equation of the conic becomes 0x 2 + y 2 = 0 ⇒ y 2 = 0. let us choose a reference system such that a directrix has equation x − h = 0 with h constant ∈ R . that is the equations of two intersecting lines. In this specific case the analytical representation of the conic can be re-written as: ⇒ (x − α)2 + (y − β)2 = e2 (ax+by+c) 2 2 dPF = e2 dPd 2 ⇒  a +b  2 2 2 2 (x) 2 ⇒ (x) + (y) = e 1 ⇒ x + y = e x ⇒ 1 − e x + y 2 = 0.1) in the case of a reference system that allows a simplification of the calculations. that is the focus and the origin of the axes. i. 6. (6.1 Simplified Representation of Degenerate Conics Let us consider. 6. These three situations correspond to the degenerate conics. Without a loss of generality. the special case where the focus belongs to the directrix. 0).e. c = 0). • The conic is two overlapped lines.2 Simplified Representation of Non-degenerate Conics Let us now consider the general case F ∈/ d. 2 2 2 2 2 2 2 From this equation and considering that the eccentricity for its definition can take only non-negative values. • The conic is one point. with k = 1 − e2 > 0.170 6 An Introduction to Geometric Algebra and Conics 6. 1 − e2 > 0 ⇒ 0 ≤ e < 1: the equation of the conic becomes kx 2 + y 2 = 0. we can distinguish three cases. F ∈ d. This means that the conic is only one point. This is the equation of the abscissa’s axis (y = 0) counted twice.

.6 if p (z) = nk=0 ak z k is a complex (real is a special case of complex) polynomial of order n ≥ 1 and α1 .5) ⇒ F 2 e2 − 1 = e2 a 2 e2 − 1 ⇒ F 2 = e2 a 2 ⇒ e2 = Fa 2 . respectively. (6.2) ⇒ x2 + F 2 − 2F x + y 2 −  e x −2 e h − 2e 2hx2 = 2 2 2 2 2 0⇒ ⇒ 1 − e x + y − 2 F − he x + F − e h 2 = 0 2 2 2 Let us now consider the intersection of the conic with the abscissa’s axis. If we fix the reference axis so that its origin is between the intersections of the conic with the line y = 0. .6. . For the Proposition 5.     1 − e2 x 2 − 2 F − he2 x + F 2 − e2 h 2 = 0 This is a second order polynomial in the real variable x. y = 0. in this case the sum of the roots a − a is equal to   2 F − he2   = a − a = 0.6 it also occurs that α1 α2 . 0). 0). . in this case. . for the Proposition 5. 0) and (−a.8. 1 − e2 From this equation we obtain   F F − he2 = 0 ⇒ h = 2 (6. . + αs = − aan−1 n .3) e given that we suppose 1 − e2 = 0 ⇒ e = 1. α2 . means F 2 − e2 h 2   = −a 2 . the analytical representation of the conic can be re-written as: ⇒ (x − α)2 + (y − β)2 = e2 (ax+by+c) 2 2 dPF = e2 dPd 2 a 2 +b2 ⇒ ⇒ (x − F)2 + y 2 = e2 (x − h)2 ⇒  ⇒ x 2 + F 2 − 2F x + y 2 = e2 x 2 + h 2 − 2hx (6.4 Simplified Representation of Conics 171 and F has coordinates (F. αs = (−1)n a0 an which. Under these conditions. . . 1 − e2 From this equation we obtain   F 2 − e2 h 2 = a 2 e2 − 1 . αn are its roots it follows that α1 + α2 + . the two intersection points  can be written as (a. .3) into (6. (6. This polynomial has at most two real (distinct) solutions for the Theorem 5. Thus.4) we have  2   2   F 2 − e2 eF2 = a 2 e2 − 1 ⇒ c2 − Fe2 = a 2 e2 − 1 ⇒     2 (6. Furthermore.4) Substituting Eq.

From this simplified equation 2 we can easily see that the fraction ax 2 is always non-negative.5) into (6. respectively.6) F and substituting Eqs.3) we obtain a2 h= (6. This means that a hyperbola has y values that can be in the ]−∞. From this statement it follows that y2 +1>0 b2 regardless of the value of y. Let us pose b2 = F 2 − a 2 > 0 and let us substitute into the equation of the conic in Eq. since by2 is always non-negative. (6.5) and (6. +∞[ interval. This formulation of a conic has been done under the hypothesis that e = 1. x2 y2 x2 y2 −b2 x 2 + a 2 y 2 + a 2 b2 = 0 ⇒ − + + 1 = 0 ⇒ − − 1 = 0. . (6.6) into the general equation of a conic in Eq.7) can be considered in only two cases: a 2 < F 2 (e > 1) and a 2 > F 2 (e < 1).7). a2 This happens when x ≥ a and x ≤ −a. In other words. it follows that e > 1. it follows that x2 − 1 ≥ 0. a graphic of an hyperbola (in its simplified equation form) can be only in the area marked in figure. For Eq.172 6 An Introduction to Geometric Algebra and Conics Substituting Eq. (6. (6.2) we have      2 2 F2 2 2 2 1− x 2 + y 2 − 2 c − aF Fa 2 x + c2 − Fa 2 aF = 0 ⇒ a2  2 ⇒ 1 − Fa 2 x 2 + y 2 − 2 (F − F) x + F 2 − a 2 = 0 ⇒ (6. Equation of the Hyperbola   If a 2 < F 2 . (6. (6. a2 b2 a2 b2 This is the analytical equation of the hyperbola. 2 In a similar way. This means that Eq.7)     ⇒ a 2 − F 2 x 2 + a 2 y 2 + a 2 F 2 − a 2 = 0.5) it occurs that a 2 = F 2 .

2 as shown in Eq.6. In order to narrow down the areas where the graphic of the hyperbola (of the simplified equation) can be plotted. i. Obviously. see e. this equation is equivalent to the equation ax + by + c = 0. Fa > a 2 ⇒ a > aF . namely angular coefficient. In order to calculate the position of the directrix let√us consideronly the right hand side of the graphic. 0 and  √  − a 2 + b2 . Thus. It can be easily verified that there is a symmetric directrix associated to the left hand side of the graphic.6).e. respectively. the foci are in the marked area of the figure. (6. 2 We know that F > a. we may think about the infinite lines of the plane passing through the origin. it follows that the directrix of the right hand side part of the graphic falls outside and at the left of the marked area. Taking into consideration that. Each line is univocally identified by its angular coefficient m. The foci have coordinates a 2 + b2 .4 Simplified Representation of Conics 173 The position of the foci can be easily identified by considering that in an hyperbola F 2 > a 2 . [16]. let us calculate those values of m that result into an intersection of the line with the hyperbola. y = mx . 0 . This means that we want to identify the values of m that satisfy the following system of equations:  2 x2 − by2 − 1 = 0 a2 . the equation of the directrix is x = aF . that the equation of a line passing for the origin the Cartesian system of coordinates can be written in its implicit form as y = mx with x variable and m coefficient. If we now consider from the equations of the line and from basic analytical geometry.g. It follows that F > a or F < −a.

By using the simplified analytical equation of the hyperbola we can now plot the conic. The figure below highlights the areas where the hyperbola can be plotted. . By solving the inequality in the variable m we have 1 m2 b2 b b 2 > 2 ⇒ m 2 < 2 ⇒− <m< . respectively. a2 b2 a2 b2 Since x 2 must be non-negative (and it is actually positive because the multiplica- tion must be 1) it follows that the values of m satisfying the inequality   1 m2 − >0 a2 b2 are those values that identify the region of the plane where the hyperbola can be plotted.174 6 An Introduction to Geometric Algebra and Conics Substituting the second equation into the first one we obtain   x2 m2 x 2 1 m2 − =1⇒ − x 2 = 1. The corresponding two lines having equation y = − ab x and y = ab x. a b a a a This means that the graphic of the (simplified) hyperbola delimited by the lines having angular coefficient − ab and ab respectively. as shown in the following figure. are named asymptotes of the hyperbola.

a 4 Let us work the equation of the hyperbola out to obtain an alternative definition. where b 3 = . 0) and (−F. The coordinates of the foci are (F.6. F 5 5 The asymptotes have equations y = ab x and y = − ab x. An hyperbola is the locus of points P such that dPF − dPF is constant and equal to 2a. a2 2 The directrices have equations x = F and x = − aF . a2 b2 42 32 This is the equation of an hyperbola in the simplified conditions mentioned above.4 Simplified Representation of Conics 175 Example 6. respectively. . Theorem 6. 0) where  F = a 2 + b2 = 5. respectively.2 Let F and F be two foci of an hyperbola. where a2 42 16 = = .10 Let us consider the equation 9x 2 − 16y 2 − 144 = 0 which can be re-written as x2 y2 x2 y2 − = − = 1.

2 Considering that x = aF is the equation of the directrix which. as shown above. the inequality is verified only for the left hand side of the hyperbola. Similarly. . The absolute values are to highlight that the expressions above have a geometric meaning. For the simplified equation of the hyperbola it happens that  2  x y 2 = b2 − 1 . dPF = a + Fax when a + Fax is positive and dPF = −a − Fax when a + Fax is negative. A symmetric consideration can be done for 2 a + Fax > 0 ⇒ x > − aF (right hand side of the hyperbola). In order to be always positive dPF = a − Fax when a − Fax is positive and dPF = −a + Fax when a − Fax is negative. a2 and that    b2 = F 2 − a 2 ⇒ F = a 2 + b2 . If we 2 solve the inequality a − Fax ≥ 0.e. i. we find that the inequality is verified when x < aF . they are distances. is on the left of the right hand side of the hyperbola. We can write now the distance of a point from the focus as  2  2  √ dPF = x − a 2 + b2 + b2 ax 2 − 1 =   √ 2 = x 2 + a 2 + b2 − 2 a 2 + b2 x − b2 + b2 ax 2 =  √  √ = x 2 + a 2 − 2 a 2 + b2 x + b2 ax 2 = a 2 − 2 a 2 + b2 x + ( a 2 ) = 2 a 2 + b2 x 2   √ 2 a 2 + b2 = a− a = |a − Fx a | and analogously  2  2  √ dPF = x+ a 2 + b2 + b2 ax 2 − 1 = |a + Fx a |.176 6 An Introduction to Geometric Algebra and Conics Proof The distance of a generic point of the conic from the focus is given by the equations  dPF =  (x − F)2 + y 2 dPF = (x − F)2 + y 2 .

6.4 Simplified Representation of Conics 177

In summary we have two possible scenarios:

• Right hand side: dPF = −a + Fx
a
and dPF = a + Fx
a

• Left hand side: dPF = a − Fx
a
and dPF = −a − Fx
a
.

In both cases we obtain

dPF − dPF = |2a|. (6.8)

Equation (6.8) gives an alternative characterization of an hyperbola.
Equation of the Parabola
If e = 1 the Eq. (6.7) cannot be used. However, if we substitute e = 1 into Eq. (6.2)
we obtain

y 2 − 2 (F − h) x + F 2 − h 2 = 0.

Without a loss of generality let us fix the reference system so that the conic passes
through its origin. Under these conditions, the term F 2 − h 2 = 0. This means that
F 2 = h 2 ⇒ h = ±F. However, h = F is impossible because we have supposed
F∈ / d with F : (F, 0) and d : x − h = 0. Thus the statement h = 0 would be equiv-
alent to say F ∈ d. Thus, the only possible value of h is −F. If we substitute it into
the equation above we have

y 2 − 4F x = 0

that is the analytical equation of the parabola.
Let us try to replicate the same procedure done for the hyperbola in order to
narrow down the area where a parabola can be plotted. At first let us observe that
this equation, written as y 2 = 4F x imposes that the focus coordinate F and the
variable x are either both positive or both negative. This means that the graphic of a
(simplified) parabola is plotted either all in the positive semi-plane x ≥ 0 or in the
negative semi-plane x ≤ 0. Let us consider the case when both F ≥ 0 and x ≥ 0.
Furthermore, the fact that ∀x there are two values of y (one positive and one
negative respectively) make the graphic of this parabola symmetric with respect to
the abscissa’s axis. Finally, if we look for the value of m such that the lines y = mx
intersect the parabola, we can easily verify that there are no impossible m values.
This means that the parabola has no asymptotes.

178 6 An Introduction to Geometric Algebra and Conics

A graphic of the parabola is shown in the figure below.

Example 6.11 Let us consider the equation

2y 2 − 16x = 0.

This is the equation of a parabola and can be re-written as

y 2 − 4F x = y 2 − 8x = 0.

The coordinates of the focus are simply (2, 0) and the directrix has equation
x = −2.

Theorem 6.3 Let F be the focus and d be the directrix of a parabola, respectively.
A parabola is the locus of points P such that dPF = dPd .

Proof Considering that dPF = edPd with e = 1, it occurs that dPF = dPd , that is an
alternative definition of the parabola.

Equation of the Ellipse  
If a 2 > F 2 , it follows that e < 1. Let us pose b2 = a 2 − F 2 > 0 and let us substitute
this piece of information into the equation of the conic in Eq. (6.7)

x2 y2
b2 x 2 + a 2 y 2 − a 2 b2 = 0 ⇒ + − 1 = 0.
a2 b2

6.4 Simplified Representation of Conics 179

This is the analytical equation of the ellipse. Analogous to what done for the
2 2
hyperbola, this equation is defined when ax 2 ≥ 0 and when by2 ≥ 0. This occurs when
the following two inequality are verified:

y2
1− ≥ 0 ⇒ y 2 ≤ b2 ⇒ −b ≤ y ≤ b
b2
and

x2
1− ≥ 0 ⇒ x 2 ≤ a 2 ⇒ −a ≤ x ≤ a.
a2
This means that the graphic of the ellipse can be plot only within a a × b rectangle.

Since a is the horizontal (semi-)length of this rectangle and for an ellipse
a 2 − F 2 > 0 ⇒ −a < F < a, i.e. the focus is within the√rectangle. The foci have
coordinates (F, 0) and (−F, 0), respectively where F = a 2 − b2 .
2
Since a > F, it follows that a 2 > a F ⇒ aF > a. Since, as shown in Eq. (6.6),
2
the directrix has equation x = aF it follows that the directrix is outside the rectangle.
It can be easily checked that the ellipse, like the hyperbola, is symmetric with
respect to both abscissa’s and ordinate’s axes (y = 0 and x = 0). Hence the ellipse
has two foci having coordinates (F, 0) and (−F, 0), respectively, and two directrices
2 2
having equation x = aF and x = − aF . Each pair focus-directrix is associated to half
graphic below (right and left hand side respectively).

180 6 An Introduction to Geometric Algebra and Conics

The graphic of the ellipse is plotted in the figure below.

Example 6.12 Let us consider the following equation

4x 2 + 9y 2 − 36 = 0.

This equation can be re-written as

x2 y2 x2 y2
+ = + =1
a2 b2 32 22
that is the simplified equation of an ellipse where a = 3√and b = 2. The
√ foci have coor-
dinates (F, 0) and (−F, 0), respectively, where F = a 2 − b2 = 5. The directrix
2
has equation x = aF = √95

Let us now work the equations out to give the alternative definition also for the
ellipse.

Theorem 6.4 Let F and F be the two foci of an ellipse. An ellipse is the locus of
points P such that dPF + dPF is constant and equal to 2a.

Proof Considering that

dPF =  (x − F)2 + y 2
dPF = (x − F)2 + y 2 ,

that for the simplified equation of the ellipse it happens that
 
x2
y =b 1− 2 ,
2 2
a

6.4 Simplified Representation of Conics 181

and that
  
b2 = a 2 − F 2 ⇒ F = a 2 − b2 ,

we can write the distance of a point from the focus as
 2  
√ 2
dPF = x − a 2 − b2 + b2 1 − ax 2 =
   2 2

= x 2 + a 2 − b2 − 2x a 2 − b2 + b2 − bax2 =
  2
√ ( a 2 −b2 )x 2

a 2 −b2
= a − 2x a − b +
2 2 2
a 2 = a − a
x =
 √ 
= a − a a−b x = a − Fax
2 2

and analogously
 2  
√ x2
dPF = x+ a 2 − b2 + b2 1 − a2
=a+ Fx
a
.

If we now sum these two distances we obtain

dPF + dPF = 2a. (6.9)

In other words, the sum of these two distances does not depend on any variable,
i.e. it is a constant equal to 2a. 

Equation (6.9) represents an alternative definition of the ellipse.

6.5 Matrix Representation of a Conic

In general, a conic is shifted and rotated with respect to the reference axes. In the
following sections a characterization of a conic in its generic conditions is given.
Let us consider again the analytical representation of a conic. If we work out the
expression in Eq. (6.1) we have:
  
x 2 + α 2 − 2αx + y 2 + β 2 − 2βy a 2 + b2 = a 2 x 2 + b2 y 2 + c2 + 2abx y + 2acx + 2bcy ⇒
   
⇒ b2 x 2 + a 2 y 2 − 2abx y − 2 αa 2 + αb2 + ac x − 2 βa 2 + βb2 + bc y+
 
+ a 2 α 2 + a 2 β 2 + b2 α 2 + b2 β 2 − c2 .

If we now perform the following replacements:

182 6 An Introduction to Geometric Algebra and Conics

a1,1 = b2
a1,2 = −ab 
a1,3 = − αa 2 + αb2 + ac
a2,2 = a2 
a2,3 = − βa 2 + βb2 + bc 
a3,3 = a 2 α 2 + a 2 β 2 + b2 α 2 + b2 β 2 − c2

we can write the analytical representation of a conic as

a1,1 x 2 + 2a1,2 x y + 2a1,3 x + a2,2 y 2 + 2a2,3 y + a3,3 = 0 (6.10)

which will be referred to as matrix representation of a conic.

Example 6.13 The equation

5x 2 + 26x y + 14x + 8y 2 + 5y + 9 = 0

represents a conic. However, from the knowledge we have at this point, we cannot
assess which kind of conic this equation represents and we cannot identify its other
features, such as the position of foci and directrices. In the following pages, a complete
and general characterization of the conics is presented.

6.5.1 Intersection with a Line

Let us consider two points of the plane, T and R, respectively, whose coordinates in
a Cartesian reference system are (xt , yt ) and (xr , yr ). From Sect. 6.1.1 we know that
the equation of a line passing from T and R is

x − xt y − yt
=
xr − x t yr − yt

where x and y are variables.
Since xr − xt and yr − yt are constants, let us pose l = xr − xt and m = yr − yt .
Hence, the equation of the line becomes

x − xt y − yt
= (6.11)
l m
where l and m are supposed = 0.
Let us pose now
y − yt
=t
m

6.5 Matrix Representation of a Conic 183

where t is a parameter. It follows that
 
y−yt
=t y = mt + yt
m ⇒ .
x−xt
l
=t x = lt + xt

Let us now consider again the matrix representation of a conic in Eq. (6.10). We
would like to find the intersections between the line and the conic. The search of
an intersection point is the search for a point that simultaneously belongs to two
objects. This means that an intersection point can be interpreted as the solution that
simultaneously satisfies multiple equations. In our case the intersection point of the
line with the conic is given by


⎨ y = mt + yt
x = lt + xt


a1,1 x 2 + 2a1,2 x y + 2a1,3 x + a2,2 y 2 + 2a2,3 y + a3,3 = 0

which leads to the equation

a1,1 (lt + xt )2 + 2a1,2 (lt + xt ) (mt + yt) + 2a1,3 (lt + xt ) + a2,2 (mt
 + yt ) + 2a2,3 (mt + yt ) + a3,3 = 0 ⇒
2

 ⇒ a 1,1 l 2 + 2a lm + a m 2 t 2 +
 1,2  2,2  
 +2 a1,1 xt + a1,2 yt + a1,3 l + a1,2 xt + a2,2 yt + a2,3 m + 
+ a1,1 xt + a2,2 yt + 2a1,2 xt yt + 2a1,2 xt yt + 2a1,3 xt + 2a2,3 yt + a3,3 = 0
2 2

that can be re-written as
αt 2 + 2βt + γ = 0 (6.12)

where

 
α = a1,1 l 2 + 2a1,2 lm + a2,2m 2   
β =  a1,1 xt + a1,2 yt + a1,3 l + a1,2 xt + a2,2 yt + a2,3 m 
γ = a1,1 xt2 + a2,2 yt2 + 2a1,2 xt yt + 2a1,2 xt yt + 2a1,3 xt + 2a2,3 yt + a3,3 .

It must be observed that γ is the Eq. (6.10) calculated in the point T. Let us now
consider Eq. (6.12). It is a second order polynomial in the variable t. As such we can
distinguish the following three cases:
• if the equation has two real solutions then the line crosses the conic, i.e. the line
intersects the conic in two distinct points of the plane (the line is said secant to the
conic)
• if the equation has two coinciding solution then the line is is tangent to the conic
• if the equation has two complex solutions then the line does not intersect the conic
(the line is said external to the conic)

184 6 An Introduction to Geometric Algebra and Conics

6.5.2 Line Tangent to a Conic

Let us focus on the case when Eq. (6.12) has two coinciding solutions, i.e. when the
line is tangent to the conic, and let us write down the equation of a line tangent to a
conic.
The solution of Eq. (6.12) corresponds to a point T of the plane belonging to both
the line and the conic. Since this point belongs to the conic, its coordinates satisfy
Eq. (6.10) that, as observed above, can be written as γ = 0. Since γ = 0 Eq. (6.12)
can be written as
αt 2 + βt = 0.

Obviously this equation can be re-written as

t (αt + β) = 0

which has t = 0 as a solution. Since the solutions are coinciding with t = 0 it follows
that

    
β=0⇒ a1,1 xt + a1,2 yt + a1,3 l + a1,2 xt + a2,2 yt + a2,3 m = 0.

This is an equation in two variables, l and m respectively. Obviously, this equation
has ∞ solutions. The solution (0, 0), albeit satisfying the equation, is unacceptable
because it has no geometrical meaning, see Eq. (6.11). Since ∞ solutions satisfy the
equation if we find one solution, all the others are proportionate to it. A solution that
satisfies the equation is the following:
 
l = a1,2 xt + a2,2 yt + a2,3 
m = − a1,1 xt + a1,2 yt + a1,3 .

If we substitute the values of l and m into Eq. (6.11) we obtain
   
a1,1 xt + a1,2 yt + a1,3 (x − xt ) + a1,2 xt + a2,2 yt + a2,3 (y − yt ) = 0 ⇒
⇒ a1,1 xt x + a1,2 yt x + a1,3 x − a1,1 xt2 − a1,2 yt xt − a1,3 xt +
+a1,2 xt y + a2,2 yt y + a2,3 y − a1,2 xt yt − a2,2 yt2 − a2,3 yt = 0 ⇒
⇒ a1,1 xt x + a1,2 yt x + a1,3 x + a1,2 xt y + a2,2 yt y + a2,3 y+
−a1,1 xt2 − 2a1,2 xt yt − a1,3 xt − a2,2 yt2 − a2,3 yt = 0.
(6.13)
Let us consider again Eq. (6.10) and let us consider that T is a point of the conic.
Hence, the coordinates of T verify the equation of the conic:

a1,1 xt2 + 2a1,2 xt yt + 2a1,3 xt + a2,2 yt2 + 2a2,3 yt + a3,3 = 0 ⇒
⇒ a1,3 xt + a2,3 yt + a3,3 = −a1,1 xt2 − a2,2 yt2 − 2a1,2 xt yt − a2,3 yt − a1,3 xt .

6.5 Matrix Representation of a Conic 185

Substituting this result into Eq. (6.13) we obtain

a1,1 xt x + a1,2 yt x + a1,3 x + a1,2
 xt y + a2,2 yt y + a2,3 y + a1,3
 xt + a2,3 yt + a3,3 = 0 ⇒
⇒ a1,1 xt + a1,2 yt + a1,3 x + a1,2 xt + a2,2 yt + a2,3 y + a1,3 xt + a2,3 yt + a3,3 = 0
(6.14)
that is the equation of a line tangent to a conic in the point T with coordinates (xt , yt ).

6.5.3 Degenerate and Non-degenerate Conics: A Conic
as a Matrix

The equation of a line tangent to a conic is extremely important to study and under-
stand conics. In order to do so, let us consider the very special case when Eq. (6.14)
is verified regardless of the values of x and y or, more formally ∀x, y the line is
tangent to the conic. This situation occurs when the coefficients of x and y are null
as well as the constant coefficient. In other words, this situation occurs when


⎨a1,1 xt + a1,2 yt + a1,3 = 0
a1,2 xt + a2,2 yt + a2,3 = 0


a1,3 xt + a2,3 yt + a3,3 = 0.

Algebraically, this is a system of 3 linear equations in 2 variables. From Rouchè-
Capelli Theorem we know that, since the rank of the incomplete matrix is at most
2, if the determinant of the complete matrix is null then the system is certainly
incompatible. In our case, if
⎛ ⎞
a1,1 a1,2 a1,3
det Ac = det ⎝ a1,2 a2,2 a2,3 ⎠ = 0
a1,3 a2,3 a3,3

then the system is surely incompatible, i.e. it is impossible to have a line tangent to
the conic regardless of x and y. On the contrary, if det Ac = 0 the system has at least
one solution, i.e. the special situation when ∀x, y the line is tangent to the conic is
verified.
Geometrically, a line can be tangent to a conic regardless of its x and y values
only when the conic is a line itself or a point of the line. These situations correspond
to a degenerate conic. From the considerations above, given a generic conic having
equation
a1,1 x 2 + 2a1,2 x y + 2a1,3 x + a2,2 y 2 + 2a2,3 y + a3,3 = 0,

2 a1.2 a2. More specifically the following cases can occur: • det Ac = 0: the conic is non-degenerate • det Ac = 0: the conic is degenerate. The tangent point T is also indicated. the overlap of the conic with the line is graphically represented with a thick line. .2 a2. the conic is part of the line or coincides with it.3 a3. This can be seen as an initial problem involving two sets is collapsed into a problem involving only one set. On the right.e. a null determinant can be seen as the presence of redundant pieces of information.14 The conic having equation 5x 2 + 2x y + 14x + 8y 2 + 5y + 9 = 0 is associated to the matrix ⎛ ⎞ 517 ⎝1 8 3⎠ 739 whose determinant is −44.186 6 An Introduction to Geometric Algebra and Conics we can associate to it the following matrix ⎛ ⎞ a1. i. In figure.1 a1. Example 6.3 The determinant of this matrix tells us whether or not this conic is degenerate. a1. the understanding of the determinant can also be revisited: a line tangent to a conic means that two figures in the plane have one point in common or that the intersection of two sets is a single point. Yet again. The figure below depicts the two situations. The conic is non-degenerate. On the left a line tangent to a non- degenerate conic is shown.3 ⎠ .3 a2.3 Ac = ⎝ a1. the line is tangent to the conic in all its points. This finding allows us to see how a conic can also be considered as a matrix. Furthermore. If this intersection point is the only point of the conic or if the intersection is the entire line we can conclude that the set of the points of the conic is contained within the set of the line.

The following theorem describes this fact. i.5 Matrix Representation of a Conic 187 6. Hence.5.2 . l The amount of asymptotic directions is a very important feature of conics. (6.6.11). However. A pair of these numbers. a parabola has one asymptotic direction.2 μ + a1. μ) = 1.2 2 − a1.12) and let us analyse again its meaning. the condition α = 0 can be written as   α = a1.2 μ2 + 2a1. If we pose μ = ml the equation becomes   a2.e. It represents the intersection of a line with a generic conic. If α = 0 the polynomial becomes of the first order. the relation between the solution μ and the asymptotic directions (l. when existent. 0).e.e.1 = 0 namely. More specifically the following cases can be distinguished: • Δ > 0: the equation has two real and distinct solutions. 0).1 a2.2 lm + a2. This equation would be verified for (l.2 m 2 = 0. are said asymptotic direction of the conic. Algebraically. m) = (0.4 Classification of a Conic: Asymptotic Directions of a Conic Let us consider again Eq. Let us solve the equation above as a second order polynomial in the variable μ. .1 l 2 + 2a1. (6. no asymptotic directions exist Obviously. equation of the asymptotic directions. two asymptotic direc- tions of the conic exist • Δ = 0: the equation has two real and coinciding solutions. an ellipse has no asymptotic directions. Theorem 6. . we need to find a solution (l. m) = (0. Let us divide the equation α = 0 by l 2 . one asymptotic direction of the conic exist • Δ < 0: the equation has two complex solutions. It is a second order polynomial because the conic has a second order equation. this solution cannot be considered for what written in Eq. i.5 An hyperbola has two asymptotic directions. In order to do it we need to discuss the sign of Δ = a1. i. m) is given by  m (1.

10). a1. Theorem 6. Example 6.2 It occurs that • if det I33 < 0 the conic is an hyperbola • if det I33 = 0 the conic is a parabola • if det I33 > 0 the conic is an ellipse Example 6.3 y + a3.2 − a1.3 = .2 I3.2 we can study a conic from directly its matrix representation in Eq.3 x + a2.2 a2.2 y 2 + a1.2 = − det a1.3 this conic is classified by the determinant of its submatrix   a1.1 x 2 + 2a1.2 a2.3 = 0 and associated matrix ⎛ ⎞ a1.3 ⎝ a1.16 We know that the conic having equation x2 y2 + =1 25 16 .3 = det = 38 > 0.1 a1.2 x y + a2.2 2 − a1.3 ⎠ .2 a2. since    2  a1.2 a2.1 a2. a1.6 (Classification Theorem) Given the a conic having equation a1. 18 The conic is an ellipse.1 a1.1 a2.15 Let us consider again the conic having equation 5x 2 + 2x y + 14x + 8y 2 + 5y + 9 = 0.2 a1. We already know that this conic is non-degenerate. Let us classify this conic by calculating the determinant of I33 :   51 det I3.3 a2. (6.2 = − a1.1 a1.2 Δ = a1.3 a3.188 6 An Introduction to Geometric Algebra and Conics Furthermore.

since it happens that   16 0 det I3.3 = det = 400 > 0. A such the matrix is non-singular. 0 0 −400 It can be immediately observed that this matrix is diagonal. the simplified equations are written by choosing a reference system that leads to a diagonal matrix.3 = det = −400 < 0. In order to do it. hence the conic is non-degenerate. It can be shown that all the simplified equations of ellipses and hyperbolas cor- respond to diagonal matrices. By applying the Classifi- cation Theorem we observe that since   16 0 det I3.6. . 0 25 the conic is an ellipse. Let us write its associated matrix: ⎛ ⎞ 16 0 0 Ac = ⎝ 0 25 0 ⎠ . 25 16 The matrix representation of this conic is 16x 2 − 25y 2 − 400 = 0 with associated matrix ⎛ ⎞ 16 0 0 Ac = ⎝ 0 −25 0 ⎠ . Let us verify at first that this conic is non-degenerate. Let us verify this statement by considering the simplified equation of an hyperbola x2 y2 − = 1.5 Matrix Representation of a Conic 189 is an ellipse in its simplified form. 0 −25 the conic is an hyperbola. let us write the equation in its matrix representation: 16x 2 + 25y 2 − 400 = 0. Furthermore. Let us verify it by applying the Classification Theorem. 0 0 −400 Since this matrix is non-singular it is non-degenerate. More rigorously.

their scalar product is null: 1 + μ1 μ2 = 0. μ1 ) and (1.18 Let us now write the equation of a generic conic and let us classify it: 6y 2 − 2x + 12x y + 12y + 1 = 0. The associated matrix is ⎛ ⎞ 001 A = ⎝0 2 0⎠. 02 i. If these two directions are perpendicular for Proposition 4. Proof For an hyperbola the equation of the asymptotic directions   a2. −1 6 0 whose determinant is −36 − 36 − 6 = −78 = 0. the associated matrix is ⎛ ⎞ 0 6 −1 Ac = ⎝ 6 6 6 ⎠ . Proposition 6. Hence.e.6. respectively. c 100 that is non-singular. The conic is non-degenerate. the conic is a parabola. These two directions can be seen as two vectors in the plane.190 6 An Introduction to Geometric Algebra and Conics Example 6. Let us classify it   06 det I3.1 The asymptotic directions of an hyperbola are perpendicular if and only if the trace of the associated submatrix I33 is equal to 0.3 = det = −36 < 0.3 = det = 0.1 = 0 has two distinct real roots μ1 and μ2 . If we apply the Classification Theorem we obtain   00 det I3. The associated asymptotic directions are (1.2 μ2 + 2a1. the conic is non-degenerate. .2 μ + a1. 66 The conic is an hyperbola. μ2 ). Example 6. At first.17 Let us consider the following simplified equation of a parabola: 2y 2 − 2x = 0. It can be observed that the simplified equation of a parabola corresponds to a matrix where only its elements on the secondary diagonal are non-null.

2 −1 ⇒ (1. More specifically the conic is a degenerate hyperbola. whose solutions are μ1 = −2 and μ2 = 21 .e. i. then a11 = −a2. By working out the original equation we obtain   2x 2 + 4y 2 + 6x y = 0 ⇒ 2 x 2 + 2y 2 + 3x y = 0 ⇒ 2 (x + y) (x + 2y) = 0. i.2 .6 μ1 μ2 = a2.2 The trace of I33 is a1.1 If tr (I33 ) = 0.2 = 2 − 2 = 0.6. In order to do it let us solve the equation of the asymptotic directions:   a2. 21 . Hence. By applying the classification theorem we find out that since   23 det I3.5 Matrix Representation of a Conic 191 a1.3 = det = −1 < 0 34 the conic is an hyperbola.1 + a2.2 .10 An hyperbola whose asymptotic directions are perpendicular is said to be equilateral. that is = −1. the conic is degenerate.19 The hyperbola having equation −2x 2 + 2y 2 − x + 3x y + 5y + 1 = 0 has perpendicular asymptotic directions since a11 + a2.1 1+ = 0 ⇒ a11 = −a2. −2) and 1.6 μ1 μ2 = a2. Let us find the asymptotic directions. .  Example 6. Let us see a few degenerate examples. For Proposition 5. Definition 6.2 = 0  a1.2 μ2 + 2a1. the two directions are perpendicular.2 . a2. The corresponding asymptotic directions   are (1.e. 000 The determinant of the matrix is null. a pair of intersecting lines.1 For Proposition 5. a1. Example 6.2 μ + a1.20 Let us consider the following conic 2x 2 + 4y 2 + 6x y = 0 and its associated matrix ⎛ ⎞ 230 Ac = ⎝ 3 4 0 ⎠ . Hence. μ1 ) (1. μ2 ) = 0.1 = 2μ2 + 3μ − 2 = 0.

3 = det = 7 > 0. a degen- erate parabola since   1 −1 det I3. The conic is a point. More specifically. The matrix associated to this conic is ⎛ ⎞ 1 −1 0 Ac = ⎝ −1 1 0 ⎠ 0 0 −9 whose determinant is null.22 Let us consider the conic having equation y 2 + x 2 − 2x y − 9 = 0. the only real point that satisfies the equation of this conic (and thus the only point with geometrical meaning) is (0. 14 i.e. −1 1 This equation of the conic can be written as (x − y + 3) (x − y − 3) = 0 that is the equation of two parallel lines: y = x +3 y = x − 3. 0).21 If we consider the conic having equation 2x 2 + 4y 2 + 2x y = 0.192 6 An Introduction to Geometric Algebra and Conics Hence the conic is the following pair of lines: y = −x y = − x2 . Example 6. Example 6. the conic is an ellipse. more specifically. . its associated matrix is ⎛ ⎞ 210 Ac = ⎝ 1 4 0 ⎠ 000 is singular and its classification leads to   21 det I3. The conic is degenerate and.3 = det = 0.

In the first case the rank of the matrix Ac is 2 while in the second one the rank of Ac is 1. since infinite points belong to a conic. Since the matrix associated to this conic is ⎛ ⎞ 120 Ac = ⎝ 2 4 0 ⎠ . 6. Considering that   12 det I3.5 Diameters. and Axes of Conics Definition 6. On the contrary. The conic can written as (x + 2y)2 = 0 that is the equation of two coinciding lines. Obviously. .3 = det = 0. a conic has infinite chords.5 Matrix Representation of a Conic 193 Example 6.23 the two parallel lines are also overlapping. Centres. 000 the conic is degenerate. In Example 6.5. Asymptotes.11 A chord of a conic is a segment connecting two arbitrary points of a conic. Degenerate conics belonging to the latter group are said twice degenerate. The last two examples show that a degenerate parabola can be of two kinds. in Example 6.6. 24 it follows that the conic is a parabola.23 Let us consider the conic having equation x 2 + 4y 2 + 4x y = 0.21 the parabola breaks into two parallel lines.

194 6 An Introduction to Geometric Algebra and Conics

Definition 6.12 Let (l, m) be an arbitrary direction. A diameter of a conic conjugate
to the direction (l, m) (indicated with diam) is the locus of the middle points of all
the possible chords of a conic that have direction (l, m). The direction (l, m) is said
to be direction conjugate to the diameter of the conic.

It must be noted that in this context, the term conjugated is referred with respect
to the conic. In other words the directions of line and diameter are conjugated to each
other with respect to the conic.

Definition 6.13 Two diameters are said to be conjugate if the direction of one diam-
eter is the conjugate direction of the other and vice-versa.

The conjugate diameters diam and diam are depicted in figure below.

6.5 Matrix Representation of a Conic 195

Proposition 6.2 Let diam be the diameter of a conic conjugated to a direction
(l, m). If this diameter intersects the conic in a point P the line containing P and
parallel to the direction (l, m) is tangent to the conic in the point P.

Proof Since P belongs to both the conic and the line, it follows that the line cannot
be external to the conic. It must be either tangent or secant.
Let us assume by contradiction, that the line is secant to the conic and thus
intersects it in two points. We know that one point is P. Let us name the second
intersection point Q. It follows that the chord PQ is parallel to (l, m). The middle
point of this chord, for definition of diameter, is a point of the diameter diam. It
follows that P is middle point of the chord PQ (starting and middle point at the same
time). We have reached a contradiction which can be sorted only if the segment is
one point, i.e. only if the line is tangent to the conic. 

Proposition 6.3 Let us consider an ellipse and a non-asymptotic direction (l, m).
Let us consider now the two lines having direction (l, m) and tangent to the conic.
Let us name the tangent points A and B respectively.

196 6 An Introduction to Geometric Algebra and Conics

The line passing through A and B is the diameter of a conic conjugate to the
direction (l, m).

Theorem 6.7 For each (l, m) non-asymptotic direction, the diameter of a conic
conjugated to this direction has equation
   
a1,1 x + a1,2 y + a1,3 l + a1,2 x + a2,2 y + a2,3 m = 0. (6.15)

It can be observed from Eq. (6.15) that for each direction (l, m) a new conjugate
diameter is associated. Hence, while these parameters vary a family of intersecting
straight lines is identified. For every direction (l, m) a conjugate diameter is identified.
Since infinite directions in the plane exist, a conic has infinite diameters.

Definition 6.14 The center of a conic is the intersection point of the diameters.

In order to find the coordinates of the center, we intersect two diameters that is
Eq. (6.15) for (l, m) = (1, 0) and (l, m) = (0, 1), respectively. The coordinates of the
center are the values of x and y that satisfy the following system of linear equations:

a1,1 x + a1,2 y + a1,3 = 0
(6.16)
a1,2 x + a2,2 y + a2,3 = 0.

The system is determined when
 
a1,1 a1,2
det = 0.
a1,2 a2,2

In this case the conic is an either an hyperbola or an ellipse. These conics have a
center and infinite diameters passing through it (family of intersecting straight lines).
On the contrary, if

6.5 Matrix Representation of a Conic 197
 
a1,1 a1,2
det =0
a1,2 a2,2

the conic is a parabola and two cases can be distinguished:
a1,1 a1,2 a1,3
• a1,2
= a2,2
= a2,3
: the system is incompatible, the conic has no center within the
plane;
• aa1,2
1,1
= a1,2
a2,2
= a1,3
a2,3
: the system is undetermined and has infinite centres.

In the first case, the two equations of system (6.16) are represented by two parallel
lines (hence with no intersections). Since there are no x and y values satisfying
the system also Eq. (6.15) is never satisfied. Hence, the conic has infinite parallel
diameters (a family of parallel straight lines).
In the second case, the parabola is degenerate. The rows of the equations of the
system are proportionate, i.e. a1,2 = λa1,1 , a2,2 = λa1,2 , and a2,3 = λa1,3 with λ ∈ R.
The two equations of system (6.16) are represented by two coinciding lines.
It follows that
   
 a1,1 x + a1,2 y + a1,3 l + a1,2 x + a2,2 y + a2,3 m = 0 ⇒
⇒ a1,1 x + a1,2 y + a1,3 (l + λm) = 0 ⇒ a1,1 x + a1,2 y + a1,3 = 0.

In this degenerate case, the parabola has thus only
 one diameter (all the
 diameters are
overlapped on the same line) having equation a1,1 x + a1,2 y + a1,3 = 0 and where
each of its points is center.

Example 6.24 If we consider again the hyperbola having equation

6y 2 − 2x + 12x y + 12y + 1 = 0,

we can write the equation of the diameters

(6y − 1) l + (6x + 6y + 6) m = 0.

In order to find two diameters let us write the equations in the specific case
(l, m) = (0, 1) and (l, m) = (1, 0). The two corresponding diameters are

6y − 1 = 0
6x + 6y + 6 = 0.

Let us now simultaneously
 solve
 the equations to obtain the coordinates of the center.
The coordinates are − 76 , 16 .

Example 6.25 Let us consider the conic having equation

4y 2 + 2x + 2y − 4 = 0.

198 6 An Introduction to Geometric Algebra and Conics

The associated matrix ⎛ ⎞
00 1
Ac = ⎝ 0 4 1 ⎠
1 1 −4

has determinant −4. The conic is non-degenerate and, more specifically, a parabola
since  
00
det I3,3 = det = 0.
04

This parabola has no center. If we tried to look for one we would need to solve
the following system of linear equations

1=0
4y + 1 = 0

which is incompatible. The parabola has infinite diameters parallel to the line having
equation 4y + 1 = 0 ⇒ y = − 41 .

Example 6.26 Let us consider again the conic having equation

x 2 + 4y 2 + 4x y = 0.

We know that this conic is a degenerate parabola consisting of two coinciding
lines. Let us write the system of linear equations for finding the coordinates of the
center 
x + 2y = 0
2x + 4y = 0.

This system is undetermined and has ∞ solutions. The conic has ∞ centres
belonging to the diameter. The latter is the line having equation x + 2y = 0.

Proposition 6.4 Let the conic C be an ellipse or an hyperbola. The diameters of
this conic are a family of intersecting straight lines whose intersection point is the
center of symmetry of the conic C .

Proof Let (l, m) be a non-asymptotic
  direction and diam the diameter conjugated to
(l, m). Let us indicate with l , m the direction of the diameter diam. The diameter
diam intersects the conic in two points A and B, respectively. Thus, the segment AB
is a chord. We may think of the infinite chords parallel to AB and thus to a diameter
diam conjugated to the direction (l, m). The diameter diam intersects the conic
in the points C and D, respectively, and the segment AB in its middle point M (for
definition of diameter). The point M is also middle point for the segment CD.

6.5 Matrix Representation of a Conic 199

We can make the same consideration for any arbitrary direction (l, m), i.e. every
chord is intersected in its middle point by a chord having direction conjugate to
(l, m). Hence, the diameters are a family of straight lines intersecting in M and M
is the center of symmetry. 

Definition 6.15 An axis of a conic is a diameter when it is perpendicular to its
conjugate direction (equivalently: an axis is a diameter whose direction is conjugated
to its perpendicular direction).

In order to calculate the equation of an axis, let us prove the following theorem.

Theorem 6.8 An ellipse or an hyperbola has two axes, perpendicular to each other.

Proof Let us consider a direction (l, m) and Eq. (6.15):
   
 a1,1 x + a1,2 y + a1,3 l + a1,2 x + a2,2 y + a2,3 m =0 ⇒
⇒ a1,1 lx + a1,2 ly + a1,3 + a1,2 mx + a2,2my + a2,3 m =  0⇒
⇒ a1,1 l + a1,2 m x + a1,2 l + a2,2 m y + a1,3 l + a2,3 m = 0.
    
The direction of the diameter is − a1,2 l + a2,2 m , a1,1 l + a1,2 m . The diame-
ter is an axis if its direction is perpendicular to (l, m) and hence if their scalar product
is null:
   
a1,2 l + a2,2 m l − a1,1 l + a1,2 m m = 0 ⇒
a1,2 l 2 + a2,2
 lm − a1,1lm − a1,2 m 2= 0 ⇒
2

a1,2 l + a2,2 − a1,1 lm − a1,2 m = 0.
2

If we solve this equation in the variable l we have that the discriminant is
  2
a2,2 − a1,1 m + 4a1,2
2
m2

200 6 An Introduction to Geometric Algebra and Conics

which is always positive. Hence, it always has two solutions that is the direction of
two axes, respectively

−(a2,2 −a1,1 )m+ ((a2,2 −a1,1 )m ) + 4a1,2
2 2
m2
l1 =
 2a1,2
−(a2,2 −a1,1 )m− ((a2,2 −a1,1 )m )2 + 4a1,2
2
m2
l2 = 2a1,2
.

Since ellipses and hyperbolas have two perpendicular directions of the axes then
have two axes perpendicular to each other. 

When the directions of the axes are determined the equation of the axis is obtained
by imposing that the axis passes through the center of the conic, which is the center
of the family of diameters. Let the direction of an axis be (l, m) and C be the center
of the conic having coordinates (xc , yc ) the equation of the corresponding axis is

x − xc y − yc
= .
l m
Corollary 6.1 In a circumference every diameter is axis.

Proof If the conic is a circumference it has equation x 2 + y 2 = R, thus a1,1 = a2,2
and a1,2 = 0. It follows that the equation
 
a1,2 l 2 + a2,2 − a1,1 lm − a1,2 m 2 = 0

is always verified. Hence every diameter is axis. 

Example 6.27 Let us consider the conic having equation

9x 2 + 4x y + 6y 2 − 10 = 0.

The associated matrix ⎛ ⎞
92 0
⎝2 6 0 ⎠
0 0 −10

is non-singular. Hence, the conic is non-degenerate.
The submatrix  
92
26

has positive determinant. Hence, the conic is an ellipse.
The equation of the family of diameters is

(9x + 2y) l + (2x + 6y) m = (9l + 2m) x + (2l + 6m) y = 0.

6.5 Matrix Representation of a Conic 201

From the equation of the family of diameters we can obtain the coordinates of the
center by solving the following system of linear equations

9x + 2y = 0
.
2x + 6y = 0

The coordinates of the center are (0, 0). In order to find the direction of the axes
(l, m) it is imposed that

− (2l + 6m) l + (9l + 2m) m = 0 ⇒ −2l 2 + 3lm + 2m 2 = 0.

Posing μ = m
l
and dividing the equation by l 2 we obtain

2μ2 + 3μ − 2 = 0

whose solutions are
μ = 21
μ = −2.
 
Hence the directions of the axes are 1, 21 and (1, −2). The corresponding equa-
tions of the axes, i.e. equations of lines having these directions and passing through
the center of the conic are

x − 2y = 0
2x + y = 0.

Theorem 
6.9 Let A  be  the matrix associated
 to a parabola and
a1,1 x + a1,2 y + a1,3 l + a1,2 x + a2,2 y + a2,3 m be the equation of the family of
its parallel diameters. The axis of a parabola is only one and is parallel to its diam-
eters. The coefficients of the equation ax + by + c = 0 of the axis are equal to the
result of the following matrix product
   
a b c = a1,1 a1,2 0 Ac .

Corollary 6.2 The coefficients of the equation of the axis of a parabola can also be
found as    
a b c = a2,1 a2,2 0 Ac .

Example 6.28 The conic having equation

x 2 + 4x y + 4y 2 − 6x + 1 = 0

202 6 An Introduction to Geometric Algebra and Conics

has associated matrix ⎛ ⎞
1 2 −3
⎝ 2 4 0 ⎠
−3 0 1

which is non-singular. The conic is non-degenerate.
The submatrix  
12
24

is singular. The conic is a parabola.
The family of diameters have equation

(x + 2y − 3) l + (2x + 4y) m = 0.

To find the axis of the conic let us calculate
⎛ ⎞
  1 2 −3  
1 2 0 ⎝ 2 4 0 ⎠ = 5 10 −3 .
−3 0 1

The equation of the axis is 5x + 10y − 3 = 0.

Although an in depth understanding of the axis of the parabola is outside the scopes
of this book, it may be useful to observe that the procedure is, at the first glance,
very different with respect to that described for ellipse and hyperbola. However,
the procedures are conceptually identical after the assumption that the center of the
parabola exists and is outside the plane where the parabola lays. This center is a so
called point at infinity which is a notion of Projective Geometry, see [17].

Definition 6.16 The vertex of a conic is the intersection of the conic with its axes.

Example 6.29 Let us find the vertex of the parabola above. We simply need to find
the intersections of the conic with its axis, i.e. we need to solve the following system
of equations: 
x 2 + 4x y + 4y 2 − 6x + 1 = 0
5x + 10y − 3 = 0.

By substitution we find that x = 17
75
and y = 14
75
.

Definition 6.17 An asymptote of a conic is a line passing through its center and
having as its direction the asymptotic direction.

An ellipse has no asymptotes because it has no asymptotic directions. An hyper-
bola has two asymptotes, one per each asymptotic direction. A parabola, although has
one asymptotic direction, has no (univocally defined) center, and thus no asymptote.

5 Matrix Representation of a Conic 203 Example 6. The asymptotic directions       are then 1. 1 − 23 and 1. Hence. respectively.31 Let us consider the equation of the following conic x 2 − 2y 2 + 4x y − 8x + 6 = 0. Let us search for the asymptotic directions of the conic by solving the equation −2μ2 + 4μ + 1 = 0. The determinant of the matrix   1 2 2 −2 is −6 that is < 0. Hence the conic is an hyperbola. The associated matrix is ⎛ ⎞ 1 2 −4 ⎝ 2 −2 0 ⎠ −4 0 6 is non-singular.   The solutions are 1 − 23 and 1 + 23 . The equations of the asymptotes are x + 19 y + 17 25 = 25 1 −2 and x + 19 y+ 17 25 = 1 25 1 2 Example 6. .6. The center of the conic can be easily found by soling the following system of linear equations:  −2x + 23 y − 21 = 0 3 2 x + 2y + 25 = 0 whose solution is xc = − 19 25 and yc = − 17 25 . 1 + 23 .30 Let us consider again the hyperbola having equation −2x 2 + 2y 2 − x + 3x y + 5y + 1 = 0   We know that the asymptotic directions are (1. the conic is non-degenerate. 21 . −2) and 1.

6.3 x + a2. The equation of the canonic form of an ellipse or an hyperbola is L X 2 + MY 2 + N = 0 . Let us now write the equations of the asymptotes: x− 4 y − 43 3 =  1 1 − 23 x− 4 y − 43 3 =  .3 y + a3. and its associated matrices Ac and I3.6 Canonic Form of a Conic Let us consider a generic conic having equation a1. The expression of a conic in its canonic form is a technique of obtaining the conic in its simplified formulation by changing the reference system.2 y 2 + 2a2.3 . the canonic form is obtained by choosing a reference system whose center coincides with the center of the conic and whose axes coincide with the axes of the conic.e. ellipses and hyperbolas. the transformation itself can be computationally onerous as it requires the solution of a nonlinear system of equation. In some cases. Two different procedures are illustrated in the following paragraphs.e.2 x y + 2a1. the first is for conics having a center.204 6 An Introduction to Geometric Algebra and Conics Let us find the coordinates of the center  x + 2y − 4 = 0 2x − 2y = 0 which yields to xc = 4 3 and yc = 43 . i.3 = 0. the trans- formation into a canonic form can be extremely convenient and lead to a simplified mathematical description of the conic. the second is for parabolas.1 x 2 + 2a1. 1 1 + 23 6. On the other hand.4 can be used in a straightforward way. i.5. Canonic Form of Ellipse and Hyperbola If the conic is an ellipse or an hyperbola and thus has a center. by rotating and translating the reference system so that the equations in Sect.

2 The ellipse of the two equations is obviously the same but it corresponds.32 The conic having equation x 2 − 2x y + 3y 2 − 2x + 1 = 0   is non-degenerate since det (Ac ) = −1 and is ellipse since det I3. M = 2 + √ 2.17) L = 2 + 2.5 Matrix Representation of a Conic 205 where ⎧ ⎪ ⎨ L M N = det (Ac )   L M = det I3. This means that we have two canonic form for the ellipse. M = 2 − 2. Canonic Form of the Parabola If the conic is a parabola it does not have a center in the same plane where the conic lies. The trace of I3. and the ordinate’s axis perpendicular to the abscissa’s axis and tangent to the vertex.6. to an ellipse whose longer axis is aligned to the ordinate’s axis. in the other case. we may write ⎧ ⎪ ⎨ L M N = −1 LM = 2 ⎪ ⎩ L + M = 4. in one case. Example 6. N = − 21 (6. In order to have the conic written in its canonic form. to an ellipse whose longer part is aligned the abscissa’s axis and.3 . N = − 21 . To write it in its canonic form.3 = 2. the reference system is chosen to have its origin coinciding with the vertex of the parabola. The solutions of this system are √ √ L = 2 − √ 2.3 ⎪ ⎩   L + M = tr I3. that is  √  2  √  1 2− 2 X + 2 + 2 Y2 − = 0 2 and  √   √  1 2 + 2 X 2 + 2 − 2 Y 2 − = 0. the abscissa’s axis aligned with the axis of the parabola. The equation of a parabola in its canonic form is MY 2 + 2B X = 0 .3 is 4.

if they exist. 6. if they exist.6 Exercises 6.3 Identify. 6. the intersection points of the following two lines: 3x − 2y + 4 = 0 4x + y + 1 = 0. The two equations refer to a parabola which lies on the positive (right) semiplane and on the negative (left) semiplane.2 Write the equation of the line passing through the points P1 having coordinates (1.5 Identify.3 = 2. the intersection points of the following two lines: 3x − 2y + 4 = 0 9x − 6y + 12 = 0.33 The parabola having equation x 2 − 2x y + y 2 − 2x + 6y − 1 = 0   has det (Ac ) = −4 and tr I3. The equations of this parabola in its canonic form are √ 2Y 2 + 2√ 2X = 0 2Y 2 − 2 2X = 0. To find M and B we write the system  −M B 2 = −4 M = 2. . −3) and P2 having coordinates (−4. Example 6. 6. 5). if they exist.3 . the intersection points of the following two lines: 3x − 2y + 4 = 0 9x − 6y + 1 = 0. respectively.4 Identify.206 6 An Introduction to Geometric Algebra and Conics where  −M B 2 = det (Ac )   M = tr I3. 6.1 Identify the direction of the line having equation 4x − 3y + 2 = 0 6.

if possible. and asymptotes. if possible. if possible. 6. center. 6. Identify. Identify. axes. and asymptotes. 6.9 Check whether or not the conic having equation x 2 + y 2 + 2x y − 8x − 6 = 0 is degenerate or not. if it exists. diameters. and asymptotes.8 Check whether or not the conic having equation 4x 2 + 2y 2 + 2x y − 4y − 6 = 0 is degenerate or not.11 Check whether or not the conic having equation x 2 − 16y 2 + 6x y + 5x − 40y + 24 = 0 is degenerate or not. center.6 Find. center. the center of the family (3x + y + 4 = 0) l + (2x − 4y + 2 = 0) m = 0. axes. Identify.6 Exercises 207 6. diameters.10 Check whether or not the conic having equation x 2 + 2x y − 7x − 8y + 12 = 0 is degenerate or not.7 Check whether or not the conic having equation 4x 2 − 2y 2 + 2x y − 4y + 8 = 0 is degenerate or not. and asymptotes. diameters. center. and asymptotes. axes. 6. axes. axes. diameters. center. if possible. 6. if possible. Identify. diameters.6. Identify. .

Part II Elements of Linear Algebra .

w#» + λ #» v ∈ V3 . the sum of two natural numbers is always a natural number. Obviously if λ ∈ R and #» v ∈ V3 .1 Basic Concepts This chapter recaps and formalizes concepts used in the previous sections of this book. DOI 10. as we know. an algebraic structure is a collection of objects (a set) which can be related to each other by means of a composition law. An internal binary operation or internal composition law is a function (mapping) f : A × A → A.3 The couple composed of a set A and an internal composition law ∗ defined over the set is said algebraic structure. Let us focus at the beginning on internal composition laws. 1. 1.1 Let A be a nonempty set.1007/978-3-319-40341-0_7 .e. i. a formal characterization of the abstract algebraic structures and their hierarchy. © Springer International Publishing Switzerland 2016 211 F. Example 7. Consequently. Furthermore. An external binary operation or external composition law is a function (mapping) f : A × B → A where the set B = A. As mentioned in Chap. Linear Algebra for Computational Sciences and Engineering. this chapter reorganizes and describes in depth the topics mentioned at the end of Chap. Definition 7. Example 7.2 The product of a scalar by a vector is an external composition law over the sets R and V3 . Neri.1 The sum + is an internal composition law over the set of natural num- bers N. i. Definition 7.Chapter 7 An Overview on Algebraic Structures 7.2 Let A and B be two nonempty sets. Definition 7. This chapter is thus a revisited summary of concepts previously introduced and used and provides the mathematical basis for the following chapters.e. algebra is the study of the connections among objects.

Example 7. Definition 7.7 A semigroup (A.6 The neutral element of N with respect to the sum + is 0.6 Let a and b are two elements of A. Definition 7.5 The couple of a set A with an associative internal composition law ·. ·) is a commutative semigroup since both associative and commutative properties are valid for real numbers.4 The couple composed of the set of square matrices Rn. An internal composition law ∗ is said to be commutative when a · b = b · a. and then a · (b · c).1 Let (A. Definition 7.212 7 An Overview on Algebraic Structures 7. c of a semi- group can be univocally represented as a · b · c without the use of parentheses. (R.8 Let B ⊂ A. The neutral element of N with respect to the product is 1. and then (a · b) · c or b · c first.5 As we know from the Fundamental Theorem of Algebra the set of natural numbers N. Given the string a · b · c we can choose to calculate a · b first.3 The couple composed of the set of real numbers R and the multipli- cation. is said semigroup and is indicated with (A. Definition 7. The subset B is said to be closed with respect to the composition law · when ∀b.2 Semigroups and Monoids If we indicate with ∗ a generic operator and a and b two set elements. Example 7.n and the mul- tiplication of matrices is a semigroup because the multiplication is associative but it is not commutative. ·). is closed with respect to the sum +. Definition 7. b. Example 7. We would attain the same result. ·) is that special element e ∈ A such that ∀a ∈ A : a · e = e · a = a.9 The neutral element of a semigroup (A. the internal composition law of a and b is indicated with a ∗ b. Proposition 7. . Example 7. and c be three elements of A.4 Let a. which can be seen as a subset of R. b. b ∈ B : b · b ∈ B. It can easily be observed that the composition of three elements a. An internal composition law ∗ is said to be associative when (a ∗ b) ∗ c = a ∗ (b ∗ c) . An associative composition law is usually indicated with ·. ·) be a semigroup and e its neutral element. Definition 7. The neutral element is unique. ·) having a commutative internal composition law is said commutative semigroup.

the inverse element of a is that element b ∈ M such that a · b = e and is indicated with a −1 . ·).9 The monoid (Q.  Definition 7. +) is a monoid where the neutral element is 0. It follows that a −1 . Example 7. ·). Since also e is neutral element e · e = e · e = e . b−1 . . The inverse element of a. by contradiction. if exists. Hence e = e . If the inverse element of a exists.11 Let (M. ·) be a monoid and e its neutral element. Let a ∈ M.7.2 Semigroups and Monoids 213 Proof Let us assume. Proposition 7. ·) has neutral element e = 1. For all the elements a ∈ Q its inverse is a1 since it always occurs that a a1 = 1 regardless of a. a is said to be invertible.10 If we consider the monoid (R. Furthermore.   Example 7.10 A semigroup (A. by contradiction that also e is a neutral element. · composed of square matrices and their product is a monoid and its neutral element is the identity matrix. Proposition 7. If a ∈ M. that c is also an inverse element of a. It follows that e · e = e · e = e.  −1 −1 a =a and (a · b)−1 = b−1 · a −1 . is unique. ·) be a monoid and e its neutral element. ·) be a monoid and a ∈ M.2 Let (M. Proof Let b ∈ M be the inverse of a.3 Let (M.  Definition 7. Let us assume. It can be easily observed that neutral element of a monoid is always invertible and its inverse is the neutral element itself. Let a and b be two invertible elements of this monoid.8 The semigroup Rn. Example 7. Definition 7. ·) having neutral element is said monoid and is indicated with (M.7 The semigroup (R.12 Let (M. the neutral element is e = 1 and its inverse is 11 = 1 = e. Hence it follows that a·b =e =a·c and b = b · e = b · (a · c) = (b · a) · c = e · c = c. Example 7. ·) be a monoid and e its neutral element. and a · b are invertible.n .

Let us now look for the inverse element of a. This example shows how. (a · b) is invertible and its inverse is b−1 · a −1 . the operator ∗ is associative.   Hence. Hence the invertible elements of the monoid (Z. which introduce a new algebraic structure. i. this operator is a ∗ b = a + b − ab. Hence one element i such that i · a −1 = a −1 · i = e exists. it follows that a · a −1 = a −1 · a = e.214 7 An Overview on Algebraic Structures Proof Since a −1 is the inverse element of a. ∗) are 0 and 2. a−1 In order to have a −1 ∈ Z. Hence.11 Let us show how a monoid can be generated by means of a non- standard operator. This means that the inverse a −1 is an element of Z such that a ∗ a −1 = 0. A special case would be that all the elements of the monoid are invertible. a −1 = a. ∗) is a monoid. For two elements a and b ∈ Z. This means that a −1 is −1 invertible and its unique inverse is a. a must be either 0 or 2. The neutral element is 0 since a ∗ 0 = a + 0 − a0 = a = 0 ∗ a. The associativity can be verified by checking that (a ∗ b) ∗ c = a ∗ (b ∗ c) which is a ∗ (b ∗ c) = a + (b ∗ c) − a (b ∗ c) = a + (b + c − bc) − a (b + c − bc) = = a + b + c − bc − ab − ac + abc = (a + b − ab) + c − c (a + b − ab) = = (a ∗ b) + c − c (a ∗ b) = (a ∗ b) ∗ c. (Z. ∗) must verify associativity and must have a neutral element.  Since a and b are invertible then  −1 −1    b · a  · (a · b) = b−1· a −1 ·a · b = b−1 · e · b = b−1 · b = e (a · b) · b−1 · a −1 = a · b · b−1 · a −1 = a · e · a −1 = a · a −1 = e. In order to be a monoid. Let us define an operator ∗ over Z. This occurs when a a ∗ a −1 = a + a −1 − aa −1 = 0 ⇒ a −1 = .e.  Example 7. . Let us prove that (Z. in general only some elements of a monoid are invertible.

7. ·) is said to be abelian (or commutative) if the operation · is also commutative.6 Let (G. Proposition 7. Proof Let g and h ∈ G and g = g −1 and h = h −1 .12 The monoid (R. Hence. Definition 7. This semigroup satisfies the cancellation law if a·b =a·c ⇒b =c b · a = c · a ⇒ b = c. +) is a group since the sum is associative. Example 7. ·) is a monoid such that all its elements are invertible.15 Let (G. 0 is the neutral element. Also. If ∀g ∈ G : g = g −1 . Definition 7. +) is an abelian group because the sum of vectors is commutative and associa- tive. ·) be a semigroup and a. ·) be a group. ·) is abelian. neutral element and inverse element for all its elements.3 Groups and Subgroups Definition 7. Proposition 7. If g and h ∈ G. then the group (G.16 Let (S. Proposition 7. (gh)z = g z · h z . the operator it is abelian.13 A group (G.  Definition 7. . +) is abelian because the sum is commutative. .5 The following properties of the power are valid • g m+n = g m · g n • g m·n = (g m )n . This means that a group should have associative property. has neutral element #» o and for every vector #» v another vector w#» = − #» v such #» #» #» that v + w = o exist. and for every element a there is an element b = −a such that a + b = 0. ·) is not a group because the element a = 0 has no inverse. The power g z is defined as gz = g · g · . . If g ∈ G and z ∈ Z.4 Let (G.13 The group (R.14 A group (G. On the contrary the monoid (R. · g where the · operation is performed z −1 times (g on the right hand sire of the equation appears z times). It follows that g · h = g −1 · h −1 = (h · g)−1 = h · g.3 Groups and Subgroups 215 7. (V3 . and c ∈ S. ·) be a group. b. Example 7. ·) be a group.

+) is a group. i. ·). are not natural numbers. the inverse element of 10 is 2 and the inverse element of 8 is 4. that the set is closed with respect to +12 and that the inverse elements are also contained in the set.7 A group (G. When the subgroup contains other elements is said proper subgroup. 8. 3. . · composed of square matrices and their product.  Definition 7. Hence. 5. for all the groups the cancellation law is valid. 2. 2. 4. 6. . Example 7. Since in a group all the elements have an inverse element. ·) always satisfies the cancellation law. .15 Considering that (R. 4. we can see that the statement AB = AC ⇒ A = C is true only under the condition that A is invertible. The pair (H. 10}. if b · a = c · a then b · a · a −1 = c · a · a −1 ⇒ b = c. 9. and c be ∈ (G. ·) is said to be a subgroup of (G.g. A trivial example of subgroup is composed of only the neutral element e. Hence. ·) if the following conditions are verified: • H is closed with respect to ·. +) would not be its subgroup because inverse elements. ∀x. 6. ·).16 If we consider the group (Z.e. 8. ∈ H • ∀x ∈ H : x −1 ∈ H In other words. the couple ([−5. −1. +12 ) where H = {0. ·) be a group. y ∈ H : x · y ∈ H • e. 11} and +12 is a cyclic sum. 10.17 Let us consider the group (Z 12 . +) is not a proper subgroup because the set is not closed with respect to the sum. ·) satisfies the cancellation law.n . −2. It follows that if a · b = a · c then a −1 · a · b = a −1 · a · c ⇒ b = c. . neutral element of (G. e. 1. 5] . e. +12 ) where Z 12 = {0. Analogously. a subgroup is a couple composed of a subset of a group that still is a group.216 7 An Overview on Algebraic Structures  semigroup (R. ⊗). i.17 Let (G. A similar consideration can be done for the semigroup (V3 . If we consider the Example 7. if the sum exceeds 1 of a quantity δ the result of +12 is δ. 10 + 7 = 6. the structure (N. A subgroup of this group is (H.e. −3. +). Example 7. Proof Let a. b. Proposition 7. the statement is in general not true. e its neutral element and H be a nonempty set such that H ⊂ G.g. 7. For example 11 + 1 = 0.14 The monoid Rn. It can be easily shown that the neutral element 0 ∈ H . Example 7. etc.

18 Let us consider the group (Z. 11}. 7. . 7. if we calculate the coset H + 2 we obtain H . ·) be a group and (H. Therefore we have the same subgroup. H + 11 leads to the same set {1. . −13. . Furthermore. . 4. H + 4. . 20. −18 . Let us now consider 1 ∈ Z 12 and build the coset H +1: {1.1 Cosets Definition 7. 8. The same results would have been achieved also with H + 0.3. 5. and of equivalence class. . Example 7. 3. is a set containing all the infinite possible elements that are equivalent to a. H + 3. 5. starting from this group and subgroup only two cosets can be generated. due to the commutativity of the operator. right and left cosets coincide. in Definition 1.21: an equivalence relation is a relation that is reflexive. −5. 22 . H + 9. −15. . ·) its subgroup. 7. The set H g = {h · g : h ∈ H } is said right coset of H in G while the set g H = {g · h : h ∈ H } is said left coset of H in G. A right coset is a set of the type g + h with a fixed g ∈ G and ∀h ∈ H . H + 6. +12 ) and its subgroup (H. The equivalence class of an element a. −8. 2.18 Let (G.3 Groups and Subgroups 217 7. 10}.19 Let us consider again the group (Z 12 .2 Equivalence and Congruence Relation Before introducing the concept of congruence relation. This occurs because 2 ∈ H and H is closed with respect to +12 . H + 8. 17. This fact is expressed saying that the index of H in Z 12 is 2.7. . Let us fix a certain g ∈ G. 10. . − 3. let us recall the definition of equivalence relation. symmetric and transitive. if the group is abelian. as in this case. 9.20. in Definition 1. It must be noted that the neutral element 0 is included in the subgroup. 15. the operation H + 1. 6. 5. 3. −10. Example 7. +12 ) where H = {0. . . Obviously. H + 5. 12.3. 11}. Also. indicated with [a]. −20. 7. . Hence. ∀h ∈ H that is {2. Since the operation is commutative (2 + h = h + 2) the left coset is the same. and H + 10. H + 7. +) and its subgroup (Z5. 9.}.}. +) where Z5 is the set of Z elements that are divisible by 5: Z5 = {0. It must be remarked that a coset is not in general a subgroup. An example of right coset is the set of 2 + h.

c ≡ a and c ≡ b.e. Theorem 7. b ∈ A either [a] and [b] are disjoint or coincide • A is the union of all the equivalence classes Proof To prove the first point. the subsets composing the partition never overlap). let us consider that the intersection [a] ∩ [b] is either empty or is nonempty. This means that • given a. Hence [a] = [b]. . In other words. since a ≡ b it follows that a ∈ [b]. 7. The equivalence relation ≡ partitions the set A. By symmetry and transitivity a ≡ b. 7. In Fig.  Let U A be the union of the equivalence classes of the elements of A.218 7 An Overview on Algebraic Structures Fig. The following theorem is a general result about equivalence classes. Hence. due to the reflexivity (it is always true that a ≡ a) always belongs to at least one class: a ∈ [a].1 Let A be a set and ≡ be an equivalence relation defined over A. hence c ∈ [a] and c ∈ [b]. if two equivalence classes are not disjoint. there exists and element c such that c ∈ [a] ∩ [b].e. A ⊂ U A . Since by symmetry b ≡ a.21). it follows that b ∈ [a] and [b] ⊂ [a]. it is reported in this section because it is of crucial importance for group theory. i. Definition 7. This means that [a] ⊂ [b].1 Example of four partitions We can now introduce the following concept. In the second case. Hence. then the they are coinciding. an element a ∈ A. every element of A is also element of U A .1 a set composed of sixteen elements and four partitions is represented. From the definition of equivalence class (see Definition 1. To prove the second point we need to consider that. there is a unique set X ∈ P such that x ∈ X (i. However.19 A partition P of A is a collection of subsets of A having the following properties: • every set in P is not the empty set • for every element x ∈ A.

Therefore. a = h 1 · h 2 · c. the equivalence class [a] is a subset of A.  Definition 7. Hence. then there exists a value h 1 such that a = h 1 · b. Hence. that k = h 1 · h 2 ∈ H .8 Let ∼ be the congruence relation modulo H of a group (G. Analogously. the congruence relation means that b is divisible by a. Thus. . symmetry and transi- tivity. i. the congruence relation is reflexive. A is the union of all the equivalence classes. ·) be a group and (H. Consequently. Within H there exists the neutral element e such that a = e · a ⇒ a ∼ a. Proof To prove the equivalence we need to prove reflexivity. Hence. 1.3 Groups and Subgroups 219 An equivalence class [a] by definition (see Definition 1.20 Let (G. We know. Proposition 7.7. that is [a]. Hence. ·). Although the reasoning of this section is performed in an abstract way if we consider the operation · as a multiplication of numbers. a also belongs to A. since H is a group. It follows that the congruence is an equivalence relation.  Thus.e.21) is [a] = {x ∈ A|x ≡ a}. If ∃h ∈ H such that a = h · b. then there exists a value h 2 such that b = h 2 · c. ·) its subgroup. we can write h −1 · a = b ⇒ b = h̃ · a ⇒ b ∼ a. A = U A . U A ⊂ S. all the intersections between two classes are empty). The right congruence relation modulo H a ∼ b (or simply right congruence) is the relation defined as : ∃h ∈ H such that b = h · a. if we consider an element a ∈ U A . Hence. the left congruence is b = a · h. then for the definition of group the inverse element h −1 ∈ H also exists. the meaning of the congruence relation is the divisibility of numbers. a belongs to only one class (because. the congruence relation is symmetric. which means a ∼ c. as shown in the previous point. Hence. if a ∼ b. if b ∼ c. 3. Hence ∃k ∈ H such that a = k · c. Hence the congruence relation is transitive. Simply. The following proposition gives an important property of the congruence relation which allows us to make a strong statement about groups. Let a and b ∈ G. It follows that the congruence relation is an equivalence. if we impose h ∈ N. for every a ∈ G the congruence relation ∼ identifies an equivalence class [a] = H a. The immediate consequence is that there is a bijection between each element of the set G and its associated coset. 2.

Two right (left) cosets of H in G either coincide or are disjoint. 6. 3. 6. +12 ) where H = {0. 7. We know that two cosets can be generated H + 0 = {0. 5.220 7 An Overview on Algebraic Structures 7.1. Example 7. Every right (left) coset of H in G has the same order as (H. The set G is equal to the union of all the right (left) cosets: G = ∪i=1 n H gi . 10}. 2. Let us convey that an algebraic structure is finite if its set is finite and let us name order of an algebraic structure the cardinality of the set associated to it. Let us consider a generic y ∈ H g. Let h 1 = h 2 . it follows that for the cancellation law h 1 · g = h 2 · g. Since both injection and surjection properties are verified. which is y. This fact would be enough to conclude that the number of cosets is equal to the number of the elements of G. Lemma 7. This statement can be rephrased as the order of a right coset of H in G has the same order as (H. Let us define a function φ : H → H g. Hence. 8. 4. +12 ) and its subgroup (H. Lemma 7. for a given y ∈ H g. This means that φ is an injective function.1 Let (G. ·) its subgroup.3 Let (G.  .20 Let us consider again the group (Z 12 . 2. let us revisit the previous results in the context of group theory by stating the following lemmas. ·) its subgroup. Lemma 7. ·) be a group and (H. 4. This means that the function is surjective. φ (h) = h · g.3. Let us consider φ (h). 11} The union of the cosets is clearly Z 12 . Thus the cardinality of H is equal to the cardinality of the right coset H g. It is equal to h · g. Considering that φ (h 1 ) = h 1 · g and φ (h 2 ) = h 2 · g. ·). 8. 10} H + 1 = {1. More specifically. Proof Let H g be a right coset. We know that there exists a bijection between the elements of two equivalent sets. Nonetheless. from the fact that congruence is an equivalence relation the following lemmas immediately follow. Since y belongs to a coset then a certain h value belonging to the subgroup H such that y = h · g exists.3 Lagrange’s Theorem We may think that an equivalence class is univocally associated to each element of G. ·) be a group and (H. ·). ∃ a value h ∈ H such that y = φ (h). by applying Theorem 7. ·) its subgroup. 9. ·) be a group and (H.2 Let (G. it follows that φ is bijective.

The set R is closed with respect to these two operations and contains neutral elements with respect to both sum and product. Definition 7. let us report again Definition 1. The Lagrange’s Theorem is hence verified since |G| 12 = =2 |H | 6 is an integer number. . We know that we have 2 cosets of cardinality 6 that is also the cardinality of H .4 Rings As we know from Chap. . . Hence.21 (Ring) A ring R is a set equipped with two operations called sum and product. 8. this union is disjoint. for Lemma 7. In addition.22 Let us consider again the group (Z 12 .21 By using the same numbers of Example 7. it follows that G = H g1 ∪ H g2 ∪ . For the Lemma 7. From Lemma 7. ·) its sub- group.2 (Lagrange’s Theorem) Let (G.7. Since it is disjoint we can write that the cardinality of G (here indicated as |G|) is equal to the sum of the cardinalities of each coset |H g1 | + |H g2 | + .1. Theorem 7. +12 ) and its subgroup (H. Example 7. the cardinality of each coset is equal to the cardinality of the corresponding subgroup. H g2 .20. namely axioms of a ring. Proof Let H g1 . 6. For convenience. the order of H divides the order of G. indicated with 0 R and 1 R respectively. ∪ H gk and.e. |G| = k |H | . • commutativity (sum): x1 + x2 = x2 + x1 • associativity (sum): (x1 + x2 ) + x3 = x1 + (x2 + x3 ) . the following properties. .2. Then. 7. i. 10}. + |H gk |. . .3 Groups and Subgroups 221 The proof for a left coset is analogous.3. +12 ) where H = {0. ·) be a finite group and (H. 2. H gk be the right cosets of H in G. The sum is indicated with a + sign while the product operator is simply omitted (the product of x1 by x2 is indicated as x1 x2 ). we can immediately see that the cardinality of H as well as the cardinality of each coset is 6. . 1 a ring is a set equipped with two operators. must be valid. The cardinality of Z 12 is 12.30.  Example 7. the ratio of the cardinality of G by the cardinality of H is an integer number. 4. .

The latter observations constitute the theoretical foundation of the following proposition. • there exists only one neutral element 0 R with respect to the sum • for every element a ∈ R there exists only one element −a (this element is said opposite element) such that a + (−a) = 0 R • cancellation law is valid: a + b = c + b ⇒ a = c • there exists only one neutral element 1 R with respect to the product Before entering into ring theory in greater details. a ring has two neutral elements. the result of a0 R is always 0 R . Hence c = c + c.n . ) are rings. At first. +. Hence. ) be a ring. +. Then. +. (R.222 7 An Overview on Algebraic Structures • neutral element (sum): x + 0 R = x • inverse element (sum): ∀x ∈ R : ∃ (−x) |x + (−x) = 0 • associativity (product): (x1 x2 ) x3 = x1 (x2 x3 ) • distributivity 1: x1 (x2 + x3 ) = x1 x2 + x1 x3 • distributivity 2: (x2 + x3 ) x1 = x2 x1 + x3 x1 • neutral element (product): x1 R = 1 R x = x Example 7. ). +. Rn. The structure Rn. Considering that it is always true that if the same quantity is added and subtracted the result stays unvaried: c = c + c − c. +. composed of square matrices with sum and product is a ring. Proposition 7. Proposition 7.10 Let (R. ).  commutativity  is a requirement for the sum but not for the product. +.n . +.  . and (C. ). ) be a ring. ). +) and a monoid (R. let us convey that a − b = a + (−b). (Q.n . +. Furthermore the existence of the inverse element with respect to the product for all the elements of the set is also not a requirement.24 The algebraic structure Rn. Finally.23 The algebraic structures (Z. +. It follows that ∀a ∈ R a0 R = 0 R a = 0 R Proof If c = a0 R then for distributivity a (0 R + 0 R ) = a0 R + a0 R = c + c. The neutral elements are the zero matrix for the sum and the identity matrix for the product. The following properties are valid. is a ring although the product is not commutative.   Example 7.9 Let (R. we can write c = c + c − c = c − c = 0R . A few remarks can immediately be made on the basis only of the ring definition. is a ring although square matrices are not always invertible. a ring can be seen as the combination of a group (R. Hence.

) be a ring. ) be a ring.  Definition 7. Definition 7. .22 Let (R. b ∈ R a (−b) = − (ab) Proof Let us directly check the result of the operation a (−b) + ab: a (−b)+ab = a (−b + b) = a0 R = 0 R . b ∈ R. b ∈ R (−a) (−b) = ab Proof (−a) (−b) = a (−1 R ) b (−1 R ) = a (−1 R ) (−1 R ) b = a1 R b = ab.27 Let X be a nonempty set and P (X ) be its power set.12 Let (R. It follows that ∀a.25 Some examples  of commutative rings are (Z.24 Let (R. . .7. see Appendix A. +. The ring is said to be commu- tative when ab = ba. that the algebraic structure RR . Example 7.1 Let (R. It follows that ∀a ∈ R a (−1 R ) = (−1 R ) a = −a. aaaa. +. If 0 R = 1 R the ring is said to be degenerate. It follows that (−1 R ) (−1 R ) = 1 R . This special structure is called Boolean Ring and constitutes the theoretical foundation of Boolean Algebra.26 Let us indicate with RR be a set of all the possible functions defined on R and having codomain  R. (Q. ) be a ring. ) be a ring and a ∈ R. Example 7. ). ∩) where Δ is the symmetric difference is a commutative ring.n . and (R. ) while Rn. Example 7. ) be a ring. Corollary 7. +. ) be a ring. +. is a commutative ring and 0 R = 0 and 1 R = 1. +. Definition 7. It follows that ∀a.23 Let (R. The n th power of a is the product of a calculated n times: a n = aaaa . Corollary 7. ) be a ring and a.4 Rings 223 Proposition 7. +. +.  Analogously it can be checked that (−a) b = − (ab). +. It  can be verified by checking the ring axioms. +.2 Let (R. It can be proved that the algebraic structure (P (X ) . is not a commutative ring. +.11 Let (R. ). By using this proposition the following two corollaries can be easily proved. Δ. Proposition 7. +. +.

+.1 Cancellation Law for Rings The basic definitions and properties of rings did not contain any statement about cancellation law with respect to the product. +. +. It follows that • a n+m = a n a m • a nm = (a n )m . +. +. ) is a subring of (Q. Theorem 7. ) be a ring and S ⊂ R with S = ∅. b ∈ R. ) is a ring and a ∈ R. ) that is a subring of (C. the can- cellation law of the product is not verified. For example the square of a binomial (a + b)2 can be written as       2 2 2 2 2 2 (a + b) = 2 a b + 2 0 a b + 1 1 a 0 b2 = a 2 + ab + b2 0 1 2 2 1 2 that is the formula of the square of a binomial a 2 + b2 + 2ab. it is not true that (ab)n = a n bn . If (S. +. ) be a commutative ring and a. if (R. ) that is a subring of (R. and n. m ∈ N. ) is a ring then (S.29 It can be observed that (Z.224 7 An Overview on Algebraic Structures Proposition 7. +.25 Let (R. Example 7. +. in general. 7.28 Newton’s binomial is a powerful formula that allows to represent a binomial of any power as a sum of monomials.13 Let (R.4. ) be a ring. In general. The latter equation is valid only if the ring is commutative. ). +. a ∈ R. ) is said subring. +. It occurs that ∀n ∈ N n   n (a + b)n = a n−i bi i i=0   n where is said binomial coefficient and defined as i   n n! = i i! (n − 1)! with the initial/boundary condition that     n n = = 1. Definition 7. The reason is that. .3 (Newton’s binomial) Let (R. For commutative rings the following theorem is also valid. 0 n Example 7.

The latter example suggests that there is a relation between the cancellation law (more specifically its failure) and the fact that the product of two elements different from the neutral element is null. the cancellation law of the product would be ca = cb ⇒ a = b. i. +. The cancellation law can be not valid also in the case of commutative rings. If we now pose a = 0. 01 It results that   01 ab = = ac 01 and still b = c. of the square matrices of size 2 and let us pose   01 a= .31 Let us consider the ring of real functions RR .   Example 7.30 If (R. 01 and   10 c= . +.7.   Let us consider the ring R2. 01   01 b= . b.4 Rings 225 Example 7. b = f (x). ) and a. and two functions belonging to RR :  x if x ≥ 0 f (x) = 0 if x ≤ 0 and  x if x ≤ 0 g (x) = 0 if x ≥ 0. 0 R .2 . +.e. c ∈ R. . and c = g (x) and calculate ca = 0 = cb while a = b.

we can conclude that R2.26 Let (R. c ∈ R with c = 0 R ac = bc ⇒ a = b. Proposition 7. b. When we multiply a by b we obtain   00 ab = .14 Let (R. The element a is said zero divisor if there exists an element b ∈ R with b = 0 R such that ab = 0 R . ) is an integral domain (hence without zero divisors) and c = 0 R it follows that necessarily (a − b) = 0 R . ) be a ring and a ∈ R such that a = 0 R . Since (R. +. that is the cancellation law.226 7 An Overview on Algebraic Structures Definition 7. and b is a zero divisor of a. and the matrices   01 a= 05 and   7 10 b= . A ring without zero divisors is a special ring formally introduced in the following definition. It follows that ac = bc ⇒ (ac − bc) = 0 R ⇒ (a − b) c = 0 R . +.2 . ) be an integral domain. This means that although a = 0 R and Hence.   Example 7. 00  b = 0 R still it happens that ab = 0 R .  . c ∈ R with c = 0 R such that ac = bc. b. neither a nor b are the null matrix. 0 0 Obviously. Proof Let us consider three elements a. This fact means that a = b. Definition 7.32 Let us consider the ring R2. if a ring contains zero divisors then the cancellation law of the product is not valid. Then it occurs that for all a. +. +. In other words.27 A commutative ring which does not contain any zero divisor is said integral domain. +.2 .

+. ) and (C. a field can be seen as the combination of two groups. The element a is said to be invertible if there exists and element b ∈ R such that ab = 1 R . Since in fields all the elements are invertible a is also invertible and a −1 a = 1 F . The element b is said inverse of the element a. Let us consider a generic element b ∈ F such that ab = 0 F . . +. +. Proof For simplicity of notation let us proof this proposition in the case of a com- mutative ring but the non-commutative case is analogous. Example 7.35 In the case of the ring (Q. +. It follows that b = b1 R = b (ac) = (ba) c = 1 R c = c.36 From the previous example we can easily state that (Q. is unique. ).  Example 7. b is not a zero divisor of a. c are both inverse elements of a. Proposition 7. Therefore. +. ) are −1 and 1. all its elements are invertible. +. This means that there are no zero divisors. ) are fields. Proposition 7. +. The inverse element of a. Proof Let (F. +. since b = 0 F .15 Let (R.  The concept of field is the arrival point of this excursus on algebraic structures and the basic instrument to introduce the topics of the following chapters. an integral domain is a commutative ring that does not contain zero divisors. By definition. are those square matri- ces of order n that are non-singular. ) be a field and a ∈ F with a = 0 F . ) is a field. Definition 7. In other words. Hence.29 A commutative ring (F. that a has two inverse elements and b. Also (R. +.28 Let (R.n . Example 7. all the elements of Q except 0 can be expressed as a fraction. ) be a ring and a ∈ R.   Example 7. ) be a ring and a ∈ R. +. Let us assume. ) such that all the elements of F except the neutral element with respect to the sum 0 F are invertible is said field.16 Every field is an integral domain. when it exists. It follows that   b = 1 F b = a −1 a b = a −1 (ab) = a −1 0 F = 0 F .7.34 The invertible elements of the ring Rn.4 Rings 227 Definition 7. by contra- diction.33 The only invertible elements of the ring (Z.

g.   Definition 7.2 . ∗ ). ∗ be two rings. a separate definition must be given. ·) and G  . ) and R2. ∗ ). this section shows how a mapping can be defined over algebraic structures. y ∈ G it follows that φ (x · y) = φ (x) ∗ φ (y)   is said group homomorphism from G to G  (or from (G. The mapping φ : N → 2N defined as φ (x) = 2x. +. ·) to G  . e.31 Let (R. ⊕. from ancient Greek “omos” and “morph”. literally means “the same form”.   Definition 7. Let us check that φ (x + y) = φ (x) + φ (y): . ∗ be two groups. this section focuses on a specific class of mappings that has interesting properties and practical implications. This fact is very easy to verify in this case. A mapping f : R → R  such that for all x.30 Let (G. In particular. Regarding algebraic structures endowed with two operators. +. +. semigroup homomorphism and monoid homomorphism.228 7 An Overview on Algebraic Structures 7.5 Homomorphisms and Isomorphisms Subsequent to this basic introduction to group ring theory.37 Let us consider the groups (N. ) to R  . +) and (2N. Let us better clarify this fact by means of the following example. Example 7.   Example 7. A mapping φ : G → G  such that for all x. +. the concepts of the other algebraic structures endowed with an operator can be defined. . ) and R  .2 defined as   x 0 φ (x) = 0x where x ∈ R is an homomorphism. +) where 2N is the set of positive even numbers. An homomorphism is a transformation that preserves the structure between two algebraic structures. In a similar way. y ∈ R it follows that • f (x + y) = f (x) ⊕ f (y) • f (x y) = f (x) ∗ f (y) • f (1 R ) = 1 R    is said ring homomorphism from R to R  (or from (R. Homomorphism. Let us show that this mapping is an homomorphism: φ (x · y) = φ (x + y) = 2 (x + y) = 2x + 2y = φ (x) + φ (y) = φ (x) ∗ φ (y) . The mapping φ : R → R2.38 Let us consider the rings (R. ⊕.

this mapping is an homomorphism. considering that 1R = 1. 1. y1 ) + (x2 . + and the mapping φ : R3 → R2 φ (x) = φ (x. 0 xy 0x 0y Finally. y. On the other hand. 1) we cannot detect the point in R3 that generated it because it could be (1. y1 . The case of a bijective homomorphism is special and undergoes a separate defi- nition. For example if we consider the vector (1. we may think of an homomorphism as a transformation defined over the elements of an algebraic structure A that has a results elements of another algebraic structure B. in such a way that to each part of one structure there is a corresponding part in the other structure. (1.2 . if we start from a vector (x. 1. Intuitively. y2 ) = φ (x1 ) + φ (x2 ) . If we consider two vectors x1 = (x1 . z) that generated it. 1.     Example 7.   10 φ (1) = = 1R2. z 2 ) it follows that φ (x1 + x2 ) = (x1 + x2 . 3. the main feature of an homomorphism is a mapping that transforms a group into a group. y) it is impossible to find the vector (x. this mapping is not bijective. we cannot obtain starting from the algebraic structure B the elements of A. (1. . a ring into a ring etc. 2). However. Hence. z 1 ) and x2 = (x2 . More formally. y) . In general. there is no requirement for an homomorphism to be a bijection.32 A bijective homomorphism is said isomorphism. where corresponding means that the two parts play similar roles in their respective structures. this mapping is not injective. y1 + y2 ) = (x1 . 1. It can be easily shown that this mapping is an homomorphism.7. 8). y. z) = (x. (1. Definition 7. + and R2 .39 Let us consider the groups R3 . Hence. 1). An intuitive way to describe ismorphisms is by means of the following quote of the mathematician Douglas Hofstadter: The word isomorphism applies when two complex structures can be mapped onto each other. Let us see the following example.5 Homomorphisms and Isomorphisms 229       x+y 0 x 0 y0 φ (x + y) = = + = f (x) + f (y) .56723) etc. y2 . 0 x+y 0x 0y Let us now check that φ (x y) = φ (x) φ (y):      xy 0 x 0 y0 φ (x y) = = = φ (x) φ (y) . 01 In other words.

. Hence.is stronger than the prefix homo-. The mapping φ : N → 10N is φ (x) = 10x . 11 within the context of graph theory. The following examples give an idea of what an isomorphism is and how it can be the theoretical foundation of several computational techniques.   Example 7. Let us show that the mapping is an homomorphism: φ (x + y) = 10x+y = 10x 10 y = φ (x) φ (y) . It is a surjection since every positive number can be expressed as 10x . the prefix iso. it can be remarked that there is a relation between the fact that an homomorphism is an isomorphism and that the mapping is invertible. If we think about the words. It is an injection since if x1 = x2 then 10x1 = 10x2 . see [19]. an isomorphism transforms an algebraic structure into a structurally identical algebraic structure. In this light. Example 7.41 Let  us indicate  with R+ the set of positive real numbers. Let us con- sider the groups R+ . the word ‘iso’ means ‘identical’. the latter unlike the first is also reversible. i. +) and the mapping f : R+ → R. +) and 10N . where 10N is the set of the powers of 10 with natural exponent. Isomorphism is an extremely important concept in mathematics and is a precious help in problem solving. While ‘homo’ means ‘same’. This mapping is an homomorphism since f (x y) = log (x y) = log (x) + log (y) = f (x) + f (y) . of the same kind. For this reason.40 Let us consider the groups (N. the problem can be transformed into and isomorphic one and solved within the isomorphic algebraic structure. An extensive example of isomorphism is presented in Chap. see [18]. The solution in the isomorphic domain is then antitransformed to the original problem domain. In order to verify that this homomorphism is an isomorphism let us show that this mapping is an injection and a surjection. f (x) = log (x) where log is the logarithm. then these algebraic structures are said isomorphic.42 The Laplace Transform is an isomorphism between differential equa- tions and complex algebraic equations. and (R. The mapping is an isomorphism because it is injective as if x1 = x2 then log (x1 ) = log (x2 ) and it is surjective as every positive number can be expressed a logarithm of a real number. while an homomorphism transforms an algebraic structure into a structure of the same kind. When a problem is very hard to be solved in its domain (its algebraic structure).e. Example 7.230 7 An Overview on Algebraic Structures If an isomorphism holds between two algebraic structures. this mapping is an isomorphism. Obviously.

semigroup. 3. +.2 Considering theset of the square matrices Rn.3 Let (Z. +8 ) with H = {0. 2. . 2. 6} verify whether or not (A. 5. +) and R+ . 4. ∗) be an algebraic structure where the operator ∗ is defined as a ∗ b = a + 5b. 7} and +8 be the cyclic sum. f (x) = e x is an homomorphism and an isomorphism. If the structure is a monoid verify whether or not this monoid is a group. if they exist. monoid. verify whether or not Rn. its inverse elements. Verify whether or not (Q. 2. if they exist.6 Exercises 231 7. 4. If the structure is a monoid verify whether or not this monoid is a group.1 Considering the set A = {0.   7. Prove that ∀a. 1. b ∈ R (−a) b = − (ab).4 Let (Q. +) is an alge- braic structure. . Verify whether or not the mapping f : R → R+ . 1. ∗) be an algebraic structure where the operator ∗ is defined as a ∗ b = a + b + 2ab.6 Exercises 7.6 Let (R. Verify whether or not (H. 7. 6} is a subgroup. monoid. Verify whether or not (Z. 6. 7. or group. · is an algebraic structure.n and the product of matrices ·. 7. or group.n .7. its inverse elements.5 Let Z 8 be {0. ) be a ring. Represent the cosets and verify the Lagrange’s Theorem in the present case. 7. ∗) is a monoid and identify. semigroup.7 Let us consider the groups (R. ∗) is a monoid and identify. 7. 4.

© Springer International Publishing Switzerland 2016 233 F. As in the case of the product of a scalar by a vector. DOI 10. namely vector space axioms are verified. 4. μ ∈ K : λ (μu) = (λμ) u = λμu • distributivity 1: ∀u. v. • E is closed with respect to the internal composition law: ∀u. 9 we will refer with K to either the set of real numbers R or the set of complex numbers C). ) if and only if the following ten axioms. v ∈ E : u + v ∈ E • E is closed with respect to the external composition law: ∀u ∈ E and ∀λ ∈ K : λu ∈ E • commutativity for the internal composition law: ∀u. Let “·” be an external composition law. Linear Algebra for Computational Sciences and Engineering. see Chap. Let “+” be an internal composition law. +. Neri. v ∈ E : u + v = v + u • associativity for the internal composition law: ∀u. +. E × E → E. Definition 8. μ ∈ K : (λ + μ) u = λu + λu • neutral elements for the external composition law: ∀u ∈ E : ∃!1 ∈ K|1u = u where o is the null vector.1 Basic Concepts This chapter revisits the concept of vector bringing it to an abstract level.1 (Vector Space) Let E to be a non-null set (E = ∅) and K to be a scalar set (in this chapter and in Chap. ·) is said vector space of the vector set E over the scalar field (K.1007/978-3-319-40341-0_8 . w ∈ E × E : u + (v + w) = (u + v) + w • neutral element for the internal composition law: ∀u ∈ E : ∃!o ∈ E|u + o = u • opposite element for the internal composition law: ∀u ∈ E : ∃!−u ∈ E|u + −u = o • associativity for the external composition law: ∀u ∈ E and ∀λ. the symbol of external composition law · will be omitted. v ∈ E and ∀λ ∈ K : λ (u + v) = λu + λv • distributivity 2: ∀u ∈ E and ∀λ.Chapter 8 Vector Spaces 8. Throughout this chapter. Let us name vectors the elements of the set E. The triple (E. for analogy we will refer to vectors using the same notation as for numeric vectors. K × E → E.

·). Proposition 8. v ∈ U : u + v ∈ U • ∀λ ∈ K and ∀u ∈ U : λu ∈ U. o ∈ U. with K we will refer to the scalar field (K. • The set of numeric vectors Rn with n ∈ N. i. it follows that ∀u ∈ U : ∃λ|λu = o. +.n . ·) if (U. ·) is a vector subspace of (E. +. ·) be a vector space. • ∀u. ·) is a vector subspace. ).1 Let (E. U ⊂ E. +.2 Let (E. ·) be a vector space over a field K. the set U is closed with respect to the external composition law. (E. 8. which still is a vector space.n . ·). The triple (U. • The set ofmatrices R  m. The triple (U.  Proposition 8. Example 8.234 8 Vector Spaces In order to simplify the notation.  If (U. +. and U = ∅. ·). Proof Since the elements of U are also elements of E. +. are valid. • The set of geometric vector V3 . If U is also closed with respect to the composition laws then (U. +. Thus. and U = ∅. +. +. +. +. +. the sum between vectors and the product of a scalar by a numeric vector.3 For every vector space (E. +. ·) if and only if U is closed with respect to both the composition laws + and ·. In the latter case if n = 1. +. +. ·) of (E.  Proposition 8. Every vector subspace (U. the sum between vectors and the product of a scalar by a geometric vector. ·) is a vector space and since U ⊂ E. then it is a vector space. the sum between matrices and the product of a scalar by a matrix. Rm.e. ·) be a vector space. Thus. +. (Rn . U is vector subspace of (E. ·) is a vector space over the same field K with respect to both the composition laws.1 The following triples are vector spaces. Since (U. including the closure with respect of the composition laws.e. the set of numeric vectors is the set of real numbers. ·) is a vector subspace of (E. in this chapter and in Chap. ·) contains the null vector. +. +. +.2 (Vector Subspace) Let (E. U ⊂ E. +. ). ·) is a vector subspace of (E. Proof Considering that 0 ∈ K. . ·) and ({o}. ). +. +. i. numeric vectors with sum and product is a vector space but a vector spaces is a general (abstract) concept that deals with the sets and composition laws that respect the above-mentioned ten axioms. 9. .2 Vector Subspaces Definition 8. +. they are vectors that satisfy the eight axioms regarding internal and external composition laws. ·). at least two vector subspaces exist. (V3 . the ten axioms.

(1) Let us consider two arbitrary vectors belonging to U. Let us calculate u1 + u2 = (x1 + x2 . y1 + y2 . z1 + z2 ) . z1 + z2 ) is . +. y2 . These two vectors are such that 3x1 + 4y1 − 5z1 = 0 3x2 + 4y2 − 5z2 = 0. 3λx + 4λy − 5λz = = λ (3x + 4y − 5z) = λ0 = 0. u2 ∈ U : u1 + u2 ∈ U. z1 ) and (x2 . Since the null vector o ∈ / U. +. (U. We know that 3x + 4y − 5z = 0. 3 (x1 + x2 ) + 4 (y1 + y2 ) − 5 (z1 + z2 ) = = 3x1 + 4y1 − 5z1 + 3x2 + 4y2 − 5z2 = 0 + 0 = 0. +. This means that ∀λ ∈ K and ∀u ∈ U : λu ∈ U. · . y1 + y2 .   Example 8. we proved that (U. u1 = (x1 . z1 ) and u2 = (x2 . The sum vector (x1 + x2 . · and its subset U ⊂ R3 : U = {(x. z2 ).   Thus. y. +. y1 .2 Let us consider the vector space R3 . +. In correspondence to the vector u1 + u2 . Let us calculate λu = (λx.2 Vector Subspaces 235   Example 8. ·) is not a vector space. λy. y. z) ∈ R3 |8x + 12y − 7z + 1 = 0}.3 Let us consider the vector space R3 . y1 . (2) Let us consider an arbitrary vector u = (x. Although it is not necessary. This means that ∀u1 . y. ·) is a vector subspace of R3 . +. · . y2 . z2 ). We have to prove the closure with respect to the two composition laws. z) ∈ U and an arbitrary scalar λ ∈ R. z) ∈ R3 |3x + 4y − 5z = 0}   and let us prove that (U. ·) is a vector subspace R3 . +.8. · and its subset U ⊂ R3 : U = {(x. let us consider two vectors (x1 . let us check that the set U is not closed with respect to the internal composition law. λz) . In correspondence to the vector λu. In order to do this.

then u + v ∈ U. +. ·).  Corollary 8. +. The set U is not closed with respect to the internal composition law. ·) and (V. it belongs to their intersection U ∩ V . ·) is a vector space. (2) Let u be an arbitrary vector ∈ U ∩V and λ an arbitrary scalar ∈ K. Analogously. +.e. Hence. If (U. Hence. ·). +.1 Let (E. λz) ∈ / U. Thus. +. ·). it can be easily proved that (U ∪ V. Proof For the Proposition 8. +. z) ∈ U.1 Let (E. ·) and (V. +. ·) is a vector subspace of (E. (1) Let u.1 it would be enough to prove the closure of the set U ∩ V with respect to the composition laws to prove that (U ∩ V. ·) their union is in general not a subspace of (E. Analogously. i.e. This means that U ∩ V is closed with respect to the + operation. y1 + y2 . v be two arbitrary vectors ∈ U ∩ V . (U ∩ V. +. ·) is a vector space. ·) is a vector subspace of (E. Since (V. ·). +. λy. +. ·) be a vector space. (λx. 8 (x1 + x2 ) + 4 (y1 + y2 ) − 5 (z1 + z2 ) + 1 = 0. then (U ∩ V. . If (U. If u ∈ U ∩V then u ∈ U and u ∈ V . then U ∩ V is always a non-empty set as it contains at least the null vector. If u ∈ U ∩ V then u ∈ U and u ∈ V . This means that U ∩ V is closed with respect to the · operation. +. +. ·) is a vector space. it belongs to their intersection U ∩ V . +. It must be observed than if (U. then λu ∈ U. Since (U. ·) are two vector subspaces of (E. if v ∈ U ∩ V then v ∈ U and v ∈ V .236 8 Vector Spaces 8 (x1 + x2 ) + 4 (y1 + y2 ) − 5 (z1 + z2 ) + 1 = = 8x1 + 4y1 − 5z1 + 8x2 + 4y2 − 5z2 + 1. Theorem 8. +. Thus. ·). v ∈ U and (U. On the contrary. ·) be a vector space. This means that (x1 + x2 . +. i. λu belongs to both U and V . then λu ∈ V . v ∈ V and (V. If we consider that 8x2 + 4y2 − 5z2 + 1 = 0 it follows that 8x1 + 4y1 − 5z1 = 0. ·). λy. then (λx. ·) and (V. +. ·) if U ⊂ V or V ⊂ U. +. ·) are two vector subspaces of (E. if (x. since U ∩ V is closed with respect to both the composition laws. Since u. λz). ·) is a vector subspace of (E. then u + v ∈ V . ·) is a vector space. is 8 (λx) + 4 (λy) − 5 (λz) + 1 which in general is not equal to zero. +. ·) is a vector subspaces of (E. +. z1 + z2 ) ∈ / U. +. +. y. ·) are two vector subspaces of (E. +. with lambda scalar. +. Since both u. It follows that u + v belongs to both U and V . +.

1. y) values satisfying the following system of linear equations:  −5x + y = 0 3x + 2y = 0. ·) is a vector subspace of R2 . · . y) values satisfying the following system of linear equations:  x − 5y = 0 3x + 2y = 2   that is 10 . y) ∈ R2 | − 5x + y = 0} V = {(x.e. · and its subsets U ⊂ R2 and V ⊂ R2 : U = {(x. As stated by Theorem 8. 2 . ·) is not a vector space.   It can be easily shown that while (U. +.4 Let us consider the vector space R2 . +. i. +.  2It can be easily shown that both (U.   Example 8. +. A geometric interpretation of the system above is the intersection of two lines passing through the origin of a reference system and  vector. Since U ∩ V does not contain the null vector. · and its subsets U ⊂ R2 and V ⊂ R2 : U = {(x. ·). y) ∈ R2 |3x + 2y = 0}. Hence the only solution is (0. Regardless of this fact the intersection U ∩ V can be still calculated and the set composed of those (x. This means that U ∩ V is composed of those (x. (U ∩ V. · . ·) 17 17 is not a vector space. +.6 Let us consider the vector space R2 .8. In this case. +. +.   Example 8.2 Vector Subspaces 237   Example 8.5 Let us consider the vector space R2 . y) values belonging to both the sets. that is the null vector o. y) ∈ R2 |x − 5y = 0} V = {(x. +. thatis the null a vector subspace of R2 . y) ∈ R2 |3x + 2y − 2 = 0}. ·) and (V. +. +. +. 0). ·) is intersecting in it. . +. obviously (U ∩ V. satisfying both the conditions above. · and its subsets U ⊂ R2 and V ⊂ R2 : U = {(x. (V. y) ∈ R2 | − 5x + y = 0} V = {(x. This is an homogeneous system of linear equations which is is determined (the incomplete matrix associated to the system is non-singular). ·) are vector subspaces of R . +. y) ∈ R2 | − 15x + 3y = 0}. · . the vector subspace is the special one ({o}. The intersection U ∩ V is composed of those (x.

we obtain 4x + y = 0 ⇒ y = −4x. the system has ∞ solutions. since the rank of the associated matrix is 2 and there are 3 variables. Hence. By applying Rouchè-Capelli Theorem. Since the system is homogeneous the system is compatible and at least the null vector is its solution. By substituting the second equation into the first one. · .  3It can be easily verified that both (U. This is an homogeneous system of 2 linear equations in 3 variables. The system can be interpreted as two overlapping lines. is a vector space. The intersection U ∩V is composed of those (x. y. ·) is a vector subspace of R2 . Besides the null vector. The system above can be interpreted as the intersection of two planes. ·) and (V. Hence. a) satisfies the system of linear equations. −4a.   Example 8. y) satisfying the following system of linear equations:  −5x + y = 0 −15x + 3y = 0.238 8 Vector Spaces  2It can be easily shown that both (U.e. · . since the rank is lower than the number of variables the system is undetermined. −4a. by applying Rouchè-Capelli Theorem.1. · and its subsets U ⊂ R2 and V ⊂ R3 : U = {(x. U ∩ V = U = V and (U ∩ V. The second equations can be written as x = z. if we fix a value of x = a. +. +. +. ·) and (V. +. The intersection set U ∩ V is given by  3x + y + z = 0 x − z = 0. +. it has ∞ solutions. ·). a) |a ∈ R}. · . y. z) ∈ R3 |x − z = 0}.7 Let us consider the vector space R3 . i. +. This means that all the points in U are also points of V and the two sets coincide  U = V . ·) are vector subspaces of R . . This means that the intersection set U ∩ V contains ∞ elements and is given by: U ∩ V = {(a. +. ·) are vector subspaces of R . +. The resulting triple (U ∩ V. +. z) ∈ R3 |3x + y + z = 0} V = {(x. a vector (a. for Theorem 8. +.

  2  Example 8. (2) Let w be an arbitrary vector ∈ S and λ an arbitrary scalar ∈ K. Thus. Proof For the Proposition 8.1 it would be enough to prove the closure of the set S with respect to the composition laws to prove that (S. Since U and V are vector spaces. +. +. From the definition of the sum subset we can write that ∃u1 ∈ U and ∃v1 ∈ V |w1 = u1 + v1 ∃u2 ∈ U and ∃v2 ∈ V |w2 = u2 + v2 .2 Vector Subspaces 239 Definition 8. Thus λw = λu + λv ∈ S. since U and V are vector spaces.8. ·) is a vector subspace of (E. +. ·) be two vector subspaces of (E. u1 + u2 ∈ U and v1 + v2 ∈ V . ·). ·) are two vec- tor subspaces of (E. · and its subsets U ⊂ R2 and V ⊂ R2 : U = {(x. +. Theorem 8. ·) be a vector space. ·). ·) be a vector space. +. (1) Let w1 . ·) is a vector subspace of (E. λu ∈ U and λv ∈ V . +.8 Let us consider the vector space R . ·) and (V. If (U. y) ∈ R2 |y = 0} V = {(x. w2 be two arbitrary vectors belonging to S. The sum subset is a set S = U + V defined as S = U + V = {w ∈ E|∃u ∈ U. From the definition of sum set S = U + V = {w ∈ E|∃u ∈ U. +. v ∈ V |w = u + v}. +. +. . +. we obtain λw = λ (u + v) = λu + λv where. y) ∈ R2 |x = 0}. then (S = U + V. If we compute the product of λ by w. +. ·) and (V. The sum of w1 and w2 is equal to w1 + w2 = u1 + v1 + u2 + v2 = (u1 + u2 ) + (v1 + v2 ) . according to the definition of sum subset w1 + w2 = (u1 + u2 ) + (v1 + v2 ) ∈ S. ·). +. Let (U.3 Let (E.2 Let (E. v ∈ V |w = u + v}. ·). +.

  For Theorem 8. 2a) |a ∈ R} V = {(b.240 8 Vector Spaces It can be easily proved that (U. (S.2. +. +. z) ∈ R3 |x. · . The two subsets can be rewritten as U = {(a. ·) is a vector subspace of R2 . · . y. +. b) |a. U ∩ V = {o}. 0) ∈ R3 |x ∈ R}. ·) and (V. it is not only the null vector. y ∈ R} V = {(x. The two subsets can be rewritten as U = {(a. The triples (U. i. · and its subsets U ⊂ R3 and V ⊂ R3 : U = {(x. (S. The triples (U. i. If we now calculate the sum subset S = U + V we obtain S = U + V = {(a. +. · and its subsets U ⊂ R2 and V ⊂ R2 : U = {(x. −3b) |b ∈ R}. 2a − 3b) |a. ·) are vector subspaces. +.e. ·) are vector subspaces. by varying  2b) ∈ R the entire R is generated. ·) and (V. ·) and (V. +. In this case the vector subspace is R2 . b ∈ R}. y) ∈ R2 |3x + y = 0}.9 Let us consider the vector space R2 .10 Let us consider the vector space R3 .e. 2 2 R . b ∈ R} that2 coincides  with the entire R2 . · . Hence again the sum vector subspace is (a. +.   Example 8. For Theorem 8. The intersection of the two subspaces is U ∩ V = {(x. Also in this case U ∩ V = {o}.   Example 8. ·) are vector subspaces. 0. 0. +. +. +. +. Again. 0) |a ∈ R} V = {(0. 0) ∈ R3 |x. The sum is again the entire R3 . · .  ·) is a vector subspace of R . b) |b ∈ R}. . The latter is obviously a vector space. z ∈ R}. y) ∈ R2 | − 2x + y = 0} V = {(x. +. +.2. The only intersection vector is the null vector. +. Let us calculate the sum subset S = U + V : S = U + V = {(a + b.

·) be a vector space. +. +. The triple (S = U ⊕ V. +. o = u1 + v1 − u2 − v2 = (u1 − u2 ) + (v1 − v2 ) . ·) and (V. ·) and (V.  Example 8.4 Let (E. This is a contradiction. +. +.3 Let (E.8. ·) be a vector space. if (v1 − v2 ) ∈ V also − (v1 − v2 ) ∈ V (for the axiom of the opposite element). ·) be two vector subspaces of (E. u2 ∈ U and ∃v1 . ·) is a vector space o ∈ S. +. +. Let (U. For hypothesis ∀w ∈ S : ∃!u ∈ U and ∃!v ∈ V |w = u + v.   (u1 − u2 ) = o u1 = u2 ⇒ (v1 − v2 ) = o v1 = v2  If S = {w ∈ E|∃!u ∈ U and ∃!v ∈ V |w = u + v} let us assume by contradiction that U ∩ V = {o} and that ∃t ∈ U ∩ V with t = o. Thus. ·) are said supplementary. In addition. +. Since (U. ·). Under this hypothesis. ·). +. +. ·). +. Thus. ·) and (V.11 As shown above the sum associated to the sets U = {(a. ·) is a vector subspace of (E. ·). If U ∩ V = {o} the subset sum S = U + V is indicated as S = U ⊕ V and named subset direct sum. ·) is a vector subspace of (E. +. +. Since (S. +. ·) and (V. +. ·) and (V. Let (U. Nonetheless. ·) are two vector subspaces of (E. Proof If S is a subset direct sum then U ∩ V = {o}. Theorem 8. ·). b) |b ∈ R}. +. ·) be two vector subspaces of (E. ·) are two vector spaces (and thus their sets are closed with respect to the internal composition law). +. ·) is a direct sum vector subspace (U ⊕ V. and the subspaces (U. By contradiction. let us assume that ∃u1 . Since U ∩ V = {o} we cannot have (u1 − u2 ) = − (v1 − v2 ) as the two terms in the equation would belong to both sets and thus their intersection unless both the terms in the equation are equal to the null vector o. +. where (u1 − u2 ) ∈ U and (v1 − v2 ) ∈ V since (U. if t ∈ U ∩ V also −t ∈ U ∩ V . Since (U ∩ V. +. +. v2 ∈ V with u1 = u2 and v1 = v2 such that w = u1 + v1 and w = u2 + v2 . we can express the null vector o ∈ U ∩ V as o = t + (−t) . 0) |a ∈ R} V = {(0. +. ·) if and only if S = U + V = {w ∈ E|∃!u ∈ U and ∃!v ∈ V |w = u + v}. o can be also expressed as o = o + o. +.1 (U ∩ V. for the The- orem 8. +. . +. ·) is a vector space. as it is a special case of (S. We expressed o that is an element of S as sum of two different pairs of vectors.2 Vector Subspaces 241 Definition 8. ·). The sum vector subspace (U + V.

v2 . y. z) ∈ R3 |x. v2 . . λn is the vector λ1 v1 + λ2 v2 + . . . we know that it can be expressed as (1. . b ∈ R}. vn ) ⊂ E or synthetically with L: L (v1 . . . there is only one way to obtain (a. . . 8. . For example the if we consider the vector (1. c) can be obtained in infinite ways.5 Let (E. . b. vn ∈ E and the scalars λ1 . λn ∈ K}. The sum set is the entire R2 generates as S = U + V = {(a. 0) ∈ R3 |x. . 0. . . Let the vectors v1 . λ2 . . This system has ∞ solutions depending on x1 and x2 . . . . vn ) = {λ1 v1 + λ2 v2 + . b). b) from (a. whose intersection is not only the null vector.242 8 Vector Spaces is a direct sum since U ∩ V = {o}. 3) = (x1 . The set containing the totality of all the possibly linear combinations of the vectors v1 . . . . v2 . . z ∈ R}. vn by means of n scalars λ1 . +. . .3 Linear Span Definition 8. . vn ∈ E by means of n scalars is named linear span (or simply span) and is indicated with L (v1 . . . 0. The linear combination of the n vectors v1 . 3). Definition 8. 0) + (x2 . . y. . for the sets U = {(x. v2 . 2. . This example can be inter- preted as the intersection of two planes. . . +. which is a line (infinite points) passing through the origin of a reference system. v2 . + λn vn |λ1 . z) which leads to the following system of linear equations: ⎧ ⎪ ⎨x1 + x2 = 1 y=2 ⎪ ⎩ z = 3. Obviously. + λn vn . y ∈ R} V = {(x. each vector (a. ·) be a vector space. λ2 . b) |a. . . On the contrary. λn ∈ K. . 2.6 Let (E. ·) be a vector space. . λ2 . 0) and (0. . .

(2) Let u be an arbitrary vector ∈ L and μ and arbitrary scalar ∈ K. Thus. 1) span the entire R2 since any point (x. v2 = (0. . + μn vn . are said to span the vector space (E. u + w = λ1 v1 + λ2 v2 + . . . Hence u + w ∈ L.1. ·). .8. λ2 . the vectors are said to span the set E or. +. . . . vn ) = E. . Hence. + λn vn . 2) . ·) is a vector subspace. Let us compute μu μu = μ (λ1 v1 + λ2 v2 + . equivalently. Thus. . ·). . v2 . y) ∈ R2 can be generated from λ1 v1 + λ2 v2 + λ3 v3 with λ1 . (1) Let u and w be two arbitrary distinct vectors ∈ L. + μn vn = = (λ1 + μ1 ) v1 + (λ2 + μ2 ) v2 + . . u = λ1 v1 + λ2 v2 + . vn ) with the composition laws is a vector subspace of (E. . +. vn = (1. + λn vn + μ1 v1 + μ2 v2 + . . .12 The vectors v1 = (1. λ3 ∈ R.4 The span L (v1 .3 Linear Span 243 In the case L (v1 . + (λn + μn ) vn . Example 8. u = λ1 v1 + λ2 v2 + . 0) . + λn vn w = μ1 v1 + μ2 v2 + . . it is enough to prove the closure of L with respect to the composition laws. Theorem 8. + λn vn ) = = μλ1 v1 + μλ2 v2 + . . v2 .  . Let us compute u + w. . for the Theorem 8. . . . . Proof In order to prove that (L. μu ∈ L. . +. . . + μλn vn .

 Example 8. . . Proposition 8. vn ∈ E. vn ∈ E. . . 4 are generally valid and thus also in the context of vector spaces. . 0 such that o = λ1 v1 + λ2 v2 . ·) be a vector space. . λn λn λn Since one vector has been expressed as linear combination of the others we reached a contradiction. . If v1 = o. λn = 0. . . 1. . Definition 8. . 1) v4 = (2. ·) be a vector space. 1. vn are linearly dependent if and only if at least one of them can be expressed as linear combination of the others. v2 . Under this hypothesis. Proof See proof of Theorem 4. . 2) . v2 . 0. v2 . . . Let the vectors v1 . vn ∈ E. . v2 . . . These vectors are said to be linearly dependent if the null vector o can be expressed as linear combination by means of the scalars λ1 . i. . . we can guess that λn = 0 and write λ1 λ2 λn−1 vn = − v1 − v2 + . . . . 1. then the vectors v1 . .. Let the vectors v1 . 0. v4 ) . 0.4 Let (E. +. . . Proof Let us assume.8 Let (E. Let the vectors v1 . ·) be a vector space. These vectors span R3 and (L (v1 . . . vn ∈ E. . 0. ·) be a vector space. Definition 8. 1) v2 = (1. . . These vectors are said to be linearly independent if the null vector o can be expressed as linear combination only by means of the scalars 0. . if v3 is not a linear combination of v1 and v2 .7 Let (E. . v2 . . by contradiction that the vectors are linearly dependent. +. . . exists a n-tuple λ1 . + λn vn . Proposition 8. 1) v3 = (2.5 Let (E. λ2 . . . v2 . . 1) v3 = (0. λ2 . v2 . − vn−1 . v3 . 1) v2 = (1. . 0. Let us consider the following vectors ∈ R3 : v1 = (1. . vn−1 . . . . . . The vectors v1 . 0) . . . .if vn is not a linear combination of v1 . . .13 Let us consider the following vectors ∈ R3 : v1 = (1. Let the vectors v1 . v2 . +. . 0. 0. 0 .244 8 Vector Spaces Example 8. . . . +. .e.1. λn = 0. vn are linearly independent. ·) is a vector space. +.14 Linear dependence and independence properties seen for V3 in Chap. if v2 is not linear combination of v1 . 1.

. Let the vectors v1 . . . . . 1) v3 = (0. μk+1 . μ2 . . + μk−1 vk−1 + μk+1 vk+1 + . λ2 . . λ2 . . . . . . μk+1 . λ2 . . . . 0. λk−1 . . . there is a unique way to express one vector as linear combination of the others: ∀vk ∈ L. ·) be a vector space. . . −1. λ2 = 1. +. 0|vk = λ1 v1 +λ2 v2 . . 0|vk = λ1 v1 + λ2 v2 + . . vn ∈ L. v2 . . Theorem 8. If the n vectors are linearly dependent while n − 1 are linearly independent. λ2 . λn = 0. 0.8. . . λn = μ1 . . . . . . . . 1. . 1) v2 = (1. . + μn vn where λ1 . λk−1 . 1. λk−1 . .+λk−1 vk−1 + λk+1 vk+1 + . + λk−1 − μk−1 vk−1 + λk+1 − μk−1 vk+1 + + . . 2) . . . . . . 0. μk−1 . λn = 0. . . 0. 0|vk . . In this case we can write v3 = λ1 v1 + λ2 v2 with λ1 . There is no way to express any of these vectors as a linear combination of the other two. μn = 0. . . 0. 0|vk = μ1 v1 + μ2 v2 .5 Let (E. . . . + λk−1 vk−1 + λk+1 vk+1 + . 1. . . + λn vn Proof Let us assume by contradiction that the linear combination is not unique: • ∃λ1 . . . . . . we can write that     o = (λ1 − μ1 ) v1 + (λ2 − μ2 ) v2 . λk+1 . λ3 = 1. . μn = 0. . Let us consider now the following vectors v1 = (1.3 Linear Span 245 These vectors are linearly dependent since o = λ1 v1 + λ2 v2 + λ3 v3 with λ1 . . ∃!λ1 . . . 0. . μk−1 . . . μ2 . . . . . . . . λk+1 . Under this hypothesis. . . . The vectors are linearly independent. . . + (λn − μn ) vn Since the n − 1 vectors are linearly independent . + λn vn • ∃μ1 . . λk+1 .

Thus. λ2 = 1. . .  Example 8. Example 8.15 Let us consider again the following linearly dependent vectors ∈ R3 v1 = (1. 1) + λ2 (2. Let us express  as a linear combination of the other two vectors. ⎪ ⎪ ⎪ ⎪ ⎩ ⎩ λn − μn = 0 λn = μn . 2) . . 0. 0. 0. 1) v2 = (1. Since v3 = 2v1 . 0. 1. 1) v2 = (1. 2) . the vectors are linearly dependent. 2) = λ1 (1. ⎨. 1.246 8 Vector Spaces ⎧ ⎧ ⎪ ⎪ λ1 − μ1 = 0 ⎪ ⎪ λ1 = μ1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ λ2 − μ2 = 0 ⎪ ⎪ λ2 = μ2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨. If we try to express v2 as a linear combination of v1 and v3 we obtain (1. 1) + λ2 (1. 1. 1) which results into the system ⎧ ⎪ ⎨λ1 + λ2 = 2 λ2 = 1 ⎪ ⎩ λ1 + λ2 = 2 which is determined and has only one solution. ⎪ ⎪. 2) . 0. Any pair of them is linearly independent. . 1) v3 = (2. . 1) v3 = (2. 1. We write (2. 1. 0. 1) = λ1 (1. the linear combination is unique. 1. .16 Let us consider the following vectors v1 = (1. 1. λk−1 − μk−1 = 0 ⇒ λk−1 = μk−1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ λk+1 − μk+1 = 0 ⎪ ⎪ λk+1 = μk+1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪. that is λ1 . . .

+ hs+2.2 v2 + +. . +. . .1 v1 + hs+1. . . v2 . . + λs vs + λs+1 vs+1 + . . . . For hypothesis. . i. . v2 . .s vs .1 v1 + hn. . Let s ∈ N with s < n. + ls vs = l1 v1 + l2 v2 + .6 ∀v ∈ L (v1 . v2 . . i. v2 . We have attempted to express a vector as the sum of two parallel vectors which is obviously impossible unless the three vectors are all parallel.   +λn hn. .1 + . v2 . then L (v1 . . vs ) (the two spans coincide). vn ) = L (v1 .s vs ⎪ ⎨v s+2 = hs+2. ·) be a vector subspace of (E. . . . . + λn hn..2 v2 + . + hn. + λs vs +   +λs+1 hs+1. vs ) : w = l1 v1 + l2 v2 + .2 v2 + .s vs + . . . . + hn. vn ) : v ∈ L (v1 . . . vs ).s vs ⎪. v2 . v2 . . We can interpret geometrically this fact by considering v1 and v3 as parallel vectors while v2 has a different direction. ∀w ∈ L (v1 . Proof Without a generality loss. + hs+1.1 v1 + hn. + hs+1. +. Theorem 8.s vs =   = λ1 + λs+1 hs+1. . . . . vn ) . . ·). . . . . . .e. . . +   + λs + λs+1 hs+1. . . . + λn hn. . If s vectors v1 . + λn hs+1.2 + . vn ) are linearly independent. .3 Linear Span 247 which results into the system ⎧ ⎪ ⎨λ1 + 2λ2 = 1 0=1 ⎪ ⎩ λ1 + 2λ2 = 1 which is obviously impossible. . . .6 again. v2 . .1 v1 +   + λ2 + λs+1 hs+1. This fact occurred since the remaining n − 1 vectors. v1 and v3 were not linearly independent. . .8.e. . . . vs are linearly independent while each of the remaining n − s vectors is linear combination of the linearly independent s vectors. . . This means that ∀v ∈ L (v1 . . + λn vn = = λ1 v1 + λ2 v2 + .s + . .s vs For the Definition 8. . let us assume that the first s vectors in the vector subspace L (v1 . . vs ). . . . + ls vs + 0vs+1 + 0vs+2 + . L(v1 .1 v1 + hs+1.1 v1 + hs+2. . v2 .2 v2 + . . vn ) ⊂ L (v1 . ⎧ ⎪ ⎪ vs+1 = hs+1. .2 v2 + . . we can express the remaining vectors as linear combinations of the first s vectors. . . Let us consider the Definition 8. vn ) : v = λ1 v1 + λ2 v2 + . . v2 . .2 v2 + . . . + 0vn . . v2 . ..6 Let (L (v1 . . . ⎪ ⎪ ⎩ vn = hn. .

. vσ (n) . v2 . . . vs ) ⊂ L(v1 . . . vs ) ⊂ L (v1 . λ3 ∈ R.17 The following vectors v1 = (0. λ4 ∈ R. Theorem 8. v2 . . This means that . v2 . . . . . . . we can generate any vector w ∈ R3 by linear combination w = λ1 v1 + λ2 v2 + λ3 v3 with λ1 . . 0) are linearly independent and span the entire set R3 . vn ). vn ). . v4 ) = L (v1 . v2 .248 8 Vector Spaces This means that ∀w ∈ L (v1 . . + μn vn . . . . . . . . it results that v4 is linear combination of v1 . . v2 . . i. i. vn ) : v ∈ L vσ (1) . vσ (n) . vs ) = L (v1 . . . . λ2 v2 . vn ) : v = λ1 v1 + λ2 v2 + . vn ) ⊂ L vσ (1) . . The sets generated by the linearly  independent vectors are equal: L (v1 . . . . L(v1 . .  From the Definition 8. . . vσ (2) . This fact can be written as L (v1 . . . . . . v2 . . 1. vs ) and L (v1 . L (v1 . . vσ (2) . v2 . vn ). . . . .   Hence. . .  Example 8. then L(v1 . It follows that any vector w ∈ R3 can be obtained by linear combination of w = λ1 v1 + λ2 v2 + λ3 v3 + λ4 v4 with λ1 . . λ2 . . λ3 . 0. . − σ (n) be another permutation of the same first n numbers ∈ N. . 1. . vn ). . we know that ∀v ∈ L (v1 .6. v3 . v2 . vσ (n) . Due to the commutativity we can rearrange the terms of the sum in a way such that v = λσ (1) vσ (1) + λσ (2) vσ (2) + λσ (n) vσ (n) . Proof From the Definition 8. . Since L (v1 . . v3 ) = R3 . λ2 . vσ (2) . v2 .6. . v2 . we know that ∀w ∈ L vσ (1) . This can be easily verified by posing λ4 = 0. . . . .7 Let 1 − 2 − . . vs ) : w ∈ L (v1 . . . v2 .e. . . 0. + λn vn . and v3 (it is their sum). This means that   ∀v ∈ L (v1 . v2 . Let σ (1) − σ (n) − . vn ) ⊂ L (v1 . . . v2 . . vn ) = L vσ (1) . . v2 . . − n be the fundamental permutation of the first n numbers ∈ N. If we add the vector v4 = (1. . . . vσ (2) . vσ (n) : w = μσ (1) vσ (1) + μσ (2) vσ (2) + μσ (n) vσ (n) . . 1) . λ2 v2 .e. . . v2 . 1) v2 = (0. 0) v3 = (1. . Due to the commutativity we can rearrange the terms of the sum in a way such that v = μ1 v1 + μ2 v2 + . . . .

. . . . . . ∀v ∈ L (v1 . 1) v2 = (0. . . . . v2 . . 0. . . . vn ): there exist scalars μ1 . w. . . v2 . .+μn vn . . w. . . . . . 1) span the same set. . . . kw = λ1i . . + λi vi + . . w. that is R3 .+μi vi +. . μn such that we can write v as v = μ1 v1 + μ2 v2 + . . . . . vn ) . − vn . . . vn ) be a span. . . μi+1 . 1. . . Proof Since w = λ1 v1 + λ2 v2 + . . vn ) ⊂ L (v1 . + λn vn with λi = 0. . . vn ) and w is such that w = λ1 v1 + λ2 v2 + . . μ2 . .8. v2 . . . . Analogously. .3 Linear Span 249   ∀w ∈ L vσ (1) . vi . Hence. . + λn vn and λi = 0. v2 . . . . vn ) = L vσ (1) .  Example 8. L vσ (1) . − vi−1 − vi+1 . v ∈ L (v1 . . . . . . . . 0) and v1 = (1. vσ (2) . . . vn ) . vn : . 0) v3 = (0. vn ). w . . . 0. . . 2. μw . . . vn ): there exist scalars μ1 . . then L (v1 . vn ) = L (v1 . . If w ∈ L (v1 . . + ki−1 vi−1 + ki+1 vi+1 + · · · + kn vn + kw w λ with kj = μj − λji for j = 1. v2 . . v2 . . then 1 λ1 λ2 λi−1 λi+1 λn vi = w − v1 − v2 − . . . . L (v1 . . . .   Hence. vn : v = k1 v1 + k2 v2 + . . . v2 . . v2 . n and j = i. . . v2 . . . 0. . . . . . . μ2 . . . . . . 0) v3 = (1. vσ (n) . . . vσ (2) . . . . . . . .6 Let L (v1 . We can substitute vi with the expression above and thus express any vector v as linear combination of v1 . + λi vi + . .18 This theorem simply states that both set of vectors v1 = (0. Thus. . . vn ). . . v2 . . vσ (2) . + μn vn . . . vσ (n) : w ∈ L (v1 . . μn such that we can write v as v = μ1 v1 +μ2 v2 +. Proposition 8. . v2 . . + μw w + . . We can substitute w with the expression of w above and thus express any vector v as linear combination of v1 . v2 . λi λi λi λi λi λi ∀v ∈ L (v1 . . . vn ) and consequently L (v1 . w. . 0) v2 = (0. μi−1 . vi . . . . v2 . . . Moreover. 0. . . . vσ (n)  ⊂ L (v1 . v2 . . . 1. . . .

v2 . vi . . . v ∈ L(v1 . 4. . 3 Lemma 8. 0) + μ3 (1. 1. we have L (v1 . 0) v3 = (1. In order to achieve this aim let us consider a vector u = (3. . λ3 = 1. v2 . n} such that the two following statements are verified: . . 3. μ2 . . vn ) and consequently L (v1 . 1. .250 8 Vector Spaces v = k1 v1 + k2 v2 + · · · + ki vi + . . v2 . vn be n linearly depen- dent vectors and v1 = o. . . Hence. . . 4. We can check that L (v1 . v3 with λ1 . 1. . . . 1. ki = μw λi . . . 1) v2 = (0. 2. . Moreover. 1. L (v1 . . We can generate u by means of v1 . w) = R3 .1 (Linear Dependence Lemma) Let v1 . vn ) ⊂ L (v1 . . . . + kn vn with kj = μj + μw λj for j = 1.19 If we consider again v1 = (0. μ3 = 2. v2 . v2 . vn ) = L (v1 . 0. 1) . 1. . . . We can obtain u also by means of v1 . v3 ) = R3 and w = λ1 v1 + λ2 v2 + λ3 v3 with λ1 . . v2 . λ3 = 5. . . then ∃ index j ∈ {2. . . n and j = i. w. . . λ2 . 1) + μ2 (0. . . . v2 . w. . . vn ). 0. vn ).  Example 8. . 5). . v2 . 4. . 0) and w = (1. . . . λ2 . . 3. Thus. 0. . 5) = μ1 (0. vi . . It is enough to write (3. 1) which leads to the following system of linear equations ⎧ ⎪ ⎨μ3 = 3 μ2 + μ3 = 4 ⎪ ⎩ μ1 + μ3 = 5 whose solution is μ1 . v2 . 1. . w. . v2 . . .

vj−1 . . . .   Hence. . . . . . . . It follows that L(v1 . . . . vj . v2 . − vj−1 . . These vectors are linearly dependent. . . vn . . vj−1    L v1 . . . . 1. vn ) : there exist scalars μ1 . . we can write λ1 λ2 λj−1 vj = − v1 − v2 − . . vj−1 . . . λn = 0. . . + μj vj + . v2 . vj−1 . v2 . This means that  v ∈ L(v1 . .. . . + μn vn with ki = μi − μj λλji for i = 1. . v2 . . . . . . . . . . . μn such that v = μ1 v1 + μ2 v2 + . Let us consider the largest non-null scalar λj (all the scalars whose index is > j are null). vn ) ⊂ L v1 . 1. . . . . vj+1 . . . 0) v3 = (1. .e. . .. . . v2 . j − 1. + kj−1 vj−1 + μj+1 vj+1 + . . vn ) and thus L(v1 . . 0 such that o = λ1 v1 + λ2 v2 + . the two spans coincide: L v1 . . vj . . . . . v2 . . vj−1 . . .   Hence v ∈ L v1 . .8. v2 . . v2 . λ2 .  In order to prove the second statement let us consider a generic vector v. vn = L(v1 . ∃λ1 . . Hence. vn . the vectors v1 . . . . . + μn vn . . vj ∈ L v1 . . . v2 . vj−1 . . vn . . 0. 1) v2 = (0. vn ). . .  Example 8. . . . . .  Analogously. vj+1 . v2 . . . v2 . + μi−1 vi−1 + 0vj + μi+1 vi+1 . . . . vn = L v1 . 1) . + μn vn which can be re-written as v = μ1 v1 + μ2 v2 + . vn it follows that v = μ1 v1 + μ2 v2 + . v2 . . . . . . . after the removal of vj . . vn still span the vector space. . . . 0. + λn vn . . . If we substitute the expression of vj into the span. . . . . vn Proof Since v1 . . v2 . . . . . . . we obtain that. . vj . . vj . . . μ2 . 0) v4 = (1. .e. ∀v ∈ L v1 . . vj . . . . vj−1 . . vj+1 . vj+1 . + μi−1 vi−1 + μi+1 vi+1 . . . . . . . v2 . vn ) ⊂ L v1 . . + μn vn . . λj λj λj   i. v2 . ∀v ∈ L (v1 . . Since v1 = o. v2 . 0. vj−1 . vj+1 . . . vj−1 . . vn are n linearly dependent vectors. v = k1 v1 + k2 v2 + . . . . . i. λn is non-null. vj+1 . . . .3 Linear Span 251   vj∈ L v1 . .20 Let us consider again the vectors v1 = (0. . vj+1 . . . at least one scalar among λ2 . . . .

A basis B = {v1 . vn } of (E. . 0) (1. . . Definition 8. 1)). This vector span can generate any vector ∈ R3 . . 0). as seen above L (v1 . . . • v1 . 1. v2 . 0). .252 8 Vector Spaces We know that v4 can be expressed as a linear combination of the other vectors. v3 ) . +. The vector space (E. Furthermore. vn are linearly independent • v1 . v2 . that is the linear independence lemma. 0. ·) is a set of vectors ∈ E that verify the following properties. +. v2 . v2 . . v2 . v2 . . vn ). v3 ) = L (v1 . A basis B of R3 (0. 0. . v2 . . . . . E = L (v1 . Example 8. vn . i. .10 Let (E. ·) = (E. (0. . +. Example 8. In this case we say that the vectors v1 . . +. v2 . it is important to consider that for example a set R∞ can be defined as the Cartesian product of R performed and infinite amount of times.21 A finite-dimensional vector space is (L((1. . .22 Let us consider the vector space R3 . This fact means that v4 ∈ L (v1 . +. . 0. 8. v2 . such that the vector space (L. . 1. infinite- dimensional sets and vector space do not fall within this scopes of this book. 0. vn span E. +. . .4 Basis and Dimension of a Vector Space Definition 8. . ·) be a finite-dimensional vector space. (0. v2 . ·) be a vector space.9 Let (E. +. 0) as they are linearly independent and all the numbers in R3 can be derived by their linear combination. 1) (0. v4 ) = R3 . . . vn ). ·) where the span L is L (v1 .e. ·) is said finite-dimensional if ∃ a finite number of vectors v1 . ·). Although. . v3 . . vn span the vector space. .

+ kn vn . . ·) be a vector space and v1 . there exists a tuple λ1 . − λn vn ) + λ2 v2 + . 0. i. . 1. w2 . λn = 0. . λ2 . Hence. . . is given by (0. . . We can now write 1 v1 = (w1 − λ2 v2 − . it follows that λ1 . . . v2 .3. . . any vector u ∈ E that would be represented as u = λ1 v1 + λ2 v2 + . vn ) = E. Since w1 . +. . w2 . . 0) (1. v2 . . a basis always spans a vector space while a set of vectors spanning a space is not necessarily a basis. 1) (0. . . v2 . . vn be its n vectors.3 (Steinitz’s Lemma) Let (E. . .2 Let (E. 3) as they still allow to generate all the numbers in R3 but are not linearly independent. ws be s linearly independent vectors ∈ E. . − λn vn ) . the span L.e. . Since w1 = o. . . vn . v2 . . . . v2 . . λ2 . i. . 0. If one of these vectors is equal to the null vector o. . 0. ws are linearly independent. It follows that s ≤ n. + λn vn = = (w1 − λ2 v2 − . . 0. . Proof Let us assume by contradiction that s > n. . we know that w1 = o. . . Lemma 8. . . Let w1 . . . . . . Hence. Lemma 8. . Since L (v1 . 2. i.4 Basis and Dimension of a Vector Space 253 A set of vectors spanning R3 . Without a loss of generality let us assume that λ1 = 0. + λn vn = w1 + k2 v2 + . . +. vn ) spans E and w1 ∈ E. vn ) = E its span. ·) be a finite-dimensional vector space and ∈ L (v1 . . these vectors are linearly dependent. the number of a set of linearly independent vectors cannot be higher than the number of vectors spanning the vector space. . .e. + λn vn . L (w1 . λn such that w1 = λ1 v1 + λ2 v2 + . . they are all different from the null vector o. 0) (1. .e. This means that any vector u ∈ E can be represented of linear combination of w1 .8. Proof See the proof of Proposition 4. . . λ1 Thus. . .

. we have that any vector u ∈ E can be expressed as u = h1 w1 + h2 w2 + h3 w3 . 4. in R3 at most three linearly independent vectors may exist. . . 1) v2 = (0. We can see that these linearly independent vectors are less than the vectors span- ning R3 . .e. If we take into consideration w1 and w3 would not be enough to generate any vector ∈ R3 . This is against the hypothesis and a contradiction has been reached. . 2. Since by contradiction s > n there are no more v vectors while still there are s − n w vectors. We can then assume that μ2 = 0 and state that any vector u ∈ E can be expressed as u = l1 w1 + l2 w2 + l3 v3 .254 8 Vector Spaces We can now express w2 ∈ E as w2 = μ1 w1 + μ2 v2 + . Since wn+1 has been expressed as linear combination of the others then for The- orem 4. 0. + μn vn where μ1 . . 0. 1. . we know from Theorem 4. i. 0. Reiterating until the nth step. + hn wn . .23 As shown above. 0) v4 = (1. + ln vn . . . at least three vectors are needed to span R3 . is given by v1 = (0. 0) w3 = (0. wn+1 ∈ E and hence can be written as wn+1 = h1 w1 + h2 w2 + h3 w3 . Thus. Taking R3 as an example. μn = 0. .  Example 8. Let us consider a set of linearly independent vectors ∈ R3 : w1 = (1. . the vector .6 that four vectors are always linearly dependent.1 the vectors are linearly dependent. . 3) . 5. μ2 . 2. 0 (it would happen that w2 = o and hence the vectors linearly dependent). In particular. . the span L. . Conversely. . This is essentially the sense of Steinitz’s lemma. + hn wn . 1) . 0) w2 = (0. a set of vectors spanning R3 . For example. 0) v3 = (1.

. w2 . +. . +. B = {w1 . . v2 . . It follows that s ≤ n. . ·) or simply with dim (E). t cannot be generated by w1 and w3 . . +. ·) be a finite-dimensional vector space and ∈ L(v1 .4 Basis and Dimension of a Vector Space 255 t = (50. . +. vn are linearly independent vectors ∈ E. w3 } is a basis since its vectors are linearly independent and span R3 . This system is impossible. +. . ws } be two arbitrary bases of (E. Proof The vectors composing the basis are linearly independent. . ws are linearly independent vectors ∈ E. ws } be its basis. ·) be a finite-dimensional vector space. . vn } and B2 = {w1 . v3 . . Theorem 8. let us write (50. Proof Let B1 = {v1 .3. For the Steinitz’s Lemma. which is less than the number of vectors v1 . 0. All the bases of a vector spaces have the same order. . v4 spanning R3 . .9 Let (E. . We have seen that at least three vectors are needed to span R3 . . v2 . vn ). . There is totally three vectors in this basis. . w2 . it follows immediately that s ≤ n. 20) could not be generated as a linear combination of w1 and w3 . ws ∈ L (v1 . thus a basis of R3 can be composed of at most three vectors. v2 . .8. .  Example 8. +. . +. ·) and is indicated with dim (E. vn ∈ L (w1 . . v2 . . w2 . Hence. w2 . . w2 . .12 Let (E. 1) which leads to ⎧ ⎪ ⎨λ1 = 50 4λ1 + 2λ3 = 0 ⎪ ⎩ λ3 = 20.11 The number of vectors composing a basis is said order of a basis. . If B1 is a basis of (E. then L (v1 . ·). . If w1 . . 2. . v2 . . w2 . For the Lemma 8. Definition 8. then v1 . . . . . ·). n ≤ s.3. vn ) = E. Definition 8. +. . s ≤ n. w2 . v2 . i. ·). If v1 . 0) + λ3 (0. v2 . . . For the Lemma 8. .e. they will have the same order. The order of a basis of (E. then L (w1 . the bases have the same order s = n. .24 From the previous example. Theorem 8. . . . 0.8 Let (E. . . . 20) = λ1 (1. +. ·) be a finite-dimensional vector space. . . ·) is said dimension of (E. then w1 . . vn ) = E its span and B = {w1 . In order to check this. . We can conclude that w1 and w3 do not span R3 . ws ) = E. If B2 is a basis of (E. We know that in R3 at most three vectors can be linearly independent. Thus each basis must have three vectors. ws ). 4.25 If we consider two bases of R3 . .  Example 8.

. i. . v2 . . . . . that n is not the maximum number of linearly independent vectors and that s linearly independent vectors exist. . v2 . v2 . ·) = n of a vector space (or simply dim (E)) is the maximum number of linearly independent vectors and the minimum number of vectors spanning a space. . v2 . . Since the s linearly independent vectors belong to L (v1 . w2 . . +. r < n of them must be linearly independent vectors. vn ) = E. +. by contradiction. with n < s. v2 . Theorem 8. Thus. . . . . . .e. ·) where L (v1 .3. . vn ) and still have an equal linear span. . . . . v2 . . vn span a vector space. . We already know that in R3 at most three linearly independent vectors there exist and at least three vectors are needed to span the vector space. The vectors v1 . its elements span a vector subspace (L (v1 . . that ∃r < n vectors that span the vector space w1 . . wr ) = E. v2 . then ∃ a basis B = {v1 . vn ) = E. vσ (2) . vn are linearly independent. v2 . . . v2 . for the hypotheses dim (E. . w2 . . dim (E. . the number of linearly independent vectors n cannot be higher than the number of vectors r that generate the vector subspace. For definition of dimension dim R3 = 3. Let v1 . This is impossible because r < n. . . . . .e. ·) = n. vn ) . v2 .e. L (v1 .11 Let (E. in the case of R3 . These vectors are indicated with vσ (1) . The dimension dim (E. vσ (r) . then L (v1 . . the number of lin- early independent vectors s cannot be higher than the number of vectors n that span the vector subspace. Since B is a basis.  In order to prove that n is also the minimum number of vectors spanning a space. .10 Let (E. +. . . +. If B is a basis of a vector space. We reached a contradiction. ·) = n). . ·) = n. For the Linear Dependence Lemma we can remove one vector from L (v1 . We reached a contradiction. Thus. . ∃ n linearly independent vectors. . . This means that the maximum number of linearly independent vectors is n. . vσ (2) . This means that n is the minimum number of vectors spanning a space.256 8 Vector Spaces Theorem 8. v2 .e. Since. let us assume. vn span the vector space (i. We can reiterate the reasoning until only linearly independent vectors compose the linear span:   L vσ (1) . . vn are linearly dependent. .26 Again. vn ) = E) if and only if v1 . v2 . . +. . . . . Proof If dim (E. the dimension of the vector space is the maximum number of linearly independent vectors and the minimum number of spanning vectors. i. . vn ). . . . The basis. . . . v2 . ·) be a finite-dimensional vector space and let n be its dimension (i. ·) be a finite-dimensional vector space. . vn be n vectors ∈ E. . . Let us assume. +.3. For the Lemma 8. by definition. +. vn }. . its elements are linearly independent. . Proof If v1 . L (w1 . Let us assume. s ≤ n. . the order of a basis B is n.  Example 8. by contradiction. we know that each basis is composed of three vectors. vn ) . . that v1 . by contradiction. Amongst these n linearly dependent vectors. wr . . they also belong to E. vσ (r) = L (v1 . For the Lemma 8. . contains n linearly independent vectors and its order is the number of its vectors n. . .

10. .4 Basis and Dimension of a Vector Space 257 This means that also that the r vectors vσ (1) . . Let us assume. v2 . . it is impossible that r < n.v2 and v3 are linearly independent. We can always represent a vector of R3 since v1 . vσ (2) . vn do not span E. v2 . u) = E. 2) . . . 1) + λ2 (0. the maximum number of linearly independent vectors in E is n. −19. . . λ3 = 40. . . . . .8. . .27 Let us consider the following linearly independent vectors ∈ R3 v1 = (1. . . . 0) + λ3 (1.10. Thus. the minimum number of vectors spanning the vector subspace is n. 8. vσ (r) span E:   L vσ (1) . vσ (r) = E. . +. 8. 8. Thus. . vn . vn linearly independent.e. 0. u) = L (v1 . . . . v2 . L(v1 . vσ (2) . We have then n+1 vectors spanning E. . 0. . . 0. . . This fact can be stated as the vectors v1 . 1) v2 = (0. . . Thus. v2 . ·) = n. λ2 . v2 . 2) can be expressed as (21. 0. . . vn are not enough to span E and more vectors are needed (at least one more vector is needed). Any vector ∈ R3 can be generated as linear combination of v1 . by contradiction. vn ) . i. v2 .v2 and v3 . v2 . 2. 2) which leads to the following system ⎧ ⎪ ⎨λ1 + λ3 = 21 λ2 = 8 ⎪ ⎩ λ1 + 2λ3 = 2 whose solution is λ1 . . 2. vn . . we add a vector u ∈ E to obtain span of the vector space: L (v1 . that the vectors v1 . . . . These vectors must be linearly independent because if u is a linear combination of the others for the Linear Dependence Lemma can be removed from the span: L (v1 . . 2) = λ1 (1.  By hypothesis. for Theorem 8. On the other hand.  Example 8. for Theorem 8. we now consider v1 . . since dim (E. vn ) = E. For example the vector t = (21. it is impossible that n + 1 linearly independent vectors exist. 0) v3 = (1. ·) = n. +. Since dim (E.

0) + λ3 (1. ·) where   x − 3y + 2z = 0 E = (x. In order to determine span and basis of this vector space. This expression can be written as (3α + 7β. Example 8. 8.5. 8. 2. 1) w2 = (0. 0. In addition. Thus. let us assume that y = α. 2. Thus. By applying the Proposition 8. E = L ((3. 0) = o and thus linearly independent. 1) . 0) . 0) . 0. (3. x+y−z =0 . 1) + λ2 (0. Hence. 1. α. 2) = λ1 (1. ·) where E = {(x. 2. 1. 1) and let us attempt to express t = (21. 1. the two vectors are linearly independent and compose a basis B = {(3. (7. and let us solve the equation with respect to x : x = 3α + 7β. +. 0. 1. 1. z) ∈ R | 3 . 1) which leads to the following system ⎧ ⎪ ⎨λ1 + λ3 = 21 λ2 + λ3 = 8 ⎪ ⎩ λ1 + λ3 = 2 which is impossible. +. β) = = α (3.29 Let us consider now the vector space (E. Linearly dependent vectors cannot span a vector space. β) = (3α. (7. y. y. 1)}. 2. 0. 0) + β (7. z = β. α. β). dim (E. 0. 0) w3 = (1. +. 2) (21. Example 8. 1) is not linear combination of (3.258 8 Vector Spaces Let us now consider the following linearly dependent vectors w1 = (1. 0.28 Let us consider the vector space (E. 0). there are ∞2 solutions of the kind (3α + 7β. 0. α. z) ∈ R3 |x − 3y − 7z = 0}. (7. 0) + (7β. 1)). Hence. ·) = 2.

4 Basis and Dimension of a Vector Space 259 Since the rank of the system is 2. 4) . Example 8. A solution is: −3 2 1 2 1 −3 det . +. +.8. 0). As previously seen in Proposition 8.30 Let us consider now the vector space (E. y. 3. det = (1. +. the vector space is composed only of the null vector. 1 −1 1 −1 1 1 Hence.31 Let us consider the vector space (E. ·) = 1. ⎪ ⎩ ⎪ ⎩ ⎪ ⎭ 3x + z = 0 Since the incomplete matrix is non-singular. ·) where ⎧ ⎧ ⎫ ⎪ ⎨ ⎨x + 2y + 3z = 0 ⎪ ⎪ ⎬ E = (x. it has ∞1 solutions. ·) where ⎧ ⎧ ⎫ ⎪ ⎨ ⎪ ⎨x − y = 0 ⎪ ⎬ E = (x. 4)}. • if the rank of the system is 1. y. ·) = 2. +. ·) = 1. Since there are no linearly independent vectors. ·) = 0. z) ∈ R3 | y + z = 0 . i. +. +. 0)) = E.2. 0. the vector space axioms would not be verified. 4)) = E and B = {(1. 3. The geometrical interpretation of this vector subspace is a plane passing through the origin. then dim (E. z) ∈ R3 | x + y + 3z = 0 . +. The geometrical interpre- tation of this vector subspace is the origin of a system of coordinates in the three dimensional space (only one point). was not included in the vector subspace. ·) = 0. the null vector. the only solution of the linear sys- tem is (0. the system has ∞2 solutions and dim (E. The geometrical interpretation of this vector subspace is a line passing through the origin. ⎪ ⎩ ⎪ ⎩ ⎪ ⎭ x−y=0 The matrix associated to this system of linear equations is ⎛ ⎞ 1 2 3 ⎝1 1 3⎠ 1 −1 0 . dim (E. 0. • if the rank of the system is 3. 3. • if the rank of the system is 2. Hence. if the origin. − det . More generally. It follows that dim (E. if a vector subspace of R3 is identified by 3 linear equations in 3 variables.e. In this special case. Example 8. L ((0. L ((1. the system has ∞1 solutions and dim (E.

λ2 = 0. 1. β) which can written as (−2α − 3β.260 8 Vector Spaces whose determinant is null. +. 0. 1. 1. 0. β) = α (−2. this means that λ1 v1 + λ2 v2 = o only if λ1 . −1). This means that L ((1. The dimension of this vector space is dim (E. Hence. ⎪ ⎩ ⎪ ⎩ ⎪ ⎭ 3x + 6y + 9z = 0 The matrix associated to this system of linear equations ⎛ ⎞ 123 ⎝2 4 6⎠ 369 is singular as well as all its order 2 submatrices. −α) = α (1. 1) . 1. ·) where ⎧ ⎧ ⎫ ⎪ ⎨ ⎨x + 2y + 3z = 0 ⎪ ⎪ ⎬ E = (x. 0) + λ2 (−3. −1)) = E and a basis of this vector space is B = {(1. In our case this means that λ1 (−2. from the last equation we can write x = y = α. α. If we pose y = α and z = β we have x = −2α − 3β. y. The latter statement is true because this vector is not the null vector (and hence linearly independent). The rank of this matrix 2. 0) + (−3β. By substituting in the first equation we have α + 2α + 3z = 0 ⇒ z = −α. 1) = o . z) ∈ R3 | 2x + 4y + 6z = 0 . These couples of vectors solve the system of linear equations. Example 8. α. ·) = 1. 1. In order to find a general solution. In order to show that these two vectors compose a basis we need to verify that they are linearly independent.32 Let us consider the vector space (E. By definition. Hence the rank of the matrix is 1 and the system has ∞2 solutions. 0. the solutions are proportional to (−2α − 3β. Hence the system has ∞1 solutions. 0) + β (−3. −1)}. α. 0. α. β) = (−2α. +. The infinite solutions of the system are proportional to (α.

1)}. Let us now consider the vector t = (0. we can write the following system of linear equations ⎧ ⎪ ⎨2λ + μ + 2ν = 0 −λ + ν = 2 ⎪ ⎩ 3λ − μ − 2ν = 0. 0. 1) v = (3. 1. μ. 1. This means that we need to find the coefficients λ. 2. Let us express t in the basis B. 0) . v} and thus could span a vector space having dimension 2. 3) + μ (1. 1. −1.34 Let us consider the following vectors ∈ R3 : u = (2. 0. −1. Example 8. λ2 = 0. ν such that t = λu + μv + νw: t = λu + μv + νw = = λ (2.33 Let us consider the following vectors ∈ R3 : u = (2. This means that the vectors are linearly independent and hence compose a basis B = {(−2. −λ + ν. Example 8. These two vectors could compose a basis B = {u. these vectors are linearly independent and compose a basis B. 2) The associated matrix A has rank 2. the vectors are linearly independent. (−3. The only values that satisfy the equations are λ1 . 1. This is equivalent to say that the system of linear equations ⎧ ⎪ ⎨−2λ1 − 3λ2 = 0 λ1 = 0 ⎪ ⎩ λ2 = 0 is determined. −2) = = (2λ + μ + 2ν. −1) + ν (2. −1. −2) The associated matrix is ⎛ ⎞ 2 −1 3 A = ⎝ 1 0 −1 ⎠ 2 1 −2 whose rank is 3. −1) w = (2. Thus. 3) v = (1. 0). 3λ − μ − 2ν) . 0. λ2 = 0. These vectors cannot span R3 .8.4 Basis and Dimension of a Vector Space 261 only if λ1 . +. ·) = 2. 0. The dimension is then dim (E. 0. Hence. . Hence.

• step 1: If v1 = o. . we can observe that b is a linear combination of the other four vectors: b = λ 1 u + λ2 v + λ3 w + λ4 a where λ1 = λ2 = λ4 = 0 and λ3 = 2. a) = E. . +. . a is not a linear combination of u. v. . the system has only one solution λ. Hence. +. . . w. v is not a linear combination of u or. 1) a = (3. v. b). in other words v ∈ / L (u). vk−1 ). . . it is removed. ·) be a finite-dimensional vec- tor space and w1 . w. The first vector. . If some vectors are removed a basis of (E. ·) be a finite-dimensional vector space and L (v1 . ws be s linearly independent vectors of the vector space. v2 . For the Linear Depen- dence Lemma the n remaining vectors span the vector space. The second vector. . they can be extended to a basis (by adding other linearly independent vectors). . v. 2) Let us consider the span L (u. Hence v is left in the span.12 (Basis Reduction Theorem) Let (E. . For the Proposition 8. u = o and therefore it is left in the span. and w. Proof Let us consider the span L (v1 . Theorem 8. . . . . . 2. 2. 2. If w1 . .13 (Basis Extension Theorem) Let (E.262 8 Vector Spaces The matrix associated to this system is non-singular. the vectors are linearly independent. w2 . Also. −1) w = (1. v2 . ν that allows to express t in the basis B. . 4. they compose a basis. . v2 . We can verify that w is not a linear combination of u and v.35 Let us consider the following vectors ∈ R4 : u = (2. 4. v2 .e. it is removed otherwise left in the span The procedure is continued until there are vectors available. v.  Example 8. . vm ) and apply the following iterative procedure. 0. . 3) v = (0. μ. L (v1 . On the contrary. otherwise left in the span • step k: if vk ∈ L (v1 . Theorem 8. vn ) = E. 1. 0. vm ) = E one of its spans. 3) b = (2. ·) is obtained.5. . Hence. a. The updated span is L (u. The vectors are linearly independent and hence compose a basis B = {u. i. b is removed from the span. . Hence. −1. w2 . a} of R4 . w. ws are not already a basis. . +. 1.

ws . v1 . vk ) = E (the indices have properly arranged). ∃v1 . . ws . v2 .8. . 0) which leads to the following system of linear equations ⎧ ⎪ ⎪ 5λ1 + 0λ2 = 1 ⎪ ⎨2λ + 6λ = 0 1 2 ⎪ ⎪ 0λ + 0λ 2 =0 ⎪ ⎩ 1 0λ1 + 0λ2 = 0. v2 . . ws . . . . 1) .36 Let us consider the following vectors belonging to R4 : v1 = (1. i. . ws ) then the span is left unchanged otherwise the span is updated to L (w1 . Let us check whether or not v1 ∈ L (w1 . 0) = λ1 (5. . v2 . that is λ1 . λ2 = 15 . . . It can be easily shown that these vectors are linearly independent and compose a basis of R4 . . . . 2. . Let us now apply the Basis Extension Theorem to find another basis. w2 . ·) is finite-dimensional ∃ a list of vectors v1 . v2 . vn ) = E. w2 ) since ∃λ1 . . . 0) w2 = (0. . . w2 . 0. λ2 ∈ R such that v1 = λ1 w1 + λ2 w2 . . the new span is composed of linearly independent vectors. 0. • step 1: If v1 ∈ L (w1 . . . . . . vn ) = E (the vectors were already spanning E). − 15 1 . . w2 . . Since (E. . . vk ) Considering how the span has been constructed. .  Example 8. w2 . 0. . 0) v2 = (0. 1. from the construction procedure. 0. . 6. . ws . w2 ). vk−1 ) (after having properly renamed the indices) then the span is left unchanged otherwise the span is updated to L (w1 . 0) . 1. 0) v4 = (0. . L (w1 . . 0) + λ2 (0. +. .4 Basis and Dimension of a Vector Space 263 Proof Let us consider a list of linearly independent vectors w1 . The vector v1 ∈ L (w1 . 0. . 0. This result was found by simply imposing (1. . 6. Let us apply the following iterative procedure. 0) v3 = (0. 2. v1 ) • step k: If vk ∈ L (w1 . . 0. ws . 0. . w2 . v2 . the new list of vectors also spans E. v2 . we found a new basis. v2 . vn such that L (v1 . . . . . . 0. . v1 .e. . Let us now consider the following vectors belonging to R4 : w1 = (5. These two vectors are linearly independent. v1 . . w2 . . 0. 0. . Hence. 0. Since L (v1 . . vn that spans E.

u2 . +. . ·) be vector subspaces of (E. we do not add v1 to the span. v3 . ·). +. v4 }. Then. . . w2 . Since we added all the vec- tors of the span (unless already contained in it through linear combinations of the others). λ2 = 0. L (w1 . w2 . 0. +. . ·) and (V. 0) which leads to the following system of linear equations ⎧ ⎪ ⎪ 5λ1 + 0λ2 = 0 ⎪ ⎨ 2λ1 + 6λ2 = 0 ⎪0λ1 + 0λ2 = 1 ⎪ ⎪ ⎩ 0λ1 + 0λ2 = 0. the system is impossible. Hence. 0. . The vector belongs to the span since v2 = λ1 w1 + λ2 w2 with λ1 . w2 .264 8 Vector Spaces The last two equations are always verified. 0. ·). ∃ two bases BU = {u1 . The last equation is always verified. 2. Thus we do not add it to the span. We check now whether or not v3 ∈ L (w1 . +. v3 are linearly independent. . By applying the same reasoning we can easily find that also w1 . For the Theorem 8. . v4 are linearly independent and can be added to the span. we have found a new basis. The third equation is never verified. 1. that is B = {w1 .1. v3 . Proof Let us suppose that dim (U) = r and dim (V ) = s . In order to achieve this aim we need to check whether or not ∃λ1 . The vector v3 can be added to the span which becomes w1 = (5. dim (U + V ) + dim (U ∩ V ) = dim (U) + dim (V ) . This means that w1 . ·). +. λ2 = 15 . ·) be a finite-dimensional vector space.e. respectively. . ur } BV = {v1 . 6. 0) = λ1 (5. 2. 0) + λ2 (0. Let (U. − 151 . Thus. +. w2 ). 1. Hence. λ2 ∈ R such that v3 = λ1 w1 + λ2 w2 : (0. Hence. λ2 values do not exist. v2 . . Theorem 8. . ·) is a vector subspace of (E. 0. (U ∩ V. Let us suppose that one of its bases is BU∩V = {t1 . i. this is a system of two equations in two variables which is determined and its solutions are λ1 . . tl }. +.14 (Grassmann’s Formula) Let (E. v3 . 0) v3 = (0. 6. . those λ1 . . t2 . 0. ·) and (V. vs } of (U. 0) w2 = (0. w2 . 0. w2 ). +. 0) . We check now whether or not v2 ∈ L (w1 . v4 ) = E. 16 .

. tl . . vs } . . ur } . . . . . . . vs ) = U + V. + μl tl + bl+1 vl+1 + bl+2 vl+2 + . ul+2 . the r+s−l vectors t1 . . . where ul+1 . ul+2 . . . . we can obtain BV from BU∩V . Let us impose that α1 t1 + α2 t2 + . . + βr ur + +γl+1 vl+1 + γl+2 vl+2 + . Since all the vectors contained in BU∩V are also vectors in V . . vl+1 . + λl tl + al+1 ul+1 + al+2 ul+2 + . +. For the Definition 8. by means of a linear combination we can represent all the vectors w ∈ U + V . . . . ul+1 . vr are some vectors from BV after the indices have been rearranged. . . t2 . (λl + μl ) tl + +al+1 ul+1 + al+2 ul+2 + . . + ar ur + bl+1 vl+1 + bl+2 vl+2 + . Let us check now the linear independence of these r + s − l vectors. ur . . ul+2 . . . . S = U + V = {w ∈ E|∃u ∈ U. . t2 . . . . v ∈ V |w = u + v}. t2 . vs span (U + V. . In other words. . ·): L (t1 . vl+2 . ul+1 . . . ul+1 . . we can obtain BU from BU∩V . . tl . vl+1 . ur are some vectors from BU after the indices have been rearranged. . . . tl . . . vl+2 . . t2 . + bs vs Hence. . . + ar ur +μ1 t1 + μ2 t2 + .8. by adding one by one the vectors from BU : BU = {t1 . . . + bs vs = = (λ1 + μ1 ) t1 + (λ2 + μ2 ) t2 + . . . vl+1 . . where vl+1 . ur . . .4 Basis and Dimension of a Vector Space 265 Since all the vectors contained in BU∩V are also vectors in U. . + γs vs = o. . ul+2 . tl . by adding one by one the vectors from BV : BV = {t1 . . . for the Theorem of Basis extension. vl+2 . . . + αl tl + +βl+1 ul+1 + βl+2 ul+2 + . . . for the Theorem of Basis extension. vl+2 . . w = u + v = λ1 t1 + λ2 t2 + . . Thus. . we can write the generic w = u + v by means of infinite scalars. . . . . . . .3.

This means that d can be expressed as linear combination of t1 . It follows that dim (U + V )+dim (U ∩ V ) = 1+1 = dim (U)+dim (V ) = 1+1 .e. . ul+2 . = βr = 0 and α1 = α1 . The intersection U ∩ V = (0. ul+1 . d ∈ V . the above r + s − l vectors are linearly independent. . . vl+2 . . 0. d ∈ U. + γs vs = o. . . . It follows that dim (U + V ) = r + s − l where r = dim (U). αl = αl . . . only one linearly independent vector and thus one line. + αl tl + +γl+1 vl+1 + γl+2 vl+2 + . α2 = α2 . 0)} and V = R3 then dim (U) = 0 dim (V ) = 3. . .e. . vl+1 . respectively. . 0) is the origin. + αl tl + +βl+1 ul+1 + βl+2 ul+2 + . ·). . and l = dim (U ∩ V ). vl+2 . It follows that dim (U + V ) + dim (U ∩ V ) = 2 + 0 = dim (U) + dim (V ) = 1 + 1 – the two vectors in U and V .37 Let us consider the vector space R . . . . . Let us verify/interpret the Grassmann’s formula in the following cases: • if U = {(0. . . represent two coinciding lines. . vl+2 . . . we can write the expression above as α1 t1 + α2 t2 + . d ∈ U ∩ V . + γs vs ) Since d can be expressed as linear combination of vl+1 . = αl = γl+1 = γl+2 = . vs only by means of null coef- ficients. . Hence. . . It follows that dim (U + V ) + dim (U ∩ V ) = 3 + 0 = dim (U) + dim (V ) = 0 + 3. 0) = U and U + V = R3 = V . = γs = 0. respectively. 0. . i. tl . . to represent d as linear combination of them. represent two lines passing through the origin. vs compose a basis they are linearly independent. tl . . tl are linearly independent. Since t1 . . + αl tl Since t1 . . while the sum U + V is the plane that contains the two vectors. . t2 . . . α1 = α2 = . . βl+1 = βl+2 = . . . Since d can be expressed as linear combination of the elements of a basis of U. . + βr ur = d = = − (γl+1 vl+1 + γl+2 vl+2 + . Thus. . .266 8 Vector Spaces Hence. . respectively. . . In this case U ∩ V = (0. Thus these vectors compose a basis BU+V . . Thus. there is only one way. Since the null vector o can be expressed as linear combination of the vec- tors t1 . . +. U ∩ V = U + V = U = V . for the Theorem 8. Both intersection and sum coincide with the vector. . s = dim (V ). . Hence.5. . we could write where α1 t1 + α2 t2 + . t l : d = α1 t1 + α2 t2 + . ur . . . .   3  Example 8. vs . . we can distinguish two subcases – the two vectors in U and V . ·) and (V. vl+1 . t 2 . +. t2 . . +. • If the dimension of both U and V is 1. i. · and two vector subspaces: (U. t2 . 0.

. rm ∈ Kn   r1 = a1. 8. .n . . . Hence.2 ..n .1 a1. .2 . a2.. .1 . am. Hence.. Hence.   cn = a1.n : ⎛ ⎞ a1.2 . i. am. dim (U + V ) + dim (U ∩ V ) = 3 + 1 = dim (U) + dim (V ) = 2 + 2 – the planes coincide. am.n . . dim (U + V ) + dim (U ∩ V ) = 2 + 1 = dim (U) + dim (V ) = 1 + 2 • If the dimension of both U and V is 2..e. .2 .n . a1.. a2. It follows that U ∩ V = (0. a2.2 . This matrix contains m row vectors r1 .1 am. am.2 . a2. .. a1. . c2 . . . a1. .5 Row and Column Spaces Let us consider a matrix A ∈ Km. 0) and U + V = R3 . two linearly independent vectors and thus two planes passing through the origin.. . dim (U + V ) + dim (U ∩ V ) = 2 + 2 = dim (U) + dim (V ) = 2 + 2.. am. . .1 . .2 .1 .e. am.4 Basis and Dimension of a Vector Space 267 • if the dimension of U is 1 while that of V is 2. i. 0. ..1 .2 . .. It follows that U ∩ V is a line while U + V = R3 ..2 . . . . . ⎠ am. Hence. r2 .n .n   r2 = a2.. .8.. . . . dim (U + V )+dim (U ∩ V ) = 3+0 = dim (U)+dim (V ) = 1+2 – the line lays in the plane.1   c2 = a1. ⎟ . one line passing through the origin and one plane passing through the origin. . intersection and sum are the same coinciding plane. a2.n ⎟ A=⎜ ⎝ ..n and n column vectors c1 .1 . cn ∈ Km   c1 = a1. a2....n ⎜ a2. .1 a2.e. .. . we can distinguish two subcases – the planes do not coincide. i. we can distinguish two subcases: – the line does not lay in the plane.. It follows that U ∩ V = U + V and U + V = U = V .. It follows that U ∩ V = U and U + V = V .   rn = am. .

·) generated by row vectors and column vectors.3 and let us suppose that the maximum number of linearly independent column vectors is q = 2 and the third is linear combination of the other two.1 . 0.13 The row space and column space of a matrix A ∈ Km. of the matrix.2 a3. ⎝ a3.1 a2.1 + 0.2       r2 = a2. These row vectors can be written as .1 a4.2       r3 = a3.1 + μa1.3 ⎟ ⎜ A=⎝ ⎟ a3.1 a1. λa2.n .2 a1.15 Let a matrix A ∈ Km. a4.1 a3.1 .3 a1.3 = λa2.2       r4 = a4.n are the vector spaces (Kn .2 .1 + μa2.3 = λa4.2 = a2. a2. a1. a1.2 .2 This equation can be written as ⎧ ⎪ ⎪ a1. Theorem 8.3 a4.1 .1 . ·) and (Km . a3.2 ⎠ a4. +.2 . 0.1 + 0.1 .2 ⎪ ⎨a 2. 0.1 .3 ⎛ ⎞ a1.2 . λa2. μa1.1 .2 . λa4.1 .2 . +. a2.1 + μa3. λa1. The maximum number p of linearly inde- pendent row vectors is at most equal to the maximum number q of linearly column vectors.2 = a1.2 ⎜ a2.3 ⎠ a4. Let us suppose that c1 and c2 are linearly independent while c3 is linear combination of them: c3 = λc1 + μc2 ⇒ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ a1.1 + μa1. We can write the row vectors as       r1 = a1.1 ⎠ ⎝ a3. Proof Without a loss of generality let us consider a matrix A ∈ K4. λa3.1 + 0.1 a4.268 8 Vector Spaces Definition 8.3 = λa1.1 + μa2. λa3.2 a2.1 + μa4. a3.2 ⎟ . 0.3 ⎠ ⎝ a3. respectively. μa3.2 .3 a3.1 a1. λa4.2 a4.1 ⎟ + μ ⎜ a2.2 . λa1.1 + 0.2 ⎪ ⎪ a = λa3. μa2.1 + μa4. a4.1 + μa3.2 . μa4.3 ⎜ a2.3 ⎟ ⎜ ⎟ ⎜ ⎟ ⇒⎜ ⎟ = λ ⎜ a2.2 = a3.2 .2 = a4.2 ⎪ ⎩ 3.

1. λ) and (0. The maximum number q of linearly inde- pendent column vectors is at the most equal to the maximum number p of linearly independent row vectors.8. 1) ⎟ ⎜ 1 (0. λ) + a4. This is due to the fact that one column vector was supposed to be linear combination of the other two column vectors. Example 8.38 Let us consider the following matrix: ⎛ ⎞ 325 ⎜1 1 2⎟ A=⎜ ⎟ ⎝0 1 1⎠ 101 where the third column is the sum (and thus linear combination) of the first two columns. (0.1 (1. 0. 1) 0 (0. λ) + a3.2 . 1. 0.2 . λ) + a1.1 (1. 1. μ) r3 = a3. 1. 0 0⎠ ⎝0 1 0⎠ 1 0 1 0 0 0 This can written as ⎛ ⎞ ⎛ ⎞ 3 (1. 1) ⎠ + ⎝ 1 (0. 1. (0.n . 0. 1. 1. (0. The rows of the matrix can be written as ⎛ ⎞ 3 2 3+2 ⎜1 1 1 + 1⎟ A=⎜ ⎝0 1 0 + 1⎠ ⎟ 1 0 1+0 and then ⎛ ⎞ ⎛ ⎞ 3 0 3 0 2 2 ⎜1 0 1⎟ ⎜0 1 1⎟ A=⎜ ⎝0 ⎟+⎜ ⎟.16 Let a matrix A ∈ Km.2 . 1 (1.  Theorem 8. the row vectors are linear combination of (1.2 . μ) r4 = a4. (0.5 Row and Column Spaces 269 r1 = a1. 0. 1) ⎟ A=⎜ ⎟ ⎜ ⎟ ⎝ 0 (1. 1) 2 (0.1 (1. 1) ⎠ . 0. Corollary 8. Hence. 1) . λ) + a2. μ) r2 = a2. 0. 0. 1. We have p = q = 2.2 The dimension of a row (column) space is equal to the rank of the associated matrix. 1) ⎜ 1 (1. μ) .1 (1. 0. μ). 0. 1.

. These two vectors are orthogonal if xy = 0. . z ∈ E : x (y + z) = xy + xz • homogeneity: ∀x. Definition 8. y ∈ E and ∀λ ∈ R (λx) y = x (λy) = λxy • identity property: ∀x ∈ E : xx ≥ 0 and xx = 0 if and only if x = 0 Definition 8.6 Euclidean Spaces Definition 8. . • are also Euclidean spaces. Analogously. +. is a Euclidean space. i. y. xn are linearly independent. . . i. Every vector belonging to a Euclidean space is orthogonal to the null vector: ∀x ∈ En : xo = 0. The following properties of the scalar product are valid: • commutativity: ∀x. Proposition 8. 0. It must be remarked that since the scalar product is not an external composition law. an element of the set E. 1). y) = xy ∈ R. the matrix has two linearly indepen- dent columns and two linearly independent rows.e. x2 . Proposition 8. i.e. 1.15 The triple (En . . the set of geometric vectors with the opera- tions of 1) sum between vectors. . +. R2 . y ∈ En where En is a Euclidean space. In the case of Euclidean spaces the result of a scalar product is a scalar and thus not a vector (not an element of E). 2) scalar product. +.16 Let x. •) be a Euclidean vector space and x1 . . .39 The triple (V3 . Proof Let us consider the linear combination of the vectors x1 . xn be n non-null vectors belonging to it. +. x2 .14 Let (E.270 8 Vector Spaces This means that all the rows can be expressed as linear combination of two vectors. y ∈ E : xy = yx • distributivity: ∀x. +. Example 8. . ∀ indexes i. ·) be a finite-dimensional vector space over the field R. • and R3 . 8. then the vectors x1 . y) ∈ E × E : φ (x. λ2 . . i. . •) be a Euclidean space. Hence. .e. The mapping φ : E × E → R is said scalar product (or inner product) when ∀ (x. +. . . 1) and (0. xn are all orthogonal to each other.7 Let (En . (1. . . +. . ). + λn xn = o.8 Let (En . . λn ∈ R and let us impose that λ1 x1 + λ2 x2 + . j : xi is orthogonal to xj . . x2 . In vector spaces the result of the external composition law is a vector. If the vectors x1 .e. It can be appreciated that 2 is also the rank of the matrix A. a Euclidean space is not a vector space. x2 . . respectively. xn by means of the scalars λ1 . .if in the context  of Euclidean   vectors we indicate with • the scalar product of algebraic vectors. . •) is said Euclidean space.

. 0. +.41 The following vectors of the Euclidean space R3 . 0) v3 = (0. we would have linearly dependent vectors and that not all the pairs would be orthogonal. .40 The following vectors of the Euclidean space R . +. 0) = λ1 (1. •) be a Euclidean space and U ⊂ En with U = ∅. If we impose that the linear combination is equal to the null vector. •). Due to the orthogonality. + λn xn ) = 0. Proposition 8. 1) which leads to  λ1 − 5λ2 = 0 5λ1 + λ2 = 0. by xn to find that the expression is equal to 0 only if λn = 0. the vectors are linearly independent. • v1 = (1. If (U. The linear combination can be multiplied by x2 to find that the expression is equal to 0 only if λ2 = 0. +. ∀u ∈ U} with sum and product of a scalar by a vector is a vector space (U o . . •) be a Euclidean space. The orthogonal set U o = {x ∈ En |xu = 0. 5) v2 = (−5. +. +. • are orthog- onal: v1 = (1. The system is determined and the vectors are linearly independent. If we added another arbitrary vector v4 . ·). this expression is equal to λ1 x1 x1 = 0. +.6 Euclidean Spaces 271 Let us now multiply (scalar product) this linear combination by x1 and we find that x1 (λ1 x1 + λ2 x2 + . 0) v2 = (0. 1) are obviously all orthogonal (each pair of vectors is orthogonal) and linearly inde- pendent. . +. For hypothesis x1 = 0. Hence. .17 Let (En . . 1. 0. then the expression can be equal to 0 only if λ1 = 0. Definition 8.. we find that all the scalars are null. 5) + λ2 (−5.   2  Example 8.   Example 8.9 Let (En .8. 1) Let us check the linear dependence: (0. •) is also a Euclidean space then it is said Euclidean subspace of (En .

Let us multiply the equation by e and obtain xe − αee = 0 ⇒ xe = α  e 2 ⇒ xe α= .272 8 Vector Spaces Proof In order to prove that (U o . e. This means that λu ∈ U o . β ∈ R such that x can be expressed as x = αe + βy. Hence. 2) is √ √ 1+4= 5. β such that x = αe + βy always exist. for hypothesis.  y 2 Since. y = o the scalars α.4 says that a vector can always be decomposed along two orthogonal directions. (U o . This means that (x1 + x2 ) ∈ U o . indicated with Definition 8.  e 2 Let us write again the equation x = αe + βy and write it as x − βy = αe. is equal to xx. Let us calculate λxu = 0 ∀u ∈ U. for every vector x ∈ En .  √ a vector x ∈ En . Let λ ∈ R and x ∈ U o . +. +. y = o. ·) is a vector space. Proof The equation x = αe + βy is equivalent to x − αe = βy. Lemma 8. • (1.42 The module of the following vector of R2 . The module of the vector x. Let us multiply the equation by y and obtain xy − βyy = 0 ⇒ xy = β  y 2 ⇒ xy β= . ·) is a vector space we have to prove that the set U o is closed with respect to sum and product of a scalar by a vector. x2 ∈ U o . .18 Let  x . Let us calculate (x1 + x2 ) u = 0 + 0 = 0 ∀u ∈ U. ∃ two scalars α.   Example 8.4 Given two orthogonal vectors e. +. Let us consider two vectors x1 .  In essence Lemma 8.

•) be a Euclidean vector space. 1). then the Cauchy-Schwarz inequality becomes 0 = 0. the following inequality holds:  xy ≤ x  y  . y 2 2 It follows that 7 3 (5. Proof Considering that x = αe + y and ey = 0. 1) we need to find xe 7 α= =  e 2 2 and xy 3 β= =− . The two modules are  2   √ 7 2 7 49 √ 52 + 22 = 29 ≥ + = = 24.43 Let us consider the vector of R2 . that is always true. 1) − (−1. we can write  x 2 = xx = (αe + βy) (αe + βy) = = αe 2 +  βy 2 ≥ αe 2 .19 For the equation x − αe + βy the vector αe is named orthogonal projection of the vector x on the direction of e. +. 1) 2 2 Definition 8. 2). +. • : x = (5. Proposition 8.17 (Cauchy-Schwarz Inequality) Let (En . For every x and y ∈ En .  Example 8. If we want to decompose this vector along the orthogonal directions of e = (1.10 The maximum length of the orthogonal projection of a vector x is its module  x . This inequality leads to  αe ≤ x  . 2) = (1. .6 Euclidean Spaces 273   Example 8. 2 2 2 Theorem 8. Proof If either x or y is equal to the null vector.8.5. 2) and its orthogonal pro- jection αe 27 (1. 1) and y = (−1.44 Let us consider again the vector x = (5.

The modules of these two vectors are  x = 14 and  y = 2. Let us now indicate with αy the orthogonal projection of the vector x onto the direction of y.29. 2. Hence.274 8 Vector Spaces In the general case. 0). let us now compute the module of the equation:  x + y 2 = x 2 +  y 2 +2  xy  For the Cauchy-Schwarz inequality 2  xy ≤ 2  x  y .  xy ≤ x  y  . Furthermore. On the basis of this statement we can write xy = (αy + z) y = αyy = α  y 2 = α  y  y  .  x + y 2 ≤ x 2 +  y 2 +2  x  y = ( x  +  y )2 ⇒ ⇒ x + y ≤ x  +  y  .  Example 8. respectively. +.10  αy ≤ x . since α is a scalar it follows that  α  y  y = αy  y  . For every x and y ∈ En . Theorem 8. let us suppose that neither x nor y are equal to the null vector.18 (Minkowski’s Inequality) Let (En . For the Lemma 8. •) be a Euclidean space. It follows that 5 ≤ √ The 2 14 ≈ 5.4 we can always express x = αy + z where yz = 0. Considering that  5 = 5.45 Let us consider the following vectors: √ x = (3. the following inequality holds:  x + y ≤ x  +  y  . For the Proposition 8. Hence. Proof By the definition of module  x + y 2 = (x + y) (x + y) = xx + xy + yx + yy =  x 2 +  y 2 +2xy Still considering that the module of a scalar is a scalar. 1.  . √ scalar product is xy = 5. By computing the module of this equation we obtain  xy = α  y  y  where obviously the module of a scalar is the scalar itself. √ 1) and y = (1.

. namely Gram matrix. For every x and y ∈ En . It can be easily verified that the scalar product xy = 0. . 1) which has square module  x + y = 11. an ). . an are linearly dependent. an ) is equal to 0. Example 8. 2.4 one vector can be expressed as linear combination of the others. . −1. i.19 (Pythagorean Formula) Let (En . . a2 . and x. ⎠. If an is multiplied by a generic ai for i = 1. . . n − 1 we obtain an ai = λ1 a1 ai + λ2 a2 ai + .. the modules of these two vectors are  x = 14 and  y = 2.  x 2 = 6 and  y 2 = 5.15. . . . λn−1 ∈ R. λ2 . . . . Theorem 8. .. Proof If the vectors are linearly dependent for the Proposition 8. . . . respectively. 3. 2. . .46 Let us consider again the vectors: x = (3. 1. .. . 0). Definition 8. the vectors are orthogonal. 1) and y = (1. ..e.. •) be a Euclidean space. .√As shown above.20 Let a1 . the following equality holds:  x + y 2 = x 2 +  y 2 Proof By the definition of module  x + y 2 = (x + y) (x + y) = xx + xy + yx + yy =  x 2 +  y 2 since for the perpendicularity xy = 0. Theorem 8. . . 2. + λn−1 an−1 where λ1 .47 Let us consider the vectors x = (2. . . a2 an ⎟ ⎜ ⎟ ⎝ . a2 . +. . a2 . √ √ √ The√sum x + y = (4. a1 an ⎜ a2 a1 a2 a2 .. If we calculate the modules of these vectors we obtain. then the Gram determinant G (a1 . It follows that 26 ≈ 5. . an an The Gram determinant is indicated with G (a1 .10 ≤ 2 + 14 ≈ 5.20 If the vectors a1 . which is obviously equal to 6 + 5. Thus. . . an a1 an a2 . a2 .6 Euclidean Spaces 275 Example 8. let us write an = λ1 a1 + λ2 a2 + .8. .. . respectively. ⎛ ⎞ a1 a1 a1 a2 . 0). . an be n vectors belonging to a Euclidean space. y orthogonal. The Gram determinant or Gramian is the determinant of the following matrix. . The sum is x + y = (3. . . 1. 1)√and y = (1. + λn−1 an−1 ai . .. 1) has module 26.

if we substitute the elements of the last line of the Gram matrix with the linear combination above. 0) . an are linearly independent the Gram deter- minant G (a1 .276 8 Vector Spaces Thus. 1) v2 = (0. 2. 1. Hence. 0) v3 = (0. The Gram matrix is ⎛ ⎞ ⎛ ⎞ v1 v1 v1 v2 v1 v3 100 ⎝ v2 v1 v2 v2 v2 v3 ⎠ = ⎝ 0 1 0 ⎠ . . 2. a2 . v3 v1 v3 v2 v3 v3 228 The Gramian determinant is G (v1 . . . v3 v1 v3 v2 v3 v3 001 whose determinant is 1 > 0. we have expressed the nth row as a linear combi- nation of all the other rows by means of the scalars λ1 . a2 . . 1. . . Thus. Example 8.  Example 8. λn−1 . Let us calculate the Gram matrix ⎛ ⎞ ⎛ ⎞ v1 v1 v1 v2 v1 v3 102 ⎝ v2 v1 v2 v2 v2 v3 ⎠ = ⎝ 0 1 2 ⎠ . λ2 = 2. the Gramian determinant associated to linearly dependent vectors is null. λ2 . 1) v2 = (0. It can be easily checked that these vectors are linearly dependent since v3 = λ1 v1 + λ2 v2 with λ1 . an ) > 0. . 0. Theorem 8. 0. the Gramian is 0. as stated in the theorem above. . .49 Let us consider the following linearly independent vectors: v1 = (0. . . . v2 . The proof of this theorem is given in Appendix B.21 If the vectors a1 . 2) . . v3 ) = 8 + 0 + 0 − 4 − 4 − 0 = 0. 0.48 Let us consider the following vectors: v1 = (0. 0) v3 = (1.

. It can be seen that the Gramian associated to two vectors x and y is xx xy det = x 2  y 2 −  xy 2 ≥ 0 ⇒ xy ≤ x  y  . xn } can be transformed into an orthonormal basis Be = {e1 . it must be observed that the latter two theorems give a generalization of the Cauchy-Schwarz inequality. +. An orthonormal basis of a Euclidean space is a basis composed of versors where each arbitrary pair of vectors is orthogonal. . .22 Every Euclidean space has an orthonormal basis. Obviously the Gramian associated to an orthonormal basis is 1 since the Gram matrix would be the identity matrix. . .  y2  . every basis B = {x1 .22 Let (En . In a Euclidean space having dimension equal to n. . •) be a Euclidean space. This method consists of the following steps: • The first versor can be obtained as x1 e1 =  x1  • The second versor will have the direction of y2 = x2 + λ1 e1 . Theorem 8. en } by applying the so-called Gram-Schmidt method. x2 . The versor of the vector x is x x̂ = . e2 .6 Euclidean Spaces 277 In addition. xy yy Definition 8. If follows that λ1 = −e1 x2 and y2 = x2 − e1 x2 e1 .8. y2 e2 = . Considering that the module a versor is equal to 1 ( e1 = 1). ||x|| Definition 8. We impose the orthogonality: 0 = e1 y2 = e1 x2 + λ1 e1 e1 = e1 x2 + λ1  e1 2 .21 Let a vector x ∈ En . Hence.

2). 1) + λ2 (2.  y3  • The nth versor has direction given by yn = xn + λ1 e1 + λ2 e2 + . + λn−1 en−1 − en−1 xn en−1 .1) It follows that λ2 = −e2 x3 and y3 = x3 + λ1 e1 − e2 x3 e2 . By imposing the orthogonality we obtain λn−1 = −en−1 xn .  yn  This method is a simple and yet powerful instrument since orthogonal bases are in general much more convenient to handle and make mathematical models simpler (all the angles among reference axes is the same and the scalar product of any pair is null) but not less accurate. yn en = .50 Let us consider the following two vectors of R2 : x1 = (3. Example 8. 2) = o . . . Hence. The latter can be shown by checking that λ1 (3. y3 e3 = . Hence. It can be easily shown that these two vectors span the entire R2 and that they are linearly independent. (8.278 8 Vector Spaces • The third versor will have the direction of y3 = x3 + λ1 e1 + λ2 e2 We impose the orthogonality: 0 = y3 e2 = x3 e2 + λ1 e1 e2 + λ2 e2 e2 = x3 e2 + λ2 . 1) and x2 = (2.

6 Euclidean Spaces 279 only if λ1 . 2) √ . +. 10 10 Let us find the orthogonal direction by imposing that y2 e1 = 0.2) . 0. +. we can calculate the versor y2 (−0.  y2  1. 3 1 y2 e1 = 0 = x2 e1 + λ1 e1 e1 = (2.√ .2) 0. +. e2 are an orthonormal basis of R2 . 1. 10 10 which leads to 6 2 8 λ1 = − √ − √ = − √ . 2) + − √ √ . 10 10 10 The vector y2 is 8 3 1 24 8 y2 = x2 + λ1 e1 = (2. we know that product of vectors).2 e2 = = = − .√ = (2. 1. e2 }.26 1.  x1  10 10 10 For the calculation of the second vector the direction must be detected at first.4. · . The direction is that given by 3 1 y2 = x2 + λ1 e1 = (2. This statement is equivalent to say that the homogeneous system of linear equations  3λ1 + 2λ2 = 0 λ1 + 2λ2 = 0 is determined. x2 } is a basis of the vector space R2 .4 1. B = {x1 . R . • is a Euclidean space.  2in R the scalar product can be defined (as a scalar 2 Furthermore. λ2 = 0. Let us now apply the Gram-Schmidt method to find an orthonormal basis BU = {e1 . Hence.   Thus.26   The vectors e1 . · . 10 10 10 10 10 Finally.√ . − = (−0. . 2) + λ1 √ . 2) + − .26 1.8. 1) 3 1 e1 = = √ = √ . √ + λ1 . .4. The first vector is x1 (3. Hence.

Prove that if U ⊂ V or V ⊂ U. 8. y. and (U ⊕ V. 8. z) ∈ R3 | . ·) is a vector space where   U = (x.1 Let (U.9 Find a basis and determine the dimension of the vector space (E. ⎪ ⎩ ⎪ ⎩ ⎪ ⎭ 4x + 8y + 3z = 0 . ·).7 Determine whether or not (U ∩ V. ·). +. +. 9x − 2y − 8z = 0 8. ·). 8. z) ∈ R3 |5x + 5y − 5z = 0 .280 8 Vector Spaces 8. y. ·) where   E = (x. +. z) ∈ R3 |8x + 4y − 2z = 0 .6 Given the sets U and V of the two problems above determine whether or not (U ⊕ V. then (U ∪ V. +. y. +. z) ∈ R | 5y + 13z = 0 3 . +. y. z) ∈ R3 |6x + 8y − 4z = 0 and   V = (x. ·) where   −3x − 5y + 2z = 0 E = (x.5 Given the sets U and V of the two problems above determine whether or not (U + V. z) ∈ R3 |5x + 5y − 5z − 5 = 0 . ·) is a vector space where   V = (x. z) ∈ R3 |5x + 2y − 3z = 0 . ·) where ⎧ ⎧ ⎫ ⎪ ⎨ ⎪ ⎨6x − y + z = 0 ⎪ ⎬ E = (x.10 Find a basis and determine the dimension of the vector space (E. +. 8. +. ·) are vector spaces where   U = (x. 8. ·) and (V. +. +. +. 8. +. ·) is a vector space.3 Determine whether or not (V. +. +.4 Given the sets U and V of the two problems above determine whether or not (U ∩ V. y. ·) is a vector space. 8. y. ·) is a vector space.8 Find a basis and determine the dimension of the vector space (E. y. 8.7 Exercises 8. ·). +. +. (U + V. ·) is a vector subspace of (E.2 Determine whether or not (U. ·) be two vector subspaces of (E.

4. 1.7 Exercises 281 8. . 3) v = (0. verify their linear independence. 1) v = (1. −2) a = (3.11 Let us consider the following vectors ∈ R4 : u = (2.12 Given the following three vectors ∈ R3 : u = (0. starting from these three vectors. Find a basis by eliminating a linearly dependent vector. Then. 2. −1) w = (0. 2. 1) b = (0. 5. 0.8. 2) . 2. find an orthonormal basis by applying Gram-Schmidt method. 1. 8. 1. 0. 0. 2. 1. ) . 0) w = (1. 1.

As seen in Chap.2 Let f be a mapping E → F. the subject “lin- ear algebra” has never been introduced in the previous chapters. A vector w such that w = f (u) is said to be the mapped (or transformed) of u through f . DOI 10. 1 by using the notions of vector and vector space. we see that the concept of vector is the elementary entity of linear algebra. +.Chapter 9 Linear Mappings 9. that systems of linear equations are vector equations. ·) and (F. If we consider that vector spaces are still vectors endowed with composition laws. is a set defined as Im ( f ) = {w ∈ F|∃u ∈ E such that f (u) = w} . and that a number can be interpreted as a single element vector. let us illustrate the subject at the intuitive level. +. which justifies the adjective “linear”. +.1 Let (E. 1. If dom( f ) = E. Let the domain U ⊂ E be a set such that ∀u ∈ U : ∃!w ∈ F| f (u) = w and let us indicate it with dom( f ). Before entering into the details of this subject let us define again the concept of mapping shown in Chap. 4. Hence. Definition 9. the relation f is said mapping.1 Introductory Concepts Although the majority of the topics in this book (all the topics taken into account excluding only complex polynomials) are related to linear algebra. Linear Algebra for Computational Sciences and Engineering. the subject linear algebra studies “portions” of lines and their interactions. © Springer International Publishing Switzerland 2016 283 F. ·). while the origin of the term algebra has been mentioned in Chap. where E and F are sets associated to the vector spaces (E. indicated with Im ( f ). ·) and (F. the use of the adjective linear has never been discussed. Linear algebra can be seen as a subject that studies vectors. Let f : E → F be a relation. a vector is generated by a segment of a line. Before entering into the formal definitions of linearity. More specifically. The image of f . +. ·) be two vector spaces defined over the scalar field K. Definition 9. that matrices are collections of row (or column) vectors. Neri.1007/978-3-319-40341-0_9 .

is a set defined as f −1 (V ) = {v ∈ E| f (v) ∈ W } . indicated with f −1 (V ). Example 9.4 Let f be a mapping E → W ⊂ F. The reverse image of R through f is R2 . The reverse image of ]0. · an example of map- ping R2 → R3 is f (x. z) = (x + 2y + −z + 2.2 Let (R. +. i. The image of V through f .7 From the vector spaces R2 . its image is equal to the entire set of the vector space. v ∈ E such that f (v) = f v ⇒ v = v .5 From the vector space R2 . +. −4x + 6y + 8. y) = (x + 2y + 2. +. v ∈ E such that v = v : Definition 9. ·) be a vector space.3 Let f be a mapping V ⊂ E → F. Definition 9. The reverse image of W through f . +. Example 9. An example of mapping R → R is f (x) = e x . The reverse image of [1. The domain of the mapping dom( f ) = R while the image Im ( f ) = R. y) = x + 2y + 2. · and R3 .     Example 9. +.e. 6y − 4z + 2). The domain of the mapping dom( f ) = R while the image Im ( f ) = ]0. An alternative and equivalent definition  of injective mapping is: f is injective if ∀v.9 The mapping R → R. An example of mapping R → R is f (x) = x 2 + 2x + 2. +. indicated with f (V ).1 Let (R. +. ∞[. x − y). +. ·) be two vector spaces. +. The reverse image of R through f is R.8 The mapping R → R.      The mapping f is said injective if ∀v.6 f (v) = f v . ∞[. · and (R. ∞[ through f is R. Definition 9. ∞[. ∞[ through f is R. · an example of map- ping R3 → R2 is f (x. y. +. y) = (6x − 2y + 9. Example 9. its image is not equal to the entire set of the vector space. is a set defined as f (V ) = {w ∈ F|∃v ∈ V such that f (v) = w} .   Example 9.284 9 Linear Mappings Definition 9.6 From the vector spaces R2 .e. · an example of mapping R2 → R2 is f (x.   Example 9. The domain of the mapping dom( f ) = R2 while the image Im ( f ) = R. f (x) = e x is not surjective because Im ( f ) = ]0.3 Let (R. f (x) = 2x + 2 is surjective because Im ( f ) = R. 8y − 3). ·) be a vector space. The domain of the mapping dom( f ) = R while the image Im ( f ) = [1. Example 9. An example of mapping R → R is f (x) = 2x + 2. i. ·) be a vector space. Example 9.5 The mapping f is said surjective if the image of f coincides with F: Im ( f ) = F. . · and R3 . An example of map- ping R2 → R is f (x.4 Let R2 .     Example 9.

this mapping is bijective.11 The mapping R → R. Definition 9. ·).9 Let f be a mapping E → F. . ∀v. +.8 Let f be a mapping E → F.9. Definition 9. . x2 such that x1 = x2 . .14 The mapping R → R.10 The mapping R → R. + λn vn ) = = λ1 f (v1 ) + λ2 f (v2 ) + .1 Introductory Concepts 285 Example 9. Example 9. Hence. . x2 with x1 = x2 such that x12 = x22 . v ∈ E : f λv + λ v = λ f (v) + λ f v or extended in the following way ∀λ1 . +. .12 The mapping R → R. where E and F are sets associated to the vector spaces (E. this mapping is bijective. For example if x1 = 3 and x2 = −3. + λn f (vn ) . Example 9. thus x1 = x2 . ·) and (F. The mapping is also surjective as its image is R. where E and F are sets associated to the vector spaces (E. Hence. f (x) = x 2 is not injective because ∃x1 . f (x) = x 3 is injective because ∀x1 . it occurs that x12 = x22 = 9. λ2 . +.13 The mapping R → R. Example 9. The mapping f is said affine mapping if the mapping g (v) = f (v) − f (o) is linear. . Example 9. this mapping is not bijective. . . . . Example 9. the mapping R → R. x2 such that x1 = x2 . it follows that x13 = x23 .7 The mapping f is said bijective if f is injective and surjective. Hence. f (x) = e x is injective but not surjective. . vn f (λ1 v1 + λ2 v2 + . it follows that e x1 = e x2 . The mapping f is said linear mapping if the following properties are valid:     • additivity: ∀v. λn ∀v1 . f (x) = e x is injective because ∀x1 .15 As shown above. Definition 9. ·). +. f (x) = x 3 is injective. . v2 . λ ∈ K. ·) and (F. v ∈ E : f v + v = f (v) + f v • homogeneity: ∀λ ∈ K and ∀v ∈ E : f (λv) = λ f (v) The two properties of linearity can be combined and written in the following compact way:     ∀λ. f (x) = 2x + 2 is injective and surjective. .

18 Let us consider the following mapping f : R → R ∀x : f (x) = x + 2 and let us check its linearity. Let us consider two vectors (numbers in this case) x1 and x2 . It follows that f (x1 + x2 ) = f (x1 ) + f (x2 ).16 Let us consider again the following mapping f : R → R ∀x : f (x) = e x and let us check its linearity. Let us consider two vectors (numbers in this case) x1 and x2 . Example 9. Hence the mapping is not linear. We have that f (x1 + x2 ) = x1 + x2 + 2 f (x1 ) + f (x2 ) = x1 + 2 + x2 + 2 = x1 + x2 + 4. from basic calculus we know that e x1 +x2 = e x1 + e x2 . We have that f (λx) = 2λx λ f (x) = λ2x. Let us consider two vectors (numbers in this case) x1 and x2 . It follows that f (λx) = λ f (x). .286 9 Linear Mappings Example 9. The additivity is not verified.17 Let us consider the following mapping f : R → R ∀x : f (x) = 2x and let us check its linearity. Example 9. Hence. Obviously. We have that f (x1 + x2 ) = 2 (x1 + x2 ) f (x1 ) + f (x2 ) = 2x1 + 2x2 . since also homogeneity is verified this mapping is linear. this mapping is additive. Let us check the homogeneity by considering a generic scalar λ. We know from calculus that an exponential function is not linear. Hence. Let us calculate f (x1 + x2 ) = e x1 +x2 .

If two vectors are v = (x. y. The mapping is linear. y. y. If it is linear then the two properties of additivity  and homogeneity are valid. 0) = = (23x − 51z. 5x + 5y + 5z) and let us check its linearity. This is against Proposition 9. 5x + 5y + 5z) is linear.1. 0. λz) = (λ2x − λ5z. y. 1. y. λy. 0. z  . . Hence the mapping is not linear. 3x + 4y − 5z. z) and v = x  . 32x + 5y − 6z + 1. 4y − 5z) = λ f (v) .1 Introductory Concepts 287 It follows that f (x1 + x2 ) = f (x1 ) + f (x2 ). 32x + 5y − 6z + 1. g (x. 0. This means that f (x) is an affine mapping. λ4y − λ5z) = = λ (2x − 5z. Example 9. 4y  − 5z  = f (v) + f v and f (λv) = f (λx. 4y − 5z) + 2x − 5z  . z : f (x. 3x + 4y − 5z. 0) = (0. If we calculate f (0. Still. 0) = oR4 . y  . Hence. z + z  = 2 x + x  − 5 z + z  . 3x + 4y − 5z.9. 1. Example 9. f (0) = 2 and g (x) = f (x) − f (0) = x which is a linear mapping. Nonetheless. y. z) = (2x − 5z.20 Let us consider now the following mapping f : R3 → R4 ∀x. Hence. z) = f (x. z : f (x. 0) = = (23x − 51z. 4 y + y  − 5 z + z  =       (2x − 5z. z) − f (0. 0. y + y  . then              f v + v = f x + x  . z) = (23x − 51z.19 Let us consider the following mapping f : R3 → R2 ∀x. 5x + 5y + 5z) − (0. the mapping is affine. y. this mapping is not linear. 32x + 5y − 6z. 4y − 5z) and let us check its linearity.

f (x. Considering that oE = (0. Furthermore.1 Let f be a linear mapping E → F.2 Let f be a linear mapping E → F. Proposition 9. Hence this mapping is linear. and v = (x. which in this case is a number we have f (−v) = f (−x) = (2) (−x) = −2x = − f (v) . f (x) = 2x. 0). +. +. Example 9. Proof We can write f (−v) = f (−1v) = −1 f (v) = − f (v) .22 Let us consider the following linear mapping f : R → R.  Example 9. considering the vector v = x. It follows that f (−v) = − f (v) . oF = 0.  Proposition 9. Let us indicate with oE and oF the null vectors of the vector spaces (E. ·). With reference to the notation of Proposition 9. let us check the two propositions: . we can easily see that oE = 0 and oF = 0. if we calculate f (oE ) = f (0) = (2) (0) = 0 = oF .1. ·) and (F.21 Let us consider the following mapping f : V3 → R v : f ( #» ∀ #» v ) = ( #» u #» v) Let us check the additivity of this mapping:     f #» v  = #» v + #» u #» v + #» v = = #» v + #» u #» v ) + f ( #» v  = f ( #» u #» v). It follows that f (oE ) = oF .288 9 Linear Mappings Example 9. respectively. y) = 2x + y. Then.23 Let us consider the following linear mapping f : R2 → R. Proof Simply let us write f (oE ) = f (0oE ) = 0 f (oE ) = oF . Let us check now the homogeneity v ) = #» f (λ #» v ) = λ ( #» u (λ #» v ) = λ f ( #» u #» v). y).

Example 9. Example 9. f (x. f (x) = 2x is an endomorphism since both the sets are R. y) = (x. Example 9. f (x. Example 9. 0) = (2) (0) + 0 = 0 = oF and f (−v) = f (−x. Definition 9. Obviously for a vector (x.e.27 The linear mapping f : R → R.26 The linear mapping f : R2 → R2 . This explains why identity mappings make sense only for endomorphisms. f (x. It can easily be proved that this mapping is linear and is an endomorphism. i. Example 9. y) = (0.28 The linear mapping f : R2 → R. y) = 2x − 4y is not an endomorphism since R2 = R.24 The linear mapping f : R → R.9.29 The linear mapping f : R2 → R2 . f (x. If E = F. It can easily be proved that this mapping is linear.32 If we consider a linear mapping f : R2 → R. . y) = 0 is a null mapping. 0) is a null map- ping.11 A null mapping O : E → F is a mapping defined in the following way: ∀v ∈ E : O (v) = oF . y) = (x.30 The linear mapping f : R → R. Example 9. y) is an identity mapping. Example 9.12 An identity mapping I : E → F is a mapping defined in the following way: ∀v ∈ E : I (v) = v. y) ∈ R2 we cannot have f (x. Example 9.25 The linear mapping f : R2 → R. y) = (2x + 3y. y) as it would not be defined within R. the linear mapping is said endomorphism.10 Let f be a linear mapping E → F. Example 9. −y) = (2) (−x) − y = −2x − y = − (2x + y) = − f (v) .1 Introductory Concepts 289 f (oE ) = f (0. f : E → E.31 The linear mapping f : R2 → R2 . we cannot define an identity mapping since. f (x) = 0 is a null mapping.2 Endomorphisms and Kernel Definition 9. f (x) = x is an identity mapping. Definition 9. f (x. 9. 9x − 2y) is an endomorphism.

e. λn = 0. . + λn f (vn ) = f (oE ) . Let us check the linear dependence of the mapped vectors: f (v1 ) = (1. vn ∈ E are linearly dependent then ∃ scalars λ1 . + λn f (vn ) with λ1 . . . . . . . λ2 . . . y) = (x + y. . For the linearity of the mapping we can write f (λ1 v1 + λ2 v2 + . + f (λn vn ) = = λ1 f (v1 ) + λ2 f (v2 ) + . + λn vn . . . f (v2 ) . . . . 1) v2 = (0. f (vn ) ∈ F are also linearly dependent. 2) f (v2 ) = (4. Since for Proposition 9. It must be observed that there is no dual proposition for linearly independent vectors. If v1 . . . v2 . . v2 . 0 such that oE = λ1 v1 + λ2 v2 + . . . f (vn ) are linearly depen- dent. . . 4) and the linear mapping f : R2 → R2 defined as f (x.290 9 Linear Mappings Proposition 9.e. f (vn ) ∈ F. . Let us apply the linear mapping to this equation: f (oE ) = f (λ1 v1 + λ2 v2 + . + λn vn ) = = f (λ1 v1 ) + f (λ2 v2 ) + . . . . . These vectors are also linearly dependent since f (v2 ) = 4 f (v1 ). . .33 To understand the meaning of the proposition above let us consider the following vectors of R2 : v1 = (0. . f (v2 ) . The two vectors are clearly linearly dependent as v2 = 4v1 . . + λn vn ) . i. .3 Let f : E → F be a linear mapping. . . . . . . λn = 0. . . 0. f (v2 ) . . Proof If v1 . f (v1 ) . . 0. . λ2 . . . . x + 2y) . 0. vn ∈ E are linearly independent then we cannot draw conclusions on the linear dependence of f (v1 ) .  Example 9. . if v1 . . . . vn ∈ E are linearly dependent then f (v1 ) . . . v2 .1 f (oE ) = oF we have oF = λ1 f (v1 ) + λ2 f (v2 ) + . . 8) . i.

+. then λv ∈ U . ·) and the linear mapping f : R → R defined as f (x) = 5x. +. The theorem above states that ( f (R) . y) = (x + y. +. ·) is a vector subspace of (F. +. +. 1) v2 = (1. ·) is a vector space. ·). The set f (U ) is closed with respect to the external composition law. Hence. Hence. ·) is obviously a vector space. ·) is a vector space. ·) is a vector subspace of (F. +. The set f (U ) is closed with respect to the internal composition law. We have a case where the transformed of linearly independent vectors are linearly dependent. It follows that the triple ( f (U ) . ·) be a vector subspace of (E. 2) f (v2 ) = (1. Since for hypothesis (U. then v+v ∈ U . f v + v ∈ f (U ). ·) is a vector space.36 Let us consider the vector space R2 . In this case. +. +.1 Let f : E → F be a linear mapping and (U. Theorem 9. .34 Let us now consider the following linearly independent vectors of R2 : v1 = (0. +. 0) and the linear mapping: f : R2 → R2 defined as f (x. if we consider two vectors w. +. ·). ·).   Example 9.2 Endomorphisms and Kernel 291 Example 9. 2x + 2y) .   Since for hypothesis (U. +. f (λv) ∈ f (U ). · and the linear mapping f : R2 → R defined  as f (x) = 6x + 4y. 2) . +. w ∈ f (U ) then     w + w = f (v) + f v = f v + v . the application of the theorem is straightforward since f (R) = R and (R. Since the set f (U ) is closed with respect to both the composition laws the triple ( f (U ) . We obtain f (v1 ) = (1.  Example 9.35 Let us consider the vector space (R. ·) is also a vector space. ·) we have to show that the set f (U ) is closed with respect to the two composition laws. Let us now consider a generic scalar λ ∈ K and calculate λw + λ f (v) = f (λv) . Let f (U ) be the set of all the transformed vectors of U . ·) is a vector subspace of (F. +. +. By definition. Proof In order to prove that ( f (U ) . +. Thus. the fact that a vector w ∈ (U ) means that ∃v ∈ U such that f (v) = w. In this case f R2 = R and (R.9.

Example 9. ·) is a vector space. We can write for the linearity of f     f v + v = f (v) + f v . The linear mapping f simply projects the points of this plane into another  plane. +. This can be easily calculated imposing f (x) = 0. ·). z) ∈ R3 | x + 2y + z = 0} and the linear mapping f : R3 → R2 defined as f (x. The set U . +. Since (W.2 Let f : E → F bea linear mapping.  Definition 9. +. Hence. 4y + 5z) . 8. +.292 9 Linear Mappings Example 9. f (v) + f v ∈ W . the set f −1 (W ) is closed  −1  with respect to the second composition law. 5x = 0. Let us consider a generic scalar λ ∈ K and calculate f (λv) = λ f (v) . +. The following examples better clarify the concept of kernel.13 Let f : E → F be a linear mapping. · is a vector subspace of (E. If (W. i.  that is the two-dimensional space R2 .e. ·) is a vector subspace of (F. +. +. The kernel of f is the set ker ( f ) = {v ∈ E| f (v) = oF } . y.     Since (W. ·) we have to prove the closure of f −1 (W ) with respect to the two composition laws. as we know from Chap. ·). then f −1 (W ) . Thus. z) = (3x + 2y. · is a vector subspace of (E. ·). · is a vector subspace of (E. +. the set f −1 (W ) is closed with respect to the first composition law. · is a vector space. Corollary 9. f (W ) . +. ·) is a vector space. Since f (λv) ∈ W the λv ∈ f −1 (W ). +. If a vector v ∈ f −1 (W ) the f (v) ∈ W . ·) where U = {(x.37 Let us consider the vector space (U. f (U ) = R2 and obviously R2 . Thus. Hence. Thus. In this case the kernel ker = {0}. can be interpreted as a plane of the space passing through the origin of the reference system. . y. ·). +.1 Let f : E → F be a linear mapping. λ f (v) ∈ W . f v + v ∈ W and v + v ∈ f −1 (W ). Theorem 9.38 Let us find the kernel of the linear mapping f : R → R defined as f (x) = 5x. +. +. If f (E) = Im ( f ) then Im ( f ) is a vector subspace of (F.   Proof In order to prove that f −1 (W ) .

y) values such that f (x. z) such that the mapping is equal to oF . y) = 0. ker ( f ) = {(0. More formally. y. y.41 Let us consider now the linear mapping f : R3 → R3 defined as f (x. To find the kernel means to find the (x. y. ∀α ∈ R. i. x − y − z. z) = (x + y + z. 5α) . x − y − z. These solutions are all proportional to (α. y) = 5x − y.2 Endomorphisms and Kernel 293 Example 9. Example 9.e. Thus. those (x. α ∈ R}. the kernel is ker ( f ) = {(α. z) = (x + y + z. x + y + 2z) . To find the kernel means to solve the following system of linear equations: ⎧ ⎪ ⎨x + y + z = 0 x−y−z =0 ⎪ ⎩ 2x + 2y + 2z = 0. Example 9. 2x + 2y + 2z) . This is an equation in two variables. For the Rouché Capelli Theorem this equation has ∞1 solutions.9. We can easily verify that ⎛ ⎞ 1 1 1 det ⎝ 1 −1 −1 ⎠ = 0 2 2 2 . y. 5α). This means that the kernel of this linear mapping is the composed of the point a line belonging to R2 and passing through the origin. 0)} = {oE }. It can be easily verified that this homogeneous system of linear equations is determined. 0. y) values that satisfy the equation 5x − y = 0.40 Let us consider the linear mapping f : R3 → R3 defined as f (x. z) such that ⎧ ⎪ ⎨x + y + z = 0 x−y−z =0 ⎪ ⎩ x + y + 2z = 0. The kernel of this mapping is the set of (x.39 Let us consider the linear mapping f : R2 → R defined as f (x. This means that the kernel of this mapping is the set of (x.

Thus. Let us consider a generic scalar λ ∈ K and calculate f (λv) = λ f (v) = λoF = oF . This confirms the statement of the theorem above. −1) . −1) . +. α ∈ R}. 2x + 2y + 2z) . 1. If we pose x = α we find out that the infinite solutions of the system are α (0. We already know that ker ( f ) = {α (0. ·) is a vector space. ·) is a vector subspace of (E.  As shown in the examples above. 1. the calculation of ker f can always be con- sidered as the solution of a homogeneous system of linear equations. z) = (x + y + z.3 Let f : E → F be a linear mapping. ker ( f ) is closed with respect to the first composition law. Hence. This means that (ker ( f ) . Proof Let us consider two vectors v. ·). it can be observed that (ker ( f ) . ker f always contains the null vector. i. v ∈ E. the kernel of the mapping is ker ( f ) = {α (0. v ∈ ker ( f ). This means at first that ker ( f ) ⊂ R3 and. . It follows that f (v) = f v if and only if v − v ∈ ker ( f ). +. ·) is a vector subspace of (E. Thus. +. Thus. From its formulation. f (λv) ∈ ker ( f ) and ker ( f ) is closed with respect to the second composition law. Example 9. The next example further clarifies this fact. more specifically.   Proof If f (v) = f v then     f (v) − f v = oF ⇒ f (v) + f −v = oF ⇒   ⇒ f v − v = oF . +. α ∈ R}. Thus. +.     f v + v = f (v) + f v = oF + oF = oF   and f v + v ∈ ker ( f ). The triple (ker ( f ) . 1). ·) is a vector space having dimension one. (ker ( f ) . −1. x − y − z. +.e. ∀α ∈ R. If a vector v ∈ ker ( f ) then f (v) = oF .4   Let f : E → F be a linear mapping and v. Theorem 9. this system is undetermined and has ∞1 solutions. Thus. y. ·).  Theorem 9.42 Let us consider again the linear mapping f : R3 → R3 defined as f (x. can be seen as a line of the space passing through the origin of the reference system.294 9 Linear Mappings and the rank of the system is ρ = 2.

Theorem 9. As stated from the theorem the mapped values are the same.2 Endomorphisms and Kernel 295 For the definition of kernel v − v ∈ ker ( f ). x − y − z.5 Let f : E → F be a linear mapping. Let us calculate the mapping of these two vectors:  = (6 + 4 − 7. 5. ker ( f ) = {oE }. for definition of injective mapping this means that v = oE . . Since f is injective. y.43 Again for the linear mapping f : R3 → R3 defined as f (x. 6 − 4 + 7. For definition of kernel ∀v ∈ ker ( f ) : f (v) = oF . The mapping f is injective if and only if ker ( f ) = {oE }. v = (6. e. every vector v in the kernel is oE . f (oE ) = oF . 4.e. 2x + 2y + 2z) . 2x + 2y + 2z) we consider two vectors v. z) = (x + y + z. 5. Thus. −1. i. 1) which belongs to ker ( f ). Since. 6) f (v) f v = (6 + 5 − 8.  If v − v ∈ ker ( f ) then     f v − v = oF ⇒ f (v) − f v = oF ⇒   ⇒ f (v) = f v . 12 + 10 − 16) = (3. Proof Let us assume that f is injective. −7) v = (6. −8) . v = v . On the other hand. −8) = (0. v ∈ E such that f (v) = f v it follows that v = v then f is injective. 9.44 Let us consider once again. z) = (x + y + z. Hence. 12 + 8 − 14) = (3. the mapping f : R3 → R3 defined as f (x. −7) − (6.  Example 9. y. v ∈ E such that f (v) = f v ⇒ f (v) − f v = oF . However.   Let us assume   that ker = ( )  E f {o }. 4. 6) . since for hypothesis ker ( f ) = {oE } then v − v = oE .9.1. 6 − 5 + 8. ∀v.  Example 9. For the definition of kernel v − v ∈ ker ( f ). For every two vectors v. v ∈ ker ( f ). It follows from the linearity of f that f v − v = oF . 9.g. f (v) = f (oE ). for the Proposition 9. x − y − z. Hence. Let us calculate their difference v − v = (6.

its only solution is the null vector) then the kernel is only the null vector and the mapping is injective. 0) . i. 0) . −9) it follows that v = v . f is not a mapping while in the case m = 0.e.46 A linear mapping f : R → R defined as f (x) = mx with m finite and m = 0 is injective. 1. 8. z) = (x + y + z. In other words this mapping is not injective. 0. A mapping is injective if it always occurs that the transformed of different vectors are different. The function is not injective and its kernel is ker ( f ) = {α (1) . y. the entire set R. . as expected.45 If we consider the mapping f : R3 → R3 defined as f (x. As a further comment we can say that if the homogeneous system of linear equa- tions associated to a mapping is determined (i. The transformed vectors are  = (0. α ∈ R}. −1)} = oE = (0. x + y + 2z) we know that its kernel is oE . −8) v = (0.296 9 Linear Mappings We know that ker ( f ) = {α (0. Example 9. 0) f (v) f v = (0. 0. f is a mapping and is also linear. It can be observed that its kernel is ker ( f ) = {0}. We can observe that if we take two different vectors and calculate their transformed vectors we will never obtain the same vector. In the special case m = ∞. In this case if we consider the following vectors v = (0. in correspondence to v = v we have f (v) = f v . From the theorem above we expect that this mapping is not injective. This mapping is injective.e. Example 9. x − y − z.   Thus. 0. 9.

1) f (w) = (1. . 0) v = (0. Since for hypothesis f is injective.9. . . . 0 such that oF = λ1 f (v1 ) + λ2 f (v2 ) + · · · + λn f (vn ) .6 Let f : E → F be a linear mapping. 0. . . f (vn ) are also linearly independent vectors ∈ F. . . Hence we can see that there is a relation between the kernel and the deep meaning of the mapping. . λn = 0. 1. The transformed of these vectors are f (u) = (1. . . . f (vn ) must be linearly independent. . This topic will be discussed more thoroughly below. . . . vn are linearly independent. . . it follows that oE = λ1 v1 + λ2 v2 + · · · + λn vn with λ1 . f (v2 ) . 0. . Theorem 9. . v2 . . If m = 0 the matrix associated to system is singular and has null rank. 2) . λn = 0. 1) f (v) = (1. 1) . x + y + 2z) and the following linearly independent vectors of R3 : u = (1. f (v2 ) . we can observe that for m = 0 the mapping transforms a line into a line while for m = 0 the mapping transforms a line into a constant (a point). Hence the system has ∞1 solutions Intuitively. If f is injective then f (v1 ) . . 0. . x − y − z.1 and linearity of f we can write this expression as f (oE ) = f (λ1 v1 + λ2 v2 + · · · + λn vn ) . y. Proof Let us assume. v2 . . . Let v1 . 0) w = (0. 0. 0. This is impossible because v1 . λ2 .2 Endomorphisms and Kernel 297 This result can be achieved looking at the equation mx = 0 as a system of one equation in one variable. . λ2 . . Hence we reached a contradiction and f (v1 ) . . .  Example 9. vn be n linearly independent vectors ∈ E. For the Proposition 9. . by contradiction that ∃λ1 .47 Let us consider the injective mapping f : R3 → R3 defined as f (x. −1. 1. . . −1. z) = (x + y + z. .

x − y − z. +.e.3 Linear Mappings and Matrices Definition 9. It follows that the vectors are linearly independent. Theorem 9. The rank of this mapping is two. Hence. +.6: while linearly dependent is always preserved by a linear mapping. α ∈ R} we can immediately see that (ker ( f ) . μ. i. −1) . 2x + 2y + 2z) and its kernel ker ( f ) = {α (0. i. ·) is a vector space having dimension one. . In order to approach the calculation of the image.298 9 Linear Mappings Let us check their linear dependence by finding.) Let f : E → F be a linear mapping where (E. Let (E. its only solution is (0. 0). We can put into relationship Proposition 9. Definition 9. (Im ( f ) .14 Let f : E → F be a linear mapping and Im ( f ) its image. by mappings whose kernel is null. ·) be a finite-dimensional vector space whose dimension is dim (E) = n. This is equivalent to solving the following homogeneous system of linear equa- tions: ⎧ ⎪ ⎨λ + μ + ν = 0 λ−μ−ν =0 ⎪ ⎩ λ + μ + 2ν = 0. 1. y.48 If we consider again the linear mapping f : R3 → R3 defined as f (x. The dimension of the image dim (Im ( f )) is said rank of a mapping. ·) are vector spaces defined on the same scalar field K. 0. +.15 Let f : E → F be a linear mapping and ker ( f ) its kernel. The dimension of the image dim (ker ( f )) is said nullity of a mapping. +. let us intuitively consider that this mapping transforms vectors of R3 into vectors that are surely linearly dependent as the third component is always twice the value of the first component.e. linearly independent is preserved only by injective mapping. if they exist. ν such that o = λ f (u) + μf (v) + ν f (w) . This means that this mapping transforms points of the space into points of a plane. Example 9. 9. ·) is a vector space having dimension two. z) = (x + y + z. the nullity of the mapping is one. The system is determined.3 and Theorem 9.7 (Rank-Nullity Theorem. +. ·) and (F. thus. the values of λ.

e. ws } with 0 < r < n and 0 < s < n. . It follows that dim (Im ( f )) ≤ n. . In order to prove this fact. Since. .6. Hence. . let us prove that the equality considers only finite numbers. Thus. ·) is finite-dimensional. ∃ a basis B = {e1 . . then dim (ker ( f )) ≤ dim (E) = n. . +. . +. Proof At first. then f injective. ∀v ∈ E : f (v) = oF and Im ( f ) = {oF }.9. . . the ker ( f ) is a subset of E. the equality contains only finite numbers. Hence. Hence. f (e2 ) . en } such that every vector v ∈ E can be expressed as v = λ1 e1 + λ2 e2 + . .e.3 Linear Mappings and Matrices 299 Under these hypotheses the sum of rank and nullity of a mapping is equal to the dimension of the vector space (E. . f (en ) in Im ( f ) are linearly independent for Theorem 9. + λn f (en ) . +. i. e2 . dim (Im ( f )) = 0 and dim (ker ( f )) + dim (Im ( f )) = dim (E). . also the vectors f (e1 ) . Im ( f ) = L ( f (e1 ) . dim (ker ( f )) is a finite number. ur } dim (Im ( f )) = s ⇒ ∃BIm = {w1 . . by definition of kernel. +. . w2 . Let us apply the linear transformation f to both the terms in the equation f (v) = f (λ1 e1 + λ2 e2 + . u2 . It follows that dim (Im ( f )) = n and dim (ker ( f )) + dim (Im ( f )) = dim (E). . e2 . . since dim (E) is a finite number we have to prove that also dim (ker ( f )) and dim (Im ( f )) are finite numbers. . en }. . . Since (E. + λn en . ·): dim (ker ( f )) + dim (Im ( f )) = dim (E) . Since these vectors also span (Im ( f ) . . f (e2 ) . ·) is B = {e1 . f (en )) . . . . In the remaining cases. . they compose a basis. ker ( f ) = {oE }. . Thus. • If dim (ker ( f )) = n. . . . Let us consider now two special cases • If dim (ker ( f )) = 0. ker ( f ) = E. dim (ker ( f )) is = 0 and = n. + λn en ) = = λ1 f (e1 ) + λ2 f (e2 ) + . i. ·). Hence. if a basis of (E. . We can write dim (ker ( f )) = r ⇒ ∃Bker = {u1 .

. . for the Theorem 9.4 u = x − h 1 v1 − h 2 v2 − · · · − h s vs ∈ ker ( f ) . . . v2 . b2 . . u1 . u2 . we can rearrange the equality as x = h 1 v1 + h 2 v2 + · · · + h s vs + l1 u1 + l2 u2 + · · · + lr ur . vs . v2 . . Let us calculate the linear mapping of this equality and apply the linear properties f (oE ) = oF = f (a1 v1 + a2 v2 + · · · + as vs + b1 u1 + b2 u2 + · · · + br ur ) = = a1 f (v1 ) + a2 f (v2 ) + · · · + as f (vs ) + b1 f (u1 ) + b2 f (u2 ) + · · · + br f (ur ) = = a1 w1 + a2 w2 + · · · + as ws + b1 f (u1 ) + b2 f (u2 ) + · · · + br f (ur ) . lr . a2 . u2 . . . . . We know that f is not injective because r = 0. .. . Since x has been arbitrarily chosen. . . . . . l2 . . . br and let us express the null vector as linear combination of the other vectors oE = a1 v1 + a2 v2 + · · · + as vs + b1 u1 + b2 u2 + · · · + br ur . . as . ur ) . u1 . On the other hand. b1 .. we can conclude that the vectors v1 . . . . Moreover. . ∀x ∈ E. . . . Let us consider the scalars a1 . . h 2 . . vs . ur span E: E = L (v1 . ws ∈ Im ( f ) ⇒ ∃vs ∈ E| f (vs ) = ws . h s f (x) = h 1 w1 + h 2 w2 + · · · + h s ws = = h 1 f (v1 ) + h 2 f (v2 ) + · · · + h s f (vs ) = = f (h 1 v1 + h 2 v2 + · · · + h s vs ) . Let us check the linear independence of these vectors. If we express u as a linear combination of the elements of Bker by means of the scalars l1 . . the corresponding linear mapping f (x) can be expressed as linear combination of the elements of BIm by means of the scalars h 1 . . .300 9 Linear Mappings By definition of image w1 ∈ Im ( f ) ⇒ ∃v1 ∈ E| f (v1 ) = w1 w2 ∈ Im ( f ) ⇒ ∃v2 ∈ E| f (v2 ) = w2 .

. Since u1 .e. that is dim (ker ( f )) + dim (Im ( f )). . i.3 Linear Mappings and Matrices 301 We know that since u1 . . Since w1 . . Let us consider the two linear mappings R3 → R3 studied above. As a geometrical interpretation. b2 . . . as = 0. . x − y − z. 0)}. u2 . . u2 . . . . w2 . which will be renamed f 1 and f 2 to avoid confusion in the notation. . f (ur ) = oF . ur ∈ ker ( f ) then f (u1 ) = oF f (u2 ) = oF .e. f 1 (x. that dim (E) = n and we know that this basis is composed of r + s vectors. . they are linearly independent.   We know that ker ( f 1 ) = {(0. . and dim (E). for the hypothesis. v2 . x + y + 2z) and f 2 (x. It follows that f (oE ) = oF = a1 w1 + a2 w2 + · · · + as ws . z) = (x + y + z. 0. . y. vs . br = 0. . y. .  Example 9. . x − y − z. It follows that v1 . . . Hence. For the rank-nullity theorem dim (Im ( f )) = 3.9. . . . u1 . We know. also b1 . . . a2 . ws compose a basis.. . Hence. dim (ker ( f )) + dim (Im ( f )) = r + s = n = dim (E) . 2x + 2y + 2z) . . It follows that a1 . . . Since these vectors also span E.. 0. z) = (x + y + z. dim (Im ( f )) is the hardest to calculate and this theorem allows an easy way to find it. they are linearly independent. ur are linearly independent. 0. dim (ker ( f )) = 0 and dim R3 = 3. . . . . 0. u2 . Usually. this mapping transforms points (vectors) of the (three-dimensional) space into points of the space. i.49 The rank-nullity theorem expresses a relation among dim (ker ( f )). ur compose a basis. 0 and that oE = a1 v1 + a2 v2 + · · · + as vs + b1 u1 + b2 u2 + · · · + br ur = = b1 u1 + b2 u2 + · · · + br ur . dim (Im ( f )). . . . . they compose a basis.

As stated above. 5x + 10y + 5z) . y) = x + y. is one. the detection of the kernel is very straightforward: 5x = 0 ⇒ x = 0 ⇒ ker ( f 1 ) = {0}.51 Let us consider the linear mapping R3 → R3 defined as f (x.e. α ∈ R from which it follows that ker ( f 2 ) = α (1.50 Let us check the rank-nullity theorem for the following mappings: f1 : R → R f 1 (x) = 5x and f 2 : R2 → R f 2 (x. z) such that ⎧ ⎪ ⎨x + 2y + z = 0 3x + 6y + 3z = 0 ⎪ ⎩ 5x + 10y + 5z = 0. Regarding f 1 . the mapping f 2 transforms points of the space into points of a plane in the space. z) = (x + 2y + z. y) = α (1. y. the kernel is calculated as x + y = 0 ⇒ (x. Example 9. it follows that dim (Im ( f 2 )) = 1. Thus ∞2 solutions exists. dim (Im ( f 1 ) . This mapping transforms the points of a line (x axis) into another line (having equation 5x). ·) = 1. Since dim (R. Since dim R2 = 2. γ ∈ R we have that the solution of the system of linear equations is . It can be checked that the rank of this homogeneous system of linear equations is ρ = 1. i. we know that ker ( f 2 ) = α (0. For the rank-nullity theorem dim (Im ( f )) = 2. +. −1. 1). This means that the mapping f 2 transforms the points of the plane (R2 ) into the points of a line in the plane. ·). i. It follows that the rank of f 1 is zero.   We can observe that dim (ker ( f 2 )) = 1. Example 9. −1) . y.302 9 Linear Mappings Regarding   f 2 . If we pose x = α and z = γ with α. Regarding f 2 . we have that the nullity. 3x + 6y + 3z. The kernel of this linear mapping is the set of points (x. dim (ker ( f )) = 1 and dim R3 = 3.e. +. −1) .

9.3 Linear Mappings and Matrices 303 .

α+γ (x. y.γ . z) = α. 2 that is also the kernel of the mapping: . − .

e. Since dim R3 . For example. 0. f is surjective. If f is surjective then it is also injective. We can conclude that the mapping f transforms the points of the space (R3 ) into the points of a line of the space. · = 3. 2   It follows that dim (ker ( f ) . dim (ker ( f )) = 0 that is equivalent to say that f is injective. If f is injective ker ( f ) = {oE }. If f is injective then it is also surjective. see Theorem 9.5. 0) Corollary 9. x + y + z.  Example 9. This means that Im ( f ) = E. Thus.2 Let f : E → E be an endomorphism where (E. a mapping f : A → B is injective when it occurs that ∀v1 . − . z) = (x − y + 2z.γ . +.52 In order to better understand the meaning of this corollary let us remind the meaning of injective and surjective mappings. If the mapping is linear. i.  If f is surjective dim (Im ( f )) = n = dim (E). Proof Let dim (E) = n. If we consider endomorphisms f : R3 → R3 we can have four possible cases: • the corresponding system of linear equations is determined (has rank ρ = 3): the dimension of the kernel is dim (ker ( f )) = 0 and the mapping transforms points of the space into points of the space • the corresponding system of linear equations has rank ρ = 2: the dimension of the kernel is dim (ker ( f )) = 1 and the mapping transforms points of the space into points of a plane in the space • the corresponding system of linear equations has rank ρ = 1: the dimension of the kernel is dim (ker ( f )) = 2 and the mapping transforms points of the space into points of a line in the space • the corresponding system of linear equations has rank ρ = 0 (the mapping is the null mapping): the dimension of the kernel is dim (ker ( f )) = 3 and the mapping transforms points of the space into a constant. ·) is a finite- dimensional vector space. it is injective if and only if its kernel is the null vector. −5x + 2y + z) . if v1 = v2 then f (v1 ) = f (v2 ). it follows from the rank-nullity theorem that dim (Im ( f )) = 1. In general. α+γ ker ( f ) = α. y. +. that is (0. dim (ker ( f )) = 0 and dim (Im ( f )) = n = dim (E). v2 ∈ A. the mapping f : R3 → R3 defined as f (x. Thus. +. ·) = 2.

+. y) = (x + y) . B = Im ( f ). Example 9. From the rank-nullity theorem we know that dim (E) = 2 = dim (Im ( f )) = dim (F) = 3 where F = R3 . a mapping which is not an endomorphism could be surjective and not injective or injective and not surjective. The mapping is surjective.54 Let us consider the mapping f : R2 → R f (x.53 Let us give an example for a non-injective mapping f : R3 → R3 f (x. This means that the mapping is injective. 3x + 2y) . +. In our case B = R3 and the image of the mapping is also Im ( f ) = R3 . Hence. This mapping is then not surjective. an injective linear mapping is always also surjective. 2) with α ∈ R. We know that dim (Im ( f ) . We can easily see that this mapping is not injective   dim (ker ( f ) . 0)} and its dimension is dim (ker ( f )) = 0. · = 3. On the other hand. 2x − 2y + 4z) . . +. i. Example 9. ·) = 3 and that this mapping transforms points of the space (R3 ) into points of the space (R3 ). Obviously. Thus the mapping is not surjective. since It follows that dim (Im ( f ) . only (0. Thus. z) = (x − y + 2z.304 9 Linear Mappings is injective (the matrix associated to the system is non-singular and then the system is determined). The kernel is then ker ( f ) = {(0. In other words. Example 9. This means that dim (Im ( f )) = 1. dim (ker ( f )) = 1 and for the rank-nullity theorem dim (E) = 2 = dim (ker ()) + dim (Im ( f )) = 1 + 1. x + y + z.55 Let us consider the mapping f : R2 → R3 f (x. Let us check the injection of this mapping by determining its kernel: ⎧ ⎪ ⎨x + y = 0 x−y=0 ⎪ ⎩ 3x + 2y = 0. y. the equiv- alence of injection and surjection would not be valid for non-linear mappings. y) = (x + y. +. 0) is solution of the system. ·) = 1. The rank of the system is ρ = 2. this mapping is not an endomorphism and not injective since its kernel is ker ( f ) = α (1. x − y.e. the corollary above states that injective endomorphisms are also surjective and vice-versa. ·) = 2 = dim R3 . Hence. This statement is the definition of surjective mapping. Obviously.

It follows that the mapping is not surjective. +. +. .9.n is associated to this linear mapping.3 Linear Mappings and Matrices 305 The latter two examples naturally lead to the following corollaries. Proof For the rank-nullity theorem we know that dim (E) = dim (ker ( f )) + dim (Im ( f )) ⇒ ⇒ dim (Im ( f )) = dim (E) − dim (ker ( f )) ≤ dim (E) < dim (F) . it follows that dim (Im ( f )) ≤ dim (F). ·) and (F. . In other words. Let us consider that dim (E) < dim (F). Thus. Let us consider that dim (E) > dim (F). Hence from the rank-nullity theorem dim (E) = dim (ker ( f )) + dim (Im ( f )) ⇒ dim (ker ( f )) = dim (E) − dim (Im ( f )) ≥ dim (E) − dim (F) > dim (F) − dim (F) = 0.  Corollary 9. . ·) and (F. . + xn en . e2 . Corollary 9. ker ( f ) cannot be only the null vector. e2 . A matrix Am. two bases are associated to these two vector spaces B E = {e1 . This means that the mapping cannot be injective. For the definition of basis ∀x ∈ E : x = x1 e1 + x2 e2 + . . . +. ·) finite-dimensional vector spaces. The mapping cannot be surjective. en . . ·) are finite- dimensional vector spaces defined on the same field K and whose dimension is n and m. +. . Proof Since Im ( f ) ⊂ F. If we apply the linear mapping we obtain f (x) = x1 f (e1 ) + x2 f (e2 ) + . . en }   B F = e1 . Hence. dim (ker ( f )) > 0. It follows that the mapping is not injective. ·) finite-dimensional vector spaces. respectively. .4 Let f : E → F be a linear mapping with (E. ·) and (F. + xn f (en ) .  9. . +.3 Let f : E → F be a linear mapping with (E. . .1 Matrix Representation of a Linear Mapping Let f : E → F be a linear mapping where (E. +. Thus.3. dim (Im ( f )) < dim (F).

m em  +         + .306 9 Linear Mappings On the other hand. .2 e2 + . . a1. + a1..2 x2 + . . . . . . + a2. . we can write x1 f (e1 ) + x2 f (e2 ) + . . ⎟. . .1 x1 + am. by grouping.. . + an. .2 e2 + .1 e1 + a1.n n ⎪ ⎪ ..1 x1 + a1. + ym em  . .n ⎟ A=⎜ ⎝ . . ⎠ am.. . + xn f (en ) = f (x) = y1 e1 + y2 e2 + . + am. .1 e1 + a2. . + xn an.n . + a1. .2 e2 + . + a x 2 2.2 . . . .n xn ⎪ ⎨y = a x + a x + . we obtain     x1 a1. . .m em By substituting.m em  + x2 a2. .2 ..⎠. . . en ∈ E ⇒ f (en ) ∈ F ⇒ f (en ) = an... + ym em  . + a2.2 e2 + .. .1 a2. .1 1 2.1 e1 + a1. ∀x ∈ E : f (x) = y1 e1 + y2 e2 + . .m em = y1 e1 + y2 e2 + .2 x2 + .2 2 2.1 e1 + an. . . . We can further consider that e1 ∈ E ⇒ f (e1 ) ∈ F ⇒ f (e1 ) = a1.1 e1 + an.m em  e2 ∈ E ⇒ f (e2 ) ∈ F ⇒ f (e2 ) = a2. .2 e2 + . + ym em and.n xn This system is then a matrix equation y = Ax where ⎛ ⎞ y1 ⎜ y2 ⎟ y=⎜ ⎝. + an.. ⎪ ⎩ ym = am.1 am.2 e2 + . By equating the two expressions of f (x). we obtain ⎧ ⎪ ⎪ y1 = a1..1 e1 + a2.. a2.2 .m em . . .n ⎜ a2. ⎟ ym ⎛ ⎞ a1. . am.1 a1.. . + a1. .

. It must be observed that in the case of endomorphism the matrix associated to the linear mapping is clearly square. the elements of the first column are the components of f (e2 ).  mapping f : R → R defined by the bases 3 2 Example 9.⎠.56 Let us considerthe linear   {e } BR = 1 . etc.9. xn The elements of the first column are the components of f (e1 ). e3 and BR = e1 . e2 ..3 Linear Mappings and Matrices 307 and ⎛⎞ x1 ⎜ x2 ⎟ x=⎜ ⎟ ⎝. e2 as well as the matrix 3 2 .

4. By definition of kernel   ker ( f ) = x ∈ R3 | f (x) = oR2 =   = (x1 . 3) . it has ∞1 solutions proportional to (1. Since the dimension of the kernel is not 0. x2 . then the mapping is not injective. x3 ) = (0. Hence. This means  x1 − 2x2 + x3 = 0 3x1 + x2 − x3 = 0 This system has rank 2. 0) . ker ( f ) = L ((1. 4. f (e3 )) = = L ((1. 3x1 + x2 − x3 ) . As shown above the image is spanned by the columns of the associated matrix: Im ( f ) = L ( f (e1 ) . x2 . 3 1 −1 This representation of the mapping is equivalent to f : R3 → R2 defined as f (x1 . Let us find the Im ( f ). (−2. . 1) . Let us find the ker ( f ). x3 ) = (x1 − 2x2 + x3 . Thus. −1)) . f (e2 ) . x3 ) ∈ R3 | f (x1 . 1 −2 1 A= . 7) . 7)) and it has dimension equal to 1. (1. x2 .

(3) (−2) + (1) 1 + (−1) 0) = (−4. 1. Since the dimension of Im ( f ) is 2 as well as the dimension of R2 . the mapping is surjective. Hence. 1. 0): f (−2. This is in agreement with the rank- nullity theorem as 1 + 2 = 3 = dim R3 . 0) = ((1) (−2) + (−2) 1 + (1) 0. two are linearly independent.308 9 Linear Mappings It can be easily seen that out of these three vectors. the dimension of Im ( f ) is equal   to 2. −5) The same result can be achieved by calculating the linear combination of the columns of the matrix associated to the mapping: . Let us compute the mapping in the point (−2.

.

.

.

i.57 Let us consider now  the linear  bases BR3 = {e1 . 3x1 − x2 . x2 ) : f (x1 . y3 ) = (2x1 − 4x2 . 2x1 + 4x2 ) that leads to f (e1 ) = 2e1 + 3e2 − 2e3 f (e2 ) = −4e1 − e2 + 4e3 . . by applying the definition. e2 } and BR2 = e1 . e2 . let us solve the following homogeneous system of linear equations. 1 −2 1 −4 −2 +1 +0 = 3 1 −1 −5  mapping f : R → R defined by the 2 3 Example 9. Let us find the kernel of this mapping. −2 4 The mapping can equivalently be expressed as ⎧ ⎪ ⎨ y1 = 2x1 − 4x2 y2 = 3x1 − x2 ⎪ ⎩ y3 = −2x1 + 4x2 or ∀ (x1 . e3 as well as the matrix ⎛ ⎞ 2 −4 A = ⎝ 3 −1 ⎠ . x2 ) = (y1 . y2 .e.

Hence. 2. Hence.58 Let us find the image of the mapping: Im ( f ) = ( f (e1 ) . 1 0 4 Let us find the kernel of f . Example 9. 1. The mapping is not surjective. ker ( f ) = {oR2 } and its dimension is 0. f is not injective. 2. These two vectors are linearly independent. −2) . ⎪ ⎩ −2x1 + 4x2 = 0 It can be easily seen that only oR2 solves the system. −1)} . f (e2 )) = ((2. x2 . the dimension of the image is 2 unlike the dimension of R3 (that is 3). 3. x1 + 4x3 ) . This means that the mapping is injective. 2.59 Let us consider the linear mapping f : R3 → R3 defined as f (x1 . Thus. (−1. 4)) . It follows that the dimension of the kernel is 1. ker ( f ) = L ((4. ⎪ ⎩ x1 + 4x3 = 0 The system is undetermined and has ∞1 solutions proportional to (4. 1) . This means that we have to solve the following system of linear equation ⎧ ⎪ ⎨x1 − x2 + 2x3 = 0 x2 + 2x3 = 0 . these two vectors compose a basis. 0)) . Hence. x2 + 2x3 . −1). x3 ) = (x1 − x2 + 2x3 .3 Linear Mappings and Matrices 309 ⎧ ⎪ ⎨2x1 − 4x2 = 0 3x1 − x2 = 0 . In other words. The matrix associated to this mapping is ⎛ ⎞ 1 −1 2 A = ⎝0 1 2⎠. (−4. If we consider the image Im ( f ) = L ((1. −1.9. 0. Example 9. −1)) and Bker( f ) = B {(4.

+.2 A Linear Mappings as a Matrix: A Summarizing Scheme For a given linear mapping f : E → F. +. equivalently. . . ·) kernel of the mapping is spanned by the row vectors of the matrix A: ker ( f ) = L (K1 . if we pose dim (E) = n dim (F) = m the mapping is identified by a matrix A whose size is m × n. In ) that is the vector space (Im ( f ) . ⎠ Km and in terms of column vectors as   A = I1 . This means that the vector space (ker ( f ) . K2 . In . Km ) .. In order to find the set kernel ker ( f ) the following system of linear equations must be solved Ax = o. . . I2 . . ·) vector spaces. . . . This matrix can be represented in terms of row vectors as ⎛ ⎞ K1 ⎜ K2 ⎟ A=⎜ ⎟ ⎝ .3. ·) and (F. 9. it follows that f (x) = Ax which. ·) image of the mapping is spanned by the column vectors of the matrix A. +. with (E. . . . . The mapping is not surjective. .310 9 Linear Mappings has dimension 2 as the three columns of the matrix A are linearly dependent (the third can be obtained as the sum of the first multiplied by 4 and the second column multiplied by 2). If we consider a vector x of n elements. I2 . can be expressed as Im ( f ) = L (I1 . +..

e2 . .1 e1 + p2. . e2 . .2 e2 + .n e2 + .1 en +   +x2 p1. . . . + pn.2 2 n. +. . . If a vector x ∈ E. .n e1 + p2. . This means that dim (ker ( f )) = n − ρ. As shown. .3.n en =   = x1 p1.1 e2 + .. + pn. Among the m vectors spanning the ker ( f ) only n − ρ are linearly independent..2 1 2. en } and B  = e1 . since e1 . . + p e 2 1. . 9. . ⎪ ⎪ ⎩  en = p1. ·).2 + .2 n ⎪.3 Change of a Basis by Matrix Transformation Let (E. . .n en Hence. .n e1 + .2 e1 + p2. .n e2 + . . . .1 e1 + p2. + pn. . Obviously. it can be expressed as a linear combination of the vectors of either bases: x = x1 e1 + x2 e2 + . en two basis associated to such vector space. . they indeed coincide. . by combining the last two equations we obtain x1 e1 + x2 e2 + . .2 en + . . ·) be a fine-dimensional vector space and let B = {e1 . en ∈ E they could be expressed as linear combinations of the elements of the basis B: ⎧  ⎪ ⎪ e1 = p1.3 Linear Mappings and Matrices 311 Let us indicate with ρ the rank of the matrix A. although the two concepts may appear distinct. +.1 + x2 p1. + xn en = x1 e1 + x2 e2 + .9. . From the rank-nullity theorem we can immediately check that dim (Im ( f )) = ρ that is the number of linearly independent column vectors in the matrix A and the number of linearly independent vectors in (Im ( f ) . + xn en =   = x1 p1. . + pn. + pn. On the other hand. .   +xn p1.1 e2 + .n e1 + p2. e2 . it is not a coin- cidence that the term rank recurs to describe the rank of a matrix and the dimension of the of the image space of a linear mapping.1 en ⎪ ⎨e  = p e + p e + . . . + xn en . + xn p1.

n ⎜ p2. . + xn pn.. xn ⎛ ⎞ p1. + xn pn.1 pn. xn Hence. The elements of the first column of matrix P are the components of e1 in the basis B. pn..1 p2....2 + .1 + x2 p2.. + xn p2.n e2 + .2 + . .1 + x2 pn...n ⎪ ⎪. .. + xn p1.⎠.2 + . ⎪ ⎩ xn = x1 pn..2 . The elements of the second column of matrix P are the components of e2 in the basis B. .1 + x2 p1..n ⎪ ⎨x = x  p + x  p + ... by means of a matrix multiplication a basis change can be performed. This equation leads to the system of linear equations ⎧ ⎪ ⎪ x1 = x1 p1.n with ⎛ ⎞ x1 ⎜ x2 ⎟ x=⎜ ⎟ ⎝. ⎠ pn... .1 2 2. . The elements of the n th column of matrix P are the components of en in the basis B. + x  p = x = Px 2 1 2..2 . .n en .   + x1 pn.2 n 2. .. . ⎟.1 + x2 pn.. On the basis of this consideration we can write x = Px ⇒ P−1 x = P−1 Px ⇒ ⇒ P−1 x = Ix ⇒ P−1 x = x . .n and ⎛ ⎞ x1 ⎜ x ⎟ x = ⎜ 2 ⎟ ⎝.. .312 9 Linear Mappings   + x1 p2.1 p1.. ...2 + . p1.2 . p2. It follows that the n column vectors in the matrix P are linearly independent and the matrix P is non-singular and thus invertible..n ⎟ P=⎜ ⎝ . . .⎠..

4. v3 }. The transformation matrix is ⎛ ⎞ 101 P = ⎝0 5 0⎠ 210 whose inverse matrix is ⎛ ⎞ 0 −0. μ. ν = 0.6 = x .60 The following three vectors v1 = (1.5 P−1 = ⎝ 0 0. By using the same notation of Chap. 0. 0. 210 The vector in the new basis is the solution of the system of linear equations. 0.1 −0. . 0. 1) = λ (1. 4 where.2. in the case of V3 . We already knew that to express a vector in a new basis a solution of a system of linear equations that is a matrix inversion was required. 0. the inversion of the matrix associated to the system: ⎛ ⎞ 101 P = ⎝0 5 0⎠.2 0 ⎠ .3 Linear Mappings and Matrices 313 Example 9. 5. 0) are linearly independent and are thus a basis of R3 . 1) in the new basis B = {v1 . 1. 1) + ν (1. 2) + μ (0. This vector is λ. 0. This fact was known already from Chap. 0) which leads to the following system of linear equations ⎧ ⎪ ⎨λ + ν = 1 5μ = 1 ⎪ ⎩ 2λ + μ = 1 that is. 5. 1 0. 0. it was shown how to express a vector in a new basis.1 0. v2 . 1. 0. 4 if we want to express x in the basis of v1 .9. essentially.4.6) . 1) v3 = (1. and v3 we may write (1. 2) v2 = (0. we can simply write P−1 x = (0.2.5 If we want to express the vector x = (1. v2 .

. Let B = {e1 . Definition 9. This fact is indicated as A ∼ A . The non-singularity of the matrix P is an obvious consequence of the fact that its column are vectors composing a basis. . en we can write: y = Ax ⇒ Py = APx ⇒   ⇒ P−1 Py = P−1 APx ⇒ y = P−1 AP x ⇒ y = A x . In conclusion. .  det (P) Regarding the trace. e2 . Then. considering the basis change x = Px and y = Py with P = e1 . . en . +. . where A = P−1 AP.  . The definition above is given from the perspective of vector spaces and linear mappings.16 Let f : E → E be an endomorphism and (E. The matrices A and A (as well as the associated endomorphisms) are said similar. . . the change of basis by matrix transformations is a generalization to all vector sets and an extension to n variables of what has been shown for vectors in the space. e2 .8 Let A and A be two similar matrices. . we can remember that tr (AB) = tr (BA)       tr A = tr P−1 AP = tr P−1 (AP) =     = tr (AP) P−1 = tr A (P) P−1 = tr (A) .314 9 Linear Mappings Thus. Let us consider another basis of the vectors space: B  = e1 . These two matrices have same determinant and trace. Definition 9. e2 . . The same definition can be given in the following alternative way from the perspective of the matrices. .   Proof Considering the similarity and remembering that det P−1 = det(P) 1       det A = det P−1 AP = det P−1 det (A) det (P) =   1 = det (A) det P−1 det (P) = det (A) det (P) = det (A) . we can write Px = x ⇒ x = P−1 x. . Theorem 9. Let this endomorphism be representable by the matrix equation  y = Ax. . ·) be a finite- dimensional space whose dimension is n.17 Two square matrices A and A are said to be similar is there exists a non-singular matrix P such that A = P−1 AP. en } be a basis of this vector space.

and A . the following properties are valid: • reflexivity: A ∼ A • symmetry: A ∼ A ⇒A ∼ A  • transitivity: A ∼ A and A ∼ A ⇒ A ∼ A 9. This mapping can be interpreted as an operators that transforms a point in the plane into another point in the plane.5 P−1 = ⎝ 0 0.9. Proposition 9. . We can easily check that   det (A) = det A = 10 and   tr (A) = tr A = 8. 1 0.3. Under these conditions.4 The similarity between matrices is an equivalence relation since.61 Let us consider the following matrix: ⎛ ⎞ 220 A = ⎝0 5 0⎠ 001 and again ⎛ ⎞ 101 P = ⎝0 5 0⎠ 210 and ⎛ ⎞ 0 −0.5 Let us calculate ⎛ ⎞ 1 −2 0 A = P−1 AP = ⎝ 0 5 0 ⎠ .4 Geometric Mappings Let us consider a mapping f : R2 → R2 .1 0. the mapping is said geometric mapping in the plane. for three matrices A.2 0 ⎠ .1 −0.3 Linear Mappings and Matrices 315 Example 9. A . 1 12 2 The matrices A and A are similar.

316 9 Linear Mappings Let us now consider the following mapping f : R2 → R2 : .

.

.

.

y1 s0 x1 sx1 = = . If the diagonal elements of the matrix are not equal this linear mapping is called non-uniform scaling and is represented by: . This mapping is called uniform scaling. y2 0s x2 sx2 It can be easily seen that this mapping is linear.

.

.

.

The following linear mapping is called rotation and is represented by . the basic points are indicated with a solid line while the transformed points are indicated with a dashed line. y2 0 s2 x2 s2 x 2 In the following figures. y1 s1 0 x1 s1 x 1 = = .

.

.

.

y1 cos θ − sin θ x1 x1 cos θ − x2 sin θ = = . y2 sin θ cos θ x2 x1 sin θ + x2 cos θ The following linear mapping is called shearing and is represented by .

.

.

.

y1 1 s1 x1 x 1 + s1 x 2 = = . y2 s2 1 x2 s2 x 1 + x 2 .

the coefficient s2 = 0 then this mapping is said horizontal shearing. as in figure.3 Linear Mappings and Matrices 317 If. . The following two linear mappings are said reflection with respect to vertical and horizontal axes. If s1 = 0 the mapping is a vertical shearing.9.

.

.

.

y1 −1 0 x1 −x1 = = y2 0 1 x2 x2 and .

.

.

.

y1 1 0 x1 x1 = = . y2 0 −1 x2 −x2 The reflection with respect to the origin of the reference system is given by .

.

.

.

t2 ). This operation. a translation is an affine mapping. More specifically. y2 0 −1 x2 −x2 Let us consider now the following mapping: y = f (x) = x + t where y1 = x1 + t1 y2 = x2 + t2 where t = (t1 . .2 matrices is not possible. a translation is not a linear mapping as the linearity properties are not valid and a matrix representation by means of R2. Unlike the previous geometric mapping. namely translation moves the points a constant distance in a specific direction (see figure below). y1 −1 0 x1 −x1 = = .

x3 1 We can now give a matrix representation to the affine mapping translation in a plane: ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ y1 1 0 t1 x1 x 1 + t1 ⎝ y2 ⎠ = ⎝ 0 1 t2 ⎠ ⎝ x2 ⎠ = ⎝ x2 + t2 ⎠ . we algebraically represent each point x of the plane by means of three coordinates where the third is identically equal to 1: ⎛ ⎞ ⎛ ⎞ x1 x1 x = ⎝ x2 ⎠ = ⎝ x2 ⎠ . the scaling and rotation can be respectively performed by multiplying the following matrices by a point x: ⎛ ⎞ s1 0 0 ⎝ 0 s2 0 ⎠ 0 0 1 and ⎛ ⎞ cos θ − sin θ 0 ⎝ sin θ cos θ 0 ⎠ 0 0 1 If we indicate with M the 2 × 2 matrix representing a linear mapping in the plane and with t the translation vector of the plane. For example. y3 001 1 1 All the linear mappings can be written in homogeneous coordinates simply adding a row and a column to the matrix representing the mapping.318 9 Linear Mappings In order to give a matrix representation to affine mappings let us introduce the concept of homogeneous coordinates. the generic geometric mapping is given by a matrix .e. i.

0 1 . Mt .

+. .9. For example. if we think to vectors in the space. and Eigenspaces Definition 9. ⎝ 0 0 1 0⎠ 0 0 0 1 ⎛ ⎞ cos θ 0 sin θ 0 ⎜ 0 1 0 0⎟ ⎜ ⎟. ⎝0 0 1 t3 ⎠ 0 0 0 1 9. we are requiring that the mapping changes only the module of a vector while it keeps its original direction. a slightly different meaning. ⎝ − sin θ 0 cos θ 0 ⎠ 0 0 0 1 and ⎛ ⎞ 1 0 0 0 ⎜ 0 cos θ − sin θ 0⎟ ⎜ ⎟. Every vector x such that f (x) = λx with λ scalar and x ∈ E \ {oE } is said eigenvector of the endomorphism f related to the eigenvalue λ. ·) is a finite- dimensional vector space defined on the scalar field K whose dimension is n. on each occasion.4 Eigenvalues.18 Let f : E → E be an endomorphism where (E. These concepts appear in various contexts and have practical implementation in various engineering problems. In other words. the rotations around the three axes are given by ⎛ ⎞ cos θ − sin θ 0 0 ⎜ sin θ cos θ 0 0⎟ ⎜ ⎟. The most difficult aspect of these concepts is that they can be observed from different perspectives and take. ⎝ 0 sin θ cos θ 0⎠ 0 0 0 1 The translation in the space is given by the following matrix ⎛ ⎞ 1 0 0 t1 ⎜0 1 0 t2 ⎟ ⎜ ⎟.3 Linear Mappings and Matrices 319 The geometric mappings in the space can be operated in a similar way by adding one dimension. As an initial interpretation by imposing f (x) = λx we are requiring that a linear mapping simply scales the input vector. Eigenvectors. The concepts of eigenvectors and eigenvalues are extremely important as well as tricky in mathematics.

regardless of the value of λ. an eigenvector (x. On the other hand. . Example 9. we have infinite eigenvectors associated to λ. A scalar λ with a corresponding vector (x. the detection eigenvectors and eigenvalues is trivial because the endomorphisms are already in the form f (x) = λx. By combining the last two equations we have   x + y = λx (1 − λ) x + y = 0 ⇒ 2x = λy 2x − λy = 0.62 Let us consider the endomorphism f : R → R defined as f (x) = 5x. This scalar is the eigenvalue. Example 9. In this case. it follows that (x. By definition. respectively. y) = λ (x. 0). respectively. If this situation occurs. Let us consider the following endomorphism f : R2 → R2 f (x. if we fix the value of λ such that the matrix associated to the system is singular. the search of eigenvalues and eigenvectors is not trivial. verify the following equation f (x. y) .320 9 Linear Mappings The search of eigenvectors is the search of those vectors belonging to the domain (the identification of a subset of the domain) whose linear mapping application behaves like a multiplication of a scalar by a vector. 0) = oE is not an eigenvector. the equations of the system are verified. y) = (x + y. y) and an eigenvalue λ. any vector x (number in this specific case) is a potential eigenvector and λ = 5 would be the eigenvalue. 2x) . y) = (0. y) = (0. Since the system is homogeneous the only way for it to be determined is if (x.63 When the endomorphism is between multidimensional vector spaces. Since by definition an eigenvector x ∈ E \ {oE }. For endomorphisms R → R. y) that satisfy the homogeneous system of linear equations are an eigenvalue and its eigenvector.

x2 ∈ V (λ). Hence.  Definition 9. Let us consider a scalar h ∈ K. It follows that f (x1 + x2 ) = f (x1 ) + f (x2 ) = λx1 + λx2 = λ (x1 + x2 ) .9 Let f : E → E be an endomorphism. For the definition of V (λ) x1 ∈ V (λ) ⇒ f (x1 ) = λx1 x2 ∈ V (λ) ⇒ f (x2 ) = λx2 . and Eigenspaces 321 Theorem 9.19 The vector subspace (V (λ) . y) = (x + y. since (hx) ∈ V (λ). ·).4 Eigenvalues. Example 9. the set V (λ) is closed with respect to the internal composition law. +. +. . It follows that f (hx) = h f (x) = h (λx) = λ (hx) . +.64 Let us consider again the endomorphism f : R2 → R2 f (x. the set V (λ) is closed with respect to the external composition law. Hence. +. since (x1 + x2 ) ∈ V (λ).9. We know that the condition for determining eigenvalues and eigenvectors is given by the system of linear equations  (1 − λ) x + y = 0 2x − λy = 0. Let us consider two generic vectors x1 . ·) is a vector subspace of (E. ·). 2x) . The dimension of the eigenspace is said geometric multiplicity of the eigenvalue λ and is indicated with γm . Eigenvectors. ·) defined as above is said eigenspace of the endomorphism f related to the eigenvalue λ. The set V (λ) ⊂ E with λ ∈ K defined as V (λ) = {oE } ∪ {x ∈ E| f (x) = λx} with the composition laws is a vector subspace of (E. Proof Let us prove the closure of V (λ) with respect tot he composition laws. We can conclude that (V (λ) . From the definition of V (λ) we know that x ∈ V (λ) ⇒ f (x) = λx.

322 9 Linear Mappings In order to identify an eigenvalue we need to pose that the matrix associated to the system is singular: .

. ·) is a vector space (and referred to as eigenspace) and a subspace of R2 . we derive two more equations. From this equation. . The solutions are λ1 = −1 and λ2 = 2. . x2 . l2 . this system is undetermined and ha ∞1 solutions of the type (α. . The set α (1. Proof Let us assume. xp be p eigenvectors of the endomorphism f related to the eigenvalues λ1 . . + λr lr −1 xr−1 while the second is obtained by calculating the linear mapping of the terms f (xr ) = λr xr = f (l1 x1 + l2 x2 + .10 Let f : E → E be an endomorphism and let (E. + lr −1 xr−1 . . we can express one of them as lineal combination of the others by means of l1 . 2 −λ This means that (1 − λ) (−λ) − 2 = 0 ⇒ λ2 − λ − 2 = 0. . · . + lr −1 xr−1 ) = l1 f (x1 ) + l2 f (x2 ) + . +. Thus. lr −1 scalars: xr = l1 x1 + l2 x2 + . Theorem 9. Let us choose λ1 for the homogeneous system above:   (1 − λ1 ) x + y = 0 2x + y = 0 ⇒ 2x − λ1 y = 0 2x + y = 0. . +. The first is obtained by mul- tiplying the terms by λr λr xr = λr l1 x1 + λr l2 x2 + . . by contradiction. + lr −1 f (xr−1 ) = λ1l1 x1 + λ2 l2 x2 + . −2) can be interpreted as a set. . Let x1 . As expected. The solutions of this polynomial would be the eigenvalues of this endomorphism. . λ2 . . More specifically this can be interpreted as a line within the plane (R2 ). The generic solution α (1. It follows that the eigenvectors are all linearly independent. −2α) = α (1. . . Without a loss of generality. . The theorem above says that(α (1. . . that the eigenvectors are linearly dependent. . . . Let these eigenvalues be all distinct. . that are the eigenvalues of the endomorphism. . let us assume that the first r < p eigenvectors are the minimum number of linearly dependent vectors. . λ p . −2) . + λr −1lr −1 xr−1 . (1 − λ) 1 det = 0. −2) with the parameter α ∈ R. −2) is indicated with V (λ1 ) to emphasize that it has been built after having chosen the eigenvalue λ1 . ·) be a finite- dimensional vector space having dimension n. +.

. (1. Let us determine V (λ2 ). and Eigenspaces 323 Thus. Since for hypothesis the eigenvalues are all distinct (λr − λ1 ) = 0 (λr − λ2 ) = 0 . . l2 . . Hence. We can easily check that (1. If we subtract the second equation from the first one.  Example 9. + l1 (λr − λr −1 ) xr−1 .. . the last two equations are equal. we obtain oE = l1 (λr − λ1 ) x1 + l2 (λr − λ2 ) x2 + . the null vector is expressed as linear combination of r − 1 linearly inde- pendent vectors. we reached a contradiction. −2) is an eigenvector. . Theorem 9. + lr −1 xr−1 = oE . . . It follows that xr = l1 x1 + l2 x2 + .4 Eigenvalues. −2) and (1. . 2x) L (λ1 ) = α (1. Thus. Eigenvectors. (λr − λr −1 ) = 0. 1) are linearly independent. Thus. lr −1 = 0. it must be that l1 . 1) is an eigenvector asso- ciated to λ2 .9. This may occur only if the scalars are all null: l1 (λr − λ1 ) = 0 l2 (λr − λ2 ) = 0 . .65 We know that for the endomorphism f : R2 → R2 f (x.. −2) and (1. . .. 0. 1) with α ∈ R. lr −1 (λr − λr −1 ) = 0. y) = (x + y. For λ2 = 2 the system is   (1 − λ2 ) x + y = 0 −x + y = 0 ⇒ 2x − λ2 y = 0 2x − 2y = 0 whose generic solution is α (1. . 0. Hence. This is impossible because an eigenvector is non-null by definition. .10 states that since λ1 = λ2 (are distinct roots of the same polynomial) then the corresponding eigenvectors are linearly independent..

324 9 Linear Mappings Let us verify that these vectors are eigenvectors .

.

.

.

11 1 −1 1 = = λ1 20 −2 2 −2 and .

.

.

.

1)) . +. If we consider the eigenvector x = (1. Let us graphically visualize some of these results. separately. 2) (dashed line) we have In general. we can interpret and eigenvector x as a vector such that its transformed f (x) is parallel to x. with λ2 = 2. 11 1 2 1 = = λ2 . The geometric multiplicity of both the eigenvalues is equal to 1. −2)) V (2) = L ((1. 20 1 2 1 These two vectors. The vector space (V (λ2 ) . is the infinite line of the plane R2 having the same direction of the eigenvector and its transformed (dotted line): . span two eigenspaces: V (−1) = L ((1. ·). 1) (solid line) and the transformed vector f (x) = (2.

Eigenvectors. For example for v = (1. and (457. x − y) . 2). Again. (30. 0. This equation is verified for λ1 = 2 and λ2 = −2. −1).1. If we consider a vector which is not an eigenvector and its transformed they are not parallel. An eigenvalue of this endomorphism is a value λ such that f (x. 1) and (1. These two vectors. 1) with α ∈ R. 457). The two vectors are not parallel: Example 9. 30). The system in undetermined (the associated incomplete matrix is singular) when (1 − λ) (−1 − λ) − 3 = λ2 − 4 = 0. y) = λ (x. (3. 3). such as (0. we have distinct eigenvalues and linearly independent eigenvectors (3.9. y) = (x + 3y.1). y):   x + 3y = λx (1 − λ) x + 3y = 0 ⇒ x − y = λy x + (−1 − λ) y = 0. 3) its transformed is f (v) = (4. For the calculation of the eigenvector associated to λ2 = −2 we write   (1 − λ2 ) x + 3y = 0 3x + 3y = 0 ⇒ x + (−1 − λ2 ) y = 0 x+y=0 whose solution is α (1. separately. and Eigenspaces 325 This fact can be expressed by stating that all the vectors on the dotted line. In order to calculate the eigenvector associated to λ1 = 2 we write   (1 − λ1 ) x + 3y = 0 −x + 3y = 0 ⇒ x + (−1 − λ1 ) y = 0 x − 3y = 0 whose solution is α (3. −1) with α ∈ R.4 Eigenvalues. span two eigenspaces: . are all eigenvectors.66 Let us consider the following endomorphism f : R2 → R2 f (x.

Example 9. 2y + z. 1)) that is the entire plane R2 . The rank of the associated matrix is ρ = 0. −4z) . −1)) . The system is still of two equations in two variables. The geometric multiplicity of λ1 and λ2 is 1. (0. Example 9. The determinant of the matrix associated to the system. The geometric multiplicity of the eigenvalue λ = 3 is 2. Let us search the eigenvalues ⎧ ⎧ ⎪ ⎨x + y = λx ⎪ ⎨(1 − λ) x + y = 0 2y + z = λy ⇒ (2 − λ) y + z = 0 ⎪ ⎩ ⎪ ⎩ −4z = λz (−4 − λ) z = 0 .326 9 Linear Mappings V (1) = L ((3. except for the null vector. This means that every vector. 0)+β (0. 0) .67 For the following mapping f : R2 → R2 defined as f (x. β) with α.68 Let us consider the following linear mapping f : R3 → R3 defined as f (x. y). 3y) let us impose that f (x. The following system of linear equation is imposed   3x = λx (3 − λ) x = 0 ⇒ 3y = λy (3 − λ) y = 0. β ∈ R. z) = (x + y. The generic solution of the system is α (1. The eigenspace is spanned by the two vectors (1. is null only for λ = 3. y) = λ (x. y) = (3x. 1): V (3) = L ((1. is an eigenvector. If we substitute λ = 3 we obtain  0x = 0 0y = 0 which is always verified. (3 − λ)2 . which is a double eigenvalue. Hence. there exist ∞2 solutions of the type (α. y. 0) and (0. 1). 1)) V (−2) = L ((1.

Example 9. 0 0 (−4 − λ) The eigenvalues are all distinct λ1 = 1. 0).e. 0.9. Let us substitute λ2 = 2 into the system: ⎧ ⎪ ⎨−x + y = 0 z=0 ⎪ ⎩ −6z = 0 whose solution is α (1. Let us substitute λ3 = −4 into the system: ⎧ ⎪ ⎨−3x + y = 0 −2y + z = 0 ⎪ ⎩ 0z = 0. −x + 3z) . 0)) . . The associate solution is (0.e. Nonetheless.69 Let us analyse another case where the eigenvalues are not distinct. Let us substitute λ1 = 1 into the system: ⎧ ⎪ ⎨y = 0 y+z =0 ⎪ ⎩ −5z = 0. The last equation is always verified. 0. 1. and λ3 = −4. the following linear mapping f : R3 → R3 f (x.4 Eigenvalues. By posing x = α we have y = 3α and z = 6α. i. 0)}. and Eigenspaces 327 whose determinant of the associated matrix is ⎛ ⎞ (1 − λ) 1 0 det ⎝ 0 (2 − λ) 1 ⎠ = (1 − λ) (2 − λ) (−4 − λ) . that is not an eigenvector. We can conclude that the geometric multiplicity of the eigenvalue λ1 = 1 is 0. 1. z) = (x + z. is V (−4) = L ((1. 3. i. 0) with α ∈ R. 2y. 6)) . The eigenspace is V (2) = L ((1. y. Eigenvectors. the solution is α (1. 6) and the eigenspace. 3. the null vector belongs to the eigenspace: V (1) = {(0. λ2 = 2.

1) and (0. and eigenspace for the following mapping f : R3 → R3 defined as f (x. Example 9. If we pose x = α and y = β we have that the generic solution (α.328 9 Linear Mappings By applying the definition of eigenvalue we write ⎧ ⎧ ⎪ ⎨x + z = λx ⎪ ⎨(1 − λ) x + z = 0 2y = λy ⇒ (2 − λ) y = 0 ⎪ ⎩ ⎪ ⎩ −x + 3z = λz −x + (3 − λ) z = 0. 1. 0. 0. 5z) . we can see that this system has rank ρ = 1 and thus ∞2 solutions. 1. 0). eigenvectors. 5y. that can be interpreted as a plane of the space. This system is undetermined when ⎛ ⎞ (1 − λ) 0 1 det ⎝ 0 (2 − λ) 0 ⎠ = (2 − λ)3 = 0 −1 0 (3 − λ) that is when λ = 2. It follows that the only eigenvalue is triple and λ = 5.e. let us calculate eigenvalues. the three eigenvectors are linearly dependent. .70 Finally. 1) + β (0. From the system of linear equations ⎧ ⎧ ⎪ ⎨5x = λx ⎪ ⎨(5 − λ) x = 0 5y = λy ⇒ (5 − λ) y = 0 ⎪ ⎩ ⎪ ⎩ 5z = λz (5 − λ) z = 0 we can see that the determinant associated to the matrix is (λ − 5)3 = 0. The second equation is always verified while the first and the third say that x = z. This means that only one eigenvector can be calculated. 0). β. By substituting into the system we have ⎧ ⎪ ⎨−x + z = 0 0y = 0 ⎪ ⎩ −x + z = 0. y. z) = (5x. i. The eigenspace is thus spanned by the vectors (1. α) = α (1. Equivalently.

·) be a finite- dimensional vector space having dimension n. . .1 Method for Determining Eigenvalues and Eigenvectors This section conceptualizes in a general fashion the method for determining eigen- values for any Rn → Rn endomorphisms. . . . +. The span of the resulting eigenspace is V (5) = L ((1. e2 .n en . en } is associated to the endomorphism..4 Eigenvalues..2 e2 + .2 . Let f : E → E be an endomorphism defined over K and let (E. 0) .n then f (e1 ) = a1.1 e1 + an.1 an. 1)) .. and Eigenspaces 329 By substituting the eigenvalue into the system we have ⎧ ⎪ ⎨0x = 0 0y = 0 ⎪ ⎩ 0z = 0 whose rank is ρ = 0. a1. 1) .1 a1. ⎠ an. 0.1 a2.2 . .1 e1 + a1. .n en . .2 . 1. . 0. If the matrix A is ⎛ ⎞ a1.n ⎜ a2.. The geometric multiplicity of the eigenvalue λ = 5 as well as its algebraic mul- tiplicity is obviously 3.. .4. 0) + γ (0. 9. ⎟.. .. β. . . a2. . 1. . + a1. . (0.2 e2 + . Eigenvectors.2 e2 + . except for the null vector. 0) . . 0. (0.n related to a basis B = {e1 . β.1 e1 + a2. + an.9. 0) + β (0. . A matrix A ∈ Rn. f (en ) = an. This means that the eigenspace is the entire space R3 .n ⎟ A=⎜ ⎝ .. We have ∞3 solutions of the type (α. γ ∈ R that is α (1.. . 0. γ ) with α. is an eigen- vector. . + a2. This means that every vector of the space. an.n en f (e2 ) = a2..

+ an. + a2. Since two vectors expressed in the same basis have the same components ⎧ ⎪ ⎪ a1. ⎪ ⎪ ⎩ an. . 0. + xn en ) = = x1 f (e1 ) + x2 f (e2 ) + .n xn = 0 ⎪ ⎨   a2.1 x1 + a1.1 e1 + a2.2 x2 + . . + xn λen =   = x1 a1.. + a1. f (e3 ) . . + xn en ) = x1 λe1 + x2 λe2 + . .n xn en . + a1. + a2. . + xn en .2 e2 + . The null solution is not relevant in this context as eigenvectors cannot be null.n xn = 0 ⇒ ⎪ ⎪ .2 e2 + . .2 − λ x2 + . . . +   +xn a2.n en + +. .2 x2 + . . . . . into the last equation we obtain λ (x1 e1 + x2 e2 + . + an.2 x2 + . . . + xn f (en ) . . . . .n en =   = a1. . . . . . .2 x2 + . indicated above.1 − λ x1 + a1. 0. + a x = λx 2.1 x1 + a1. .n n 2 ⇒ ⎪. This is a homogeneous system of linear equations in the components of the eigen- vectors related to the eigenvalue λ. . . We can then calculate the linear mapping of this equation f (x) = λx = λ (x1 e1 + x2 e2 + .n xn = λxn ⎧  ⎪ ⎪ a1. By substituting f (e1 ) .n en +   +x2 a2.1 1 2. ⎪ ⎩   an.2 x2 + . + an. . . . + a1.n xn e2 + +··· +   + an.1 x1 + a2..2 e2 + . . . . + a1. f (en ).n xn e1 +   + a2. this system always has one solution that is 0.1 e1 + a1. . . + a2.330 9 Linear Mappings An eigenvector x ∈ E − {oE } can be expressed as a linear combination of the vectors of the basis x = x1 e1 + x2 e2 + .1 e1 + a2.2 x2 + . . .1 x1 + an. . .2 2 2. If this system has infinite solutions then we can find . + xn en ) = f (x1 e1 + x2 e2 + .1 x1 + a2.1 x1 + an.n xn = λx1 ⎪ ⎨a x + a x + ..2 x2 + . . Since it is homogeneous. + a2.1 x1 + an. . .. . . .n − λ xn = 0.

if the system (A − Iλ) x = o has ∞k solutions. Eigenvectors. The kernel of this endomorphism is the solution of the system (A − Iλ) x = o. . .2 . ⎟ = det (A − Iλ) = 0. a2. .1 an. In order to find the eigenvalues we need to find those values of λ such that p (λ) = 0 and λ ∈ K.n ⎟ det ⎜ ⎝ . the geometric multiplicty of the eigenvalue λ is γm = k. This statement can be reformulated considering that (A − Iλ) represents a matrix associated to an endomorphism Rn → Rn .. . . This can be the case for an endomorphism defined over the R field. Equivalently to what written above. In order to have more than a solution. In this case.. some values of λ satisfying the identity to 0 of the characteristic polynomial can be ∈/ K and thus not eigenvalues.. some roots of the characteristic polynomial can be complex and thus not be eigenvalues..n ⎜ a2. . Hence. . ⎠ an.. As it can be observed by the examples and on the basis of the vector space theory.2 . a1. . the geometric multiplicity γm is the dimension of the kernel. this polynomial has at least one root (there is at least one value of λ such that the determinant is null).. As a further remark. This polynomial is said characteristic polynomial of the endomorphism f . and Eigenspaces 331 eigenvectors and eigenvalues. This means that although the equation has n roots.. an. Let us say that If we indicate with ρ the rank of (A − Iλ) then for a fixed eigenvalue λ it follows that γm = n − ρ.1 − λ a1. a vector x is an eigenvector if Ax = λx.2 − λ ..1 a2. .9. the determinant of the matrix associated to this system must be null: ⎛ ⎞ a1. when the eigenvalue has been determined the corresponding eigenvector is found by solving the homogeneous system of linear equations (A − Iλ) x = o where the eigenvalue λ is obviously a constant. Thus. This is the number of linearly independent eigenvectors associated to the eigen- value λ that is the dimension of the associated eigenspace. . since the characteristic polynomial is the determinant asso- ciated to a homogeneous system of linear equations (that always has at least one solution).4 Eigenvalues.n − λ If the calculations are performed a n order polynomial in the variable λ is obtained: p (λ) = (−1)n λn + (−1)n−1 kn−1 λn−1 + . . + (−1) k1 λ + k0 . .

Let λ0 be a zero of the char- acteristic polynomial. Let p (λ) be the order n characteristic polynomial related to the endomorphism.332 9 Linear Mappings Definition 9. Let γm and αm be the geometric and algebraic multiplicity. Let p (λ) be the order n characteristic polynomial related to the endomorphism. of the characteristic polynomial.5 Let f : E → E be an endomorphism. It follows that 1 ≤ γm ≤ αm ≤ n. Example 9.71 Let us consider an endomorphism f : R2 → R2 over the field R represented by f (x. Proposition 9. 2x + y) corresponding to the following matrix .This characteristic polynomial is said of algebraic multiplicity r ≤ n if it is divisible by (λ − λ0 )r and not by (λ − λ0 )r +1 . y) = (3x + 2y.20 Let f : E → E be an endomorphism. respectively.

32 A= . At first let us calculate the characteristic polyno- mial . 21 Let us compute the eigenvalues.

√ √ The roots of the polynomial are λ1 = 1 + 23 and λ2 = 1 − 23 . In order to find the eigenvectors we need to solve the two following systems of linear equations:  √ . They are both eigenvalues. 3−λ 2 det = (3 − λ) (1 − λ) − 4 = 2 1−λ = 3 − 3λ − λ + λ2 − 4 = λ2 − 4λ − 1.

.

2− 3 2√ x1 0 2 = 2 − 23 x2 0 and  √ .

.

72 Let us consider now the endomorphism f : R3 → R3 defined over the field R and associated to the following matrix ⎛ ⎞ 0 1 0 A = ⎝ −2 −3 21 ⎠ . 2+ 3 2 x1 0 2 √ = . 2 3 x2 0 2 Example 9. 0 0 0 .

. . This theorem. j  .i = j For each eigenvalue λ it follows that     λ ∈ ∪in = Ci ∩ ∪in = Di . and Eigenspaces 333 In order to find the eigenvalues we have to calculate ⎛ ⎞ −λ 1 0   det ⎝ −2 −3 − λ 21 ⎠ = −λ λ2 + 3λ + 2 . the eigenvalues are λ1 = 0. 0 0 −λ Hence. j  ≤ ai. . for a given endomorphism and thus its associated matrix. . . . . Although the proof and details of the Gerschgorin’s Theorem are out of the scope of this book.n be a matrix associated to an endomorphism f : Cn → Cn . it is worthwhile picturing the meaning of this result.4 Eigenvalues. ⎩ ⎭ i=1. Let us consider the following circular sets in the complex plane: ⎧ ⎫ ⎨    n  ⎬ Ci = z ∈ C| z − ai.11 (Gerschgorin’s Theorem) Let A ∈ Cn. j  ⎩ ⎭ j=1. j  ≤ ai. .21 Let f : E → E be and endomorphism and λ1 . λ2 . Definition 9. Obviously. λn the eigenvalues associated to it. λ2 . λ2 = −2. By substituting the eigenvalues into the matrix (A − Iλ) we obtain three homogeneous systems of linear equations whose solutions are the eigenvectors. λn } is said spectrum of the endomorphism while sr = max |λi | i is said spectral radius. and λ3 = −1. since real numbers are a special case of complex numbers. . The set of the eigenvalues Sp = {λ1 .9. Theorem 9.n . Eigenvectors.i = j ⎧ ⎫ ⎨   n  ⎬ Di = z ∈ C| z − ai. allows to make an estimation of the region of the complex plane where the eigenvalues will be located. this result is valid also for matrices ∈ Rn.

. Definition 9. By combining the latter definition with Theorem 9. we can give an equivalent definition from the perspective of matrices. 0 ⎟ . .. As mentioned in Chap. . .12 Let A and A be twosimilar matrices. . ·) means a transformation matrix P whose columns are the vectors composing the basis. ⎜ ⎟ ⎝ . . +. . Hence. .23 A square matrix A is diagonalizable if it is similar to a diagonal matrix: ∃ a non-singular matrix P such that D = P−1 AP where D is a diagonal matrix and P is said transformation matrix. . .     det A − Iλ = det P−1 AP − Iλ =       = det P−1 AP − P−1 IP λ = det P−1 (A − Iλ) P = 1 = det (A − Iλ) det (P) = det (A − Iλ) . considering that the identity matrix. being the neutral element of the prod- uct between matrices. ·) with respect to which the matrix defining this mapping is diagonal.12. .  These two matrices have the same characteristic polynomial: det A − Iλ = det (A − Iλ) Proof If the two matrices are similar then ∃ a non-singular matrix P such that A = P−1 AP. The endomorphism is diag- onalizable if there exists a basis spanning (E. . γ3 . 0.334 9 Linear Mappings 9. a diagonal matrix is of the form ⎛ ⎞ γ1 . .22 Let f : E → E be an endomorphism. . 0. +. 2 a diagonal matrix is a matrix whose extra-diagonal elements are all zeros. 0.5 Diagonal Matrices Theorem 9. ⎠ 0. γn Definition 9.  det (P) This means that two similar matrices have also the same eigenvalues. Hence. 0 ⎟ ⎜ ⎟ ⎜ 0. 0. for a diagonalizable matrix A. 0. . . 0. γ2 .. . If we remember that a basis spanning (E. there exists a diagonal matrix D with the same eigenvalues. is similar to itself. 0 ⎜ 0. We knew this result already since two similar matrices have the same determinant.

. . 0. Axn = λ2 xn .. .e. λ3 . . . . λn Since AP = PD. D = P−1 AP. 0 ⎟ ⎜ ⎟ D=⎜ ⎟ ⎜ 0.. . 0 ⎜ 0. x2 . λ2 . . . λ2 . .13 Let f : E → E be a finite-dimensional endomorphism being asso- ciated to a matrix A having order n... . . xn :   P = x1 x2 .. . . ⎝ . . . xn are the corresponding eigenvectors.. . λn xn . . 0  ⎜ 0 λ2 . . . xn be n vectors having n dimensions. it follows that     Ax1 Ax2 . . . . .. λn xn . . λn It follows that if A and D are similar. λn are the eigenvalues of the mapping and x1 . . . .. . .. . . then λ1 . . . 0. 0 ⎟ . . 0. . .5 Diagonal Matrices 335 Theorem 9.. . Let P be a n × n matrix whose columns are the vectors x1 . x2 . Proof If A and D are similar then D = P−1 AP ⇒ PD = AP it follows that     AP = A x1 x2 . ..9. that is Ax1 = λ1 x1 Ax2 = λ2 x2 . . .. x2 .. 0 ⎟   PD = x1 x2 . . . xn . λ2 . . i. . ⎟ = λ1 x1 λ2 x2 . 0.. . . 0.. and let D be a diagonal matrix whose diagonal elements are the scalars λ1 . . λn : ⎛ ⎞ λ1 . . . . . Axn and ⎛ ⎞ λ1 0 . ⎠ 0.⎠ 0 0 . λn be a set of scalars and x1 . λ2 . Let λ1 . . xn ⎜ ⎝. xn = Ax1 Ax2 . . .. Axn = λ1 x1 λ2 x2 . 0. .. .. .

The result above can be rephrased in the following theorem. λn . Proof If the matrix is diagonalizable. . . then its eigenvectors are linearly independent. are linearly independent. . with D diagonal. If the matrix (and thus the endomorphism) is diagonalizable. λn are the eigenvalues of the mapping and x1 . . . . .. the eigen- vectors of the mapping.6 Let f : E → E be an endomorphism defined by an order n matrix A. For Theorem 9.13. λ2 . The transfor- mation matrix associated to the basis transformation is a matrix P whose columns are the eigenvectors of the mapping. ⎟ . λ2 .. the diagonalization is a basis trans- formation into a reference system where the contribution of each component is only along one axis. . the columns of P.. xn are the corresponding eigenvectors. i...⎠ 0 0 . then it is non-singular.  .. λn with λ1 . . Thus. If the mapping is diagonalizable then D = P−1 AP where ⎛ ⎞ λ1 0 . Then the diagonal matrix D is calculated as D = P−1 AP.. According to the second one.e. 0 ⎜ 0 λ2 . . x2 .. According to the first one.  The diagonalization can be seen from two equivalent perspectives. The following theorems describe the conditions under which an endomorphism is diagonalizable. the diagonalization is a matrix transformation that aims at generating a diagonal matrix.. if the matrix A is seen like a matrix associated to a system of linear equations. Since the matrix P is invertible. .. . . 0 ⎟ D=⎜ ⎝. As a combination of these two perspectives.. λ2 . . . .. xn with x1 . Proposition 9. . having the same solutions) where all the variables are uncoupled. then there exists a non-singular matrix P such that D = P−1 AP.e... .336 9 Linear Mappings This means that λ1 . . . x2 . xn eigenvectors corresponding to the eigenvalues λ1 . the columns of the matrix P are the eigenvectors.e. λn eigenvalues of the mapping (not necessarily distinct) and   P = x1 x2 . every equation is in only one variable. .14 Let f : E → E be an endomorphism having n dimensions (it is thus finite-dimensional) associated to the matrix A. the diagonalization transforms the origi- nal system into an equivalent system (i. . i. . Theorem 9. .

) Let f : E → E be an endomorphism defined by a matrix A. A is diagonalizable. 2 −4 2 − λ The roots of the characteristic polynomial are λ1 = 3. It would easily follow that D = P−1 AP is diagonal and A diagonalizable. Example 9.73 Let us consider an endomorphism associated to the following matrix ⎛ ⎞ 1 2 0 A = ⎝0 3 0⎠ 2 −4 2 Let us find the eigenvalues ⎛ ⎞ 1−λ 2 0 det ⎝ 0 3 − λ 0 ⎠ = (1 − λ) (2 − λ) (3 − λ) . Thus.9. Since the eigenvalues are all distinct. The matrix (and thus the endomorphism) is diagonalizable if and only if one of the following condition occurs: • all the eigenvalues are distinct • the algebraic multiplicity of each eigenvalue coincides with its geometric multi- plicity. λ2 = 2. Thus. for Theorem 9. they can be placed as columns of the matrix P which would result non-singular. Theorem 9. the matrix is diagonalizable.5 Diagonal Matrices 337 If the n eigenvectors are all linearly independent. and λ3 = 1.15 (Diagonalization Theorem. In order to find the corresponding eigenvectors we need to solve the following systems of linear equations ⎛ ⎞⎛ ⎞ ⎛ ⎞ −2 2 0 x1 0 ⎝ 0 0 0 ⎠ ⎝ x2 ⎠ = ⎝ 0 ⎠ 2 −4 −1 x3 0 whose ∞1 solutions are proportional to ⎛ ⎞ −1 ⎝ −1 ⎠ . The first condition can be easily proved by considering that the eigenvectors associated to distinct eigenvalues are linearly independent and thus the matrix P would be non-singular.13 it results that D = P−1 AP is diagonal. 2 .

⎛ ⎞⎛ ⎞ ⎛ ⎞ −1 2 0 x1 0 ⎝ 0 1 0 ⎠ ⎝ x2 ⎠ = ⎝ 0 ⎠ 2 −4 0 x3 0 whose ∞1 solutions are proportional to ⎛ ⎞ 0 ⎝0⎠.338 9 Linear Mappings then. 001 . 2 1 2 The inverse of this matrix is ⎛ ⎞ 0 −1 0 P−1 = ⎝ 2 0 1⎠. 1 and ⎛ ⎞⎛ ⎞ ⎛ ⎞ 0 2 0 x1 0 ⎝ 0 2 0 ⎠ ⎝ x2 ⎠ = ⎝ 0 ⎠ 2 −4 1 x3 0 whose ∞1 solutions are proportional to ⎛ ⎞ −1 ⎝ 0 ⎠. 2 Hence. −1 1 0 It can be observed that ⎛ ⎞⎛ P−1 AP ⎞⎛ ⎞= 0 −1 0 1 2 0 −1 0 −1 = ⎝ 2 0 1 ⎠ ⎝ 0 3 0 ⎠ ⎝ −1 0 0 ⎠ = −1 1 0 2 −4 2 2 1 2 ⎛ ⎞ 300 = D = ⎝0 2 0⎠. the transformation matrix P is given by ⎛ ⎞ −1 0 −1 P = ⎝ −1 0 0 ⎠ .

the eigenvalue λ2 has geometric multiplicity 1.74 Let us consider an endomorphism associated to the following matrix ⎛ ⎞ −8 18 2 A = ⎝ −3 7 1 ⎠ 0 0 1 The characteristic polynomial p (λ) = det (A − Iλ) = (2 + λ) (λ − 1)2 has roots λ1 = −2 with multiplicity 1 and λ2 = 1 with multiplicity 2. In order to find the eigenvector associated to λ1 = −2 we need to solve the system of linear equations ⎛ ⎞⎛ ⎞ ⎛ ⎞ −6 18 2 x1 0 ⎝ −3 9 1 ⎠ ⎝ x2 ⎠ = ⎝ 0 ⎠ . Example 9. 0 0 3 x3 0 The ∞1 solutions of this system are proportional to ⎛ ⎞ 3 ⎝1⎠.75 Let us finally consider an endomorphism associated to the following matrix . the endomorphism is not diagonalizable. Since the algebraic and geometric multiplicities are not the same. 0 0 0 x3 0 The ∞1 solutions of this system are proportional to ⎛ 1 ⎞ 6 ⎝ 1 ⎠ 36 1 Hence. In order to find the eigenvectors associated to λ2 = 1 we need to solve the system of linear equations ⎛ ⎞⎛ ⎞ ⎛ ⎞ −9 18 2 x1 0 ⎝ −3 6 1 ⎠ ⎝ x2 ⎠ = ⎝ 0 ⎠ .9. 0 Hence.5 Diagonal Matrices 339 Example 9. the eigenvalue λ1 has geometric multiplicity 1.

0) and (0. 1. Hence the eigenvalues are λ1 = 2 with algebraic multiplicity 1 and λ2 = −1 with algebraic multiplicity 2. β The eigenvectors can be written as (2. 0. In order to find the eigenvectors associated to λ1 = 2 we need to solve the system of linear equations ⎛ ⎞⎛ ⎞ ⎛ ⎞ 6 −18 0 x1 0 ⎝ 3 −9 0 ⎠ ⎝ x2 ⎠ = ⎝ 0 ⎠ . 1). the rank of the system is 1 and the system has ∞2 solutions. A diagonal matrix of the endomorphism is ⎛ ⎞ 2 0 0 D = ⎝ 0 −1 0 ⎠ 0 0 −1 . Hence. this endomorphism is diagonalizable. Thus. 0 0 0 x3 0 Since the first two rows are linearly dependent and the third is null. The generic solution depends on two parameters α. 0 Thus.340 9 Linear Mappings ⎛ ⎞ 8 −18 0 A = ⎝ 3 −7 0 ⎠ . the geometric multiplicity of λ1 is 1. β ∈ R and is ⎛ ⎞ 2α ⎝ α ⎠. 0 0 −1 The characteristic polynomial is p (λ) = (1 + λ)2 (λ − 2). the geometric multiplicity of λ2 is 2. 0 0 −3 x3 0 The ∞1 solutions of this system are proportional to ⎛ ⎞ 3 ⎝1⎠. In order to find the eigenvectors associated to λ2 = −1 we need to solve the system of linear equations ⎛ ⎞⎛ ⎞ ⎛ ⎞ 9 −18 0 x1 0 ⎝ 3 −6 0 ⎠ ⎝ x2 ⎠ = ⎝ 0 ⎠ .

Algorithm 6 Power Method x1 = Ax0 x2 = Ax1 = A2 x0 . The Power Method is an iterative method easily allowing the calculation of the dominant eigenvalue for endomorphisms having a dominant eigenvalue. 9. n. .7 Let A ∈ Rn .24 Let λ1 . .76 Let us consider an endomorphism f : R2 → R2 represented by the following matrix. . In the case of endomorphism.9. . the matrix associated to it is charac- terized by many columns (as many as the dimensions of the domain). . λn be the eigenvalues associated to an endomor- phism f : Rn → Rn . The eigenvalue λ1 is said dominant eigenvalue if |λ1 | > |λi | for i = 2.. .6 Power Method When a linear mapping is highly multivariate. . Let us indi- cate with A the matrix associated to the endomorphism. the calculation of the roots of the characteristic polynomial can be computationally onerous. This section describes an iterative method for deter- mining one eigenvalue without calculating the roots of the characteristic polynomial. the number of variables is equal to the order of the associated matrix. λ2 . xk = Axk−1 = Ak x0 Example 9.. The method processes an initial eigenvector guess x0 and is described in Algorithm 6. The eigenvectors associated to the eigenvalue λ1 are said dominant eigenvectors. Under these conditions. .5 Diagonal Matrices 341 and the corresponding transformation matrix is ⎛ ⎞ 320 P = ⎝1 1 0⎠ 001 Proposition 9. n. Definition 9. If A is symmetric then it is diagonalizable. . see [20].

1 −5 . 2 −12 A= .

342 9 Linear Mappings The characteristic polynomial is .

Hence. for the definition of dominant eigenvalue λ2 is the dominant eigenvalue. 2 − λ −12 det = (2 − λ) (−5 − λ) + 12 = λ2 + 3λ + 2 1 −5 − λ whose roots are λ1 = −1 and λ2 = −2. The corresponding eigenvectors are derived from .

.

.

4 −12 x1 0 det = 1 −3 x2 0 whose ∞1 solutions are proportional to .

1): . 1 Let us reach the same result by means of the Power Method where the initial guess x0 = (1. 3 .

.

.

.

2 −12 1 −10 2.5 x1 = Ax0 = = = −4 1 −5 1 −4 1 .

.

.

.

8 x2 = Ax1 = = = 10 1 −5 −4 10 1 . 2 −12 −10 28 2.

.

.

.

2 −12 28 −64 2.91 x3 = Ax2 = = = −22 1 −5 10 −22 1 .

.

.

.

2 −12 −64 136 2.96 x4 = Ax3 = = = 46 1 −5 −22 46 1 .

.

.

.

98 x5 = Ax4 = = = −94 1 −5 46 −94 1 . 2 −12 136 −280 2.

.

.

.

The corresponding eigenvalue λ is given by Axx λ= . Theorem 9. xx . In order to find the corresponding eigenvalue we need to introduce the following theorem. 1 −5 −94 190 1 We have found an approximation of the dominant eigenvector.16 (Rayleigh’s Theorem. 2 −12 −280 568 2.) Let f : Rn → Rn be an endomorphism and x one of its eigenvectors.99 x6 = Ax5 = = = 190 .

9. Hence. Axx λxx = = λ.6 Power Method 343 Proof Since x is an eigenvector then Ax = λx. the dominant eigenvalue is given by .  xx xx In the example above.

.

1) 1 −5 1 −20 λ= .99 (2.99. 2 −12 2.

. ≈ ≈ −2. . 1). . Theorem 9. . 1. 6 Instead of carrying this vector. Since iterative multiplications can produce large numbers. For example.54 ⎠ . we can divide the vector elements by 11 and use the modified x1 for the following iteration. we obtain ⎛ ⎞ 11 x1 = ⎝ 6 ⎠ . xk = Ak x0 approaches the dominant eigenvector.99. if we consider the matrix ⎛ ⎞ 245 ⎝3 1 2⎠ 222 and apply the Power Method with an initial guess (1.94 (2. 2.54 This normalization is named scaling and the Power Method that applied the scaling at each iteration is said Power Method with scaling. It follows that ∃ a non-null vector x0 such that for k = 1.17 Let f : Rn → Rn be a diagonalizable endomorphism having a dominant eigenvalue and A the matrix associated to it.01. Let us now give a rigorous proof of the convergence of the Power Method. 0. ⎛ ⎞ 1 x1 = ⎝ 0. 1) 1 We have found an approximation of the dominant eigenvalue.99 9. 2. the solutions are usually normalized with respect to the highest number in the vector.

. xn of Rn exists. 0. . . x2 . . 0 and c1 = 0. cn = 0. . + cn λkn xn that leads to . λ2 . . + cn xn where the scalars c1 . c2 . Since these eigenvectors compose a basis.344 9 Linear Mappings Proof Since A is diagonalizable. . let us assume that λ1 is the dominant eigenvalue and x1 the corresponding dominant eigenvector. . If we multiply the terms of the equation k times by the matrix A we obtain Ak x0 = c1 λk1 x1 + c2 λk2 x2 + . . . . + cn Axn = = c1 λ1 x1 + c2 λ2 x2 + . + cn λn xn . . . + cn xn ) = = c1 Ax1 + c2 Ax2 + . . . . . Let us multiply the terms in this equation by A: Ax0 = A (c1 x1 + c2 x2 + . We can choose an initial guess x0 such that x0 = c1 x1 + c2 x2 + . for the Theorem 9. These eigenvectors are associated to the correspond- ing eigenvalues λ1 . . . . . . . . they are linearly independent. . λn . Without a loss of generality. .14 a basis of eigenvectors x1 .

. . + cn k xn = λk1 λ1 λ1  . λk2 λkn A x0 = k c1 x1 + c2 k x2 + .

k .

λ1 λ1 λ1 are all smaller than 1.. ... the fractions . . the fractions λ2 λ3 λn . λ1 λ1 Since λ1 is the largest eigenvalue in absolute value. Hence. + cn xn .. k  λ2 λn = λk1 c1 x1 + c2 x2 + . .

k .

k .

. k λ2 λ3 λn . .. It follows that  . λ1 λ1 λ1 approach 0 as k approaches infinity...

k .

+ cn xn ⇒ λ1 λ1 ⇒ Ak x0 ≈ λk1 c1 x1 . . k  λ2 λn A x0 = k λk1 c1 x1 + c2 x2 + . . .

y. e3 } and BR2 = e1 .2 Find kernel. z) = (4x + 6y − 2z. e2 . e2 as well as the matrix . and discuss injection and surjection of the following  map-  ping f : R3 → R2 defined by the bases BR3 = {e1 .  9.7 Exercises 9.9. This means that as k grows the method converges to a vector that is proportional to the dominant eigenvector.6 Power Method 345 with c1 = 0. y. 9. image. z : f (x. 7x − 3y + 2z) .1 Determine whether or not the following mapping f : R3 → R2 is linear ∀x.

5 −6 8 A= . e3 } and BR2 = e1 . e2 .3 Find kernel. 7 13 −2 9. and discuss injection and surjection of the following  map-  ping f : R3 → R2 defined by the bases BR3 = {e1 . e2 as well as the matrix . image.

e3 as well as the matrix ⎛ ⎞ 2 −1 1 A = ⎝ 2 3 −2 ⎠ . image. e2 .4 Find kernel. and discuss injection and surjection of the following  map-  ping f : R2 → R3 defined by the bases BR2 = {e1 . . 2 0 −3 9. 4 −1 −12 A= . image. . e3 as well as the matrix ⎛ ⎞ −2 −1 A = ⎝ 3 5 ⎠.7 Find eigenvalues and eigenvectors of the following mapping f : R3 → R3 defined over the field R . e2 } and BR3 = e1 . e2 . 5 1 2 9. e3 } and BR3 = e1 . and discuss injection and surjection of the following  map- ping f : R3 → R3 defined by the bases BR3 = {e1 .6 Find kernel.5 Find kernel. image. 6 5 −3 9. e2 . e3 } and BR3 = e1 . e3 as well as the matrix ⎛ ⎞ 14 −12 −12 A = ⎝ −2 4 −3 ⎠ . e2 . 5 5 9. . e2 . and discuss injection and surjection of the following  map- ping f : R3 → R3 defined by the bases BR3 = {e1 .

21 −2 −3 9. 4 3 −3 9. ⎛ ⎞ −9 k 3 A = ⎝ 0 k 0 ⎠. 1 2 −1 9. apply the Power Method to determine the dominant eigen- vector and the Rayleigh’s Theorem to find the corresponding eigenvalue. . apply the Power Method to determine the dominant eigen- vector and the Rayleigh’s Theorem to find the corresponding eigenvalue. if possible.13 Verify that the endomorphism associated to the matrix A is has a dominant eigenvalue. 3 0 −1 9.14 Verify that the endomorphism associated to the matrix A is has a dominant eigenvalue. the endomorphism associated to the following matrix ⎛ ⎞ 4 1 21 A = ⎝ 1 3 −2 ⎠ .346 9 Linear Mappings ⎛ ⎞ 2 −3 0 A = ⎝ −1 0 0 ⎠ . If possible. if possible.11 Diagonalize. the endomorphism associated to the following matrix ⎛ ⎞ 1 2 1 A = ⎝0 2 0⎠. 555 9. 1 −2 1 9. ⎛ ⎞ −1 −6 0 A = ⎝ 2 7 0 ⎠. If possible. the endomorphism associated to the following matrix ⎛ ⎞ 820 A = ⎝4 1 0⎠.12 Detect the values of k ∈ R such that the endomorphism associated to the matrix A is diagonalizable.10 Diagonalize. −1 1 1 9.8 Find eigenvalues and eigenvectors of the following mapping f : R3 → R3 defined over the field R ⎛ ⎞ 5 −1 0 A = ⎝ 0 3 −2 ⎠ .9 Diagonalize. if possible.

7 Exercises 347 ⎛ ⎞ 1 2 0 A = ⎝ 0 −7 1 ⎠ . 0 0 −2 . ⎛ ⎞ 1 1 0 A = ⎝ 3 −1 0 ⎠ .9.15 Try to apply the Power Method for four iterations to the following matrix notwithstanding the fact that this matrix does not have a dominant eigenvalue. 0 0 0 9.

However. this chapter gives some basics of complexity theory and discrete mathematics and will attempt to answer to the question: “What is a hard problem?” We already know that the hardness of a problem strictly depends on the solver. etc. the contents of this chapter are related to algebra as they are ancillary concepts that help (and in some cases allow) the under- standing of algebra. method. For example. Neri. such as sugar.g. we will refer to the harness of a problem for a computational machine. butter. In a human context. each cooking recipe is an algorithm as it provides for a number of instructions: ingredients. DOI 10. © Springer International Publishing Switzerland 2016 349 F. Due to the physics of the machines. this chapter offers a set of math- ematical and computational instruments that will allow us to introduce several con- cepts in the following chapters. Since complex problems can be decomposed into a sequence of decision problems. The recipe of a chocolate cake is an algorithm that allows in a finite number of steps to transforms a set of inputs. one by one.1007/978-3-319-40341-0_10 . Linear Algebra for Computational Sciences and Engineering. 10.Chapter 10 An Introduction to Computational Complexity This chapter is not strictly about algebra. It can be proved that all the problems can be decomposed as a sequence of decision problems.1 Complexity of Algorithms and Big-O Notation A decision problem is a question whose answer can be either “yes” or “no”. a computing machine can tackle a complex problem by solving. An algorithm is a finite sequence of instructions that allows to solve a given problem. into a delicious dessert. More specifically. the same problem. e. not every problem can be solved by an algorithm. learning how to play guitar. cocoa. measures. oven temperature. In this chapter we will refer to problems that usually humans cannot solve (not in a reasonable time at least). each decision problem that compose it. computers ultimately can solve only decision problems. Hence. However. can be fairly easy for someone and extremely hard for someone else. Moreover. flour.

unless otherwise specified..350 10 An Introduction to Computational Complexity Definition 10. formulated by the British mathematician Alan Turing in 1936. 100 . it can also be seen as a natural number.1 The vast majority of the problems are uncomputable. ∀x ∈ Alg : x ∈ N.. can be interpreted as a real number between 0 and 1. . .011 . This function can be expressed by means of a table with infinite columns. Since an input can be seen as a binary string. As stated above. Proof Let Alg and Pr o be the sets of all possible algorithms and problems. they are represented within a machine as a sequence of binary numbers (binary string). . equivalently. whether the program will stop running or continue to run forever. which will be a positive integer. 1}. A example of undecidable problem is the so-called problem “halting problem”. . The elements of the set Pr o are problems. A problem is said decidable (or computable) if an algorithm that solves the prob- lem exists or. 1}. This means that for each instance of the input the corresponding problem solution is returned. The number can be converted into base 10 number. every decision problem can be represented as a real number: ∀y ∈ Pr o : y ∈ R.. 0 1 1 . Hence every algorithm can be represented as a natural number. respec- tively. Theorem 10. . . Throughout this chapter.e. i. if the problem can be solved in finite (even if enormous) amount of time. An infinite binary string can be interpreted as a real number. . 0 . As such. The function p is then p : N → {0. A problem is said undecidable (or uncomputable) oth- erwise. a problem can be seen as a function p that processes some inputs to ultimately give an answer “yes” or “no”. Obviously {“yes”. In other words. . The halting problem is the problem of determining. The second line of this table is an infinite binary string. the binary string .. 0 1 2 . Hence. The elements of of the set Alg are algorithms. set of instructions that run within a computer. 0 . At most one action is allowed for any given situation. We may think of them as com- puter programs. . We may think that this sequence is a binary number itself. from a description of an arbitrary computer program and an input. In other words.1 A Turing Machine is a conceptual machine able to to write and read on a tape of infinite length and to execute a set of definite operations. each problem can be represented as a sequence of decision problems. If we put a decimal (strictly speaking binary) point right before the first digit. “no” } can be modelled as {0. every time we will refer to a machine and to an algorithm running within it we will refer to the Turing Machine. These operations compose the algorithms.

Hence. In this book we will focus on the time complexity. the number of the algorithms is much smaller that the number of problems. the efficiency of an algorithm can be seen as a function of the length of the input. it is fairly easy to understand that a very accurate algorithm that requires 100 years to return the solution cannot be used in a . the minimum distance from surrounding buildings. Here we enter the field of analysis of algorithms and complexity theory. i. Let us imagine that a number of projects are then produced. Two types of complexities are taken into account. In order to assess the efficiency of an algorithm the complexity of the algorithm must be evaluated. Not all the projects can be taken into account because they may violate some law require- ments. the algorithm must be efficient. Let us suppose we want to build a house. the right output. Secondly. An algorithm is said feasible if its execution time is acceptable with respect to the necessity of solving a problem. Before entering into the details let us consider that the algorithm is automatically performed by a machine.e.e. Let us imagine that as a first action we hire several engineers and architects to perform the initial project design. Obviously this problem can be decomposed as a set of pairwise comparisons: for a given problem and two algorithms solving it how can we assess which algorithm is better and which one is worse? Before explaining this we should clarify what is meant with “better”. This means that the vast majority of problems cannot be solved. b ∈ N.g. Once an algorithms has been developed. e. i. • space complexity: amount of memory necessary for the algorithm to return the correct result • time complexity: number of elementary operations that the processor must perform to return the correct result Hence. an algorithm must be correct. ∃ infinite numbers c ∈ R such that a < c < b. it is fundamental to estimate prior to the algorithm’s execution on a machine. For example we may exclude a project due to its excessive cost or associ- ated construction time. However. Moreover we know that N ⊂ R and that the cardinality of N is much smaller than the cardinality of R because ∀a. In order to compare different projects we have to make clear what our preferences are and what would would be the most important requirement. in general. firstly. Under these conditions. are uncomputable. we will limit ourselves to a few simple considerations. the must produce. In this case. the time will take to complete the task.  Although.10.1 Complexity of Algorithms and Big-O Notation 351 We know that R is uncountably infinite while N is countably infinite. the safety requirements. We do not intend to develop these topics in this book as it is about algebra. a natural question will be to assess which algorithm solves the problem in the best way. the majority of problems is undecidable. the use of appropriate materials. This estimation (or algorithm examination) is said analysis of feasibility of the algorithm. etc. Even if the project is done within a perfect respect of laws and regulations. A similar consideration can be done for algorithms. Although this concept is fairly vague in this formulation. For a given problem there can be many algorithms that solve it. we may not consider it viable because of personal necessities. in this chapter we are interested in the study of decidable problems and the comparison of several algorithms that solve them. at each input.

This piece of information is static. for a given n value. Probably. Thus. In order to estimate the machine time of an algorithm execution. a2 . Let us consider that the problem is scalable. Hence. This means that the time complexity of a product between matrix of size n is n 2 (2n − 1) = 2n 3 − n 2 . If we now consider the matrix product between two matrices having size n. the complexity of the scalar product is 9 and of the matrix product 225.e. The main issue is to assess the impact of the variation in size of the problem on the time required by the machine to solve the problem itself. to the inputs. a4 . Hence. the two vectors are a = (a1 . b3 . a 100 year waiting time would be unacceptable also in the case of a design. a scalar product between two vectors of length n requires the calculation of n products and n − 1 sums. b5 ) respectively. Let us consider a simple example. If n = 10. In other words. the exact number of elementary calculations is not too important.e. Obviously. The variation of the amount of users make the problem vary in size. i. a5 ) and b = (b1 . a3 . Example 10. we can identify a function of the inputs t (input). b4 . More generally. the amount of operations within a machine is directly proportionate to the execution time. The algorithm calculating the scalar products performs the five products (ai bi ) and four sums. we need to compute n 2 scalar products. it can be defined for a growing amount of homogeneous inputs. a distribution network can be associ- ated to a certain problem and algorithm.352 10 An Introduction to Computational Complexity real-time industrial process. Let us now perform the scalar product between the two vectors ab = a1 b1 + a2 b2 + a3 b3 + a4 b4 + a5 b5 . This example shows how algorithms can be characterized by a different com- plexity. More formally. the complexity of the scalar product becomes 19 while the complexity of the matrix product 1900. we may think that. the complexity of the scalar product solver for vectors composed of 5 elements is 5 + 4 = 9. b2 . we have to perform a scalar product for each element of the product matrix. Hence. each algorithm is characterized by its own function t (input). The time required by the algorithm to solve the problem also varies. the number of elementary operations needed to solve the problem/return the desired result where the execution time of an elementary operation is assumed to be constant. if we double the dimensionality we approxi- mately double the corresponding time of scalar product while we make the calculation time of the product between matrices extremely bigger. For example if n = 5. for a given problem. For example. This function makes correspond.1 Let a and b be two vectors of size 5. is related to a specific problem. i. The actual interest of the . Hence the time complexity of a scalar product is 2n − 1 elementary operations.

Definition 10. These problems are important because they are the problems that can surely be solved by a machine in a finite time. Hence. Those problems that can be solved by an algorithm within polynomial time are often referred to as feasible problems while the solving algorithm is often said to be efficient. On the contrary.2 P. It is not too relevant to distinguish the details of the growth of the complexity while it is fundamental to understand the growth of the complexity when the dimensionality increases.2 The class of problems that can be exactly solved by an algorithm  within a finite time and such that the time complexity is of the kind O n k with k finite number ∈ R is said to have a polynomial complexity. The following trends of complexity are usually taken into account. respectively. NP. 10.3 A Nondeterministic Turing Machine is a Turing Machine where the rules can have multiple actions for a given situation (one situation is prescribed into two or more rules rules by different actions). see [21]. For example if k = 10252 the waiting time in a modern computer would likely be unreasonable even though the problem is feasible and the algorithm is efficient.10. It must be remarked that although we may use the words feasible and efficient. This class of problems composes a set that will be indicated with P. the corresponding problems (and algorithms) belong to the same class. an increase of the complexity according to n or 30n are substantially comparable. In the previous example.g. NP-Hard. we have linear and cubic growth. NP-Complete Problems Definition 10. see e. As it can be easily observed the big-O notation gives us an understanding of the order of magnitude of the complexity of the problem. . [22]. In other words. • k constant • log (n) logarithmic • n linear • n 2 quadratic • n 3 cubic • n t polynomial • k n exponential • n! factorial These trends are are indicated with the symbol O (·) and named big-O notation. a growth k n is a completely different problem. the solution of the problem can be extremely time-consuming if k is a large number.1 Complexity of Algorithms and Big-O Notation 353 mathematician/computer scientist is to estimate which kind of relation is between dimensionality and time complexity. For example if an algorithms presents a linear growth of the number of elementary operations in dependence on the problem dimensionality is said to have a O (n) complexity.

if and only if a given solution requires at most a polynomial time to be verified. The sum of these numbers requires n − 1 operations.i∈N 2 2 Hence the problem above would simply be 781+782 2 . 81}. NP problem. . i=1. this is not a P problem. e. Hence. if a solution is given. many problems require a non-polynomial. Example 10. now. hence the complexity of this problem is linear. 2.g. 781}. Definition 10. it will . P ⊂ NP. 13.e. In general. exponential time. 6.354 10 An Introduction to Computational Complexity Example 10. 57. We want to compute the sum of these numbers. In other words. This is a very easy task because from basic arithmetic we know that  n n + (n + 1) 2n + 1 i= = . {6. to be solved but a polynomial time to be verified. this problem requires 2n − 1 calculations. −4. Obviously many problems in NP are not also in P. More precisely this is an EXP problem since the time complexity associated to the problem is exponential. Let us consider. An alternative and equivalent way to define and characterize NP problems is given by the following definition. It can be easily observed that all the problems in P are also in NP. this problem is characterized by O (k). O (n). The problem A is a Nondeterministic Poly- nomial problem. In other words. 27 − 1 operations should be performed by checking all the possible grouping. 3. i. In general the sum on n natural numbers requires a constant number of operations regardless of n. −4. A Nondeterministic Turing Machine may have both the rules “If the light is ON turns right” and “If the light is ON turns left”. The corresponding class of algorithms is indicated with NP. Definition 10. However. . we may want know whether or not there is a subset such that its sum is 24. if we consider a set of n numbers.e.2 A simple Turing Machine may have the rule “If the light is ON turns right” and no more rules about the light being ON.g.3 Let us consider the following discrete set of numbers {1. For the same set of numbers. 22}. i. . If we asked this question to a machine.4 An algorithm that runs with a polynomial time complexity on a Nondeterministic Turing Machine is said to have a Nondeterministic Polynomial complexity.5 Let A be a problem. . the sum of n generic numbers such as {−31. 22. e.

Equivalently. such as the above-mentioned halting problem. Thus. NP.7 A decision problem H is said NP-hard when is at least as hard as the hardest NP problem. We have intuitively shown that this problem is in NP.2 P. On the other hand. −4. It can be proved by . solve by an algorithm in B and anti-transform the solution of B back to the A space. This means that the vast majority of NP-hard problems are not in NP. 6. given a candidate solution of m numbers. This means that the search of the solution requires more than a polynomial time while the verification of any solution can be performed in a polynomial time. Given the set of integer numbers {−31. that is to have an initial understanding of computational complexity. For example. we can give an alternative definition of NP-hard problems. NP-complete problems lay in the intersection between NP and NP-hard problems. The fact that the search of the solution is very different in terms of hardness/complexity is an argument to claim that they are different. for the purpose of this book. Definition 10. Definition 10. Some mathematicians and computer scientists are investigating whether or not P and NP coincide.6 A decision problem H is said NP-hard when every NP problem L can be reduced to H within a polynomial time. NP-Complete Problems 355 take only 2 operations to verify whether or not their sum is 24. These problems are the hardest NP problems that can be found. A special role is played by a subset of NP-hard that is composed also by NP problems. the fact that the verification of a solution is equally easy make the look the two sets as the same concept. Whether or not P and NP are coinciding sets is an open question in computer science that is still a hot topic of discussion and is beyond the scopes of this book. we will perform m − 1 operations to verify it. especially from modern games. NP-Hard. Several problems can be solved by transforming the original problem into a dif- ferent problem. This fact is the basic concept behind the following definition. Many examples can be made. undecidable problems. 81} we want to find a non-null subset whose sum is 24. Definition 10. are always NP-hard. However. The set of step just describe are said reduction of the problem A to the problem B. In general. 57. 22. The class of NP-hard problems is very large (actually infinite) and includes all the problems that are at least as hard as the hardest NP problem.10. and in NP. We could then transform the input from A to B space. This problem belongs to NP. Let us consider a generic problem A and let us assume we want to solve it. In this way we obtain a solution of the our original problem. the same example shown above will be considered again. This problem may be very hard to solve but can be transformed into a mirror problem B that is easier.8 A problem is NP-complete if the problem is both NP-hard. 13.

the Huffman cod- ing. the standard representation of this sentence requires 136 bits in total.g. are also NP-hard. This sentence contains 17 characters. namely Boolean satisfiability problem. As an outcome. the following diagram is presented. 10. In this section. Let us consider the sentence “Mississippi river”. see [24]. Let us imagine we ideally sort them from the easiest (on the left) to the most difficult (on the right). . The sets of P and NP problems are highlighted as well as the class of problems EXP that can be solved within an exponential time. and a representation of arithmetic operations to efficiently exploit the architecture of a machine. a technique to efficiently represent data within a machine. see e. A famous proof of this fact is performed by reduction using an important problem in computational complexity. Since the details are outside the scopes of this book. [23] for the proof. it is related to it. the subset sum problem is NP-complete. Considering that each character requires 8 bits.356 10 An Introduction to Computational Complexity reduction that problems of this type.3 Representing Information An important challenge in applied mathematics and theoretical computer science is the efficient representation and manipulation of the information. albeit not coinciding with the computational complexity. namely subset sum problem.1 Huffman Coding The Huffman coding is an algorithm to represent data in a compressed way by reserving the shorter length representation for frequent pieces of information and longer for less frequent pieces of information. Also. All the problems of the universe are represented over a line. This challenge. NP-hard and NP-complete (grey rectangle) sets are represented. let us explain the algorithmic functioning by means of an example and a graphical representation. EXP NP NP-hard P NP-complete decidable undecidable 10. The solid part of this line represents the set of decidable problems while the dashed part undecidable problems.3. the polish and reverse polish notations. In order to clarify the contents of this section and give an intuitive representation of computational complexity. are given.

let us write the occurrences for each letter appearing in the sentence: M → 1 I → 5 S → 4 P → 2 R → 2 V → 1 E → 1 − → 1 where “-” indicates the blank space and → relates the occurrences to the letter. how Huffman coding can lead to a substantial saving of the memory requirements.3 Representing Information 357 Let us see for this example. we now connect the vertices associated to the least occurrence and sum the occurrences: MV 2 E-2 I5 S4 P2 R2 M1 V1 EE11 -1 The operation can be iterated to obtain: MVE.10. In order to do that. The first step of the Huffman coding simply consists of sorting the letters from the most frequent to the least frequent.2 I5 S4 P2 R2 M1 V1 E 11 -1 .4 PR 4 MV 2 E. I5 S4 P2 R2 M1 V1 E1 -1 From this diagram.

8 MVE. Then. the total number of characters. i.e. appears.17 PRMVE.2 I5 S4 P2 R2 M1 V1 E1 -1 where in the top circle the number 17. each bit is concatenated starting from the top circle down until the letter. Hence the scheme above becomes the following: . Now let us label each edge of the construction by indicating with 0 the edge incident on the left of the circle while with 1 the edge incident on the right of the circle.358 10 An Introduction to Computational Complexity Eventually the complete scheme is the following: ISPRMVE.4 IS 9 PR 4 MV 2 E.

It can be observed that the letters with the highest frequency have the shortest bit representation.3 Representing Information 359 ISPRMVE.17 1 PRMVE. It must be appreciated that this massive memory saving has been achieved without any loss in the delivered information and only by using an intelligent algorithmic solution. If we write again the “Mississippi River” sentence by means of the Huffman coding we will need 46 bits instead of the 136 bits for the standard 4-bit binary representation. the letters with the lowest frequency have the longest bit representation.8 0 1 0 MVE.4 0 1 IS 9 PR 4 MV 2 E. . On the contrary.10.2 1 0 0 1 0 1 0 1 I5 S4 P2 R2 M1 V1 E1 -1 and the Huffman coding of the letters is I = 01 S = 00 P = 100 R = 101 M = 1100 V = 1101 E = 1110 − = 1111.

Hence. In order to understand the first advantage let us focus on the a + b example. If this operation is performed in a machine. Since a machine has to execute the instructions in this order. Normally. the operands must be loaded into the memory.360 10 An Introduction to Computational Complexity 10. The second advantage is that the polish notation and especially the reverse polish notation work in the same way as a stack memory of a calculator works. it is indeed not the most efficient for an electronic calculator. Although this notation may appear naturally under- standable for a human. this operation is represented as a+b where the operator is written between the two operands. If we consider for example the arithmetic expression (the symbol ∗ indicates the multiplication) a + b ∗ c. In this way. respectively.3. These two notations are said prefix and postfix notation. These notations are also named polish and reverse notation. . A stack memory saves (pushes) and extracts (pops) the items sequentially. These notations have essentially three advantages. the machine does not require to interpret the notation and can immediately start performing the operations as they are written. the most natural way for a compiler to prioritize the instructions is the following: Algorithm 7 a + b from the compiler perspective Load memory register R1 with a Load memory register R2 with b Add what is in R1 with what is in R2 and write the answer into the memory register R3 .2 Polish and Reverse Polish Notation At the abstract level. the most efficient way to pass the information to a machine is by following the same order. see [25]. For example the operation sum of a and b involves the operands a and b with the operator +. respectively. The logician Jan Łukasiewicz proposed an alternative notation for arithmetic opera- tions consisting of representing the operator before the operand or after the operands: +ab ab + . a simple arithmetic operation can be interpreted as an opera- tion involving 2 operands and 1 operator. This way of writing the operation is named infix notation. the lifted into the computing unit when the operation can then be performed. The stack memory architecture requires that the first item to be popped is the last that has been pushed.

The same set of instructions are graphically represented in the diagram below. where the variables d and e are indicated just for explanation convenience (but are not actual variables). More specifically. Let us consider the following expression: (a + b) ∗ c. from the perspective of the stack memory. the calculation of the arithmetic expression above is achieved. by the following steps Algorithm 8 a + b ∗ c from the stack memory perspective push a push b push c pop c pop b compute d = b ∗ c push d pop d pop a compute e = a + d push e pop e. which corresponds exactly to the most efficient way to utilize a stack memory. This was the reason why Łukasiewicz originally introduced it. to simplify the notation in logical proofs. In other words. the reverse polish notation explains exactly and in a compact way what the memory stack of a machine supposed to do.e. i. . The third advantage is that the reverse polish notation allows to univocally write all the arithmetic expressions without the need to writing parentheses.3 Representing Information 361 in reverse polish notation becomes abc ∗ +.10. c d=b*c b d e=a+d a a a Stack Memory Processor Stack Memory Processor Stack Memory This sequence of instructions is essentially the very same of what represented by the reverse polish notation.

10. when we write with infix notation.4 Exercises 10.2 Calculate the complexity of a matrix product. In polish or reverse polish notation. Example 10. the use of parentheses can be necessary to avoid ambiguities. 10. arithmetic expressions can be written without ambiguities without the aid of parentheses.4 Let us consider the following arithmetic expression: 5 ∗ (4 + 3) + 2 ∗ 6. If we removed the parentheses the arithmetic expression would mean something else.3 Express in Huffman coding the sentence “Susie sells seashells by the seashore” . 10. Thus. In reverse polish notation it can be written as 543 + ∗26 ∗ +.1 Calculate the complexity of a scalar product. The operations of a complex arithmetic expression described in reverse polish notation are performed from the most internal towards the most external. In particular. the expression (a + b) ∗ c in reverse polish notation is: cab + ∗.362 10 An Introduction to Computational Complexity In this case the parentheses indicate that the sum must be performed before the multiplication. that is the multiplication of b by c first and then the sum of the result with a.

in our days) was divided into 4 parts by the Pregel river. electrical circuits. We will present some concepts of graph theory. mapping. A complete discussion of graph theory. © Springer International Publishing Switzerland 2016 363 F. the theory of transport. those that seem most relevant for our purposes. omitting many others.1 Motivation and Basic Concepts Historically graph theory was born in 1736 with Leonard Eulers when he solved the so called Königsberg bridges problem. Linear Algebra for Computational Sciences and Engineering. see [26]. The problem consisted of the following. biology.Chapter 11 Graph Theory In this chapter we introduce a notion of fundamental importance for modelling in schematic way a large amount of problems.1). For this purpose. The four regions of the town were connected by seven bridges (Fig. 11. artificial intelligence. physics. civil engineering. on the other hand. that brings the pedestrian back to its starting point? The problem was solved by Euler.2. but even in fields as diverse as chemistry. operational research. The Prussian town of Königsberg (Kaliningrad. This is the concept of a graph. 11. On sunny Sundays the people of Königsberg used to go walking along the river and over the bridges. neglecting accessory or irrelevant items. 11. would require more than a chapter. The question was the following: is there a walk using all the bridges once. Euler considered the model in Fig.1007/978-3-319-40341-0_11 . sociology. telephone net- works. industrial organization. The importance of this result lies above all in the idea that Euler introduced to solve this problem. one of these parts being an island in the river. who showed that such a walk is not possible. These pages are only an invitation to this fascinating theory. This concept applies not only to computer science and mathematics. Euler realized that to solve the problem it was necessary to identify its essential elements. DOI 10. Neri. and that is precisely gave rise to the theory of graphs.

2 Königsberg bridges graph . 11.364 11 Graph Theory Fig. 11.1 The Königsberg bridges Fig.

If the vertices w and v are adjacent they are also called neighbours. called end point. Example 11. Two vertices w and v ∈ V are said adjacent if (v.3 Let G be a graph composed of the sets (V. v4 .2 Let G be a graph composed of the sets (V. E).1 The graph G = (V.5 An undirected graph.11. v6 }. Two edges are adjacent edges if they have a vertex in common. until vertex w. w) ∈ E. to make a map of the streets of a city where every road is passable in both senses enough to indicate the arc between two points of the city and not the direction of travel. The edges describe the links of the vertices. where V = {v1 .4 Let G be a graph composed of the sets (V. Definition 11.1 Motivation and Basic Concepts 365 As it will be shown later. A directed edge represents an element e = (v. Definition 11. w) ∈ E as an edge oriented from vertex v. E = {(v1 . v3 . E). The set of neighbours of a vertex v is said its neighbourhood N (v). Definition 11. . v5 . E) which consists of a finite set V = ∅ and a set E of unordered pairs of elements (not necessarily distinct from V ). v1 ) . v6 )}. Definition 11. v2 ) . v1 v2 v4 v6 v3 v5 To model a problem. v3 ) .1 A digraph or directed graph G is a pair of sets (V. E). or simply graph G is a pair G = (V. called starting point. (v2 . (v5 . S E ⊂ E) is said subgraph of G. the general result obtained by Euler allows to solve the seven edges problem. E) represented in the following figure is a digraph. (v1 . Definition 11. E) consisting of a finite set V = ∅ of elements called vertices (or nodes) and a set E ⊆ V × V of ordered pairs of distinct vertices called arcs or directed edges. A graph SG composed of (SV ⊂ V. v2 . it is often not necessary that the edges of a graph are oriented: for example.

366 11 Graph Theory Example 11. v5 . A null graph consists of n vertices v1 . v2 .  m) be a graph of order n and size m. Definition 11.2 An undirected graph is given by G = (V. . v3 ) . v6 } and E = {(v1 . we will refer to an undirected graph. m) n can have at the most edges: 2   n 0≤m≤ 2   n where = n! 2!(n−2)! is the Newton’s binomial coefficient. vi (they would  be two different  edges). v j = v j . (v6 . The graph G (n. when not specified whether it is directed or undirected. v3 . given a graph G = (V. Usually a graph of order n and size m is denoted by G (n. while the size of a graph is the number of its edges. Proposition 11. The case m = 0 corresponds to the null graph (which is often indicated with Nn ). . E) where V = {v1 . v4 . On the contrary in an undirected graph is follows that vi . v j = v j .   if vi and v j are vertices of directed graph then in general vi . vn . v3 . vi . E). without any connection. m).1 LetG (n. . We mostly consider undirected graphs. Hence. (v1 . v2 . 2 In the extreme cases: 1. it is assumed to be undirected. In general. . v1 )} v1 v2 v4 v6 v3 v5 As a further characterization.6 The order of a graph is the number of its vertices. . when we speak of a graph without further specification. v2 ) .

a graph having n vertices and an edge for each pair of distinct 2 vertices.4 The chart below shows the graphs K 1 . . indicated with K n . . The case m = corresponds to the case in which each vertex is connected 2 with all the others. . .3 An example of null graphs is shown in the following figure. v2 v4 v6 v1 v3 v5   n 2. is a graph n G n.7 A complete graph on n vertices. Example 11. Definition    11. . K 4 v1 K1 v1 v2 K2 v1 v3 v2 K3 .e. i.11.1 Motivation and Basic Concepts 367 Example 11.

such that no edge appears more than once. . . (v3 . Ver- tices and edges may appear more than once.e. . where n is the number of nodes.10 A trail is a finite alternating sequence of vertices and edges. v4 ) . . v4 . however. v2 . v5 . . A vertex. v2 ) . (vi . Definition 11. A trail may begin and end at the same vertex. . the terminal vertices are distinct) is called open trail.8 An edge of the type (vi . vn where (v1 . Definition 11. Definition 11. v1 . . may appear more than once. v3 ) . trail v4 v5 trail trail v1 v3 v6 v2 The initial and final vertices of a trail called terminal vertices. A trail of this kind is called closed trail. . v6 is a trail while the sequence v1 . . Example 11. . vi ) is called (self)-loop. begin- ning and ending with vertices. (v2 . v3 . (vn−1 . v6 in the trail graph above is a walk (and not a trail). v5 . .9 A walk of a graph G is a sequence of vertices v1 . . A trail that is not closed (i. vi+1 ) . Definition 11. vn ) are edges of the graph.11 An open trail in which no vertex appears more than once is called a path (also simple path or elementary path). .368 11 Graph Theory v4 v1 v3 v2 K4 It may be observed that in a directed graph since E ⊆ V × V the maximum number of edges is n 2 . v4 .5 In the graph below the sequence v1 . v4 .

14 The distance d vi .17 If a graph G has circuits. v. w) ≥ 0. w) = d (w. and polygon. More formally: a graph G is said to be connected if there is at least one trail between every pair of vertices in G. v) + d (v.18 A circuit is said even (odd) if its length is even (odd). A trail of minimal length between two vertices of a graph is said to be geodesic. the girth of G is the length of the shortest cycle contained in G and the circumference is the length of the longest cycle contained in G. .1 Motivation and Basic Concepts 369 Definition 11. v5 . v j = ∞.   Definition 11. Proposition 11. Definition 11. Definition 11. It immediately follows that an edge which is not a self-loop is a path of length one. vn = v1 is n − 1. the graph is said simple. . It can be easily seen that the length of a circuit v1 . w) Definition 11. i. circular path. with d (v. v2 . w) ≤ d (u. v) • d (u. Definition 11.7 In the graph above. If the vertexes are not linked d vi . and w: • d (v.15 The diameter of a graph G is the maximum distance between two vertices of G. Example 11.e. Example 11.2 The distance in a graph satisfies all the properties of a metric distance. w) = 0 if and only if v = w • d (v.12 A graph in which there is more than one edge that connects two vertices is said multigraph. a geodesic trail is marked out: another trail is for example v1 . A circuit is also called elementary cycle. there is at most one edge that connects them. v4 . Otherwise. v3 . Definition 11. . Otherwise. v j between two vertices vi and v j of a graph G is the minimal length between  all  the trails (if they exist) which link them. The geodesic trail has length 3 while the latter has length 4.19 A graph is connected if we can reach any vertex from any other vertex by travelling along a trail.6 The graph corresponding to the bridges of Knigsberg is a multigraph.13 The length of a trail is the number of its edges. It should also be noted that a self-loop can be included in a walk but not in a path. It is easy to see that one may divide each graph in connected subgraphs. in the case where given any two vertices. . In the trail above. .16 A circuit or cycle in a graph is a closed trail. For all vertices u. Definition 11. v6 .11. the length is 3. G is disconnected.

matrices and vectors. (1) Vertex removal. the subgraph v1 . Unless vi is an isolated vertex. Let us consider a mapping f that consists of connecting n fixed nodes by means m edges. Definition 11. Example 11. The subgraph v1 .22 The rank of a graph ρ is equal to the number of vertices n minus the number of components c: ρ = n − c. Definition 11. A vertex vi can be removed from the graph G. combining the two definitions above. Obviously. Proposition 11. As such. The number of edges m is the dimension of a vector space. it follows that ν = m − n + c.20 A connected subgraph containing the maximum number of (con- nected) edges is said connected component (or simply component) of the graph G. we obtain something that is not a graph. and the nullity is the dimension of the image. the concept of connected component of a graph must not be confused with the concept of vector component seen in the previous chapters. v2 . the equation m = ρ + ν is the rank-nullity theorem for graphs.370 11 Graph Theory Definition 11. v3 is connected but is not a connected component.23 The nullity of a graph ν is equal to number of edges m minus the rank ρ: ν = m − ρ. v4 is a connected component. a set operations can be defined for graphs. Graphs are mathematical entities like sets.21 A vertex that is not connected to any part of a graph is said to be isolated. If G is not connected then its rank is equal to the sum of the ranks of each of its connected component. if the graph is connected then ρ = n − 1 and ν = m − n + 1. Its nullity is equal to the sum of the nullities of each of its connected component.8 In the graph below. for a graph G.3 Let G be a graph. v3 . Obviously. because there will be . the rank ρ is the dimension of the kernel. Definition 11.7. some are listed in the following. v4 v3 v5 v1 v2 v6 A null graph of more than one vertex is disconnected. see Theorem 9. Among all the possible operations. Moreover. Furthermore. v2 .

v5 is removed the following graph is obtained. along with vi . the edge removal results into the removal of the edge but not the vertices.10 If from the graph in the example above the edge v4 . EU ) = G 1 ∪ G 2 is a graph such that VU = V1 ∪ V2 and EU = E 1 ∪ E 2 . we obtain v2 v1 v3 v4 (2) Edge removal. Hence. Let us indicate with the notation G − vi the subgraph obtained from G by removing vi with all the edges passing through vi . Example 11. v2 v1 v5 v3 v4 (3) Union of Graphs. all the edges that pass through vi .9 If we consider the following graph v2 v1 v5