Dragu Atanasiu, Piotr Mikusiński - A Bridge To Linear Algebra-WSPC (2019)

A BRIDGE TO
LINEAR
ALGEBRA
11276_9789811200229_TP.indd 1 15/2/19 4:38 PM

This page intentionally left blank
A BRIDGE TO
LINEAR
ALGEBRA
Dragu Atanasiu
University of Borås, Sweden
Piotr Mikusiński
University of Central Florida, USA
World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TAIPEI • CHENNAI • TOKYO
11276_9789811200229_TP.indd 2 15/2/19 4:38 PM

Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Library of Congress Cataloging-in-Publication Data

Names: Atanasiu, Dragu, author. | Mikusiński, Piotr, author.
Title: A bridge to linear algebra / by Dragu Atanasiu (University of Borås, Sweden),
Piotr Mikusiński (University of Central Florida, USA).
Description: New Jersey : World Scientific, 2019. | Includes index.
Identifiers: LCCN 2018061427| ISBN 9789811200229 (hardcover : alk. paper) |
ISBN 9789811201462 (pbk. : alk. paper)
Subjects: LCSH: Algebras, Linear--Textbooks. | Algebra--Textbooks.
Classification: LCC QA184.2 .A83 2019 | DDC 512/.5--dc23
LC record available at https://lccn.loc.gov/2018061427
British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.
Copyright © 2019 by World Scientific Publishing Co. Pte. Ltd.

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or
mechanical, including photocopying, recording or any information storage and retrieval system now known or to
be invented, without written permission from the publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.
For any available supplementary material, please visit

https://www.worldscientific.com/worldscibooks/10.1142/11276#t=suppl
Printed in Singapore
LaiFun - 11276 - A Bridge to Linear Algebra.indd 1 03-01-19 9:42:42 AM

January 24, 2019 13:55 book-961x669 A Bridge to Linear Algebra-11276 LA_Master page v
We dedicate this book to our wives,
Delia and Grażyna

January 24, 2019 13:55 book-961x669 A Bridge to Linear Algebra-11276 LA_Master page vii
Contents
Preface ix
1 Basic ideas of linear algebra 1

1.1 2 × 2 matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Inverse matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4 Diagonalization of 2 × 2 matrices . . . . . . . . . . . . . . . . . . . . . . . 39
2 Matrices 55
2.1 General matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.2 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.3 The inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3 The vector space R2 131

3.1 Vectors in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
3.2 The dot product and the projection on a vector line in R2 . . . . . . . . 143
3.3 Symmetric 2 × 2 matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
4 The vector space R3 179

4.1 Vectors in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
4.2 Projections in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
5 Determinants and bases in R3 233

5.1 The cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
5.2 Calculating inverses and determinants of 3 × 3 matrices . . . . . . . . . 250
5.3 Linear dependence of three vectors in R3 . . . . . . . . . . . . . . . . . . 264
5.4 The dimension of a vector subspace of R3 . . . . . . . . . . . . . . . . . . 283
6 Singular value decomposition of 3 × 2 matrices 291
7 Diagonalization of 3 × 3 matrices 307

7.1 Eigenvalues and eigenvectors of 3 × 3 matrices . . . . . . . . . . . . . . 307
7.2 Symmetric 3 × 3 matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
vii
January 24, 2019 13:55 book-961x669 A Bridge to Linear Algebra-11276 LA_Master page viii
viii CONTENTS
8 Applications to geometry 355

8.1 Lines in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
8.2 Lines and planes in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
9 Rotations 391
9.1 Rotations in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
9.2 Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
9.3 Rotations in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
9.4 Cross product and the right-hand rule . . . . . . . . . . . . . . . . . . . . 420
10 Problems in plane geometry 429

10.1 Lines and circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
10.2 Triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
10.3 Geometry and trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . 443
10.4 Geometry problems from the International Mathematical Olympiads . 446
11 Problems for a computer algebra system 457
12 Answers to selected exercises 459
Bibliography 491
Index 493
January 24, 2019 13:55 book-961x669 A Bridge to Linear Algebra-11276 LA_Master page ix
Preface
As teachers in the classroom, we have noticed that some hardworking students have
trouble finding their feet when tackling linear algebra for the first time. One of us has
been teaching students in Sweden for many years using a more accessible method
which became the eventual foundation and inspiration for this book.
Why do we need yet another linear algebra text?
To provide introductory level mathematics students greater opportunities for
success in both grasping, practicing, and internalizing the foundation tools of
Linear Algebra. We present these tools in concrete examples prior to being pre-
sented with higher level complex concepts, properties and operations.
TO STUDENTS:
This book is intended to be read, with or without help from an instructor, as
an introduction to the general theory presented in a standard linear algebra course.
Students are encouraged to read it before or parallel with a standard linear alge-
bra textbook as a study guide, practice book, or reference source for whatever and
whenever they have problems understanding the general theory. This book can also
be recommended as a student aid and its material assigned by an instructor as a
reference source for students needing some coaching, clarification, or PRACTICE!
It is our goal to provide a “lifesaver” for students drowning in a standard linear
algebra course. When students get confused, lost, or stuck with a general result,
they can find a particular case of that result in this book done with all the details and
consequently easy to read. Then the general result will make much more sense.
We welcome students to use this guide to become more comfortable, confident,
and successful in understanding the concepts and tools of linear algebra.
GOOD LUCK!
TO INSTRUCTORS:
Let’s face it, many students experience difficulties when they learn linear
algebra for the first time. For example, they struggle to understand concepts like
linear independence and bases. In order to help students we propose the following
pedagogical approach: We present in depth all major topics of a standard course
in linear algebra in the context of R2 and R3 , including linear independence, bases,
dimension, change of basis, rank theorem, rank nullity theorem, orthogonality,
ix
January 24, 2019 13:55 book-961x669 A Bridge to Linear Algebra-11276 LA_Master page x
x PREFACE
projections, determinant, eigenvalues and eigenvectors, diagonalization, spectral

decomposition, rotations, quadratic forms. We also give an elementary and very
detailed presentation of the singular value decomposition of a 3 × 2 matrix.
Students gain understanding of these ideas by studying concrete cases and
solving relatively simple but nontrivial problems, where the essential ideas are not
lost in computational complexities of higher dimensions.
The only part where we do not restrict ourselves to particular cases is when we
present the algebra of matrices, Gauss elimination, and inverse matrices.
There are many repetitions in order to facilitate understanding of the presented
ideas. For example, the dimension is first defined for R2 and vector subspaces of
dimension 2 in R3 and then for R3 . QR factorization is first presented for 2 × 2
matrices, then 3 × 2 matrices, and finally for 3 × 3 matrices.
Our approach uses more geometry than most books on linear algebra. In our
opinion, this is a very natural presentation of linear algebra. Using concepts of
linear algebra we obtain powerful tools to solve plane geometry problems. At the
same time, geometry offers a way to use and understand linear algebra. This book
proves that there in no conflict between analytic geometry and linear algebra, as it
was presented in older books.
When writing this book we were influenced by the recommendations of the
Linear Algebra Curriculum Study Group.
Now a few words about the content of the book.
Chapter 1 presents most of the basic ideas of this book in the context of 2 × 2
matrices. We attempted to make this chapter more dynamic, introducing from the
beginning elementary matrices, inverse of a matrix, determinant, LU decomposi-
tion, eigenvalues and eigenvectors, and in this way hoping that students would find
it more attractive and that it will stimulate curiosity of students about the content of
the rest of the book.
Chapter 2 is about the algebra of general matrices, Gauss elimination, and
inverse matrices. This chapter is less abstract and easier to understand.
Chapters 3, 4, and 5 form the kernel of this book. Here we present vectors in
R2 and R3 , linear independence, bases, dimension and orthogonality. We can say
that Chapter 3 is about the vector space R2 , Chapter 4 about the vector subspaces of
dimension 2 of R3 and Chapter 5 is about the vector space R3 .
Some applications are also discussed. In Chapter 3 we present QR factorization
for 2 × 2 matrices and in Chapter 4 we present the least square method for 3 × 2
matrices and QR factorization for matrices 3 × 2 matrices. In Chapter 5 we discuss
practical methods for calculating determinants of 3 × 3 matrices.
Chapter 6 is a short chapter about singular value decomposition for 3 × 2
matrices. The meaning of this chapter is to give more opportunities to use matri-
ces. It will also help students understand the singular value decomposition in the
general case.
Chapter 7 is about diagonalization in R3 . We include complete calculations for
many determinants and solve numerous systems of equations. At the end of the
chapter we present 3×3 symmetric matrices and QR factorization for 3×3 matrices.
January 24, 2019 13:55 book-961x669 A Bridge to Linear Algebra-11276 LA_Master page xi
PREFACE xi
Chapter 8 gives a presentation of classical analytic geometry compatible with the

concepts of linear algebra.
Chapter 9 is about rotations in R2 and R3 . Quadratic forms in R2 are also dis-
cussed here because our presentation makes use of rotations.
Chapter 10 contains, for readers interested in geometry, completely solved
problems in plane geometry. Among them are four problems given at International
Mathematical Olympiads. In our solutions we use concepts and tools from linear
algebra, including vectors, norm, linear independence, and rotations.
Because the use of technology is also important for students, we give some
examples using Maple in an appendix at the end of the book. This part is not
emphasized, since practically all examples and exercises in the book are designed
for “paper and pencil” calculations. We believe that the experience of working
through these examples improves understanding of the presented material.
In several places of this book we refer to the book Core Topics in Linear Alge-
bra, which presents the standard topics of an introductory course in linear algebra.
These two books can be used in parallel, with A Bridge to Linear Algebra providing a
wealth of examples for the ideas discussed in Core Topics in Linear Algebra. On the
other hand, the books are written so that they can be used independently. When the
reader is directed to the book Core Topics in Linear Algebra, actually any standard
book for an introductory course in linear algebra can be used.
ACKOWLEDGEMENTS:
We would like to thank Joseph Brennan from the University of Central Florida
for fruitful discussions that influenced the final version of the book. We acknowl-
edge the effort and time spent of our colleagues from the University of Borås, An-
ders Bengtsson, Martin Bohlén, Anders Mattsson, and Magnus Lundin, who cri-
tiqued portions of the earlier versions of the manuscript. We also benefitted from
the comments of the reviewers. We are indebted to the students from the Univer-
sity of Borås who were the inspiration for writing this book. We would like to thank
Delia Dumitrescu for drawing the hand needed for the right-hand rule and design-
ing the figures for the problems from the International Mathematical Olympiads.
We are grateful to the World Scientific Publishing team, including Rochelle Kronzek-
Miller, Lai Fun Kwong, Rok Ting Tan, Yolande Koh, and Uthrapathy Janarthanan, for
their support and assistance. Finally, we would like to express our gratitude the TeX-
LaTeX Stack Exchange community for helping us on several occasions with LaTeX
questions.
January 24, 2019 13:55 book-961x669 A Bridge to Linear Algebra-11276 LA_Master page 1
Chapter 1
Basic ideas of linear algebra
1.1 2 × 2 matrices
The introduction of matrices is one of the great ideas of linear algebra. Matrices were
invented to solve some mathematical problems, like systems of linear equations, in
a shorter, more transparent and more elegant way. In this chapter we describe some
operations on matrices. The purpose of this chapter is to provide motivation and an
opportunity for the reader to work with matrices. The ideas introduced here will be
generalized and discussed in a more systematic way in the following chapters.
Solving linear equations is one of the basic problems of mathematics. Linear
equations are also among the most common models for real life problems. The sim-
plest linear equation is
ax = b, (1.1)
where a and b are known real numbers and x is the unknown quantity. The equation
has a unique solution if and only if a 6= 0. The solution is x = ba .
Now we consider the system of equations:
½
ax + b y = e
, (1.2)
cx + d y = f
where a, b, c, d , e, and f are known real numbers and x and y are to be determined.
This looks much more complicated than the equation ax = b. Linear algebra gives
us tools that allow us to treat (1.2), and in fact many other more complicated prob-
lems, as a special case of the basic equation
Ax = b, (1.3)
where A, x, and b are no longer numbers, but many similarities between this equa-
tion and (1.1) remain. If we think of x as the solution of (1.2), then it should be
represented by both x and y. We will use the notation
· ¸
x
x=
y
1
2 Chapter 1: Basic ideas of linear algebra
and call x a 2 × 1 matrix or a 2 × 1 vector. The geometric

· interpretation
¸ of vectors will
x
be discussed in Chapter 3. At this time we think of as a way of representing a
y
· ¸
e
solution of the system (1.2). Similarly, we write b = .
f
By definition
· ¸ · ¸
a1 a2
= if and only if a 1 = a 2 and b 1 = b 2 .
b1 b2
If we go back to the system (1.2) we quickly realize that A has to contain the
information about all coefficients, that is, a, b, c, and d . To capture this information
we will write · ¸
a b
A= .
c d
Such an array is called a 2 × 2 matrix.
We also have by definition
· ¸ · ¸
a1 b1 a2 b2
=
c1 d1 c2 d2
if and only if
a 1 = a 2 , b 1 = b 2 , c 1 = c 2 , and d 1 = d 2 .
· ¸ · ¸
1 2 1 2
Consequently, 6= .
3 4 4 3
Now (1.2) can be written as Ax = b or
· ¸· ¸ · ¸
a b x e
=
c d y f
if we define · ¸· ¸ · ¸
a b x ax + b y
= . (1.4)
c d y cx + d y
· ¸
ax + b y
Definition 1.1.1. The vector is called the product of the matrix
cx + d y
· ¸ · ¸
a b x
and the vector .
c d y
Example 1.1.2. The system ½

x + 3y = 6
2x + y = 1
1.1. 2 × 2 MATRICES 3
can be written as · ¸· ¸ · ¸
1 3 x 6
= ,
2 1 y 1
where · ¸· ¸ · ¸
1 3 x x + 3y
= .
2 1 y 2x + y
£ ¤
By a 1×2 matrix we mean a row £ a 1 ¤ a 2 £ of two¤real numbers a 1 and a 2 . As in the
case of other matrices, we write a 1 a 2 = b 1 b 2 if and only if a 1 = b 1 and a 2 = b 2 .
The operation in (1.4) can be viewed as the result of combining
£ ¤ two simpler op-
erations. To this end we define the product of a 1×2 matrix a 1 a 2 by a 2×1 matrix
· ¸
b1
:
b2
· ¸
£ ¤ b1
a1 a2 = a1 b1 + a2 b2 . (1.5)
b2
Example 1.1.3. · ¸
£ ¤ 2
5 4 = 5 · 2 + 4 · (−6) = −14.
−6
Using the operation defined in (1.5), the operation introduced in (1.4) can be
written as  · ¸
£ ¤ b1
¸ · ¸  a1 a2
b2 
· · ¸
a1 a2 b1 a1 b1 + a2 b2
= = .
 
a3 a4 b2 a3 b1 + a4 b2
· ¸
£ ¤ b1 
a3 a4
b2
This might look like a more complicated expression than (1.4), but it is actually a
convenient way of interpreting the product of a 2 × 2 matrix and a 2 × 1 matrix and it
will serve us well in more complicated situations considered later.
Example 1.1.4. We want to calculate

· ¸· ¸
4 −9 6
.
3 2 1
Since · ¸ · ¸
£ ¤ 6 £ ¤ 6
4 −9 = 15 and 3 2 = 20,
1 1
we obtain · ¸· ¸ · ¸
4 −9 6 15
= .
3 2 1 20
Using the operation defined in · (1.5), ¸we can also define the product of a 1 × 2
£ ¤ b1 b3
matrix a 1 a 2 and a 2 × 2 matrix :
b2 b4
· ¸ · · ¸ · ¸¸
£ ¤ b1 b3 £ ¤ b1 £ ¤ b3 £ ¤
a1 a2 = a1 a2 a1 a2 = a1 b1 + a2 b2 a1 b3 + a2 b4 .
b2 b4 b2 b4
Example 1.1.5. To calculate · ¸

£ ¤ 2 4
7 −1
8 −2
we first find · ¸ · ¸
£ ¤ 2 £ ¤ 4
7 −1 = 6 and 7 −1 = 30.
8 −2
Hence · ¸
£ ¤ 2 4 £ ¤
7 −1 = 6 30 .
8 −2
Finally we define the product of two 2 × 2 matrices, again using the operation
defined in (1.5):
 · ¸ · ¸
£ ¤ b1 £ ¤ b2
¸  a1 a2 a1 a2
b3 b4 
· ¸·
a1 a2 b1 b2
= · ¸ .
 
a3 a4 b3 b4
· ¸
£ ¤ b1 £ ¤ b2 
a3 a4 a3 a4
b3 b4
Note that the product of two 2 × 2 matrices can be equivalently expressed in one
of the following three ways:
 · ¸
£ ¤ b1 b2
¸  a1 a2
b3 b4 
· ¸·
a1 a2 b1 b2
=
 
a3 a4 b3 b4
· ¸
£ ¤ b1 b2 
a3 a4
b3 b4
·· ¸· ¸ · ¸ · ¸¸
a1 a2 b1 a1 a2 b2
=
a3 a4 b3 a3 a4 b4
· ¸
a1 b1 + a2 b3 a1 b2 + a2 b4
= .
a3 b1 + a4 b3 a3 b2 + a4 b4
Example 1.1.6. We wish to calculate the product

· ¸· ¸
1 5 4 −3
.
3 2 −1 6
1.1. 2 × 2 MATRICES 5
We have · ¸ · ¸
£ ¤ 4 £ ¤ −3
1 5 = −1, 1 5 = 27
−1 6
and · ¸ · ¸
£ ¤ 4 £ ¤ −3
3 2 = 10, 3 2 = 3.
−1 6
This means that · ¸· ¸ · ¸
1 5 4 −3 −1 27
= .
3 2 −1 6 10 3
It is important to remember that the product of matrices is not commutative,

that is, the result usually depends on the order of matrices.
Example 1.1.7. For the product

· ¸· ¸
4 −3 1 5
−1 6 3 2
we calculate · ¸ · ¸
£ ¤ 1 £ ¤ 5
4 −3 = −5, 4 −3 = 14
3 2
and · ¸ · ¸
£ ¤ 1 £ ¤ 5
−1 6 = 17, −1 6 = 7.
3 2
This means that · ¸· ¸ · ¸
4 −3 1 5 −5 14
= ,
−1 6 3 2 17 7
while in the previous example we found that
· ¸· ¸ · ¸
1 5 4 −3 −1 27
= .
3 2 −1 6 10 3
The results are completely different.
Now we consider products of three matrices. There are two ways we can calcu-
late a product of three matrices, as the next example illustrates.
Example 1.1.8. Show that

µ · ¸¶ · ¸ µ· ¸ · ¸¶
£ ¤ −1 2 −8 £ ¤ −1 2 −8
2 3 = 2 3 .
1 1 2 1 1 2
Solution. First we calculate the product

µ · ¸¶ · ¸
£ ¤ −1 2 −8
2 3 .
1 1 2
We find · ¸
£ ¤ −1 2 £ ¤
2 3 = 1 7
1 1
and then · ¸
£ ¤ −8
1 7 = 6.
2
Now we calculate µ· ¸ · ¸¶
£ ¤ −1 2 −8
2 3 .
1 1 2
We find · ¸· ¸ · ¸
−1 2 −8 12
=
1 1 2 −6
and then · ¸
£ ¤ 12
2 3 = 6.
−6
In the above example we get the same result regardless of the way the product is
calculated. This is always true as the next theorem shows.
Theorem 1.1.9. For any numbers a 1 , a 2 , b 1 , b 2 , b 3 , b 4 , c 1 , c 2 we have

µ · ¸¶ · ¸ µ· ¸ · ¸¶
£ ¤ b1 b2 c1 £ ¤ b1 b2 c1
a1 a2 = a1 a2 .
b3 b4 c2 b3 b4 c2
Proof. The equality can be verified by simply calculating the products on both sides
and comparing the results. On the left-hand side we have
· ¸
£ ¤ b1 b2 £ ¤
a1 a2 = a1 b1 + a2 b3 a1 b2 + a2 b4
b3 b4
and
· ¸
£ ¤ c1
a1 b1 + a2 b3 a1 b2 + a2 b4 = a1 b1 c1 + a2 b3 c1 + a1 b2 c2 + a2 b4 c2 ,
c2
so µ · ¸¶ · ¸
£ ¤ b1 b2 c1
a1 a2 = a1 b1 c1 + a2 b3 c1 + a1 b2 c2 + a2 b4 c2 .
b3 b4 c2
1.1. 2 × 2 MATRICES 7
We obtain the same result if we calculate

µ· ¸ · ¸¶
£ ¤ b1 b2 c1
a1 a2 .
b3 b4 c2
The calculations are left as an exercise.
The result in the above lemma is an example of associativity of matrix multipli-

cation. It is an important property of matrix multiplication and it allows us to write
the product · ¸· ¸
£ ¤ b1 b2 c1
a1 a2
b3 b4 c2
without parentheses. In the next theorem we prove the associativity property for the
product of three 2 × 2 matrices.
Theorem 1.1.10. For any numbers a 1 , a 2 , a 3 , a 4 , b 1 , b 2 , b 3 , b 4 , c 1 , c 2 , c 3 , c 4 we

have
µ· ¸· ¸¶ · ¸ · ¸ µ· ¸· ¸¶
a1 a2 b1 b2 c1 c2 a1 a2 b1 b2 c1 c2
= .
a3 a4 b3 b4 c3 c4 a3 a4 b3 b4 c3 c4
Proof. The equality can be verified by calculating the products on both sides and
comparing the results. However, such approach would lead to rather tedious calcu-
lations. We can significantly simplify our proof by employing Theorem 1.1.9.
First we observe that
· ¸
£ ¤ b1 b2 
 a1 a2 b3 b4 
· ¸· ¸
a1 a2 b1 b2
= · ¸
a3 a4 b3 b4 £ ¤ b1 b2 
a3 a4
b3 b4
and consequently
µ · ¸¶ · ¸ µ · ¸¶ · ¸
£ ¤ b1 b2 c1 £ ¤ b1 b2 c2
a
 1 2 ba a1 a2
b4 c3 b3 b4 c4 
µ· ¸· ¸¶ · ¸
a1 a2 b1 b2 c1 c2 3
= µ ¸¶ · ¸ .
 
a3 a4 b3 b4 c3 c4
· ¸¶ · ¸ µ ·
 £ ¤ b1 b2 c1 £ ¤ b1 b2 c2 
a3 a4 a3 a4
b3 b4 c3 b3 b4 c4
Similarly,
 µ· ¸ · ¸¶ µ· ¸ · ¸¶
£ ¤ b1 b2 c1 £ ¤ b1 b2 c2
· ¸ µ· ¸· ¸¶  a1 a2
b3 b4 c3
a1 a2
b3 b4 c4 
a1 a2 b1 b2 c1 c2
= ¸ · ¸¶ .
 
a3 a4 b3 b4 c3 c4
µ· ¸ · ¸¶ µ·
£ ¤ b1 b2 c1 £ ¤ b1 b2 c2 
a3 a4 a3 a4
b3 b4 c3 b3 b4 c4
According to Theorem 1.1.9, the two matrices on the right-hand side are equal.
We can see that matrix multiplication shares some properties with number mul-
tiplication, like associativity, but there are also some significant differences. For ex-
ample matrix multiplication is not commutative. The number one plays a very spe-
cial role in number multiplication, namely, 1 · a = a · 1 = a for any real number a. It
turns out that there is a matrix that plays the same role in matrix multiplication.
Theorem 1.1.11. For any numbers a, b, c, d we have

· ¸· ¸ · ¸· ¸ · ¸
1 0 a b a b 1 0 a b
= = .
0 1 c d c d 0 1 c d
Proof. The equalities can be verified by direct calculations.
Besides the matrix multiplication we will use addition of matrices of the same
size. To add two matrices we simply add the corresponding entries of the matrices:
· ¸ · ¸ · ¸
a1 b1 a1 + b1
+ =
a2 b2 a2 + b2
£ ¤ £ ¤ £ ¤
a1 a2 + b1 b2 = a1 + b1 a2 + b2
· ¸ · ¸ · ¸
a1 a2 b1 b2 a1 + b1 a2 + b2
+ =
a3 a4 b3 b4 a3 + b3 a4 + b4
We will also multiply matrices by real numbers. To multiply a matrix by a real

number t we multiply every entry of that matrix by t :
· ¸ · ¸
a1 t a1
t =
a2 t a2
£ ¤ £ ¤
t a1 a2 = t a1 t a2
· ¸ · ¸
a1 a2 t a1 t a2
t =
a3 a4 t a3 t a4
1.1. 2 × 2 MATRICES 9
1.1.1 Exercises
Find the products of the given matrices.

· ¸ · ¸· ¸
£ ¤ 4 1 1 3 −1
1. 3 −2 8.
5 −7 3 4 1
· ¸ · ¸· ¸
£ ¤ 2 2 1 4 1
2. 5 3 9.
−1 5 9 5 −1
· ¸ · ¸· ¸
£ ¤ 5 2 3 −1 1 1
3. 2 −3 10.
3 −4 4 1 −7 3
· ¸ · ¸· ¸
£ ¤ 2 −2 7 −2 p q
4. 1 7 11.
4 3 5 3 r s
· ¸· ¸ · ¸· ¸
7 −1 1 3 4 p q
5. 12.
2 4 −5 8 1 r s
· ¸· ¸ · ¸· ¸
3 5 2 p q 7 −2
6. 13.
−2 8 3 r s 5 3
· ¸· ¸ · ¸· ¸
4 1 2 1 p q 3 4
7. 14.
5 −1 5 9 r s 8 1
15. Show by direct calculations that

µ· ¸ · ¸¶
£ ¤ b1 b2 c1
a1 a2 = a1 b1 c1 + a2 b3 c1 + a1 b2 c2 + a2 b4 c2
b3 b4 c2
16. Show that the product

· ¸ µ· ¸· ¸¶
a1 a2 b1 b2 c1 c2
a3 a4 b3 b4 c3 c4
can be written in the form

 µ· ¸ · ¸¶ µ· ¸ · ¸¶
£ ¤ b1 b2 c1 £ ¤ b1 b2 c2
 a1 a2 b3 b4 c3
a1 a2
b3 b4 c4 
¸ · ¸¶ .
 
 µ· ¸ · ¸¶ µ·
£ ¤ b1 b2 c1 £ ¤ b1 b2 c2 
a3 a4 a3 a4
b3 b4 c3 b3 b4 c4
17. Show that · ¸· ¸ · ¸· ¸ · ¸

1 0 a b a b 1 0 a b
= = .
0 1 c d c d 0 1 c d
18. Show that

µ · ¸¶ · ¸ · ¸µ · ¸¶ µ· ¸· ¸¶
a1 a2 b1 b2 a1 a2 b1 b2 a1 a2 b1 b2
s = s =s
a3 a4 b3 b4 a3 a4 b3 b4 a3 a4 b3 b4
19. Show that

µ · ¸¶ · ¸ · ¸ µ · ¸¶ µ· ¸ · ¸¶
a1 a2 b1 a1 a2 b1 a1 a2 b1
s = s =s .
a3 a4 b2 a3 a4 b2 a3 a4 b2
20. Show that if A is a 2 × 2 matrix and B and C are 2 × 1 vectors, then
A(B +C ) = AB + AC .
21. Show that if A, B , and C are 2 × 2 matrices, then
A(B +C ) = AB + AC .
22. Show that if A and B are 2 × 2 matrices and C is a 2 × 1 vector, then
(A + B )C = AC + BC .
23. Show that if A, B , and C are 2 × 2 matrices, then
(A + B )C = AC + BC .
1.2 Inverse matrices

When solving a linear equation ax = b, with a 6= 0, we multiply both sides of the
equation by a1 to obtain the solution x = ba . We are now going to describe a general-
ization of this idea to matrix equations of the form Ax = b.
Definition 1.2.1. If
α β a b a b α β
· ¸· ¸ · ¸· ¸ · ¸
1 0
= = , (1.6)
γ δ c d c d γ δ 0 1
α β
· ¸ · ¸
a b
then the matrix is called an inverse of the matrix . A matrix that
γ δ c d
has an inverse is called an invertible matrix.
α β
· ¸ · ¸ · ¸
a b a b
Note that, if is an inverse matrix of , then is an inverse ma-
γ δ c d c d
α β α β
· ¸ · ¸ · ¸
a b
trix of . If (1.6) holds, we can say that the matrices and are
γ δ γ δ c d
inverses of each other.
1.2. INVERSE MATRICES 11
Example 1.2.2. Since

· ¸·1 ¸ ·1 ¸· ¸ · ¸
2 0 2 0 0 2 0 1 0
= 2 = ,
0 17 0 7 0 7 0 71 0 1
the matrices · ¸ ·1 ¸
2 0 2 0
and
0 17 0 7
are inverses of each other.

· ¸· ¸ · ¸· ¸ · ¸
1 5 1 −5 1 −5 1 5 1 0
= = ,
0 1 0 1 0 1 0 1 0 1
the matrices · ¸ · ¸
1 5 1 −5
and
0 1 0 1

¸" 3 # " 3 #·
2 −4 2 −4
· ¸ · ¸
6 8 6 8 1 0
= = ,
2 3 −1 3 −1 3 2 3 0 1
the matrices
3
" #
−4
· ¸
6 8 2
and
2 3 −1 3
Theorem 1.2.5. If a matrix has an inverse, then that inverse is unique.
Proof. We need to show that, if
α β a b a b α β
· ¸· ¸ · ¸· ¸ · ¸
1 0
= = ,
and · ¸· ¸ · ¸· ¸ · ¸
s t a b a b s t 1 0
= = ,
u v c d c d u v 0 1
then
α β
· ¸ · ¸
s t
= .
γ δ u v
Indeed, from the above assumptions and Theorem 1.1.11, we have
α β α β 1 0
· ¸ · ¸· ¸
=
γ δ γ δ 0 1
α β
· ¸ µ· ¸· ¸¶
a b s t
=
γ δ c d u v
α β a b
µ· ¸· ¸¶ · ¸
s t
=
γ δ c d u v
· ¸· ¸
1 0 s t
=
0 1 u v
· ¸
s t
= .
u v
· ¸ · ¸−1
a b a b
The inverse of a matrix will be denoted . With the aid of inverse
c d c d
matrices we can easily solve matrix equations.
· ¸
a b
Theorem 1.2.6. If the matrix is invertible, then the equation
c d
· ¸· ¸ · ¸
a b x e
= (1.7)
c d y f
has an unique solution which is

· ¸ · ¸−1 · ¸
x a b e
= . (1.8)
y c d f
Proof. First we show that the numbers x and y defined by (1.8) satisfy equation (1.7).
Indeed, from
· ¸ · ¸−1 · ¸
x a b e
=
y c d f
we obtain
· ¸· ¸ · ¸ Ã· ¸−1 · ¸! Ã· ¸· ¸−1 ! · ¸ · ¸· ¸ · ¸
a b x a b a b e a b a b e 1 0 e e
= = = = .
c d y c d c d f c d c d f 0 1 f f
Now suppose that we have

· ¸· ¸ · ¸
a b x e
= .
c d y f
Then
· ¸ · ¸ · ¸ Ã· ¸−1 · ¸! · ¸ · ¸−1 µ· ¸ · ¸¶ · ¸−1 · ¸
x 1 0 x a b a b x a b a b x a b e
= = = = .
y 0 1 y c d c d y c d c d y c d f
Example 1.2.7. Solve the system of equations

½
6x + 8y = 3
.
2x + 3y = 2
Solution. The above system can be written as a matrix equation

· ¸· ¸ · ¸
6 8 x 3
= .
2 3 y 2
In Example 1.2.4 we found that

¸−1 " 3 #
2 −4
·
6 8
= .
2 3 −1 3
Consequently, the unique solution is

· ¸ " 3 #· ¸ " 7#
x 2 −4 3 −2
= = ,
y −1 3 2 3
that is, x = − 27 and y = 3.
¸−1 " 3 #
2 −4
·
6 8
Note that once we know that = , all we need to solve the sys-
2 3 −1 3
tem
½
6x + 8y = 1
,
2x + 3y = 5
3
# · ¸ " 37 #
"
−4 1 2 −2
is to calculate the product = . The solution is x = − 37
2 and
−1 3 5 14
y = 14.
This indicates that being able to decide if a matrix is invertible and finding the
inverse of an invertible matrix is important. We will consider different ways these
problems can be solved. The first one uses elementary matrices.
Elementary matrices
Definition 1.2.8. A 2 × 2 matrix is called an elementary matrix if it has the

form of one of the following matrices
· ¸ · ¸ · ¸ · ¸ · ¸
0 1 s 0 1 0 1 0 1 t
, , , , and ,
1 0 0 1 0 s t 1 0 1
where s and t are arbitrary numbers with s 6= 0.
The product of an elementary matrix and an arbitrary matrix behaves in a pre-

dictable fashion:
· ¸· ¸ · ¸
0 1 a b c d
=
1 0 c d a b
· ¸· ¸ · ¸
s 0 a b sa sb
=
0 1 c d c d
· ¸· ¸ · ¸
1 0 a b a b
=
0 s c d sc sd
· ¸· ¸ · ¸
1 0 a b a b
=
t 1 c d c + ta d + tb
· ¸· ¸ · ¸
1 t a b a + tc b + td
=
0 1 c d c d
We will use elementary matrices to find inverse matrices. The process will usu-
ally require several multiplications by elementary matrices. To be able to do it quickly
and correctly you should know the above identities very well.
Example 1.2.9.
· ¸· ¸ · ¸
0 1 3 1 2 7
=
1 0 2 7 3 1
· ¸· ¸ · ¸
5 0 3 1 15 5
=
0 1 2 7 2 7
· ¸· ¸ · ¸
1 0 3 1 3 1
=
0 5 2 7 10 35
· ¸· ¸ · ¸
1 0 3 1 3 1
=
4 1 2 7 14 11
· ¸· ¸ · ¸
1 4 3 1 11 29
=
0 1 2 7 2 7
Example 1.2.10. Calculate the product

· ¸· ¸· ¸· ¸· ¸
1 3 0 1 2 0 1 0 4 1
.
0 1 1 0 0 1 5 1 2 3
Solution.
· ¸· ¸· ¸· ¸· ¸ · ¸· ¸· ¸· ¸
1 3 0 1 2 0 1 0 4 1 1 3 0 1 2 0 4 1
=
0 1 1 0 0 1 5 1 2 3 0 1 1 0 0 1 22 8
· ¸· ¸· ¸
1 3 0 1 8 2
=
0 1 1 0 22 8
· ¸· ¸
1 3 22 8
=
0 1 8 2
· ¸
46 14
=
8 2
Elementary matrices are invertible and their inverses are elementary matrices:
Theorem 1.2.11.
· ¸−1 · ¸
0 1 0 1
=
1 0 1 0
· ¸−1 · 1 ¸
s 0 0
= s
0 1 0 1
· ¸−1 · ¸
1 0 1 0
=
0 s 0 1s
· ¸−1 · ¸
1 0 1 0
=
t 1 −t 1
· ¸−1 · ¸
1 t 1 −t
=
0 1 0 1
Proof. We have
¸·· ¸ · ¸
0 1 0 1 1 0
=
1 0 1 0 0 1
· ¸·1 ¸ ·1 ¸· ¸ · ¸
s 0 s 0 0 s 0 1 0
= s =
0 1 0 1 0 1 0 1 0 1
· ¸· ¸ · ¸· ¸ · ¸
1 0 1 0 1 0 1 0 1 0
= =
0 s 0 1s 0 1s 0 s 0 1
· ¸· ¸ · ¸· ¸ · ¸
1 0 1 0 1 0 1 0 1 0
= =
t 1 −t 1 −t 1 t 1 0 1
· ¸· ¸ · ¸· ¸ · ¸
1 t 1 −t 1 −t 1 t 1 0
= =
0 1 0 1 0 1 0 1 0 1
α β a b α β
· ¸ · ¸ · ¸· ¸
a b
If the matrices and are invertible then the product
c d γ δ c d γ δ
is an invertible matrix and we have
¸¶−1 · ¸−1 · ¸−1
a b α β α β
µ· ¸·
a b
= .
c d γ δ γ δ c d
Indeed, we have
Ã· ¸−1 · ¸−1 ! µ· ¸−1 Ã· ¸−1 · ¸! ·
α β a b α β α β α β
¸· ¸¶ · ¸
a b a b a b
=
γ δ c d c d γ δ γ δ c d c d γ δ
¸−1 ·
α β 1 0 α β
· ¸· ¸
=
γ δ 0 1 γ δ
¸−1 ·
α β α β
· ¸
=
γ δ γ δ
· ¸
1 0
= .
0 1
A similar argument shows that

¸¶ Ã· ¸−1 · ¸−1 ! ·
α β α β
µ· ¸· ¸
a b a b 1 0
= .
c d γ δ γ δ c d 0 1
This property is not limited to two matrices.
Theorem 1.2.12. The product A 1 · · · A m of invertible 2 × 2 matrices is invert-

ible and we have
(A 1 · · · A m )−1 = A −1 −1
m · · · A1 .
Proof. The proof follows from the argument for two matrices presented above.
From Theorem 1.2.12 and the fact that elementary matrices are invertible it fol-
lows that every matrix that is a product of elementary matrices is invertible and the
inverse is a product of elementary matrices.
Example 1.2.13. Write the matrix

µ· ¸· ¸· ¸· ¸¶−1
1 3 0 1 2 0 1 0
0 1 1 0 0 1 5 1
as a product of elementary matrices.
Solution.
µ· ¸· ¸· ¸· ¸¶−1 · ¸−1 · ¸−1 · ¸−1 · ¸−1
1 3 0 1 2 0 1 0 1 0 2 0 0 1 1 3
=
0 1 1 0 0 1 5 1 5 1 0 1 1 0 0 1
· ¸·1 ¸· ¸· ¸
1 0 2 0 0 1 1 −3
=
−5 1 0 1 1 0 0 1
The following is the first theorem that addresses the question of invertibility of
2 × 2 matrices.
· ¸
a b
Theorem 1.2.14. For an arbitrary 2 × 2 matrix the following condi-
c d
tions are equivalent:
· ¸
a b
(i) The matrix is invertible;
c d
· ¸· ¸ · ¸
a b x 0
(ii) The only solution of the equation = is the trivial solution
c d y 0
x = 0 and y = 0;
(iii) There are elementary matrices E 1 , . . . , E m such that

· ¸ · ¸
a b 1 0
E1 · · · Em = .
c d 0 1
· ¸
a b
Proof. If the matrix is invertible and we have
c d
· ¸· ¸ · ¸
a b x 0
= ,
c d y 0
then · ¸ · ¸−1 · ¸ · ¸
x a b 0 0
= = .
y c d 0 0
This proves that (i) implies (ii).
Now assume that (ii) holds. If a = 0 and c = 0, then
· ¸· ¸ · ¸
a b 1 0
= ,
c d 0 0
contradicting (ii). Thus (ii) implies that a 6= 0 or c 6= 0.

If a 6= 0, then
· 1 ¸· ¸ " b#
a b 1 a
a 0 =
0 1 c d c d
and ¸" b # " b b
# " #
1 0 1 a 1 1
·
a a
= b
= ad −bc
.
−c 1 c d 0 d −c a 0 a
We must have ad − bc 6= 0 because, if ad − bc = 0, then we would have
− ba
· ¸· ¸ · ¸
a b 0
= ,
c d 1 0
contradicting (ii). Since

b
" #" # · b¸
1 0 1 a 1 a
a ad −bc
=
0 ad −bc 0 0 1
a
and "
1 − ba 1
#"
b
# ¸ ·
a 1 0
= ,
0 1 0 1 0 1
we have " ¸" 1 #·

1 − ba
#" #·
1 0 1 0 a 0 a b
¸ · ¸
1 0
a = . (1.9)
0 1 0 −c 1 0 1 c d 0 1
ad −bc
This proves that (ii) implies (iii) when a 6= 0.

If a = 0, then c 6= 0. In this case we use the equality
· ¸· ¸ · ¸
0 1 a b c d
= ,
1 0 c d a b
· ¸
c d
and apply the proof given in the case a 6= 0 to the matrix . Thus (ii) implies
a b
(iii).
Now we show that (iii) implies (i). If
· ¸ · ¸
a b 1 0
E1 · · · Em = ,
c d 0 1
then
· ¸ · ¸ · ¸
a b a b 1 0
= (E 1 · · · E m )−1 E 1 · · · E m = (E 1 · · · E m )−1 = (E 1 · · · E m )−1
c d c d 0 1
which gives us
· ¸
a b
= (E 1 · · · E m )−1 = E m
−1
· · · E 1−1 .
c d
Now is easy to see that

· ¸ · ¸
a b −1 −1 1 0
E1 · · · Em = E1 · · · Em Em · · · E1 = .
c d 0 1
and ¸ · · ¸
a b −1 −1 1 0
E · · · Em = Em · · · E1 E1 · · · Em = .
c d 1 0 1
· ¸
a b
This means that the matrix is invertible and
c d
· ¸−1
a b
= E1 · · · Em .
c d
The above theorem suggests a method for calculating the inverse of an arbitrary
invertible 2 × 2 matrix.
Corollary 1.2.15. If · ¸ · ¸
a b 1 0
E1 · · · Em = ,
c d 0 1
where E 1 , . . . , E m are elementary matrices, then
· ¸−1 · ¸
a b a b −1
= E1 · · · Em and = Em · · · E 1−1 .
c d c d
· ¸
2 3
Example 1.2.16. Write the matrix and its inverse as products of elementary
5 4
matrices and find the inverse of the matrix.
Solution. Since
"1 #· ¸ " 3#
2 0 2 3 1 2
= ,
0 1 5 4 5 4
¸" 3# " 3
#
1 0 1 2 1
·
2
= ,
−5 1 5 4 0 − 72
3 3
" #" # " #
1 0 1 2 1 2
2 7
= ,
0 −7 0 −2 0 1
3
" #" 3# ·
1 −2 1 2
¸
1 0
= ,
0 1 0 1 0 1
· ¸
2 3
the inverse of the matrix is
5 4
¸−1 "
1 − 32 1
#" #· ¸"1 #
0 0 2 0
·
2 3 1
=
5 4 0 1 0 − 27 −5 1 0 1
#" 1
1 − 32 1
" #" #
0 2 0
=
0 1 0 − 27 − 52 1
#"1
1 − 32
" #
2 0
=
0 1 57 − 27
" 1 15 ¡ 3 ¢ ¡ 2 ¢#
− −2 −7
= 2 145
7 − 27
" 4 3
#
−7 7
= 5 2
7 −7
and
¸ " 1 #!−1
1 − 32 1
¸ Ã" #" #·
0 1 0 2 0
·
2 3
=
5 4 0 1 0 − 27 −5 1 0 1
" 1 #−1 · ¸−1 " #−1 " #−1
2 0 1 0 1 0 1 − 32
=
0 1 −5 1 0 − 27 0 1
" #· ¸" #" 3#
2 0 1 0 1 0 1 2
= .
0 1 5 1 0 −7 0 1 2
In the next example we use what we learned in this section to solve a system of
linear equations.
Example 1.2.17. Solve the system of equations

½
2x + 3y = 1
5x + y = 2
using elementary matrices.
Solution. The system can be written as a matrix equation:

· ¸· ¸ · ¸
2 3 x 1
= .
5 1 y 2
Now we solve the equation by multiplying both sides of the equation by appropri-
ately chosen elementary matrices:
·1 ¸· ¸· ¸ · 3¸· ¸ ·1 ¸· ¸ ·1¸
2 0 2 3 x 1 2 x 0 1
= = 2 = 2 ,
0 1 5 1 y 5 1 y 0 1 2 2
¸· ¸ " 3 · ¸ · ¸· ¸ " 1#
#
1 0 1 32 x 1 1 0 21
· ¸·
2 x 2
= = = ,
−5 1 5 1 y 0 − 132
y −5 1 2 − 1
2
¸" 3 · ¸ · ¸" 1# " 1#
#
0 1 1 32 x
· ¸· ¸ ·
1 2 x 1 0 2
2 = = 2 = 12 ,
0 − 13 0 − 13
2
y 0 1 y 0 − 13 − 1
2 13
" 1# " 5 #
1 − 32 1 23 x 1 − 32
· ¸ · ¸ · ¸ · ¸ · ¸ · ¸ · ¸
1 0 x x 2
= = = 1
= 13 1
.
0 1 0 1 y 0 1 y y 0 1
13 13
5 1
The solution is x = 13 and y = 13 .
In the definition of an invertible 2 × 2 matrix we require that
α β a b a b α β
· ¸· ¸ · ¸· ¸ · ¸
1 0
= = .
This seems to imply that it is necessary to verify that both equalities
α β α β
· ¸· ¸ · ¸ · ¸· ¸ · ¸
a b 1 0 a b 1 0
= and =
γ δ c d 0 1 c d γ δ 0 1
hold. Actually, as the next theorem shows, if we verify one of these equalities, then
the other one follows.
· ¸
a b
Theorem 1.2.18. For an arbitrary 2 × 2 matrix the following condi-
c d
· ¸
a b
(i) The matrix is invertible;
c d
α β α β a b
· ¸ · ¸· ¸ · ¸
1 0
(ii) There is a matrix such that = ;
γ δ γ δ c d 0 1
α β a b α β
· ¸ · ¸· ¸ · ¸
1 0
(iii) There is a matrix such that = .
γ δ c d γ δ 0 1
Proof. Clearly (i) implies (ii) and (iii).

If (ii) holds, then the equality
· ¸· ¸ · ¸
a b x 0
=
c d y 0
gives us
α β a b x α β 0
· ¸ · ¸· ¸ · ¸· ¸· ¸ · ¸· ¸ · ¸
x 1 0 x 0
= = = = .
y 0 1 y γ δ c d y γ δ 0 0
· ¸
a b
This implies, by Theorem 1.2.14, that the matrix is invertible. This shows that
c d
(ii) implies (i).
Similarly, if (iii) holds, then the equality
α β
· ¸· ¸ · ¸
x 0
=
γ δ y 0
gives us
a b α β x
· ¸ · ¸· ¸ · ¸· ¸· ¸ · ¸· ¸ · ¸
x 1 0 x a b 0 0
= = = = .
y 0 1 y c d γ δ y c d 0 0
α β
· ¸
This implies, by Theorem 1.2.14, that the matrix is invertible. Now, the equal-
γ δ
ity
a b α β
· ¸· ¸ · ¸
1 0
=
c d γ δ 0 1
implies that
¸−1
α β
· ¸ ·
a b
= .
c d γ δ
This means that
α β α β a b
· ¸· ¸ · ¸· ¸ · ¸
a b 1 0
= = .
c d γ δ γ δ c d 0 1
· ¸
a b
Consequently, the matrix is invertible, completing the proof that (iii) implies
c d
(i).
LU-decomposition of 2 × 2 matrices
Now we present a representation of 2 × 2 matrices in a form that is useful in applica-
tions.
· ¸
a 0
Definition 1.2.19. A matrix of the form is called a lower triangular
b c
· ¸
a b
matrix. A matrix of the form is called upper triangular matrix.
0 c
Lower triangular and upper triangular matrices are used in the so-called LU-
decomposition of matrices.
Definition 1.2.20. By an LU-decomposition (or an LU-factorization) of a 2×2

matrix A we mean the representation of A in the form
A = LU
where U is an upper triangular matrix and L is a lower triangular matrix with

every entry on the main diagonal equal 1.
An LU-decomposition of a 2 × 2 matrix will have the form
· ¸ · ¸· ¸
a1 b1 1 0 u1 u2
= .
a2 b2 l 1 0 u3
When finding an LU-decomposition of a 2 × 2 matrix it is useful to note that

· ¸ · ¸· ¸ · ¸· ¸ · ¸
a1 b1 1 0 u1 u2 1 0 a1 b1 u1 u2
= if and only if = ,
a2 b2 l 1 0 u3 −l 1 a2 b2 0 u3
· ¸−1 · ¸
1 0 1 0
because = .
l 1 −l 1
· ¸
2 7
Example 1.2.21. Find an LU-decomposition of the matrix .
5 3
Solution. Since · ¸· ¸ · ¸
1 0 2 7 2 7
= ,
− 52 1 5 3 0 − 29
2
we have · ¸ · ¸· ¸
2 7 1 0 2 7
= 5 .
5 3 2 1 0 − 29
2
Not every 2 × 2 matrix has an LU-decomposition.
· ¸
0 1
Example 1.2.22. Show that the matrix has no LU-decomposition.
4 3
Solution. Suppose, to the contrary, that the matrix has an LU-decomposition, that
is, there are numbers l , u 1 , u 2 , and u 3 such that
· ¸ · ¸· ¸
0 1 1 0 u1 u2
= .
4 3 l 1 0 u3
Since · ¸· ¸ · ¸
1 0 u1 u2 u1 u2
= ,
l 1 0 u3 l u1 l u2 + u3
we must have u 1 = 0. But then
· ¸ · ¸
0 1 0 u2
= ,
4 3 0 l u2 + u3
which is not true.
· ¸
p q
Example 1.2.23. Let A = . Assuming that p 6= 0, find an LU-decomposition
r s
of A.
Solution. Since
¸ " #
1 0 p q p q
· ¸·
= qr ,
− pr 1 r s 0 s− p
we have
¸" #
1 0 p q
· ¸ ·
p q
= r qr .
r s p 1 0 s− p
· ¸
p q
The above shows that a matrix has an LU-decomposition as long as p 6= 0.
r s
The next example illustrates how we can use LU-decomposition to solve systems
of linear equations.
Example 1.2.24. Use LU-decomposition to solve the system

½
2x 1 + x 2 = b 1
.
5x 1 + 2x 2 = b 2
Solution. The system can be written as a matrix equation

· ¸· ¸ · ¸
2 1 x1 b1
= . (1.10)
5 2 x2 b2
Since · ¸· ¸ · ¸
1 0 2 1 2 1
= ,
− 25 1 5 2 0 − 12
we have · ¸ · ¸· ¸
2 1 1 0 2 1
= 5 .
5 2 2 1 0 − 12
Consequently, the equation (1.10) is equivalent to
· ¸· ¸· ¸ · ¸
1 0 2 1 x1 b1
5 1 = .
2 1 0 − 2
x 2 b2
First we let · ¸· ¸ · ¸
2 1 x1 y1
= (1.11)
0 − 12 x2 y2
and solve first the system
· ¸· ¸ · ¸
1 0 y1 b1
5 = .
2 1 y 2 b2
Since y 1 = b 1 and 52 y 1 + y 2 = b 2 , we get y 1 = b 1 and y 2 = b 2 − 25 b 1 . (This step is called

forward substitution.)
Now the matrix equation (1.11) becomes
· ¸· ¸ · ¸
2 1 x1 b1
=
0 − 12 x2 b 2 − 52 b 1
or ½
2x 1 + x2 = b1
.
− 12 x 2 = b 2 − 52 b 1
We first get (This step is called back substitution and will be discussed later in this
book in connection with Gauss elimination)
x 2 = −2b 2 + 5b 1
and then
1 1
x 1 = b 1 − x 2 = −2b 1 + b 2 .
2 2
1.2.1 Exercises
Calculate the following products.

· ¸· ¸ · ¸· ¸· ¸· ¸
0 1 3 8 0 1 2 0 1 0 1 2
1. 7.
1 0 4 5 1 0 0 1 5 1 −1 1
· ¸· ¸ · ¸· ¸· ¸· ¸
9 0 3 8 1 0 2 0 1 5 0 1
2. 8.
0 1 4 5 3 1 0 1 0 1 1 0
· ¸· ¸ · ¸· ¸· ¸· ¸
1 0 3 8 1 7 0 1 1 0 w x
3. 9.
3 1 4 5 0 1 1 0 1 1 y z
· ¸· ¸ · ¸· ¸· ¸· ¸
1 −2 3 8 1 0 3 0 1 2 w x
4. 10.
0 1 4 5 4 1 0 1 0 1 y z
· ¸· ¸· ¸· ¸ · ¸· ¸· ¸· ¸
1 3 4 0 1 0 1 −2 5 0 1 3 1 0 w x
5. 11.
0 1 0 1 5 1 0 1 0 1 0 1 −2 1 y z
· ¸· ¸· ¸· ¸ · ¸· ¸· ¸· ¸
1 0 1 −1 1 0 3 0 1 −4 0 1 1 0 w x
6. 12.
0 2 0 1 7 1 0 1 0 1 1 0 0 7 y z
Write the inverse of the given matrix as a product of elementary matrices and find
the inverse.
· ¸· ¸ · ¸· ¸· ¸· ¸
9 0 1 8 1 0 1 −1 1 0 3 0
13. 15.
0 1 0 1 0 2 0 1 7 1 0 1
· ¸· ¸· ¸ · ¸· ¸· ¸· ¸
5 0 1 −4 1 0 1 0 0 1 4 1 1 4
14. 16.
0 1 0 1 0 7 −3 1 1 0 0 1 0 1
Find a matrix P satisfying the given equation for some a, b, and c.

· ¸ · ¸ · ¸ · ¸
2 4 a 0 3 2 a 0
17. P = 19. P =
3 1 b c 2 5 b 1
· ¸ · ¸ · ¸ · ¸
1 3 a b 3 2 1 b
18. P = 20. P =
5 8 0 c 2 3 0 c
Use elementary matrices to calculate the inverse of the given matrix.

· ¸ · ¸ · ¸ · ¸
1 4 2 3 0 8 4 5
21. 22. 23. 24.
5 22 4 1 3 −12 3 7
Write the given matrix as a product of elementary matrices.

· ¸ · ¸ · ¸
36 5 3 −2 2 4
25. 27. 29.
7 1 15 −8 2 8
· ¸ · ¸ · ¸
4 0 5 4 1 2
26. 28. 30.
−1 1 2 1 3 4
· ¸
a b
31. Show that the matrix is not invertible.
0 0
¸ ·
a 0
32. Show that the matrix is not invertible.
c 0
33. Show that if the matrix A is invertible and the matrix B is not invertible, then
the matrix AB is not invertible.
34. Show that if the matrix A is not invertible and the matrix B is invertible, then
the matrix AB is not invertible.
· ¸
a b
35. Show that, if the matrix A = is not invertible and a 6= 0 or c 6= 0, then
c d
· ¸ · ¸
a b a ka
= for some real number k.
c d c kc
· ¸
a b
36. Using elementary matrices to show that the matrix is not invertible if
c d
and only if one of the following conditions occurs:
· ¸ · ¸
a b a ka
(a) = for some real number k,
c d c kc
· ¸ · ¸
a b 0 b
(b) = .
c d 0 d
Find a number a such that the given

· matrix A is not
· invertible· and¸then determine a
α β α β
¸ ¸
1 k
product of elementary matrices such that A= .
γ δ γ δ 0 0
· ¸ · ¸ · ¸ · ¸
2 8 1 a a 2 3 2
37. 38. 39. 40.
5 a 5 3 7 3 8 a
Solve the given system of equations using elementary matrices.

½ ½
2x + y = 7 3x + 2y = 1
41. 42.
3x − 2y = 3 5x + 4y = 0
½ ½
4x + 3y = u 2x + y = u
43. 44.
2x + y = v x + 4y = v
Determine the LU-decomposition of the given matrix.

· ¸ · ¸
5 7 4 1
45. 48.
3 8 1 1
· ¸
7 1
· ¸
46. p p
2 2 49. , p 6= 0
q p
· ¸
3 2
47.
5 3
50. Suppose that the matrices A, B and C are invertible. Show that the matrix ABC
is invertible and we have
(ABC )−1 = C −1 B −1 A −1 .
51. Suppose that the matrices A, B , C , and D are invertible. Show that the matrix
ABC D is invertible and we have
(ABC D)−1 = D −1C −1 B −1 A −1 .
1.3 Determinants
Theorem 1.2.14 characterizes invertible 2 × 2 matrices. In the proof of that theorem
the number ad − bc seems to play a significant role.
Definition
· ¸ 1.3.1. The number · ad −bc
¸ is called the determinant of the matrix
a b a b
and is denoted by det , that is,
c d c d
· ¸
a b
det = ad − bc.
c d
Example 1.3.2. · ¸
6 3
det = 6 · 4 − 13 · 3 = 24 − 39 = −15
13 4
·¸ · ¸ · ¸
a1 b1 a1 b1 £ ¤
If a = and b = , then the matrix can be written as a b . We
a2 b2 a2 b2
will use the notation
· ¸
£ ¤ a1 b1
det a b = det = a1 b2 − a2 b1 .
a2 b2
1.3. DETERMINANTS 29
In the following theorem we list some useful properties of determinants.
Theorem 1.3.3. For any a, b, c in R2 and t in R we have

£ ¤
(a) det a a = 0;
£ ¤ £ ¤
(b) det a b = − det b a ;
£ ¤ £ ¤ £ ¤
(c) t det a b = det t a b ) = det a t b ;
£ ¤ £ ¤ £ ¤
(d) det a + c b = det a b + det c b ;
£ ¤ £ ¤ £ ¤
(e) det a b + c = det a b + det a c ;
£ ¤ £ ¤
(f) det a + t b b = det a b ;
£ ¤ £ ¤
(g) det a b + t a = det a b .
Moreover, for any real numbers a 1 , a 2 , b 1 and b 2 , we have

· ¸ · ¸
a1 + t a2 b1 + t b2 a1 b1
(h) det = det ;
a2 b2 a2 b2
· ¸ · ¸
a1 b1 a1 b1
(i) det = det ;
a2 + t a1 b2 + t b1 a2 b2
· ¸ · ¸
a1 a2 a1 b1
(j) det = det .
b1 b2 a2 b2
Proof. These identities follow easily from the definition of the determinant. We
leave the proofs as exercises.
The following theorem describes an essential property of determinants. The

property is not just convenient for calculations, but it has important theoretical con-
sequences. The proof of Theorem 1.3.5 is a good example of that.
Theorem 1.3.4. For arbitrary numbers a, b, c, d , α, β, γ, δ we have
α β α β
µ· ¸· ¸¶ · ¸ · ¸
a b a b
det = det det .
c d γ δ c d γ δ
Proof. Since
α β
· ¸· ¸ · ¸
a b aα + bγ aβ + bδ
= ,
c d γ δ cα + d γ cβ + d δ
we have
α β
µ· ¸· ¸¶ · ¸
a b aα + bγ aβ + bδ
det = det
c d γ δ cα + d γ cβ + d δ
= (aα + bγ)(cβ + d δ) − (aβ + bδ)(cα + d γ)
= ad (αδ − βγ) + bc(βγ − αδ)
= (ad − bc)(αδ − βγ)
α β
· ¸ · ¸
a b
= det det .
c d γ δ
¸ · · ¸
a b a b
Theorem 1.3.5. If the matrix is invertible, then det 6 0.
=
c d c d
· ¸
a b
Proof. If the matrix is invertible, then
c d
· ¸· ¸−1 · ¸
a b a b 1 0
= .
c d c d 0 1
From Theorem 1.3.4 we have

· ¸ Ã· ¸−1 ! Ã· ¸· ¸−1 ! · ¸
a b a b a b a b 1 0
det det = det = det = 1.
c d c d c d c d 0 1
· ¸
a b
Hence det 6 0.
=
c d
The proof of Theorem 1.2.14 suggests a method

· ¸for finding the inverse of an in-
a b
vertible matrix. Consider an invertible matrix . If a 6= 0, then we have
c d
" ¸" 1 #·
1 − ba 1
#" #·
0 1 0 a 0 a b
¸ · ¸
1 0
a = ,
0 1 0 −c 1 0 1 c d 0 1
ad −bc
and consequently we must have

" ¸" 1 # · ¸−1
1 − ba 1
#" #·
0 1 0 a 0 a b
a = .
0 1 0 −c 1 0 1 c d
ad −bc
· ¸
a b
To find the inverse of the matrix we calculate product of the elementary ma-
c d
trices on the left-hand side:
¸−1 " ¸" 1 # " 1
1 − ba 1
#" #·
1 − ba 1
#" #" #
0 1 0 a 0 0 0
·
a b a
= =
c d 0 1 0 ad a−bc −c 1 0 1 0 1 0 a
ad −bc − ac 1
1
"
1 − ba
#" #
a 0
= c a
0 1 − ad −bc ad −bc
"
d
#
−b
ad −bc ad −bc
= −c a
.
ad −bc ad −bc
· ¸
a b
It turns out that the obtained matrix is the inverse of a invertible matrix
c d
also in the case when a = 0.
· ¸ · ¸
a b a b
Theorem 1.3.6. If det = ad − bc 6= 0, then the matrix is
c d c d
invertible and
¸−1 "
d −b
# " #
d −b
·
a b ad −bc ad −bc 1
= −c a
= .
c d ad − bc −c a
ad −bc ad −bc
Proof. It is easy to verify that

· ¸" d −b
# "
d −b
#· ¸ · ¸
a b ad −bc ad −bc a b 1 0
−c a
= ad−c
−bc ad −bc
a
= .
c d c d 0 1
ad −bc ad −bc ad −bc ad −bc
The formula in the above theorem allows us to solve more efficiently problems
that require finding the inverse of a matrix.
Example 1.3.7. Solve the system

½
x + 3y = 6
.
2x + y = 1
Solution. Since the system is equivalent to the equation

· ¸· ¸ · ¸
1 3 x 6
=
2 1 y 1
and · ¸
1 3
det = 1 · 1 − 3 · 2 = −5 6= 0,
2 1
we obtain
¸−1 · ¸ ¸ " 3#
−
· ¸ · · ¸· ¸ ·
x 1 3 6 1 1 −3 6 1 3
= = =− = 115 .
y 2 1 1 −5 −2 1 1 5 −11 5
In other words, the solution of the system is x = − 53 and y = 11

5 .
Example 1.3.8. Find a 2 × 2 matrix X such that

· ¸ · ¸
4 3 1 2
X= .
1 1 1 1
Solution. If · ¸ · ¸
4 3 1 2
X= ,
1 1 1 1
then · ¸−1 · ¸
4 3 1 2
X= .
1 1 1 1
· ¸−1 · ¸
4 3 1 −3
Since = , we have
1 1 −1 4
· ¸· ¸ · ¸
1 −3 1 2 −2 −1
X= =
−1 4 1 1 3 2
Theorems 1.3.5 and 1.3.6 yield:
· ¸ · ¸
a b a b
Theorem 1.3.9. The matrix is invertible if and only if det 6 0.
=
c d c d
Cramer’s Rule
Theorems 1.2.6 and 1.3.6 lead to the following result that gives the solution of a lin-
ear system in a explicit form in terms of determinants. The theorem is known as
Cramer’s Rule.
· ¸
a b
Theorem 1.3.10. If det 6 0, then the system
=
c d
½
ax + b y = e
cx + d y = f
has a unique solution for any real numbers e and f . The solution is
· ¸ · ¸
e b a e
det det
f d c f
x= · ¸ and y= · ¸.
a b a b
det det
c d c d
Proof. Since
¸−1 · ¸ " d # · ¸ " ed −b f #
· ¸ · −b
x a b e e
= = ad−c
−bc ad −bc
a
= ad −bc
a f −ce
,
y c d f f
ad −bc ad −bc ad −bc
we have · ¸ · ¸
e b a e
det det
f d c f
x= · ¸ and y= · ¸.
a b a b
det det
c d c d
Example 1.3.11. In this example we solve the system

½
2x + 5y = 4
x + 3y = 3
using Cramer’s Rule. Since

· ¸
2 5
det = 2 · 3 − 1 · 5 = 1,
1 3
· ¸
4 5
det = 4 · 3 − 3 · 5 = −3,
3 3
and · ¸
2 4
det = 2 · 3 − 1 · 4 = 2,
1 3
we have · ¸ · ¸
4 5 2 4
det det
3 3 1 3
x= · ¸ = −3 and y= · ¸ = 2.
2 5 2 5
det det
1 3 1 3
From Theorem 1.3.10 we obtain the following useful result.
Theorem 1.3.12. Let u and v be vectors in R2 such that det u v 6= 0. Then

£ ¤
for any vector x in R2 there exist unique real numbers α and β such that
x = αu + βv.
·¸ · ¸ · ¸
u1 v1 x1
Proof. If u = ,v= , and x = , then the equation x = αu+βv is equivalent
u2 v2 x2
to the system of linear equations
u1 α + v 1 β = x1
½
.
u2 α + v 2 β = x2
· ¸
£ ¤ u1 v 1
If det u v = det 6= 0, then the system has a unique solution by Theorem
u2 v 2
1.3.10.
The next result is a consequence of Theorems 1.2.14 and 1.3.9. It will be used in
some arguments in the next section.
Theorem 1.3.13. The following conditions are equivalent:

· ¸
a b
(a) det = 0;
c d
(b) The system ½

ax + b y = 0
(1.12)
cx + d y = 0
has a nontrivial solution, that is, a solution different from the trivial
solution x = y = 0.
·
¸
a b
We can easily find nontrivial solutions of the stystem (1.12) when det = 0.
c d
Indeed, if d 6= 0 or c 6= 0, then x = d and y = −c is a nontrivial solution of the system.
Similarly, if a 6= 0 or b 6= 0, then x = b and y = −a is a nontrivial solution. Finaly, if

a = b = c = d = 0, then any pair of numbers x and y is a solution of the system.
· ¸
a b
If det = 0, then the system
c d
½
ax + b y = e
cx + d y = f
either has no solutions or infinitely many solutions, as illustrated by the following

two examples.
Example 1.3.14. The system ½

x + y =1
2x + 2y = 3
has no solutions. Indeed, if some x and y were a solution of the system, then we
would have x + y = 1. But then 2x + 2y = 2, which is not possible in view of the
second equation of the system.
Example 1.3.15. We will show that the system

½
x + y =1
2x + 2y = 2
has infinitely many solutions.

It is easy to check that x = t and y = 1 − t is a solution of the system for any real
number t .
The Leontief model
The Leontief model is a model introduced to describe the economics of a whole

country or a large region. It was proposed by Wassily Leontief, who won the No-
bel prize in economics in 1973. The following theorem gives us the mathematical
foundation of the model in the case of 2 × 2 matrices.
· ¸
a 11 a 12
Theorem 1.3.16. Let C = be a matrix with nonnegative entries
a 21 a 22
and such that a 11 + a 21 < 1 and a 12 + a 22 < 1. Then
· ¸ · ¸
1 0 a 11 a 12
(a) The matrix − is invertible;
0 1 a 21 a 22
µ· ¸ · ¸¶−1
1 0 a 11 a 12
(b) All entries of the matrix − are nonnegative;
0 1 a 21 a 22
¸ ·
d1
(c) For every vector d = with nonnegative entries the equation x =
d2
· ¸
x1
C x + d has a unique solution x = with nonnegative entries.
x2
Proof. If
a 11 ≥ 0, a 21 ≥ 0, a 12 ≥ 0, a 22 ≥ 0
and
a 11 + a 21 < 1 and a 12 + a 22 < 1,
then
µ· ¸ · ¸¶
1 0 a 11 a 12
det − = (1 − a 11 )(1 − a 22 ) − a 12 a 21 > a 12 a 21 − a 12 a 21 = 0,
0 1 a 21 a 22
and thus the matrix · ¸ · ¸

1 0 a 11 a 12
−
0 1 a 21 a 22
is invertible. Since
µ· ¸ · ¸¶−1 · ¸
1 0 a 11 a 12 1 1 − a 22 a 12
− = ,
0 1 a 21 a 22 (1 − a 11 )(1 − a 22 ) − a 12 a 21 a 21 1 − a 11
µ· ¸ · ¸¶−1
1 0 a 11 a 12
all entries of the matrix − are nonnegative. Moreover, if d 1 ≥
0 1 a 21 a 22
0 and d 2 ≥ 0, then the solution of the equation
· ¸ · ¸· ¸ · ¸
x1 a 11 a 12 x 1 d1
= +
x2 a 21 a 22 x 2 d2
is · ¸ µ· ¸ · ¸¶−1 · ¸
x1 1 0 a 11 a 12 d1
= − .
x2 0 1 a 21 a 22 d2
Consequently, x 1 ≥ 0 and x 2 ≥ 0.
Example 1.3.17. Solve the equation x = C x + d for

"1 2
# · ¸
10 5 70
C= 3 1
and d = .
5
10 5
Proof. Since
· ¸ · ¸ "1 2# " 9
− 25
#
1 0 1 0 10 5 10
−C = − =
0 1 0 1 3 1 3
− 10 4
10 5 5
and Ã· ¸ "1 2 #!−1 "4 2#

1 0 10 5 3 3
− = ,
0 1 3 1 1 3
10 5 2 2
we obtain the solution "4 2 ¸ " 290 #

#·
3 3 70 3
1 3
= 85 .
5
2 2 2
We illustrate an application of the above theorem in economics with a very sim-

ple example. We consider two products, product A and product B. We assume that
product A is used in the production of product A and in the production of product
B. Similarly, product B is used in the production of product A and in the production
of product B. The variables in Theorem 1.3.16 are interpreted as follows:
x 1 = the total output of product A
x 2 = the total output of product B
a 11 x 1 = the amount of product A used in the production of A
a 12 x 2 = the amount of product A used in the production of B
a 21 x 1 = the amount of product B used in the production of A
a 22 x 2 = the amount of product B used in the production of B
d 1 = the remaining amount of product A
d 2 = the remaining amount of product B
We have
x 1 = a 11 x 1 + a 12 x 2 + d 1 and x 2 = a 21 x 1 + a 22 x 2 + d 2 .
· ¸ · ¸
a 11 a 12 x1
The matrix C = is called the consumption matrix, the vector x = is
a 21 a 22 x2
· ¸
d1
called the output vector or the production vector, and the vector d = is called
d2
the demand vector. Theorem 1.3.16 guarantees that, under the assumption of the
theorem, for any demand vector with nonnegative entries there is a unique output
vector with nonnegative entries.
1.3.1 Exercises
Find the determinants of the given matrices.

· ¸ · ¸ · ¸
3 2 3 2 a ta
1. 4. 7.
2 3 −3 −2 b tb
· ¸ · ¸
4 −2 a b
· ¸
a b
2. 5. 8.
5 7 c 0 ta tb
· ¸ · ¸
1 5 a b
3. 6.
3 4 0 c
Find the inverses of the following matrices when possible.

· ¸ · ¸ · ¸
3 2 a b 3 a
9. 13. 17.
1 5 b a 2 b
· ¸
1 −3
· ¸
a + 1 −1
· ¸
10. 4 1
14. 18.
2 −4 1 a a b
· ¸
1 2
· ¸
11. 2 a
3 4 15.
−a 5
2
" #
1 3
·
a −2
¸
12. 5
. 16.
0 3 2 a
Use Theorem 1.2.6 to solve the following systems of equations.

½ ½
x + 2y = 2 7x − 2y = 0
19. 21.
3x + 4y = 1 5x + 3y = 4
½ ½
2x − y = 1 5x + 4y = 3
20. 22.
x + 2y = −1 2x + 3y = 0
Show that the following identities hold for any real numbers a 1 , a 2 , b 1 , b 2 , and t .
· ¸ · ¸
a1 + t a2 b1 + t b2 a1 b1
23. det = det
a2 b2 a2 b2
· ¸ · ¸
a1 b1 a1 b1
24. det = det
a2 + t a1 b2 + t b1 a2 b2
Use Theorem 1.3.10 to solve the following systems of equations.

1.4. DIAGONALIZATION OF 2 × 2 MATRICES 39
½ ½
3x + 2y = 1 (a − 1)x − y = 1
25. 30.
2x + 3y = 1 x + ay = 0
½ ½
5x − 4y = 1 3x + a y = s
26. 31.
3x + 5y = 0 2x + 5y = t
½ ½
7x + 2y = 0 3x + a y = s
27. 32.
x + 7y = −1 x + 2y = t
½ ½
5x + 3y = 2 2x + a y = s
28. 33.
3x + 4y = −1 bx + 2y = t
½ ½
(a − 4)x − 5y = 1 4x + a y = s
29. 34.
2x + a y = 1 5x + b y = t
1.4 Diagonalization of 2 × 2 matrices

α 0
· ¸
Matrices of the form are called diagonal matrices. They are easy to work with.
0 β
For example, we have
α 0 x αx
· ¸· ¸ · ¸
= .
0 β y βy
Multiplying diagonal matrices is equally easy:
α1 0 α2 0 α1 α2
· ¸· ¸ · ¸
0
= (1.13)
0 β1 0 β2 0 β1 β2
In this section we show how these and other properties of diagonal matrices can be
useful when dealing with matrices that are not diagonal.
From (1.13) we obtain the following nice and useful property of diagonal matri-
ces ¸3 · 3
α 0 α 0 α 0 α 0 α
· ¸· ¸· ¸ · ¸
0
= =
0 β 0 β 0 β 0 β 0 β3
and more generally
¸n · n
α 0 α 0 α 0 α
· ¸ · ¸ · ¸
0
... = = .
0 β 0 β 0 β 0 βn
| {z }
n times
This property can be extended to any 2 × 2 matrix that can be written in the form
P DP −1 where P is an invertible 2×2 matrix and D is a diagonal matrix. Indeed, for
such a matrix we have
¢3
P DP −1 = P DP −1 P DP −1 P DP −1
¡
= P D(P −1 P )D(P −1 P )DP −1

= P D 3 P −1 .
More generally, if n is any natural number, we have

¢n n −1
P DP −1 = P −1 −1 −1
¡
| DP P DP{z . . . P DP } = P D P ,
n times
which means that ¶n · n

α 0 −1 α
µ · ¸ ¸
0 −1
P P =P P
0 β 0 βn
for any numbers α and β and any invertible 2 × 2 matrix P .
It is not clear when a matrix can be written in the form P DP −1 . Moreover, even
if we know that a matrix has such a representation, it is not obvious how to find P
and D. In this section we investigate these questions. We begin by considering an
example.
Example 1.4.1. We want to write the matrix

· ¸
5 −1
A=
2 2
as a product
¸−1
u1 v 1 α 0 u1 v 1
· ¸ · ¸· ¸·
5 −1
= (1.14)
2 2 u2 v 2 0 β u2 v 2
where · ¸
u1 v 1
u2 v 2
is an invertible matrix.
· Multiplying
¸ from the right both sides of the equation (1.14) by the matrix
u1 v 1
we get
u2 v 2
¸−1 ·
α 0
· ¸· ¸ · ¸· ¸· ¸
5 −1 u 1 v 1 u1 v1 u1 v 1 u1 v 1
=
2 2 u2 v 2 u2 v2 0 β u2 v 2 u2 v 2
α
· ¸· ¸· ¸
u1 v1 0 1 0
=
u2 v2 0 β 0 1
α
· ¸· ¸
u1 v1 0
= .
u2 v2 0 β
This means that equation (1.14) is equivalent to the equation
u1 v 1 α 0
· ¸· ¸ · ¸· ¸
5 −1 u 1 v 1
= . (1.15)
2 2 u2 v 2 u2 v 2 0 β
Since
u1 v 1 α 0 αu 1 βv 1
· ¸· ¸ · ¸
=
u2 v 2 0 β αu 2 βv 2
the equation (1.15) can be written as
αu 1 βv 1
· ¸· ¸ · ¸
5 −1 u 1 v 1
= . (1.16)
2 2 u2 v 2 αu 2 βv 2
Equation (1.16) is equivalent to the following two equations:
αu 1
· ¸· ¸ · ¸ · ¸
5 −1 u 1 u1
= =α (1.17)
2 2 u2 αu 2 u2
and
βv 1
· ¸· ¸ · ¸ · ¸
5 −1 v 1 v1
= =β . (1.18)
2 2 v2 βv 2 v2
So far we have shown that the equation
¸−1
u1 v 1 α 0 u1 v 1
· ¸ · ¸· ¸·
5 −1
=
2 2 u2 v 2 0 β u2 v 2
¸ · ¸ ·
u1 v1
is equivalent to finding real numbers α and β and vectors and such that
u2 v2
· ¸· ¸ · ¸ · ¸· ¸ · ¸
5 −1 u 1 u1 5 −1 v 1 v1
=α and =β
2 2 u2 u2 2 2 v2 v2
· ¸
u1 v 1
and such that the matrix is invertible.
u2 v 2
The equation (1.17) can be written as
5−α
· ¸· ¸ · ¸
−1 u 1 0
=
2 2 − α u2 0
or
(5 − α)u 1 −
½
u2 = 0
. (1.19)
2u 1 + (2 − α)u 2 = 0
We are interested in a solution such that u 1 6= 0 or u 2 6= 0. This means that, by
Theorem 1.3.13, we must have
5−α
· ¸
−1
det = 0.
2 2−α
Similarly, equation (1.18) can be written as
5−β
· ¸· ¸ · ¸
−1 v 1 0
=
2 2 − β v2 0
or
(5 − β)v 1 −
½
v2 = 0
. (1.20)
2v 1 + (2 − β)v 2 = 0
Since we are interested in a solution such that v 1 6= 0 or v 2 6= 0, we must have
5−β
· ¸
−1
det = 0.
2 2−β
Consequently, both α and β are roots of the equation
5−λ
· ¸
−1
det = 0.
2 2−λ
We calculate the determinant and obtain the equation
λ2 − 7λ + 12 = 0,
which has roots α = 3 and β = 4.

With α = 3 the system (1.19) becomes
½
2u 1 − u 2 = 0
2u 1 − u 2 = 0
· ¸ · ¸
u1 1
and has a nontrivial solution = . With β = 4 the system (5.5) becomes
u2 2
½
v1 − v2 = 0
2v 1 − 2v 2 = 0
· ¸ · ¸
v1 1
and has a nontrivial solution = .
v2 1
We note that · ¸ · ¸
u1 v 1 1 1
det = det = −1 6= 0,
u2 v 2 2 1
so the matrix is invertible and we have
· ¸−1 · ¸−1 · ¸ · ¸
u1 v 1 1 1 1 −1 −1 1
= =− = .
u2 v 2 2 1 −2 1 2 −1
· ¸
5 −1
Consequently, the desired representation of the matrix is
2 2
¸−1 ·
u1 v 1 α 0 u1 v 1
· ¸ · ¸· ¸· ¸· ¸· ¸
5 −1 1 1 3 0 −1 1
= = .
2 2 u2 v 2 0 β u2 v 2 2 1 0 4 2 −1
We illustrate the advantage of the obtained representation by calculating

· ¸77
5 −1
, which is practically impossible to do by direct calculations with the ma-
2 2
· ¸
5 −1
trix . On the other hand, since
2 2
· ¸ · ¸· ¸· ¸
5 −1 1 1 3 0 −1 1
= ,
2 2 2 1 0 4 2 −1
we have
· ¸77 µ· ¸· ¸· ¸¶77
5 −1 1 1 3 0 −1 1
=
2 2 2 1 0 4 2 −1
· ¸· ¸77 · ¸
1 1 3 0 −1 1
=
2 1 0 4 2 −1
· ¸ · 77 ¸· ¸
1 1 3 0 −1 1
= 77
2 1 0 4 2 −1
· 77 77 ¸ · ¸
3 4 −1 1
= 77 77
2·3 4 2 −1
77 77
377 − 477
· ¸
−3 + 2 · 4
= .
−2 · 377 + 2 · 477 2 · 377 − 477
Now we will show that the method used in the above example leads to a more
general result. We observe that finding the roots of the equation
5−λ
· ¸
−1
det =0
2 2−λ
was crucial in the presented solution. The roots α = 3 and β = 4 turned out to be the
numbers needed for the representation
¸−1
u1 v 1 α 0 u1 v 1
· ¸ · ¸· ¸·
5 −1
= .
2 2 u2 v 2 0 β u2 v 2
This observation leads to one of the most important ideas in linear algebra.
Definition 1.4.2. A real number λ is called an eigenvalue of a 2 × 2 matrix A

if the equation · ¸ · ¸
x1 x1
A =λ
x2 x2
· ¸ · ¸
x1 0
has a nontrivial solution, that is, a solution 6= .
x2 0
Example 1.4.3. Since · ¸· ¸ · ¸ · ¸

5 −1 1 3 1
= =3
2 2 2 6 2
and · ¸· ¸ · ¸ · ¸
5 −1 1 4 1
= =4 ,
2 2 1 4 1
· ¸
5 −1
λ = 3 and λ = 4 are eigenvalues of the matrix .
2 2
The following theorem gives us a practical way of finding eigenvalues of a 2 × 2

matrix.
· ¸
a b
Theorem 1.4.4. The real number λ is an eigenvalue of the matrix if
c d
a −λ
· ¸
b
det = 0.
c d −λ
Proof. First note that the equation

· ¸· ¸ · ¸
a b x x
=λ
c d y y
can be written as
a −λ
· ¸· ¸ · ¸
b x 0
= .
c d −λ y 0
The above equation has a nontrivial solution if
a −λ
· ¸
b
det = 0,
c d −λ
by Theorem 1.3.13.
· ¸
1 5
Example 1.4.5. Find the eigenvalues of the matrix .
3 3
Solution. Since
1−λ
· ¸
5
det = (1 − λ)(3 − λ) − 15 = λ2 − 4λ − 12,
3 3−λ
we need to solve the quadratic equation
λ2 − 4λ − 12 = 0.
· ¸
1 5
The solutions are λ = 6 and λ = −2, which are the eigenvalues of the matrix .
3 3

1−λ
· ¸
3
det = λ2 − 7,
2 −1 − λ
p p
· ¸
1 3
the eigenvalues of the matrix A = are λ = 7 and λ = − 7.
2 −1
· ¸ · ¸ 1.4.7. Let λ ·
Definition be an
¸ eigenvalue
· ¸ of a 2 × 2 matrix A. A vector
x1 0 x1 x1
6= such that A =λ is called an eigenvector corresponding
x2 0 x2 x2
to the eigenvalue λ.
Example 1.4.8. Since · ¸· ¸ · ¸ · ¸

1 5 1 6 1
= =6
3 3 1 6 1
and · ¸· ¸ · ¸ · ¸
1 5 −5 10 −5
= = −2 ,
3 3 3 −6 3
· ¸
1
the vector is an eigenvector corresponding to the eigenvalue λ = 6 and the
1
· ¸
−5
vector is an eigenvector corresponding to the eigenvalue λ = −2.
3
¸ ·
x1
Note that eigenvectors are not unique. If is an eigenvector of A correspond-
x2
ing to the eigenvalue λ, then for any real number t we have
µ · ¸¶ µ · ¸¶ µ · ¸¶ µ · ¸¶
x1 x1 x1 x1
A t =t A =t λ =λ t ,
x2 x2 x2 x2
· ¸
x1
so t is also an eigenvector of A corresponding to the eigenvalue λ as long as
x2
t 6= 0. · ¸ · ¸
x1 y1
Similarly, if and are eigenvectors of A corresponding to the eigenvalue
x2 y2
· ¸
x1 + y 1
λ, then it is easy to verify that the vector is also an eigenvector of the matrix
x2 + y 2
· ¸ · ¸
x1 + y 1 0
A as long as 6= .
x2 + y 2 0
Definition 1.4.9. A matrix A is called diagonalizable if there is a diagonal

matrix D and an invertible matrix P such that
A = P DP −1 .
· ¸
a1 b1
In other words, a matrix A = is diagonalizable if there are real numbers
a b2
·2 ¸
u1 v1
α and β and an invertible matrix such that
u2 v2
¸−1
u1 v 1 α 0 u1 v 1
· ¸ · ¸· ¸·
a1 b1
= .
a2 b2 u2 v 2 0 β u2 v 2
Note that we are not assuming that α and β are different.
Example 1.4.10. In Example 1.4.1 we have shown that

· ¸ · ¸· ¸· ¸−1
5 −1 1 1 3 0 1 1
= .
2 2 2 1 0 4 2 1
Therefore the matrix · ¸

5 −1
A=
2 2
is diagonalizable.
Not all matrices are diagonalizable.

· ¸
2 0
Example 1.4.11. If possible, diagonalize the matrix A = .
3 2
Solution. The eigenvalues of A are the roots of the equation
2−λ
· ¸
0
det = λ2 − 4λ + 4 = 0.
3 2−λ
We solve this quadratic equation and find that the only eigenvalue is λ = 2.
The eigenvectors corresponding to the eigenvalue λ = 2 are the solutions of the
equation · ¸· ¸ · ¸
2 0 x x
=2
3 2 y y
that is equivalent to the equation
3x = 0.
· ¸
x
So a vector is an eigenvector of A corresponding to the eigenvalue λ = 2 if x = 0.
y
· ¸ · ¸
0 0
Consider two eigenvectors and corresponding to the eigenvalue λ = 2.
a b
· ¸ · ¸
0 0 0 0
Since det = 0, the matrix is not invertible. Because this happens no
a b a b
matter what a and b we use, the matrix A cannot be diagonalized.
The next theorem describes a property that guarantees diagonalizability of a 2×2

matrix. Note that it is not an “if and only if” statement.
Theorem 1.4.12. Every 2 × 2 matrix with two different real eigenvalues is di-
agonalizable.
·
¸
a1 b1
Proof. Let A = . Assume there exist two different real numbers α and β and
a2 b2
· ¸ · ¸
u1 v1
nonzero vectors and such that
u2 v2
αu 1 βv 1
· ¸· ¸ · ¸ · ¸ · ¸· ¸ · ¸ · ¸
a1 b1 u1 u1 a1 b1 v1 v1
=α = and =β = . (1.21)
a2 b2 u2 u2 αu 2 a2 b2 v2 v2 βv 2
· ¸
u1 v 1
We will show that the matrix is invertible. We use Theorem 1.2.14, that
u2 v 2
is, we assume that · ¸· ¸ · ¸
u1 v 1 x 0
= (1.22)
u2 v 2 y 0
and show that x = y = 0.
We first note that the equations in (1.21) can be written as a single equation
αu 1 βv 1
· ¸· ¸ · ¸
a1 b1 u1 v 1
= (1.23)
a2 b2 u2 v 2 αu 2 βv 2
which implies
αu 1 βv 1 x
· ¸· ¸· ¸ · ¸· ¸
a1 b1 u1 v 1 x
= . (1.24)
a2 b2 u2 v 2 y αu 2 βv 2 y
From (1.24) and (1.22) we obtain
αu 1 βv 1 x
· ¸· ¸ · ¸
0
=
αu 2 βv 2 y 0
which can be written as
αu 1 x + βv 1 y = 0
½
. (1.25)
αu 2 x + βv 2 y = 0
Equation (1.22) can be written as
½
u1 x + v 1 y = 0
.
u2 x + v 2 y = 0
Multiplying these equations by α and subtracting them from the corresponding equa-
tions in (1.25) we get
(β − α)v 1 y = 0
½
,
(β − α)v 2 y = 0
which gives us y = 0 because β − α 6= 0 and at least one of the numbers v 1 and v 2 is
different from 0.
By modifying the above argument appropriately · we obtain
¸ x = 0. This allows us
u1 v 1
to conclude, by Theorem 1.2.14, that the matrix is invertible. Multiplying
u2 v 2
· ¸−1
u1 v 1
(1.23) by on the left produces the desired result:
u2 v 2
¸−1
u1 v 1 α 0 u1 v 1
· ¸ · ¸· ¸·
a1 b1
= .
a2 b2 u2 v 2 0 β u2 v 2
Note that the above proof shows more than just that every 2 × 2 matrix with two
different real eigenvalues is diagonalizable. It gives us a practical method for diag-
onalizing such a matrix. First we need to find the eigenvalues. If the eigenvalues
· ¸
u1
are two different real numbers α and β, then we need to find an eigenvector
u2
· ¸
v1
corresponding to the eigenvalue α and an eigenvector corresponding to the
v2
· ¸
u1 v 1
eigenvalue β. Then the matrix is invertible and we have
u2 v 2
¸−1
u1 v 1 α 0 u1 v 1
· ¸ · ¸· ¸·
a1 b1
= .
a2 b2 u2 v 2 0 β u2 v 2
In the next example we use this method to diagonalize a given 2 × 2 matrix.
· ¸
5 7
Example 1.4.13. If possible, diagonalize the matrix A = .
3 9
5−λ
· ¸
7
det = λ2 − 14λ + 24 = 0.
3 9−λ
We solve this quadratic equation and find that the eigenvalues are λ = 2 and λ = 12.
The eigenvectors corresponding to the eigenvalue λ = 2 are the solutions of the
equation · ¸· ¸ · ¸
5 7 x x
=2
3 9 y y
that is equivalent to the equation
· ¸· ¸ · ¸
3 7 x 0
= .
3 7 y 0
· ¸
x
So the vector is an eigenvector A corresponding to the eigenvalue λ = 2 if
y
· ¸ · ¸
x 7
3x + 7y = 0. This means that we can take = as an eigenvector of A corre-
y −3
sponding to λ = 2.
The eigenvectors corresponding to the eigenvalue λ = 12 are the solutions of
the equation · ¸· ¸ · ¸
5 7 x x
= 12 ,
3 9 y y
or, equivalently, of the equation
· ¸· ¸ · ¸
−7 7 x 0
= .
3 −3 y 0
· ¸
x
So the vector is an eigenvector A corresponding to the eigenvalue λ = 12 if
y
· ¸ · ¸
x 1
x−y = 0. This means that = as an eigenvector of A corresponding to λ = 12.
y 1
Now we have everything we need to diagonalize the matrix A:
· ¸· ¸· ¸−1
7 1 2 0 7 1
A= .
−3 1 0 12 −3 1
· ¸−1 · ¸
7 1 1 1 −1
Since = 10 , we have
−3 1 3 7
¸" 1 1
#
7 1 2 0 10 − 10
· ¸·
A= 3 7
−3 1 0 12
10 10
We could also write

· ¸· ¸· ¸−1 · ¸· ¸" 3 7
#
1 7 12 0 1 7 1 7 12 0 10 10
A= = 1 1
.
1 −3 0 2 1 −3 1 −3 0 2
10 − 10
It is important to note that the order of the eigenvectors matches the order of the
eigenvalues in the diagonal matrix.
Applications
Example 1.4.14. Let p, q, r , and s be real numbers such that p > 0, q > 0, r > 0,
s > 0, p + r = 1 and q + s = 1. Show that the matrix
· ¸
p q
A=
r s
has eigenvalues 1 and k, where −1 < k < 1.
Solution. We can write r = 1 − p and q = 1 − s. The eigenvalues of A are the solu-

tions of the equation
p −λ 1−s
· ¸
det = 0.
1−p s −λ
Since
p −λ 1−s
· ¸
det = (p − λ)(s − λ) − (1 − p)(1 − s)
1−p s −λ
= λ2 − (p + s)λ + p + s − 1
= (λ − 1)(λ − p − s + 1),
the eigenvalues are 1 and k = p + s − 1. Moreover
−1 < k = p + s − 1 < 1,
because 0 < p < 1 and 0 < s < 1.

Example 1.4.15. Let p, q, r , and s be real numbers

· ¸ such that p > 0, q > 0, r > 0,
p q
s > 0, p + r = 1 and q + s = 1 and let A = . If x 0 and y 0 are arbitrary real
r s
numbers and · ¸ · ¸
x n+1 xn
=A for n ≥ 1,
y n+1 yn
express x n and y n in terms of x 0 and y 0 and find the limits limn→∞ x n and limn→∞ y n .
Solution. The eigenvalues of A are 1 and k with −1 < k < 1. Let u be an eigenvec-
tor corresponding to the eigenvalue 1 and v an eigenvector corresponding to the
eigenvalue k. Because det u v 6= 0, there are real numbers α and β such that
£ ¤
· ¸
x0
= αu + βv
y0
by Theorem 1.3.12. Since

· ¸ · ¸
xn n x0
=A = αA n u + βA n v = αu + βk n v,
yn y0
· ¸
u1
if u = , we get
u2
lim x n = αu 1 and lim y n = αu 2 .
n→∞ n→∞
· ¸
£ ¤ 1 0
We note that, if we denote P = u v and D = , then
0 k
" # " #
xn n x0
lim = lim A
n→∞ yn n→∞ y0
" #
n −1 x0
= lim P D P
n→∞ y0
" # " #
1 0 x0
= lim P P −1
n→∞ 0 kn y0
" # " #
1 0 −1 x 0
=P P
0 0 y0
" # " #
1 0 −1 α
=P P P
0 0 β
" #
α
=P = αu.
0
1.4.1 Exercises
Find an eigenvector of the given matrix A corresponding to the given eigenvalue λ.

· ¸ · ¸
1 2 −2 2
1. A = ,λ=0 3. A = ,λ=6
2 4 4 5
· ¸ · ¸
2 4 1 8
2. A = ,λ=1 4. A = , λ = −1
1 5 2 7
Find the eigenvalues of the given matrix.

· ¸ · ¸ · ¸
5 3 4 3 a b
5. 9. 13.
1 3 4 3 1−a 1−b
· ¸
2 5
· ¸ · ¸
2 3 a b
6. 10. 14.
5 4 −1 −4 3−a 3−b
· ¸
1 7
· ¸ · ¸
10 4 11. a +3 2a
7. 9 3 15.
6 5 7a 14a + 3
" #
·
3 3
¸ 3 5 ·
a +k b
¸
8. 12. 16.
1 5 12 10 2a 2b + k
· ¸
4
17. Find a matrix A such that is an eigenvector of A corresponding to the
−1
· ¸
−1
eigenvalue λ = 3 and is an eigenvector of A corresponding to the eigen-
1
value λ = 0.
· ¸
3
18. Find a matrix A such that is an eigenvector of A corresponding to the
2
· ¸
2
eigenvalue λ = 2 and is an eigenvector of A corresponding to the eigen-
3
value λ = −4.
· ¸
2
19. Let s and t be two real numbers. Find a matrix A such that is an eigen-
1
· ¸
1
vector of A corresponding to the eigenvalue s and is an eigenvector of A
4
corresponding to the eigenvalue t .
· ¸
3
20. Let s and t be two real numbers. Find a matrix A such that is an eigen-
4
· ¸
2
vector of A corresponding to the eigenvalue s and is an eigenvector of A
0
corresponding to the eigenvalue t .
· ¸
a b
21. Find real numbers a and b such that the matrix has eigenvalues 1 and
5 3
2.
¸ ·
a b
22. Assuming a matrix has only one eigenvalue α, find α.
c d
If possible, diagonalize the given matrix.

· ¸ · ¸
9 3 2 4
23. A = 25. A =
5 7 3 13
· ¸ · ¸
2 2 5 3
24. A = 26. A =
1 3 3 13
· ¸
2 1
27. For A = find two different invertible matrices P 1 and P 2 and a diagonal
8 9
matrix D such that A = P 1 DP 1−1 and A = P 2 DP 2−1 .
· ¸
3 3
28. For A = find two different invertible matrices P 1 and P 2 and two dif-
8 13
ferent diagonal matrices D 1 and D 2 such that A = P 1 DP 1−1 and A = P 2 DP 2−1 .
Find the given products of matrices.

· ¸n · ¸n
2 4 3 2
29. 30.
2 9 4 10
31. Consider sequences (x n ) and (y n ) defined by the following recurrence rela-

tions ½
x n+1 = 3x n + 7y n
y n+1 = x n + 9y n
with x 0 = y 0 = 1. Find x 33 and y 33 .

tions ½
x n+1 = 2x n + 2y n
y n+1 = 5x n + 11y n
with x 0 = 2 and y 0 = 3. Find x n and y n .

tions
x n+1 = 53 x n + 14 y n
(
y n+1 = 25 x n + 34 y n
with x 0 = 2 and y 0 = 3. Find limn→∞ x n and limn→∞ y n .


tions
x n+1 = 12 x n + 13 y n
(
y n+1 = 12 x n + 23 y n
with x 0 = 5 and y 0 = 2. Find limn→∞ x n and limn→∞ y n .
Chapter 2
Matrices
2.1 General matrices

In Chapter 1 we considered some of the fundamental ideas of linear algebra in the
context of 2 × 2 matrices. The discussed ideas and methods can be generalized and
used to solve problems that require matrices of larger size. In this chapter we study
algebraic properties of matrices of arbitrary size.
Basic definitions
Definition 2.1.1. By an m × n matrix we mean an array of numbers

 
a 11 a 12 . . . a 1n
a
 21 a 22 . . . a 2n 

A= .. .. .. .
.. 
 . . . . 
a m1 a m2 ... a mn
The matrices £ ¤
ai 1 ai 2 ... ai n ,
where 1 ≤ i ≤ m, are called the rows of the matrix A. The matrices
 
a1 j
a 
 2j 
 . ,
 .
 .


am j
where 1 ≤ j ≤ n, are called the columns of the matrix A. The numbers a i j are
referred to as entries of the matrix. An n × n matrix is called a square matrix.
55
56 Chapter 2: Matrices
An m × n matrix has m rows, n columns, and mn entries.
Example 2.1.2. The matrix

 
a 11 a 12 a 13 a 14
A = a 21 a 22 a 23 a 24 
a 31 a 32 a 33 a 34
has three rows

£ ¤ £ ¤ £ ¤
a 11 a 12 a 13 a 14 , a 21 a 22 a 23 a 24 , a 31 a 32 a 33 a 34 ,
four columns        
a 11 a 12 a 13 a 14
a 21  , a 22  , a 23  , a 24  ,
a 31 a 32 a 33 a 34
and 12 entries.
Definition 2.1.3. Two matrices are equal if they have the same size and the
corresponding entries are the same. More precisely, two m × n matrices
   
a 11 a 12 . . . a 1n b 11 b 12 . . . b 1n
a
 21 a 22 . . . a 2n 
b
 21 b 22 . . . b 2n 
 
A= .
 .. .. ..  and B =  ..
  .. .. .. 
 .. . . .   . . . . 

a m1 a m2 ... a mn b m1 b m2 ... b mn
are equal if and only if a i j = b i j for every 1 ≤ i ≤ m and every 1 ≤ j ≤ n.
Example 2.1.4. The matrices
1 −1 2 5 1 −1 2 5
   
 0 π − 12 7 and  0 π − 1 7
p p 2
2 0 −0.77 13 2 0 0.77 13
are not equal.

2.1. GENERAL MATRICES 57
Sum of matrices
Definition 2.1.5. If A and B are two m × n matrices, the sum A + B is the

matrix whose entry in the i -th row and j -th column is a i j + b i j , where a i j is
the element of the matrix A which is in the i -th row and j -th column and b i j
is the element of the matrix B which is in the i -th row and j -th column:
a 11 + b 11 a 12 + b 12 ... a 1n + b 1n
     
a 11 a 12 . . . a 1n b 11 b 12 . . . b 1n
  a 21 + b 21 a 22 + b 22 ... a n j + b 2n 
 a 21 a 22 . . . a 2n 
  b 21 b 22 . . . b 2n 
 
 
 .
 . . .. .  +
 .. . .. .  = . . .. . 
 . . . .  . . .   . . . . 
. .   . . .   . . . 
a m1 a m2 . . . a mn b m1 b m2 . . . b mn a m1 + b 31 a m2 + b m2 . . . a mn + b mn
In other words, we add matrices by adding the corresponding entries.
Note that A + B does not make sense if the the matrices A and B are not of the
same size.
Example 2.1.6.
     
a 11 a 12 b 11 b 12 a 11 + b 11 a 12 + b 12
a 21 a 22  + b 21 b 22  = a 21 + b 21 a 22 + b 22 
a 31 a 32 b 31 b 32 a 31 + b 31 a 32 + b 32
· ¸ · ¸ · ¸
3 4 −1 1 0 3 4 4 2
+ =
2 5 1 4 −3 7 6 2 8
Definition 2.1.7. The m × n matrix whose all entries are 0,

 
0 0 ... 0
0 0 . . . 0
 
,
. . .. 
. . ..
. . . .
0 0 ... 0
is called the m ×n zero matrix. The m ×n zero matrix will be denoted by 0m,n
or simply by 0 when the dimension of the matrix is clear from the context.
If A, B , and C are three m × n matrices and 0 is the m × n zero matrix, then
A + 0 = 0 + A = A,
A + B = B + A,
(A + B ) +C = A + (B +C ).
The above properties are immediate consequences of the definition of addition

of matrices and the corresponding properties of addition of real numbers.
Scalar multiplication
Definition 2.1.8. If A is a m × n matrix and t is a real number, then t A is the

matrix whose entry in the i -th row and j -th column is t a i j , where a i j is the
entry of the matrix A in the i -th row and j -th column:
   
a 11 a 12 . . . a 1n t a 11 t a 12 . . . t a 1n
a
 21 a 22 . . . a 2n   t a 21 t a 22 . . . t a 2n 
  
t
 .. ..
 =
..   ..

.. .. 
 . . .   . . . 

a m1 a m2 ... a mn t a m1 t a m2 ... t a mn
In other words, to multiply a matrix A by a number t we multiply every entry

of A by t . The operation of multiplication of a matrix by a number is referred
to as scalar multiplication.
Example 2.1.9.
   
a 11 a 12 t a 11 t a 12
t a 21 a 22  = t a 21 t a 22 
a 31 a 32 t a 31 t a 32
· ¸ · ¸
3 4 15 20
5 =
2 5 10 25
If A and B are m × n matrices and s and t are real numbers, then
(s + t )A = s A + t A,
(st )A = s(t A),

t (A + B ) = t A + t B.
The above properties are immediate consequences of the definitions of scalar

multiplication and addition of matrices and the corresponding properties of addi-
tion and multiplication of real numbers.
Product of matrices
Products of matrices play a fundamental role in linear algebra. So far we considered
the following products:
· ¸ · ¸· ¸ · ¸
£ ¤ b1 a1 a2 b1 £ ¤ b1 b3
a1 a2 , , a1 a2
b2 a3 a4 b2 b2 b4
as well as · ¸· ¸
a 11 a 12 b 11 b 12
.
a 21 a 22 b 21 b 22
The first product can be easily extended to higher dimensions:
 
£ ¤ b1
a 1 a 2 a 3 b 2  = a 1 b 1 + a 2 b 2 + a 3 b 3 ,
b3
b1
 
£ ¤ b 2 
a1 a2 a3 a4  b 3  = a 1 b 1 + a 2 b 2 + a 3 b 3 + a 4 b 4 ,

b4
b1
 
b 2 
£ ¤ 
a1 a2 a3 a4 a5 
b 3  = a 1 b 1 + a 2 b 2 + a 3 b 3 + a 4 b 4 + a 5 b 5 ,

b 4 
b5
and, in general,
 
b1
¤  b2 
 
£
a1 a2 . . . am  ..  = a 1 b 1 + a 2 b 2 + · · · + a m b m .

 . 
bm
Two matrices can be multiplied, if the number of columns of the one on the left
is the same as the number of rows of the one on the right. So, if A is a k × l matrix
and B is a m × n matrix, then the product AB is well-defined if l = m.
Definition 2.1.10. The product of a k × m matrix A and an m × n matrix B is

the k × n matrix AB such that the entry in the i -th row and the j -th column
is  
b1 j
¤  b2 j 
 
£
ai 1 ai 2 . . . ai m  . 

,
 .. 
bm j

b1 j
b 
£ ¤  2j 
where a i 1 a i 2 . . . a i m is the i -th row of the matrix A and 
 ..  is the

 . 
bm j
j -th column of the matrix B .
For example,
 b d1
 
a1 a2 a3 a4  1

 c 1 c 2 c 3 c 4  b 2 d2 

b 3 d3 
f1 f2 f3 f4
b4 d4
  
b1 d1
 
£
 a a a ¤ b 2  £ ¤ d 2 
 1 2 3 a4 b 3  a 1 a 2 a 3 a 4 d 3 
  
 
b4 d4 
 

 
  
b1 d1 
  
 
 £ ¤ b 2  £ ¤ d 2 
=  c1 c2 c3 c4  c1 c2 c3 c4 
  
 b 3  d 3 

 

 b4 d4 
  
b1 d1 
  
 
 £ ¤ b2
  £ ¤ d 2 

 f1 f2 f3 f4  f1 f2 f3 f4 
  
 b 3  d 3 

b4 d4
 
a1 b1 + a2 b2 + a3 b3 + a4 b4 a1 d1 + a2 d2 + a3 d3 + a4 d4
=  c1 b1 + c2 b2 + c3 b3 + c4 b4 c1 d1 + c2 d2 + c3 d3 + c4 d4 
f 1 b1 + f 2 b2 + f 3 b3 + f 4 b4 f 1 d1 + f 2 d2 + f 3 d3 + f 4 d4
Here are some concrete examples of products of matrices.

Example 2.1.11.
4 1 1
 
· ¸ · ¸
1 3 0 2 0 2 1 = 4 9
 6
2 0 1 0 1 1 6 9 3 8
0 1 1
    
1 2 0 2 2 3 6 8 4 5 18
0 3 1 3 1 1 6 = 11 3 4 18
1 −1 1 2 0 1 0 1 1 3 0
   
1 3 · ¸ −1
2 0 2 =  4
−1
3 4 2
· ¸ · ¸
1 £ ¤ 2 1 −5 4
2 1 −5 4 =
3 6 3 −15 12
In Theorem 1.1.10 we show that the product of 2 × 2 matrices is associative, that

is, A(BC ) = (AB )C for any 2 × 2 matrices A, B , and C . This property is not limited to
2 × 2 matrices.
Theorem 2.1.12. If A is an m ×n matrix, B is an n ×p matrix, and C is a p ×q

matrix, then we have
A(BC ) = (AB )C .
Proof of a particular case. The identity can be obtained by direct calculations. We

illustrate the idea of the proof by considering the case when m = 1, n = 3, p = 2, and
q = 1.
Let
 
b 11 b 12 · ¸
£ ¤ c1
A = a 1 a 2 a 3 , B = b 21 b 22  , and C = .
c2
b 31 b 32
Then
  
b 11 b 12 · ¸
£ ¤ c 1
A(BC ) = a 1 a 2 a 3 b 21 b 22  
c2
b 31 b 32
 
£ ¤ b 11 c 1 + b 12 c 2
= a1 a2 a 3 b 21 c 1 + b 22 c 2 
b 31 c 1 + b 32 c 2
= a 1 b 11 c 1 + a 1 b 12 c 2 + a 2 b 21 c 1 + a 2 b 22 c 2 + a 3 b 31 c 1 + a 3 b 32 c 2
and
  
¤ b 11 b 12
· ¸
c
a 3 b 21 b 22  1
£
(AB )C =  a 1 a 2
c2
b 31 b 32
· ¸
£ ¤ c1
= a 1 b 11 + a 2 b 21 + a 3 b 31 a 1 b 12 + a 2 b 22 + a 3 b 32
c2
= a 1 b 11 c 1 + a 2 b 21 c 1 + a 3 b 31 c 1 + a 1 b 12 c 2 + a 2 b 22 c 2 + a 3 b 32 c 2
= a 1 b 11 c 1 + a 1 b 12 c 2 + a 2 b 21 c 1 + a 2 b 22 c 2 + a 3 b 31 c 1 + a 3 b 32 c 2 .
Definition 2.1.13. The n × n matrix with 1’s on the main diagonal and 0’s
everywhere else is called a unit matrix or an identity matrix and denoted by
In :  
1 0 ... 0
0 1 . . . 0
 
In = 
 .. .. .. 
. . .

0 0 ... 1
I n is called an identity matrix because when we multiply a matrix A by the iden-

tity matrix of the appropriate size, then the result is the original matrix A.
Theorem 2.1.14. Let A be an m × n matrix. Then
Im A = A and AI n = A.
Proof of a particular case. We verify the result for the matrix

· ¸
a 11 a 12 a 13
A= .
a 21 a 22 a 23
Indeed, we have
· ¸· ¸
1 0 a 11 a 12 a 13
I2 A =
0 1 a 21 a 22 a 23
 · ¸ · ¸ · ¸
£ ¤ a 11 £ ¤ a 12 £ ¤ a 13
 1 0 a 1 0
a 22
1 0
a 23 
21
=
 
· ¸ · ¸ · ¸
£ ¤ a 11 £ ¤ a 12 £ ¤ a 13 
0 1 0 1 0 1
a 21 a 22 a 23
· ¸
a 11 a 12 a 13
=
a 21 a 22 a 23
and
 
· ¸ 1 0 0
a 11 a 12 a 13 
AI 3 = 0 1 0
a 21 a 22 a 23
0 0 1
      
£a ¤ 1 £ ¤ 0 £ ¤ 0 
 11 a 12 a 13 0 a 11 a 12 a 13 1 a 11 a 12 a 13 0
0 0 1
 
=
      
¤ 1 ¤ 0 ¤ 0 

£ £ £
 
 a 21 a 22 a 23 0 a 21 a 22 a 23 1 a 21 a 22 a 23 0
0 0 1
· ¸
a 11 a 12 a 13
= .
a 21 a 22 a 23
Theorem 2.1.15. If A is an m ×n matrix and B and C are n ×p matrices, then
A(B +C ) = AB + AC .
Proof of a particular case. We illustrate the method of the proof by considering the
case when m = 1, n = 3, and p = 1.
     
£ ¤ b1 c1 £ ¤ b1 + c1
a1 a2 a3 b 2  + c 2  = a 1 a 2 a 3 b 2 + b 2 
b3 c3 b3 + c3
   
£ ¤ b1 £ ¤ c1
= a 1 a 2 a 3 b 2  + a 1 a 2 a 3 c 2  .
b3 c3
Theorem 2.1.16. If A and B are m ×n matrices and C is an n ×p matrix, then
(A + B )C = AB + BC .
Proof of a particular case. To illustrate the method of the proof we consider the case
when m = 1, n = 3, and p = 1.
   
¡£ ¤ £ ¤¢ c 1 £ ¤ c1
a1 a2 a 3 + b 1 b 2 b 3 c 2  = a 1 + b 1 a 2 + b 2 a 3 + b 3 c 2 
c3 c3
   
£ ¤ c1 £ ¤ c1
= a1 a2 a 3 c 2  + b 1 b 2 b 3 c 2  .
c3 c3
Theorem 2.1.17. If A is an m × n matrix, B is an n × p matrix, and t is a real

number, then
t AB = (t A)B = A(t B ).
Proof of a particular case. We illustrate the method of the proof by considering the
case when m = 1, n = 3 and p = 1.
    
£ ¤ b1 £ ¤ b1
t  a 1 a 2 a 3 b 2  = t a 1 t a 2 t a 3 b 2 
b3 b3
 
£ ¤ t b1
= a 1 a 2 a 3 t b 2 
t b3
= t a1 b1 + t a2 b2 + t a3 b3 .
We close this section with a simple theorem that is often quite useful in argu-
ments when it is necessary to show that two matrices are equal.
Theorem 2.1.18. Let A and B be m × n matrices. If
Ax = B x
for every vector x in Rn , then A = B .
Proof of a particular case. We verify the result when A and B are 2 × 3 matrices.
If the equality
   
¸ x ¸ x
a 13  1  b 13  1 
· ·
a 11 a 12 b b 12
x 2 = 11 x2
a 21 a 22 a 23 b 21 b 22 b 23
x3 x3
     
x1 1 0
holds for arbitrary vector x = x 2  in R3 , then it must hold for the vectors 0, 1,
x3 0 0
 
0
and 0. Consequently, we have
1
   
· ¸ 1 · ¸ 1
a 11 a 12 a 13   b b 12 b 13  
0 = 11 0 ,
a 21 a 22 a 23 b 21 b 22 b 23
0 0
   
· ¸ 0 · ¸ 0
a 11 a 12 a 13   b b 12 b 13  
1 = 11 1 ,
a 21 a 22 a 23 b 21 b 22 b 23
0 0
and
   
· ¸ 0 · ¸ 0
a 11 a 12 a 13   b b 12 b 13  
0 = 11 0 .
a 21 a 22 a 23 b 21 b 22 b 23
1 1
The above equalities can be written together as
   
· ¸ 1 0 0 · ¸ 1 0 0
a 11 a 12 a 13  b b 12 b 13 
0 1 0 = 11
 0 1 0 .
a 21 a 22 a 23 b 21 b 22 b 23
0 0 1 0 0 1
Now we have
· ¸
a 11 a 12 a 13
a 21 a 22 a 23
   
· ¸ 1 0 0 · ¸ 1 0 0
a 11 a 12 a 13  b b 12 b 13 
= 0 1 0 = 11 0 1 0
a 21 a 22 a 23 b 21 b 22 b 23
0 0 1 0 0 1
· ¸
b 11 b 12 b 13
= .
b 21 b 22 b 23
But this means that A = B .

Transpose of a matrix
Definition 2.1.19. The transpose of a matrix

 
a 11 a 12 . . . a 1n
a
 21 a 22 . . . a 2n 

A=  .. .. .. .. 
 . . . . 

a m1 a m2 . . . a mn
is the matrix denoted by A T whose rows are the columns of A in the same
order, that is  
a 11 a 21 . . . a m1
a
 12 a 22 . . . a m2 

T
A = . .. .. .
.. 
 .. . . . 
a 1n a 2n ... a mn
The first row of A T is the same as the first column of A, the second row of A T is
the same as the second column of A, and so on. Note that the columns of A T are the
same as the rows of A. If A is an m × n matrix, then A T is an n × m matrix.
Example 2.1.20. Here are some examples of transposes of matrices of different sizes:
 T
1 3 · ¸
2 0 = 1 2 3
3 0 4
3 4
 T  
1 −1 0 1 2 −3
 2 3 5 = −1 3 4
−3 4 −7 0 5 −7
 T
1
2 £ ¤
  = 1 2 3 4
3
4
Note that the products A A T and A T A are always defined. If A is an m ×n matrix,

then A A T is a square m × m matrix and A T A is a square n × n matrix.
Example 2.1.21. We have

  T    
1 3 1 3 1 3 · ¸ 10 2 15
2 0 2 0 = 2 0 1 2 3 =  2 4 6
3 0 4
3 4 3 4 3 4 15 6 25
and
 T    
1 3 1 3 · ¸ 1 3 · ¸
2 0 2 0 = 1 2 3 2 0 = 14 15
3 0 4 15 25
3 4 3 4 3 4

  T  
3 3 3 9 3 15 −3
  
 1  1
    =  1 3 1 5 −1 =  3 1 5 −1
 £ ¤  
 5  5  5  15 5 25 −5
−1 −1 −1 −3 −1 −5 1
and
T  
3 3 3
  
 1  1  £ ¤  1
    = 3 1 5 −1   = 3 · 3 + 1 · 1 + 5 · 5 + (−1) · (−1) = 36
 5  5   5
−1 −1 −1
For the next result we leave the student to verify some particular cases by direct
calculations in exercises.
Theorem 2.1.23. If A is a k × m matrix and B is an m × n matrix, then
(AB )T = B T A T .
Note that the order of matrices in the above equality changes. Since B T is an
n × m matrix and A T is an m × k, the product B T A T makes sense. For example, we
have
µ· ¸· ¸¶T · ¸T · ¸T
a 11 a 12 b 11 b 12 b 11 b 12 a 11 a 12
=
a 21 a 22 b 21 b 22 b 21 b 22 a 21 a 22
and
¸ b 11 b 12 T
    T
· b 11 b 12 · ¸T
 11 12 13 b 21 b 22  = b 21 b 22  a 11 a 12 a 13 .
a a a
a 21 a 22 a 23 a 21 a 22 a 23
b 31 b 32 b 31 b 32
These properties can be verified by direct calculations using the obvious equalities
· ¸ · ¸
£ ¤ b1 £ ¤ a1
a1 a2 = b1 b2
b2 a2
and    
£ ¤ b1 £ ¤ a1
a1 a2 a 3 b 2  = b 1 b 2 b 3 a 2  .
b3 a3
Definition 2.1.24. A square matrix A is called symmetric if A T = A.
Example 2.1.25. Here are some examples of symmetric matrices:
1 2 3 4
 
 
· ¸ 1 3 0
1 −2 3 −4 2 ,
2 0 −1 6
, 3 −1 −5 0 .
 
−2 1
0 2 −1
4 6 0 7
Note that both matrices A A T and A T A found in Example 2.1.21 are symmetric.
It is not difficult to show that for an arbitrary matrix A the matrices A A T and A T A
are symmetric.
2.1.1 Exercises
Perform the indicated operations on matrices.

· ¸
3 2 1
£ ¤ £ ¤
1. 1 2 −3 5 + 9 −2 4 8 5. 5
1 4 −1
2 9
   
1 2
 
 0  1  5
2.  −2
−1 +  1 6. 3 
   
1 0
3 −3 2 7
   
·
1 2 3
¸ ·
7 1 −4
¸ 5 1 4 −1
3. + 7. 2 1 7 + 3 2 8
1 0 −1 11 2 −3
2 3 0 −5
         
1 1 2 3 1 4 2 2 0 1 2 1
4. 0 1 −1 + −2 5 −1 8. 3 1 5 +8  1 −1 +2 −1 0
3 −2 0 −3 1 2 3 0 −3 1 1 −1
· ¸
1 £ 1 1 1
¤  
9. 2 5 · ¸
4 1 1 0 1 1 0 0
15. 
· ¸ 1 0 0 1 0 1 0
5 £ ¤ 0 0 1
10. 1 7
2
1 −1 0 
 

1 1 2 1 1 1
 
¤ 3  1 1 1 1 3
16. 
£  
11. 2 −1 1 4  
2  0 1
1 0 1
5
 
· ¸ 1 1 · ¸
3
  1 2 5  2 0
17. 0 3
 2 3 2 1 1 1
£ ¤  1 0
12. −1 4 2 4 3 
 6

 
−3 · ¸· ¸ 1 1
2 0 1 2 5 
4 18. 0 3
1 1 3 2 1
1 0
4
 
 
2 £ ¤ 5 £ · ¸
1 2 7 1
13.   ¤ 1 1 0 0
19. 1 4 −1
1 0 1 1
5 2
   
7 £ · ¸
¤ 1
¤ 1 £
14. 2 2 1 1 2 20. x y z 3
1
1 2
Explain why the following products are not defined.

 
¸ 1 1 1 1 −1 
 
· 
1 1 −1 1  1 1
21. 1 0 1   1 1 

0 1 1 1 1 3 1 3
22.  
1 1 0
0 1
1 2
23. Show, without using Theorem 2.1.18, that if

· ¸· ¸ · ¸· ¸ ·¸
a 11 a 12 x 1 b 11 b 12 x 1 x1
= for every ,
a 21 a 22 x 2 b 21 b 22 x 2 x2
· ¸ · ¸
a 11 a 12 b 11 b 12
then = .
a 21 a 22 b 21 b 22
      
a 11 a 12 a 13 x1 b 11 b 12 b 13 x1 x1
a 21 a 22 a 23  x 2  = b 21 b 22 b 23  x 2  for every x 2  ,
a 31 a 32 a 33 x3 b 31 b 32 b 33 x3 x3
   
a 11 a 12 a 13 b 11 b 12 b 13
then a 21 a 22 a 23  = b 21 b 22 b 23 .
a 31 a 32 a 33 b 31 b 32 b 33
x1 x1 x1
    
£ ¤ x 2  £ ¤ x 2  x 2 
a 11 a 12 a 13 a 14 x 3  = b 11
 b 12 b 13 b 14  
x 3  for every 
x 3  ,

x4 x4 x4
£ ¤ £ ¤
then a 11 a 12 a 13 a 14 = b 11 b 12 b 13 b 14 .
x1 x1 x1
     
· ¸ · ¸
a 11 a 12 a 13 a 14 x 2 
  = b 11 b 12 b 13 b 14 x 2  x 2 
  for every 
x 3  ,

a 21 a 22 a 23 a 24 x 3  b 21 b 22 b 23 b 24 x 3 
x4 x4 x4
· ¸ · ¸
a 11 a 12 a 13 a 14 b 11 b 12 b 13 b 14
then = .
a 21 a 22 a 23 a 24 b 21 b 22 b 23 b 24
Find A T for the given matrix A.
· ¸
1 2 5
£ ¤
27. A = 1 2 −3 29. A =
3 4 2
1 1 1
 
 
1 2 3 1 1 2 1
28. A = 
4

5 6 30. A = 0 1 −1 9
7 8 9 3 −2 0 3
31. Show that

µ· ¸· ¸¶T · ¸T · ¸T
a1 a2 b1 b2 b1 b2 a1 a2
= .
a3 a4 b3 b4 b3 b4 a3 a4
32. Show that
¸ b 11 b 12 T
    T
· b 11 b 12 · ¸T
 a 11 a 12 a 13 b 21 b 22  = b 21 b 22  a 11 a 12 a 13 .
a 21 a 22 a 23 a 21 a 22 a 23
b 31 b 32 b 31 b 32
33. Let A be a 2 × 2 matrix. Show that det A = det A T .

    
· ¸ c1 µ · ¸¶ c 1
£ ¤ b 11 b 12 b 13 £ ¤ b 11 b 12 b 13
34. Show that a 1 a 2  c 2  = a 1 a 2 c 2 .
b 21 b 22 b 23 b 21 b 22 b 23
c3 c3
35. Show that for an arbitrary matrix A the matrix A A T is symmetric.

2.2. GAUSSIAN ELIMINATION 71
2.2 Gaussian elimination

In Chapter 1 we solved a system of linear equations by writing it in the matrix form
and then solving it by inverting the matrix. In this chapter we present a different
approach. First we note that solving the system

 a1 x + b1 y + c1 z = d1
a x + b2 y + c2 z = d2
 2
a3 x + b3 y + c3 z = d3
is more difficult and time consuming than solving the system that has the form

 a1 x + b1 y + c1 z = d1
b2 y + c2 z = d2 . (2.1)
c3 z = d3

Then we observe that we will not affect the solution of the system of linear equa-
tions if we multiply one of the equations by a number different from 0 or multi-
ply one of the equations by a number and then add the result to another equation.
Moreover, if necessary, we can always change the order of equations in the system.
It turns out that by manipulating the system as described above we can eventually
change it to the form (2.1) or a similar form that makes solving the system very easy.
This process is referred to as Gaussian elimination. In this chapter we discuss the
process of Gaussian elimination in detail and examine different possible outcomes
of the process.
Elementary operations
The operations used in the Gaussian elimination process are called elementary op-
erations.
Definition 2.2.1. By elementary operations on a system of linear equations

we mean the following three operations:
• Interchange two equations.
• Multiply an equation by a nonzero constant.
• Multiply an equation by a constant and then add the result to another

equation.
Now we give a number of examples in order to illustrate and clarify the meaning
of these operations. In these examples we give a system of linear equations, then we
describe the elementary operation that will be applied to the system, and then show
the resulting system.
Example 2.2.2. 
 2x + 3y − 2z = 1
x + 2y + z = 3
3x + 4y − 5z = −1

interchange equation 1 and equation 3


 3x + 4y − 5z = −1
x + 2y + z = 3
2x + 3y − 2z = 1

Example 2.2.3. 
 2x + 3y − 2z = 1
x + 2y + z = 3
3x + 4y − 5z = −1

multiply equation 2 by 7

 2x + 3y − 2z = 1
7x + 14y + 7z = 21
3x + 4y − 5z = −1

Example 2.2.4. 
 2x + 3y − 2z = 1
x + 2y + z = 3
3x + 4y − 5z = −1

multiply equation 1 by 4 and then add to equation 3


 2x + 3y − 2z = 1
x + 2y + z = 3
11x + 16y − 13z = 3

In the next example we use two elementary operations to modify a system. It

is usually necessary to use several elementary operations to modify the system to a
form that is easy to solve.
Example 2.2.5. 
 2x + 3y − 2z = 1
x + 2y + z = 3
3x + 4y − 5z = −1

multiply equation 2 by −3 and then add to equation 3


 2x + 3y − 2z = 1
x + 2y + z = 3
− 2y − 8z = −10

add equation 3 to equation 2


 2x + 3y − 2z = 1
x − 7z = −7
− 2y − 8z = −10

It may seem that the second operation, namely add equation 3 to equation 2, is
not an elementary operation, since it is not listed in the definition at the beginning
of this section, but in fact, it is a special case of the third operation, since we could
describe it as multiply equation 3 by 1 and then add to equation 2.
Now we illustrate how we can use elementary operations to solve a system of
linear equations.


 x + y + z = −1
2x + 4y + 3z = 0
3x + y + 5z = 1

Solution. First we eliminate the x-terms from the second and third equations.

 x + y + z = −1
2y + z = 2
3x + y + 5z = 1



 x + y + z = −1
2y + z = 2
− 2y + 2z = 4

To replace by 1 the number 2 in front of y in the second equation we


 x + y + z = −1
y + 12 z = 1
− 2y + 2z = 4

Next we eliminate the y-term from the third equation.


 x + y + z = −1
y + 12 z = 1
3z = 6

To remove the 3 in front of z from the third equation we


 x + y + z = −1
y + 12 z = 1
z= 2

Now we eliminate the z-terms from the first and second equations.
multiply equation 3 by − 12 and then add to equation 2

 x + y + z = −1
y = 0
z= 2



x + y = −3
y = 0
z= 2

Finally, we eliminate the y-term from the first equation.


x = −3
y = 0
z= 2

The solution of the system is


 x = −3

y =0 .

z =2



 2x + 3y − 2z = 1
x + 2y + z = 3
3x + 4y − 5z = −1

Solution. Because the second equation looks simpler than the first one we
interchange equation 1 and equation 2.

 x + 2y + z = 3
2x + 3y − 2z = 1
3x + 4y − 5z = −1

Next we eliminate the x-terms from the second and third equations.

 x + 2y + z = 3
−y − 4z = −5
3x + 4y − 5z = −1



 x + 2y + z = 3
−y − 4z = −5
− 2y − 8z = −10

To remove the minus sign in front of y from the second equation we

multiply equation 2 by −1

 x + 2y + z = 3
y + 4z = 5
− 2y − 8z = −10

Now we eliminate the y-terms from the third equation


 x + 2y + z = 3
y + 4z = 5
0+ 0=0

Finally we eliminate the y-terms from the first equation.


x − 7z = −7
y + 4z = 5
0= 0

Since the last equation does not contribute anything, the system is equivalent
to the system with two equations:
½
x − 7z = −7
y + 4z = 5
This system has infinitely many solutions. The solutions can be described in the
form
½
x = −7 + 7z
,
y = 5 − 4z
where z is an arbitrary real number.
The process of solving systems of linear equations using the Gaussian elimina-
tion method can be simplified if we use matrices. We observe that the complete
information about a system of linear equations

 2x + 3y − 2z = 1
x + 2y + z = 3
3x + 4y − 5z = −1

is contained in the matrix  

2 3 −2 1
1 2 1 3 .
3 4 −5 −1
It suffices to remember that the first column corresponds to x, the second column
to y, the third column to z, and that the last column contains the numbers on the
other side of the = sign. This matrix is called the augmented matrix of the system.
We will solve the system of linear equations from the last example by first con-
verting it to the augmented matrix of the system and then performing elementary
operations on rows of the matrix. In this case we call these operations elementary
row operations.
Definition 2.2.8. By elementary row operations on a matrix we mean the fol-

lowing three operations:
• Row interchange: Interchange two rows of the matrix.
• Row scaling: Multiply a row of the matrix by a nonzero constant.
• Row replacement: Multiply a row of the matrix by a constant and then

add the result to another a row of the matrix.


 2x + 3y − 2z = 1
x + 2y + z = 3 .
3x + 4y − 5z = −1

Solution. The augmented matrix of the system is

 
2 3 −2 1
1 2 1 3 .
3 4 −5 −1
interchange rows 1 and 2

 
1 2 1 3
2 3 −2 1
3 4 −5 −1
multiply row 1 by −2 and then add to row 2

 
1 2 1 3
0 −1 −4 −5
3 4 −5 −1

 
1 2 1 3
0 −1 −4 −5
0 −2 −8 −10
multiply row 2 by −1
 
1 2 1 3
0 1 4 5
0 −2 −8 −10
multiply row 2 by 2 and then add to equation 3

 
1 2 1 3
0 1 4 5
0 0 0 0
multiply row 2 by −2 and then add to equation 1

 
1 0 −7 −7
0 1 4 5 .
0 0 0 0
The last matrix corresponds to the system


x − 7z = −7
y + 4z = 5
0= 0

and now we obtain the solutions as in Example 2.2.7.
We note that the two operations

in the above example do not depend on each other in any way and the order of these
two operations does not matter. We can combine them together. More precisely,
from the matrix  
1 2 1 3
2 3 −2 1
3 4 −5 −1
by applying the following operations

we get the matrix
 
1 2 1 3
0 −1 −4 −5 .
0 −2 −8 −10
The same thing is true for the last two operations in the above example. That is, the
operations
multiply row 2 by 2 and then add to equation 3
multiply row 2 by −2 and then add to equation 1
could be applied together to the matrix
 
1 2 1 3
0 1 4 5
0 −2 −8 −10
in order to obtain the matrix  

1 0 −7 −7
0 1 4 5 .
0 0 0 0
The Gaussian elimination algorithm

In the previous section we illustrated the Gaussian elimination method with a cou-
ple of examples. Now we will discuss it in a more formal way. First we need to intro-
duce some new terminology.
Definition 2.2.10. By a leading term in a row of a matrix we mean the first

nonzero entry. In other words, a leading term in a row is a nonzero entry in
that row such that all entries to the left of it are 0. If a leading term is equal
to 1, then we call it a leading 1.
Example 2.2.11. Here are some examples of matrices with the leading terms en-
closed in boxes. Each leading 1 is indicated by a red box.
   
  1 3 −2 0 3 3 5 0 2 3 2 0
1 −1 0 9 
  0 0 0 1 7  0 0 −2 7 2 4 0
  
 0 2 0 2 ,  , .

 0 0 0 0 0  0 0 0 0 0 0 3
0 0 −1 5
0 0 0 0 0 0 0 0 0 0 0 0
Definition 2.2.12. A matrix is in a reduced row echelon form (or Gauss-Jordan

form) if it satisfies all of the following conditions:
(a) All leading terms of the matrix are equal to 1;
(b) In each column of with a leading 1 all other terms are equal to 0;
(c) Each leading 1 is in a column to the right of the leading 1 in the row
above it;
(d) Rows whose entries are all zero are below rows with nonzero entries.
Example 2.2.13. Here are examples of matrices in a reduced row echelon form.
 1 3 −2 0 3 1 0 4 0 3 1 0 2
   

· ¸ 1 0 0 9
1 0  0 0 0 1 7 0 1 7 0 2 0 0 2
, 0 1 0 2 ,  ,  
0 1 0 0 0 0 0  0 0 0 1 8 3 0 7
0 0 1 5
0 0 0 0 0 0 0 0 0 0 0 1 5
Example 2.2.14. Here are examples of matrices that are not in a reduced row ech-
elon form.
1. The matrix
1 0 0 9
 
0 1 0 2
0 0 2 5
is not in a reduced row echelon form because condition (a) is not satisfied.
The leading entry in the third row is not 1.
2. The matrix
1 0 0
 
0 1 −1 
0 0 1
is not in a reduced row echelon form because condition (b) is not satisfied.
The third column has a leading 1 and another nonzero term.
3. The matrix  
1 3 0 0 3
0 0 0 1 7
 
 
0 0 1 0 0
0 0 0 0 0
is not in a reduced row echelon form because condition (c) is not satisfied.
The leading 1 in the third row is not in a column to the right of the leading 1
in the row above it.
4. The matrix
1 5 0 2 3 2 0 5
 
0 0 1 7 2 4 0 3
 
0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0
is not in a reduced row echelon form because condition (d) is not satisfied.
All terms in the third row are 0’s, but the fourth row has a nonzero term.
Definition 2.2.15. Every entry in a matrix, which is in a reduced row echelon

form, where a leading 1 is located is called a pivot position and every column
that contains a pivot position is called a pivot column.
Example 2.2.16. All pivot positions in the matrix

 
1 5 0 2 3 2 0 5
0 0 1 7 2 4 0 3
 

 
 0 0 0 0 0 0 1 2
0 0 0 0 0 0 0 0
are marked. Columns 1, 3, and 7 are the pivot columns of this matrix.
Theorem 2.2.17. Every matrix can be reduced using row operations in a re-
duced row echelon form.
The reduced row echelon form does not depend of the row operations we
choose to get this form.
We do not prove this theorem in the book.

Because the reduced row echelon form is unique we can extend the definition
2.2.15:
Definition 2.2.18. Every entry in a matrix where a leading 1 is located in the

reduced row echelon form of the matrix is called a pivot position and every
column that contains a pivot position is called a pivot column.
The general algorithm for obtaining the reduced row echelon form of any matrix
is based on three basic ideas used in this process:
• If there is a nonzero term in a column, we can always move it to a desired

position in that column by applying an appropriate row interchange.
• Any nonzero term can be changed to 1 by an appropriate scaling.
• If a column has a term equal to 1, then any other nonzero term in that column
can be changed to 0 by an appropriate row replacement.
Now we describe the general Gaussian elimination process. Pivot columns and
pivot elements play an important role in this algorithm.
Step 1 Identify the first (from the left) nonzero column. (This is a pivot col-
umn.)
Step 2 If necessary, move a row with a nonzero entry in the pivot column to
the top using an appropriate row interchange. (After this operation the
entry at the top of the pivot column is in the pivot position.)
Step 3 If necessary, change the number in the pivot position to 1 using an

appropriate scaling. (The 1 in the pivot position is a leading 1.)
Step 4 Replace, if necessary, every term below the leading 1 by 0 using appro-
priate row replacements.
We are not done yet, but we take a break to look at an example.

Example 2.2.19. We consider the matrix
0 0 1 −1 0 1 2
 
0 0 3 0 3 −2 0
 .
0 2 0 1 3 −1 3
0 −1 2 0 1 0 5
Step 1 The second column is the first one that has nonzero terms. This is the pivot
column.
Step 2 We interchange rows 1 and 4 to get the nonzero term −1 at the top of the
pivot column:
0 −1 2 0 1 0 5
 
0 0 3 0 3 −2 0
0 2 0 1 3 −1 3 .
 
0 0 1 −1 0 1 2
Step 3 We multiply the first row by −1 to get 1 in the pivot position:
0 1 −2 0 −1 0 −5
 
0 0 3 0 3 −2 0
 .
0 2 0 1 3 −1 3
0 0 1 −1 0 1 2
Step 4 We add the first row multiplied by −2 to the third row to replace the 2 in the
pivot column by 0:
0 1 −2 0 −1 0 −5
 
0 0 3 0 3 −2 0
0 0 4 1 5 −1 13 .
 
0 0 1 −1 0 1 2
Now we continue with the general algorithm.
Step 5 Temporarily ignore the top row of the matrix.
Step 6 Apply steps 1–4 to the smaller matrix that remains. Continue then with
step 5 followed by steps 1–4 until there are no nonzero rows left.
We now return to our example.

Steps 5 and 6 We ignore (cover) the top row of the matrix after step 4, that is
0 1 −2 0 −1 0 −5
 
0 0 3 0 3 −2 0
 
0 0 4 1 5 −1 13
0 0 1 −1 0 1 2
and proceed with the algorithm on the submatrix under the top row:
(that is, interchange rows 1 and 3 in the submatrix under the top row)
0 1 −2 0 −1 0 −5
 
0 0 1 −1 0 1 2
 
0 0 4 1 5 −1 13
0 0 3 0 3 −2 0

(that is, multiply row 1 by −4 and then add to row 3 in the submatrix)
(that is, multiply row 1 by −3 and then add to row 3 in the submatrix under the top
row)
0 1 −2 0 −1 0 −5
 
0 0 1 −1 0 1 2
 
0 0 0 5 5 −5 5
0 0 0 3 3 −5 −6
Now we ignore the top two rows from the obtained matrix, that is,
0 1 −2 0 −1 0 −5
 
0 0 1 −1 0 1 2
 
0 0 0 5 5 −5 5
0 0 0 3 3 −5 −6
and proceed with the algorithm on the matrix under these rows:
multiply row 3 by 51
1
(that is, multiply row 1 by 5 in the new submatrix)
0 1 −2 0 −1 0 −5
 
0 0 1 −1 0 1 2
 
0 0 0 1 1 −1 1
0 0 0 3 3 −5 −6

(that is, multiply row 1 by −3 and then add to row 2 in the new submatrix)
0 1 −2 0 −1 0 −5
 
0 0 1 −1 0 1 2
 
0 0 0 1 1 −1 1
0 0 0 0 0 −2 −9
Now we ignore the top three rows from the obtained matrix, that is,
0 1 −2 0 −1 0 −5
 
0 0 1 −1 0 1 2
 
0 0 0 1 1 −1 1
0 0 0 0 0 −2 −9
and proceed with the algorithm on the matrix under these rows which is the row 4
in the original matrix:
multiply row 4 by − 12
(that is, multiply the only row of the new submatrix by − 12 )
0 1 −2 0 −1 0 −5
 
0 1 1 −1 0 1 2
 
0 0 0 1 1 −1 1
0 0 0 0 0 1 92
The obtained matrix has four pivot positions:

 
0 1 −2 0 −1 0 −5
0 0 1 −1 0 1 2
 
 
0 0 0 1 1 −1 1
9
0 0 0 0 0 1 2
Columns 2, 3, 4, and 6 are the pivot columns.
The matrix obtained in the example above is not yet in the reduced row echelon
form. Condition (b) is not satisfied: in addition to the leading 1’s there are other
nonzero terms in columns 3, 4, and 6. One more step is necessary.
Step 7 Use appropriate row replacements to replace with 0 all nonzero terms,
other than the leading 1’s, in all pivot columns starting from the last
pivot column to the right and then continuing with next pivot to the
left and so on.
Now we are finally ready to finish our example and obtain the reduced row ech-
elon form of the matrix.
Step 7 First we take care of column 6, the last pivot column on the right:
add row 4 to row 3
0 1 −2 0 −1 0 −5
 
0 0 1 −1 0 0 − 5
 2
0 0 0 1 1 0 11
 
2 
9
0 0 0 0 0 1 2
Now the only nonzero term in column 6 is the pivot term. Next we take care of
column 4. We skip column 5 because it is not a pivot column.
add row 3 to row 2

 
0 1 −2 0 −1 0 −5
0 0 1 0 1 0 3
 
0 0 0 1 1 0 11 
 2 
0 0 0 0 0 1 92
Now the only nonzero term in column 4 is the pivot term. The final step is to re-
place the −2 in column 3 by 0.
multiply row 2 by 2 and then add to row 1

 
0 1 0 0 1 0 1
0 0 1 0 1 0 3
 
0 0 0 1 1 0 11 
 2 
9
0 0 0 0 0 1 2
Now the matrix is in the reduced row echelon form.
Here are all the steps of the Gaussian elimination process put together.
Step 1 Identify the first (from the left) nonzero column. (This is the pivot col-
umn.)
Step 2 If necessary, move a row with a nonzero term in the pivot column to
the top using an appropriate row interchange. (The position at the top
of the pivot column is now in the pivot position.)
Step 3 If necessary, change the number in the pivot position to 1 using an

appropriate scaling. (The 1 in the pivot position is a leading 1.)
Step 4 Replace, if necessary, every term below the leading 1 by 0 using appro-
priate row replacements.
Step 5 Temporarily cover (ignore) the top row of the matrix.
Step 6 Apply steps 1–4 to the smaller matrix that remains. Continue then with
step 5 followed by steps 1–4 until there are no nonzero rows left.
Step 7 Use appropriate row replacements to replace with 0 all nonzero terms,
other than the leading 1’s, in all pivot columns starting from the last
pivot column on the right and working to the left.
While it is important to understand the above algorithm, it would not make

much sense to try to memorize it. It is important to understand what the reduced
row echelon form is and what row operations are allowed. Following the above al-
gorithm exactly may not be the best strategy. For example, in the matrix
 
1 −1 0 1 2
 2 −3 0 5 1
−2 3 3 −2 1
instead of multiplying row 1 by −2 and adding it to row 2 and then multiplying row
1 by 2 and adding it to row 3, it makes more sense to start by adding row 2 to row 3
 
1 −1 0 1 2
2 −3 0 5 1
0 0 3 3 2
and then multiply row 1 by −2 and add to row 2

 
1 −1 0 1 2
0 −1 0 3 −3 .
0 0 3 3 2
The algorithm described above is designed in such a way that, if we already have the
desired values in a pivot column, then the following steps will not mess them up. For
example, in the above matrix the values in the first column are exactly the values we
want:
 
1 −1 0 1 2
0 −1 0 3 −3 .
 
0 0 3 3 2
When continuing the Gaussian elimination process we have to make sure that
the first column remains unchanged.
Using the idea presented above we show how to get the reduced row echelon
form of the matrix in Example 2.2.19 in a different way. From the original matrix
0 0 1 −1 0 1 2
 
0 0 3 0 3 −2 0
 
0 2 0 1 3 −1 3
0 −1 2 0 1 0 5
we obtain the matrix

0 1 −2 0 −1 0 −5
 
0 0 1 −1 0 1 2
 
0 0 4 1 5 −1 13
0 0 3 0 3 −2 0
proceeding as presented in example 2.2.19. Then we replace by 0 the entries in the

third column in rows 1, 3, and 4, that is, above and below the leading 1 in the second
row. Note that this way we do not change columns 1 and 2.

0 1 0 −2 −1 2 −1
 
0 0 1 −1 0 1 2
 
0 0 0 5 5 −5 5
0 0 0 3 3 −5 −6
1
multiply row 3 by 5
0 1 0 −2 −1 2 −1
 
0 0 1 −1 0 1 2
 
0 0 0 1 1 −1 1
0 0 0 3 3 −5 −6
Next we replace by 0 the entries in the forth column in rows 1,2 and 4, that is, above
and below the leading 1 in the third row. Note that this way we do not change
columns 1, 2, and 3 because the entries to the left of the leading 1 in the third row
are 0.

0 1 0 0 1 0 1
 
0 0 1 0 1 0 3
 
0 0 0 1 1 −1 1
0 0 0 0 0 −2 −9
0 1 0 0 1 0 1
 
0 0 1 0 1 0 3
 
0 0 0 1 1 −1 1
0 0 0 0 0 1 92
 
0 1 0 0 1 0 1
0 0 1 0 1 0 3
 
0 0 0 1 1 0 11 
 2 
9
0 0 0 0 0 1 2
The following observation is useful when obtaining 0 above and below a leading
1.
If there is a leading 1 in the row p and column q, then the row replacement
multiply row p by α and then add to row r 6= p
does not change the columns 1, 2, . . . , q−1 because all the entries to the left
of the leading 1 in the p-th row are 0.
The above observations lead to the following modified algorithm for the Gaus-
sian elimination process. Note that the steps 1, 2, and 3 have not changed.
Step 1 Identify the first (from the left) nonzero column. (This is the pivot col-
umn.)
Step 2 If necessary, move a row with a nonzero entry in the pivot column to
the top using an appropriate row interchange. (The entry at the top of
the pivot column is now in the pivot position.)
Step 3 If necessary, change the entry in the pivot position to 1 using an ap-
propriate scaling. (The 1 in the pivot position is a leading 1.)
Step 4 Replace, if necessary, every entry above and below the leading 1 by 0
using appropriate row replacements.
Step 5 Temporarily cover (ignore) the top row of the matrix.
Step 6 Apply steps 1–3 to the smaller matrix that remains and step 4 to the
whole matrix. Continue then with step 5 followed by steps 1–4 until
there are no nonzero rows left.
We consider another example to illustrate the differences between the original

algorithm and the modified algorithm.
Example 2.2.20. Find the reduced row echelon form of the matrix
 
2 4 7 3 2 1 1
1 2 1 2 5 0 1
4 8 9 7 12 1 1
Solution. First we follow the original algorithm:

 
1 2 1 2 5 0 1
2 4 7 3 2 1 1
4 8 9 7 12 1 1
multiply first row by −2 and then add to row 2
multiply first row by −4 and then add to row 3
 
1 2 1 2 5 0 1
0 0 5 −1 −8 1 −1
0 0 5 −1 −8 1 −3
1
multiply row 2 by 5
 
1 2 1 2 5 0 1
0 0 1 − 1 − 8 1 − 1 
5 5 5 5
0 0 5 −1 −8 1 −3

 
1 2 1 2 5 0 1
1
0 0 1 − − 8 1 1
5 5 −5

5
0 0 0 0 0 0 2
1
multiply row 3 by 2
 
1 2 1 2 5 0 1
0 0 1 − 1 − 8 1 − 1 
5 5 5 5
0 0 0 0 0 0 1

 
1 2 1 2 5 0 0
0 0 1 − 1 − 8 1 0
5 5 5
0 0 0 0 0 0 1

11 33
− 15 0
 
1 2 0 5 5
0 0 1 − 15 − 58 1
0
 
5
0 0 0 0 0 0 1
This is the reduced row echelon form of our matrix.

Now we apply the modified algorithm. After we obtain, using the original algo-
rithm, the matrix  
1 2 1 2 5 0 1
0 0 1 − −1 8 1 1
5 − 
5 5 5
0 0 5 −1 −8 1 −3
we replace by 0 the entries in the third column in rows 1 and 3, that is, above and
below the leading 1 in the second row.
1 2 0 11 33 1 6

5 5 −5 5
0 0 1 − 51 − 85 1
− 1

5 5
0 0 0 0 0 0 2
1
multiply row 3 by 2
1 2 0 11 33 1 6

5 5 −5 5
0 0 1 − 51 − 85 1
− 1

5 5
0 0 0 0 0 0 1
multiply row 3 by − 56 and then add to row 1

11 33
− 15 0
 
1 2 0 5 5
0 0 1 − 15 − 58 1
0
 
5
0 0 0 0 0 0 1
As expected, this is the reduced row echelon form obtained using the original al-
gorithm.
Solving systems of linear equations using Gaussian elimination

At the beginning of this section we presented some examples of systems of linear
equations that were solved by Gaussian elimination. In the first couple of examples
we worked directly with the equations. In the last example in that section we first
represented the system by a matrix and then worked with the matrices until we were
ready to give the final answer. Those examples were used to motivate the Gaussian
elimination algorithm presented in the previous section. In this section we present
more examples illustrating the use of Gaussian elimination to solve systems of linear
equations.

½
2x + 7y = 1
.
5x + 3y = 2
Solution. We represent the system by its augmented matrix

· ¸
2 7 1
5 3 2
and then apply the Gaussian elimination algorithm to obtain the reduced row ech-
elon form of the matrix:
multiply row 1 by 21
· 7 1¸
1 2 2
5 3 2

7 1
" #
1 2 2
0 − 29 1
2 −2
2
7 1
" #
1 2 2
1
0 1 − 29
18
" #
1 0 29
1
0 1 − 29
18 1
x= and y =− .
29 29


 2x + y + 3z + 2w = 1
3x + 2y + z + 2w = 3 .
x + 5z + 2w = −1


 
2 1 3 2 1
3 2 1 2 3 .
1 0 5 2 −1
Now we apply the Gaussian elimination algorithm to obtain the reduced row ech-
elon form of the matrix:
 
1 0 5 2 −1
3 2 1 2 3 
2 1 3 2 1

 
1 0 5 2 −1
0 2 −14 −2 6
0 1 −7 −2 3

 
1 0 5 2 −1
0 1 −7 −2 3
0 2 −14 −2 6

 
1 0 5 2 −1
0 1 −7 −2 3
0 0 0 0 0
The original system is reduced to

½
x+ 5z + 2w = −1
.
y − 7z − 2w = 3
x = −5z − 2w − 1 and y = 7z + 2w + 3.
Definition 2.2.23. The variables in a system of equations which correspond

to the pivot columns of the augmented matrix are called pivot variables or
basic variables. The other variables are called free variables.
In the above example we have two pivot columns, namely columns 1 and 2. The
variables corresponding to columns 1 and 2 are x and y. These are the basic vari-
ables of our system. The variable z and w corresponding to columns 3 and 4, which
are not pivot columns, are the free variable of the system. Note that in the solution
of the system the basic variables are expressed in terms of the free variables.


 2x + 2y + z = 0
3x + y + z = 1 .
x + 3y + z = 2


 
2 2 1 0
 3 1 1 1 .
1 3 1 2
Now we apply the Gaussian elimination algorithm to obtain the reduced row ech-
elon form of the matrix.
 
1 3 1 2
 3 1 1 1 
2 2 1 0
 
1 3 1 2
 0 −8 −2 −5 
0 −4 −1 −4
 
1 3 1 2
 0 1 5
1 4 8

0 −4 −1 −4
1 3 1 2
 
1 5
 0 1
 
4 8 
3
0 0 0 −2
We have arrived to something that is not possible, because from the last row we get
0 = − 32 . The system has no solution.
Echelon form of a matrix and back substitution

When solving a system of linear equations using Gaussian elimination it is not nec-
essary to reduce the augmented matrix representing a system all the way to the
Gauss-Jordan form to solve the system. The solution can be easily obtained “half
way” to the Gauss-Jordan form. We proceed with Gaussian elimination until we get
the so-called row echelon form of the matrix and then we use a method called back
substitution to solve the system. In this section we take a closer look at this method.

½
2x + 7y = 1
5x + 3y = 2
Solution. We start by transforming the augmented matrix of the system of equa-

tion, that is, the matrix · ¸
2 7 1
.
5 3 2
multiply row 1 by − 25 and add to row 2

· ¸
2 7 1
0 − 29
2 −2
1
We have not found the reduced row echelon form of the augmented matrix, but
we can easily get the solution of the system from the above matrix. Indeed, when
converted back to a system of equations we get
½
2x +7y = 1
1 .
− 29
2 y = −2
1
The second equation gives us y = 29 . We substitute back the obtained value of y to
the first equation and obtain an equation for x:
1
2x + 7 · =1
29
11
Solving for x we get x = 29 .
Definition 2.2.26. A matrix is in a row echelon form if it satisfies all of the

following conditions:
(a) In any two rows with leading terms the leading term of the row above
is to the left of the leading term of the row below;
(b) In a column with a leading term all entries below the leading term are
0;
(c) Rows whose entries are all zero are below rows with nonzero entries.
It is clear that all matrices in a reduced row echelon form are in echelon form.
Example 2.2.27. Here are some examples of matrices in a row echelon form, but
not in a reduced row echelon form:
 2 3 5 −2 3 5 3 4 0 3 1 4 2
   
  
3 2 0 4 5 
0 1 4 , 0 1 , 0 0 0 3 7 0 2 7 1 2 1 0 2
,  .
0 0 0 0 0  0 0 0 3 8 3 1 7
0 0 7 0 0
0 0 0 0 0 0 0 0 0 0 0 3 5
Example 2.2.28. Here are some examples of matrices that are not in a row echelon
form:      
· ¸ 3 2 0 2 8 2 9 4 5
0 1 
, 0 1 4 , 0 0 5 2 , 0 0 .
1 3
0 1 7 0 1 4 5 0 1
As first indicated in Example 2.2.25, we can solve a system of linear equations

by first reducing the augmented matrix of the system to a row echelon form and
then solving the reduced system of equations by back substitution. We illustrate
this method with some more examples.


 2x + y + z = 2
x + 3y + z = 1 .
−x + y + z = 3

Solution. We represent the system by its augmented matrix

 
2 1 1 2
 1 3 1 1 
−1 1 1 3
and then obtain a row echelon form of the matrix.

 
1 3 1 1
 2 1 1 2 
−1 1 1 3
add row 1 to row 3
 
1 3 1 1
 0 −5 −1 0 
0 4 2 4
 
1 3 1 1
 0 1
1 5 0 
0 4 2 4

 
1 3 1 1
 0 1
 1 5 0 

6
0 0 5 4
5
multiply row 3 by 6
 
1 3 1 1
 0 1
 1 5 0 

10
0 0 1 3
When converted back to a system of equations we get


 x + 3y + z = 1
y + 15 z = 0 .
z = 10

3
Consequently,
10
z= ,
3
1 1 10 2
y =− z =− · =− ,
5 5 3 3
and µ ¶
2 10 1
x = 1 − 3y − z = 1 − 3 · − − =− .
3 3 3


 x + y + z = −1
2x + 4y + 3z = 0
3x + y + 5z = 1

Solution. First we obtain a row echelon form of the augmented matrix of the sys-
tem and then converted it back to a system of equations:


 x + y + z = −1
y + 12 z = 1
z= 2

So z = 2, y = 0, and x = −3.
Definition 2.2.31. Two matrices A and B of the same size are called equiva-
lent if A can be transformed into B by elementary row operations. If A and B
are equivalent, we write
A ∼ B.
Example 2.2.32. The work done in Example 2.2.24 can be summarized as follows:
     
2 2 1 0 1 3 1 2 1 3 1 2
 3 1 1 1 ∼ 3 1 1 1  ∼  0 −8 −2 −5 
1 3 1 2 2 2 1 0 0 −4 −1 −4
   1 3 1 2

1 3 1 2
∼ 0 1 1
4 8 ∼  0 1 14
5   5 
8 
0 −4 −1 −4 0 0 0 − 32
In future examples we will give the matrices obtained by elementary row oper-
ations without explicitly describing those operations. With sufficient experience it
should be clear what operations were used.


 x + 2y + z = 1
2x + 2y + 3z = 4 .
3x + 2y + 5z = 7

Solution. Since    
1 2 1 1 1 2 1 1
2 2 3 4 ∼ 0 −2 1 2 ,
3 2 5 7 0 0 0 0
the system can be reduced to

½
x + 2y + z = 1
− 2y + z = 2
From the second equation we get y = 12 z − 1 and then from the first equation we
get x = −2z + 3, where z is a free variable.
2.2.1 Exercises
Find the system whose augmented matrix is the given matrix.

· ¸  
2 1 3 2 3 −7 −7 0
1.
3 2 4 2. 5 1 4 5 2
3 2 4 4 1
Find the reduced row echelon form of the following matrices.

· ¸  
2 7 2 1 3 1
3.
8 28 11. 1 2 0 1
· ¸ 1 1 2 1
2 1
4.
2 1 
2 2 1 1

·
1 2 3
¸ 12. 1 3 1 1
5. 3 1 0 1
4 0 1
 
2 −3 0 5 1
· ¸
1 −1 3
6.
−1 1 −3 13. −2 3 3 −2 1
  1 −1 0 1 2
3 1 1
7. 4 2 1 
2 1 1 1 0 0

5 3 1 14. 1 1 1 0 1 0
  2 5 −1 0 0 1
3 0 1
8. 3 2 1  
1 2 4 p
1 1 3
15. 1 1 1 q

2 1 3
 5 7 11 2p + 3q
2 3 1
9. 
2
 
2 1 1 p

5 −1
1 1 1 16. 3 1 2 q
1 1 2 p − 5q
0 1 2
 
 
3 −2 0 2 1 0 3 p
10.  
3 −1 3 17. 3 1 1 5 q
1 0 5 5 3 −1 8 4p + q
   
1 1 1 2 p 0 1 1 p
18. 2 4 1 3 q 20. 1 0 1 q 
3 5 2 5 p +q 1 1 0 r
 
2 1 3 p
19. 1 2 1 q
1 1 2 r
21. List all reduced row echelon forms of 4 × 2 matrices with 1 pivot.
22. List all reduced row echelon forms of 3 × 3 matrices with 1 pivot.
23. List all reduced row echelon forms of 3 × 3 matrices with 2 pivots.
Solve the given system of linear equations using Gaussian elimination method.
½ 
3x + 2y = 1  3x + y + 2z = p
27.
4x + 3y = 2 35. x + 3y + 2z = q
x+ y+ z=r

½
2x + 3y = 0
28.

x + 2y = 1  2x + y + z = p
36. x + y + 2z = q
x + 2y + z = r
½ 
2x + y − z = 1
29.
3x + 2y + z = 4 
 x + y + 3z = p
½ 37. x−y+ z=q
x + y + 2z = 2 
3x − y + 5z = r
30.
2x + 3y + 5z = 7

  4x + 3y = p
 2x + y + z = 4 38. 3x + 4y = q
31. x + 2y + z = 3 x+ y=r

x + y + 2z = 0

2x + y + z = p



x + 2y + z = q
 
 3x + y + 2z = 7 39.
32. 3x − y + z = 5 
 3x + 3y + 2z = r


x + 9y + 5z = 11 5x + 4y + 3z = s
3x + 2y + z = p


 x + 3y + 2z = 1


x + 2y + 2z = q

33. x + y + z =2 40.
 x + y + 2z = r
2x + 4y + 3z = 2
 

x+ y+ z= s
 
 2x + y = 0  3x + 5y + 3z + 4w = p
34. 3x + 2y = 1 41. x + 2y + 2z + w = q
2x + 3y = 1 x + y − z + 2w = r
 
2.3. THE INVERSE OF A MATRIX 101

 2x + y + 3z + w = p
42. x + 2y +w=q
x + y + 2z + w = r

2.3 The inverse of a matrix

The inverse of a matrix plays a special role in linear algebra. There are two basic
questions in connection with invertible matrices: how can we check if a matrix is in-
vertible and how do we find the inverse of an invertible matrix. While up to now we
only considered invertibility of 2 × 2 matrices, in this section we extend our consid-
erations to matrices of arbitrary sizes. We also apply what we learned in the previous
section to the problem of finding inverse matrices.
Elementary matrices
Elementary 2 × 2 matrices were introduced in Section 1.2. We noticed there that
elementary matrices can be used to find the inverse of a matrix. In this section we
extend those ideas to matrices of a larger size.
Definition 2.3.1. By an elementary matrix we mean a matrix obtained from

an identity matrix by one elementary row operation.
Example 2.3.2. The matrix

 
1 0 0
0 0 1 
0 1 0
 
1 0 0
is an elementary matrix since it can be obtained from the unit matrix 0 1 0 by
0 0 1
interchanging rows 1 and 2.
The matrix
1 0 0 0
 
0 1 0 0 
 
0 0 k 0 
0 0 0 1
is an elementary matrix since it can be obtained from the unit matrix
1 0 0 0
 
0 1 0 0
 
0 0 1 0
0 0 0 1
by multiplication of row 3 by k.
The matrix
1 0 0 0 0
 
0
 1 0 k 0

0 0 1 0 0
 
0 0 0 1 0
0 0 0 0 1
is an elementary matrix since it can be obtained from the unit matrix
1 0 0 0 0
 
0 1 0 0 0
 
0 0 1 0 0
 
0 0 0 1 0
0 0 0 0 1
by multiplying row 4 by k and adding it to row 2.
The following observation is very useful in calculations involving elementary

matrices.
The result of applying an elementary row operation to a matrix is equivalent

to multiplication by an elementary matrix obtained from the unit matrix by
the same elementary row operation.
Example 2.3.3. Multiplying a 3 × 4 matrix by the elementary matrix

 
1 0 0
0 0 1 
0 1 0
is equivalent to the elementary row operation of interchanging rows 2 and 3:

    
1 0 0 a1 b1 c1 d1 a1 b1 c1 d1
0 0 1 a 2 b 2 c 2 d 2  = a 3 b 3 c 3 d 3  .
0 1 0 a3 b3 c3 d3 a2 b2 c2 d2
Multiplying a 4 × 2 matrix by the matrix
1 0 0 0
 
0 1 0 0
 
0 0 k 0
0 0 0 1
is equivalent to the elementary row operation of multiplication of row 3 by k:
1 0 0 0 a1 b1 a1 b1
    
 a 2 b2  =  a2 b2  .
0 1 0 0   

0 0 k 0  a 3 b 3  ka 3 kb 3 
0 0 0 1 a4 b4 a4 b4
Multiplying a 5 × 3 matrix by the matrix
1 0 0 0 0
 
0
 1 0 k 0

0 0 1 0 0
 
0 0 0 1 0
0 0 0 0 1
is equivalent to the elementary row operation of multiplying row 4 by k and adding

it to row 2:
1 0 0 0 0 a1 b1 c1 a1 b1 c1
    
0
 1 0 k  a 2
0  b2 c2 
 a 2 + ka 4 b 2 + kb 4 c 2 + kc 4 
 
 a 3 b3 c3  a b c
0 0 1 0 0  =  3 3 3
.
   
0 0 0 1 0 a 4
 b4 c4   a4 b4 c4 
0 0 0 0 1 a5 b5 c5 a5 b5 c5
In the next example the situation is somewhat different from what we considered
so far in this section.
Example 2.3.4. Find a matrix P such that

   
a1 b1 c1 a1 b1 c1
P a 2 b 2 c 2  =  a 2 + j a 1 b 2 + j b 1 c 2 + j c 1 
a3 b3 c3 a 3 + ka 1 b 3 + kb 1 c 3 + kc 1
Solution. The matrix on the right is the result of applying two elementary opera-
tions: multiply row 1 by j and add to row 2 and then multiply row 1 by k and add
to row 3. To represent these two elementary operations we need two elementary
matrices:    
1 0 0 1 0 0
 j 1 0 and  0 1 0 .
0 0 1 k 0 1
Thus the matrix P is the product of these two matrices:
    
1 0 0 1 0 0 1 0 0
P =  j 1 0  0 1 0 =  j 1 0 .
0 0 1 k 0 1 k 0 1
Now we have
      
a1 b1 c1 1 0 0 a1 b1 c1 a1 b1 c1
P  a 2 b 2 c 2  =  j 1 0  a 2 b 2 c 2  =  a 2 + j a 1 b 2 + j b 1 c 2 + j c 1  .
a3 b3 c3 k 0 1 a3 b3 c3 a 3 + ka 1 b 3 + kb 1 c 3 + kc 1
Note that the matrix P does not depend on the order of the elementary matri-
ces, that is, we have
       
1 0 0 1 0 0 1 0 0 1 0 0 1 0 0
 j 1 0  0 1 0 =  0 1 0  j 1 0 =  j 1 0 .
0 0 1 k 0 1 k 0 1 0 0 1 k 0 1
 
1 0 0
The matrix P =  j 1 0 in the above example is not an elementary matrix since
k 0 1
we need to apply two elementary operations to the 3 × 3 identity matrix to obtain
P . When applying Gaussian elimination to a matrix, we often combine elemen-
tary operations that change a single column in the matrix. For example, we write
Note that the combination of these two elementary operations correspond to mul-
tiplication by a single matrix, namely, the matrix
1 0 − 25
 
1
0 1 5.

0 0 1
In the remainder of this chapter we often represent matrices as products of elemen-

tary matrices. It is convenient to combine those elementary matrices that corre-
spond to multiplication of the same row and addition to the remaining rows into
a single matrix, as in the example above. Since traditionally the name “elementary
matrix” is reserved to mean a matrix obtained from an identity matrix by one el-
ementary row operation, we will use the name “simple matrix” to mean a matrix
from this larger class of matrices.
Definition 2.3.5. By a simple matrix we mean an elementary matrix or a

product of elementary matrices that correspond to multiplication of the
same row and addition to the remaining rows.
Example 2.3.6. The matrices
1 0 0 0 0
 
1 0 0 p
 
 
1 p 0 p 1 0 0 0
0 1 0 ,
0 1 0 q 
q

 , 0 1 0 0
0 0 1 r  
0 q 1 r 0 0 1 0
0 0 0 1
s 0 0 0 1
are simple matrices for arbitrary real numbers p, q, r, s.

 
2 3 2 5
A = 1 2 −1 3 .
1 1 3 2
Find a matrix P such that the product P A is the reduced row echelon form of the
matrix A. Express the matrix P as a product of simple matrices.
Solution. We have
    
0 1 0 2 3 2 5 1 2 −1 3
1 0 0 1 2 −1 3 = 2 3 2 5 ,
0 0 1 1 1 3 2 1 1 3 2
    
1 0 0 1 2 −1 3 1 2 −1 3
−2 1 0 2 3 2 5 = 0 −1 4 −1 ,
−1 0 1 1 1 3 2 0 −1 4 −1
    
1 0 0 1 2 −1 3 1 2 −1 3
0 −1 0 0 −1 4 −1 = 0 1 −4 1 ,
0 0 1 0 −1 4 −1 0 −1 4 −1
and finally
    
1 −2 0 1 2 −1 3 1 0 7 1
0 1 0 0 1 −4 1 = 0 1 −4 1 .
0 1 1 0 −1 4 −1 0 0 0 0
This means that we can take
    
1 −2 0 1 0 0 1 0 0 0 1 0
P = 0 1 0 0 −1 0 −2 1 0 1 0 0
0 1 1 0 0 1 −1 0 1 0 0 1
   
1 −2 0 1 0 0 0 1 0
= 0 1 0 0 −1 0 1 −2 0
0 1 1 0 0 1 0 −1 1
  
1 −2 0 0 1 0
= 0 1 0 −1 2 0
0 1 1 0 −1 1
 
2 −3 0
= −1 2 0 .
−1 1 1
It is easy to verify that

    
2 −3 0 2 3 2 5 1 0 7 1
P A = −1 2 0 1 2 −1 3 = 0 1 −4 1 .
−1 1 1 1 1 3 2 0 0 0 0
Invertible matrices
Definition 2.3.8. An n × n matrix A is invertible if there is a matrix B such

that
AB = B A = I n ,
where I n is the n × n identity matrix.
Theorem 2.3.9. If an n × n matrix A is invertible, then there is a unique ma-

trix B such that AB = B A = I n .
Proof. We need to show that, if
AB = B A = I n and AC = C A = I n ,
then B = C . Indeed, if B A = I n and AC = I n , then
B = B I n = B (AC ) = (B A)C = I n C = C .
Definition 2.3.10. Let A be an invertible n × n matrix. The unique matrix B

such that AB = B A = I n is called the inverse of the matrix A and is denoted
by A −1 .
Now we consider the problem of finding the inverse of an invertible matrix. We

start by noting that, since elementary row operations are reversible, elementary ma-
trices are invertible. Moreover, it is easy to find the inverse of an elementary matrix
or, more generally, a simple matrix.
Example 2.3.11. Assuming k 6= 0, find the inverses of the elementary matrices

     
k 0 0 1 0 0 1 0 0
 0 1 0 , 0 k 0 , and 0 1 0  .
0 0 1 0 0 1 0 0 k
Solution.
 −1  1 
k 0 0 k 0 0
 0 1 0 =  0 1 0 ,
0 0 1 0 0 1
 −1  
1 0 0 1 0 0
0 k 0 = 0 1 0 ,
k
0 0 1 0 0 1
and
 −1  
1 0 0 1 0 0
0 1 0 = 0 1 0 .
0 0 k 0 0 k1
Example 2.3.12. Find the inverses of the elementary matrices

     
0 1 0 1 0 0 0 0 1
1 0 0 , 0 0 1 and 0 1 0  .
0 0 1 0 1 0 1 0 0
Solution.
 −1  
0 1 0 0 1 0
1 0 0 = 1 0 0 ,
0 0 1 0 0 1
 −1  
1 0 0 1 0 0
0 0 1 = 0 0 1 ,
0 1 0 0 1 0
and
 −1  
0 0 1 0 0 1
0 1 0 = 0 1 0 .
1 0 0 1 0 0
Note that for these elementary matrices the inverse is the same as the original
matrix.
Example 2.3.13. Find the inverse of the simple matrices

     
1 0 0 1 c 0 1 0 e
a 1 0 , 0 1 0 and 0 1 f  .
b 0 1 0 d 1 0 0 1
Solution.
 −1  
1 0 0 1 0 0
a 1 0 = −a 1 0 ,
b 0 1 −b 0 1
 −1  
1 c 0 1 −c 0
0 1 0 = 0 1 0 ,
0 d 1 0 −d 1
and
 −1  
1 0 e 1 0 −e
0 1 f  = 0 1 − f  .
0 0 1 0 0 1
The inverses found in the above three examples easily generalize to larger matri-
ces.
Example 2.3.14.
−1 
1 a 0 0 0 1 −a 0 0 0
 
0 1 0 0 0 0 1 0 0 0
   
0
 b 1 0 0 = 0 −b 1 0 0 .
 
0 c 0 1 0 0 −c 0 1 0
0 d 0 0 1 0 −d 0 0 1
In the process of determining the inverse of a matrix the following theorem is

often useful.
Theorem 2.3.15. If A 1 , . . . , A m are invertible n × n matrices, then the product

matrix A 1 · · · A m is invertible and we have
(A 1 · · · A m )−1 = A −1 −1
m · · · A1 .
Note that the order of the matrices in the equality (A 1 · · · A m )−1 = A −1 −1

m · · · A 1 is
different on the left-hand side and the right-hand side.
Proof for m = 3. We have
A 1 A 2 A 3 A −1 −1 −1 −1 −1 −1
3 A2 A1 = A1 A2 A2 A1 = A1 A1 = In
and
A −1 −1 −1 −1 −1 −1
3 A2 A1 A1 A2 A3 = A3 A2 A2 A3 = A3 A3 = In .
It should be clear why this argument easily generalizes to any number of matri-
ces.
Example 2.3.16. Calculate the matrix
1 0 0 0 1 0 3 0 1 0 0 0
   
0 0 1 0  1
 0 1 2 0
 0 5 0 0
 
A=
0 1 0 0 0
 0 1 0 0 0 1 0
 
0 0 0 1 0 0 7 1 0 0 0 1
and then write its inverse as a product of three simple matrices. Use the result to
calculate the inverse of the matrix A.
Solution. First we find that

1 0 3 0
 
0 0 1 0 
A=
0 1 2 0 .

5
0 0 7 1
Now
−1
1 0 0 0 1 0 3 0 1 0 0 0
  
0 0 1 0 0 1 2 0 0 1 0 0
A −1 = 

   5 
0 1 0 0 0 0 1 0 0 0 1 0 
0 0 0 1 0 0 7 1 0 0 0 1
−1  −1  −1
1 0 0 0 1 0 3 0 1 0 0 0

0 1 0 0 0 1 2 0 0 0 1 0
= 5     
0 0 1 0 0 0 1 0 0 1 0 0
0 0 0 1 0 0 7 1 0 0 0 1
1 0 0 0 1 0 −3 0 1 0 0 0
   
0 5 0 0
 0 1 −2 0 0 0 1 0
 
=
0

0 1 0 0 0 1 0 0 1 0 0
0 0 0 1 0 0 −7 1 0 0 0 1
1 0 0 0 1 −3 0 0
  
0 5 0 0
 0 −2 1 0

=
0

0 1 0 0 1 0 0
0 0 0 1 0 −7 0 1
1 −3 0 0
 
0 −10 5 0
= .
0 1 0 0
0 −7 0 1
It is easy to verify that
1 0 3 0 1 −3 0 0 1 0 0 0
    
0 0 1 0 0 −10 5 0 0 1 0 0
 1  = 
0  0 1 0 0  0 0 1 0
5 2 0
0 0 7 1 0 −7 0 1 0 0 0 1
and
1 −3 0 0 1 0 3 0 1 0 0 0
    
0 −10 5 0 0 0 1 0 0 1 0 0
= .
1 0 0 0 15 2 0 0
 
0 0 1 0
0 −7 0 1 0 0 7 1 0 0 0 1
A characterization of invertible matrices

Now we turn our attention to the problem of determining whether a given matrix is
invertible. We start with the following important theorem.
Theorem 2.3.17. Let A be an n × n matrix with columns c1 , . . . , cn , that is,

A = [c1 . . . cn ]. The following conditions are equivalent:
(a) The matrix A is invertible;
(b) The equation

x 1 c1 + · · · + x n cn = 0
has only the trivial solution, that is, the solution x 1 = · · · = x n = 0;
(c) The reduced row echelon form of the matrix A is the identity matrix I n ;
(d) The matrix A can be written as a product of elementary matrices.

Proof for n = 4. First we assume that the matrix
a1 b1 c1 d1
 
£ ¤  a2 b2 c2 d2 
A = c1 c2 c3 c4 = 
 a3

b3 c3 d3 
a4 b4 c4 d4
is invertible. The equation
x 1 c1 + x 2 c2 + x 3 c3 + x 4 c4 = 0
can be written as a matrix equation
a1 b1 c1 d1 x1 0
    
 a2 b2 c2 d2  x 2  0
   =  .
 a3 b3 c3 d 3  x 3  0
a4 b4 c4 d4 x4 0
Since the matrix A is invertible, we have
x1 1 0 0 0 x1
    
x 2   0 1 0 0  x 2 
 =  
x 3   0 0 1 0  x 3 
x4 0 0 0 1 x4
−1 
a1 b1 c1 d1 a1 b1 c1 d1 x1
  
 a2 b2 c2 d2   a2 b2 c2 d2  x 2 
=
 a3
   
b3 c3 d3   a3 b3 c3 d 3  x 3 
a4 b4 c4 d4 a4 b4 c4 d4 x4
−1    
a1 b1 c1 d1 0 0

 a2 b2 c2 d2  0  0 
=    =  .
 a3 b3 c 3 d 3  0  0 
a4 b4 c4 d4 0 0
This shows that (a) implies (b).

Next we assume that the only solution of the equation x 1 c1 +x 2 c2 +x 3 c3 +x 4 c4 = 0
is x 1 = x 2 = x 3 = x 4 = 0. This means that the only solution of the equation
a1 b1 c1 d1 x1 0
    
 a2 b2 c2 d2  x 2  0
   =  
 a3 b3 c3 d 3  x 3  0
a4 b4 c4 d4 x4 0
is
x1 0
   
x 2  0
  =  .
x 3  0
x4 0
We note that the same is true for any matrix obtained from the matrix A by elemen-
tary row operations.
If a 1 = a 2 = a 3 = a 4 = 0, then
a1 b1 c1 d1 1 0 b1 c1 d1 1 0
       
 a2 b2 c2 d2  0  0 b2 c2 d2  0 0
   =    =  
 a3 b3 c3 d 3  0  0 b3 c3 d 3  0 0
a4 b4 c4 d4 0 0 b4 c4 d4 0 0
contrary to our assumption. So at least one of the numbers a 1 , a 2 , a 3 , or a 4 must

be different from 0. Without loss of generality, we can assume that a 1 6= 0. Now we
multiply the first row of the matrix
a1 b1 c1 d1
 
 a2 b2 c2 d2 
 
 a3 b3 c3 d3 
a4 b4 c4 d4
1
by a1 and get
b1 c1 d1
 
1 a1 a1 a1
a2 b2 c2 d2 
 

 
 a3 b3 c3 d3 
a4 b4 c4 d4
and then, by row replacement, we obtain
b1 c1 d1
 
1 a1 a1 a1
 
 0 b 2 − aa21 b 1 c 2 − aa21 c 1 d 2 − aa21 d 1
 

 
 0 b 3 − aa31 b 1 c 3 − aa31 c 1 d 3 − aa31 d 1
 

 
0 b 4 − aa41 b 1 c 4 − aa41 c 1 d 4 − aa41 d 1
We will write this last matrix as
b 10 c 10 d 10
 
1
 0 b 20 c 20 d 20 
 
 .
 0
 b 30 c 30 d 30 

0 b 40 c 40 d 40
If b 20 = b 30 = b 40 = 0, then
b 10 c 10 d 10
 
a1 b1 c1 d1
 0 
−b 1 1  0  
0
 −b 1

0 0 c 20 d 20

 a2 b2 c2 d2 
 1  = 
     1  0
  0  = 0
    
 a3 b3 c3 d3   0    0 0 c 30 d 30 
a4 b4 c4 d4 0 0 0 c 40 d 40 0 0
contrary to our assumption. So at least one of the numbers b 20 , b 30 , or b 40 must be

different from 0. Without loss of generality, we can assume that b 20 6= 0. Now we
continue as before with the matrix
1 b 10 c 10 d 10
 
 0 b 20 c 20 d 20 
 
 
 0 b0 c 0 d 0 
 3 3 3 
0 b 40 c 40 d 40
1
and by multiplying the second row by b 20
we get
b 10 c 10 d 10
 
1
c 20 d 20
 
 0 1
 
b 20 b 20

 
b 30 c 30 d 30
 
 0 
 
0 b 40 c 40 d 40
and then by row replacement we get
 b0 b0 
1 0 c 10 − b10 c 20 d 10 − b10 d 20
 2 2 

 0 1 c 20 d 20 

 b 20 b 20 
 0
.
b0
 0 0 c 0 − b30 c 0 d 30 − b30 d 20
 

 3 b2 2 2

b0 b0
 
0 0 c 40 − b40 c 20 d 40 − b40 d 20
2 2

1 0 c 100 d 100
 
0 1 c 200 d 200 
 
0 0 c 00 d 00  .
 
 3 3
0 0 c 400 d 400
If c 300 = c 400 = 0, then
0 c 100 d 100
 
 −c 00  1 −c 100
  
a1 b1 c1 d1 0

1
 00   0 1 c 200 d 200   00   

 a2 b2 c2 d 2  −c 2  
  −c 0
 =  2 =

 a3 b3 c3 d3   1   0 0 d 300  0

0  1 
a4 b4 c4 d4 0 0 0 0 d 400 0 0
contrary to our assumption. So at least one of the numbers c 300 or c 400 must be different
from 0. Without loss of generality, we can assume that c 300 6= 0. Now we continue as
before with the matrix
1 0 c 100 d 100
 
0 1 c 200 d 200 
 
 
0 0 c 00 d 00 
 3 3
0 0 c 400 d 400
1
and by multiplying the third row by c 300
we get
1 0 c 100 d 100
 
 
0 1 c 00 d 00 
 2 2
 00 

0 0 1 d003 

 c3 
0 0 c 400 d 400
and then by row replacement

 c 100 d 300 
1 0 0 d 100 − c 300 

00 00 
0 1 0 d 00 − c2 d00 3 

 2 c3 
 .
 d 300 
0 0 1
c 300 


c 00 d 00
 
0 0 0 d 400 − 4c 00 3
3

0 0 d 1000
 
1
0 1 0 d 2000 
 
 .
0
 0 1 d 3000 

0 0 0 d 4000
If d 4000 = 0, then
0 0 d 1000
 
 −d 000  1 −d 1000
  
a1 b1 c1 d1 0

1
 000   000  000   
 a2 b2 c2 d 2  −d 2   0 1 0 d2  
  2  = 0
−d
 
=
  
 a3 b3 c3 d 3  −d 000   0 1 d 3000  0

3 0  −d 3000 
a4 b4 c4 d4 1 0 0 0 0 1 0
contrary to our assumption. So we must have d 4000 6= 0.

Now we multiply the last row of the matrix
1 0 0 d 1000
 
0 1 0 d 2000 
 
 
0 0 1 d 000 
 3 
0 0 0 d 4000
1
by d 4000
and get
1 0 0 d 1000
 
0 1 0 d 2000 
 
 
0 0 1 d 000 
 3 
0 0 0 1
and then by row replacement

1 0 0 0
 
0 1 0 0
 
0 0 1 0
0 0 0 1
This long argument shows that if x 1 = x 2 = x 3 = x 4 = 0 is the only solution of the
equation x 1 c1 + x 2 c2 + x 3 c3 + x 4 c4 = 0, then the reduced row echelon form of the
matrix A is the identity matrix I 4 . Therefore (b) implies (c).
Now we assume that the reduced row echelon form of the matrix A is the identity
matrix I 4 . Each step in the Gaussian elimination process corresponds to an elemen-
tary matrix E j . When the matrix A is multiplied by the product E = E m E m−1 . . . E 2 E 1
of all those elementary matrices, then we obtain I 4 , that is, we have E A = I 4 . Conse-
quently,
−1
A = E −1 = (E m E m−1 . . . E 2 E 1 )−1 = E 1−1 E 2−1 . . . E m−1 −1
Em .
Therefore (c) implies (d) because the inverse of a elementary matrix is an elementary
matrix.
Finally, if the matrix A can be written as a product of elementary matrices, then
it is invertible, because elementary matrices are invertible and the product of invert-
ible matrices is an invertible matrix. Therefore (d) implies (a).
All essential ingredients of the general proof are present in the considered case
and generalizing the argument to larger matrices is easy, but tedious.
 
2 1 2
Example 2.3.18. Show that the matrix A = 1 2 1 is invertible. Write the matrix
1 2 2
and its inverse as products of simple matrices and calculate A −1 .
Solution. Since     
0 1 0 2 1 2 1 2 1
1 0 0 1 2 1 = 2 1 2 ,
0 0 1 1 2 2 1 2 2
    
1 0 0 1 2 1 1 2 1
−2 1 0 2 1 2 = 0 −3 0 ,
−1 0 1 1 2 2 0 0 1
    
1 0 0 1 2 1 1 2 1
1
0 − 0 0 −3 0 = 0 1 0 ,
3
0 0 1 0 0 1 0 0 1
    
1 −2 0 1 2 1 1 0 1
0 1 0 0 1 0 = 0 1 0 ,
0 0 1 0 0 1 0 0 1
and finally
    
1 0 −1 1 0 1 1 0 0
0 1 0 0 1 0 = 0 1 0 ,
0 0 1 0 0 1 0 0 1
we have
     
10 −1 1 −2 0 1 0 0 1 0 0 0 1 0
A −1 = 0 1 0 0 1 0 0 − 13 0 −2 1 0 1 0 0
00 1 0 0 1 0 0 1 −1 0 1 0 0 1
    
10 −1 1 −2 0 1 0 0 0 1 0
= 0 1 0 0 1 0 0 − 13 0 1 −2 0
00 1 0 0 1 0 0 1 0 −1 1
   
10 −1 1 −2 0 0 1 0
= 0 1 0 0 1 0 − 13 23 0
00 1 0 0 1 0 −1 1
   2 − 1 0
10 −1 3 3
= 0 1 0 − 13 2
3 0
 
00 1 0 −1 1
 2 2 
3 3 −1
 1 2
= − 3 3 0 .

0 −1 1
Now we use the representation of A −1 as a product of simple matrices to find a

representation of A as a product of simple matrices.
¢−1
A = A −1
¡
     −1
1 0 −1 1 −2 0 1 0 0 1 0 0 0 1 0
= 0 1 0 0 1 0 0 − 13 0 −2 1 0 1 0 0
0 0 1 0 0 1 0 0 1 −1 0 1 0 0 1
 −1  −1  −1  −1  −1
0 1 0 1 0 0 1 0 0 1 −2 0 1 0 −1
1
= 1 0 0 −2 1 0 0 − 3 0 0 1 0 0 1 0
0 0 1 −1 0 1 0 0 1 0 0 1 0 0 1
     
0 1 0 1 0 0 1 0 0 1 2 0 1 0 1
= 1 0 0 2 1 0 0 −3 0 0 1 0 0 1 0 .
0 0 1 1 0 1 0 0 1 0 0 1 0 0 1
According to Definition 2.3.8 an n × n matrix A is invertible if there is a matrix B

such that AB = I n and B A = I n . In the case when n = 2 we discovered that it is not
necessary to check both conditions. The same is true for square matrices of any size.
Theorem 2.3.19. If A and B are n × n matrices such that
AB = I n ,
then both matrices A and B are invertible and we have A −1 = B and B −1 = A.
Proof. Assume that A and B are n × n matrices such that
AB = I n .
If
B x = 0,
then
AB x = A0
and thus
x = I n x = AB x = A0 = 0.
This shows that the only solution of the equation B x = 0 is the trivial solution x = 0,
which implies that B is invertible, by Theorem 2.3.17. Let B −1 = C . Then
BC = C B = I n
and consequently
A = AI n = ABC = I n C = C ,
which means that
B A = AB = I n .

  
1 1 0 1 0 1
2 0 −1 0 0 −1 = I 3 ,
0 −1 0 2 −1 2
   
1 1 0 1 0 1
both matrices 2 0 −1 and 0 0 −1 are invertible and we have
0 −1 0 2 −1 2
 −1  
1 1 0 1 0 1
2 0 −1 = 0 0 −1
0 −1 0 2 −1 2
and
 −1  
1 0 1 1 1 0
0 0 −1 = 2 0 −1 .
2 −1 2 0 −1 0
A matrix equation
In Theorem 1.2.6 we show how inverse matrices can be used to solve systems of
two linear equations with two unknowns. That method works equally well for more
general systems.
Theorem 2.3.21. Let A be an n × n invertible matrix and B an n × p matrix.

The equation
AX = B (2.2)
has a unique solution and that solution is the n × p matrix
X = A −1 B.
Proof. If A is invertible, we can multiply the equation AX = B by the matrix A −1 and

get
A −1 (AX ) = A −1 B
which simplifies to
X = A −1 B. (2.3)
This shows that, if the equation (2.2) has a solution, then it is given by the equa-
tion (2.3).
Now we need to check that X = A −1 B is a solution. Indeed, we have
AX = A A −1 B = (A A −1 )B = I n B = B.
¡ ¢
Theorem 2.3.22. Let A be an n × n invertible matrix and B an n × p matrix.

Then the reduced row echelon form of the n × (n + p) matrix
£ ¤
A B
is the matrix
I n A −1 B .
£ ¤
Proof. According to Theorem 2.3.17 there are elementary matrices E 1 , E 2 , . . . , E m−1 , E m

such that
E m E m−1 . . . E 2 E 1 A = I n .
Consequently,
E m E m−1 . . . E 2 E 1 A B = E m E m−1 . . . E 2 E 1 A E m E m−1 . . . E 2 E 1 B = I n A −1 B ,

£ ¤ £ ¤ £ ¤
because E m E m−1 . . . E 2 E 1 = A −1 .
The above result applied to the £case when B = I n tells

£ us that¤ the reduced row
echelon form of the n × 2n matrix A I n is the matrix I n A −1 . This suggests a
¤
convenient method for finding the inverse of a matrix, as illustrated in the following
example.
Example 2.3.23. Determine if the matrix
2 1 1 1
 
1 2 1 1
A=
1

1 2 1
1 1 1 2
is invertible. If A is invertible, then find its inverse.
Solution. We find that

4
− 15 − 15 − 15
 

2 1 1 1 1 0 0 0
 1 0 0 0 5
1 4
0 1 0 0 −5 − 15 − 15 
 
1 2 1 1 0 1 0 0 
  5
 ∼ .
1 1 2 1 0 0 1 0  0 0 1 0 − 51 − 15 4
5 −5
1
1 1 1 2 0 0 0 1
0 0 0 1 − 51 − 15 − 15 4
5
This means that the matrix A is invertible and the inverse is

 4 1 1 1
5 −5 −5 −5 4 −1 −1 −1
 
 1 4 1 1
− 5 5 −5 −5

 1 −1 4 −1 −1
A −1 = 
− 1 − 1 4 1
= −1 −1 4 −1 .

 5 5 5 − 5  5
−1 −1 −1 4
− 15 − 15 − 15 4
5
Example 2.3.24. Find a matrix

 
x p
 y q 
z r
such that     
2 1 1 x p 2 1
 3 3 1  y q = 1 2 .
1 1 2 z r 3 4
Solution. Since
  1 3
0 0


2 1 1 2 1 5 −1
 3 3 1 1 2 ∼ 0 1 0 − 45 1 ,
 
1 1 2 3 4 0 0 1 8
2
5
we have
   −1    3 −1

x p 2 1 1 2 1 5
 y q = 3 3 1   1 2  =  − 45 1 ,
 
z r 1 1 2 3 4 8
2
5
by Theorem 2.3.22.
LU-decomposition of 3 × 3 matrices
In Chapter 1 we introduced LU-decomposition of 2 × 2 matrices. The idea can be

easily generalized to 3 × 3 matrices.
Definition 2.3.25. By an LU-decomposition (or an LU-factorization) of a 3×3

matrix A we mean the representation of A in the form
A = LU
where U is an upper triangular matrix and L is a lower triangular matrix with

every entry on the main diagonal equal 1, that is,
  
1 0 0 u1 u2 u3
A = l 1 1 0  0 u 4 u 5  .
l2 l3 1 0 0 u6
When finding an LU-decomposition of a 3 × 3 matrix it is useful to note that, if

a 1 6= 0, then
1 0 0 a1 b1 c1
    
a1 b1 c1
a
− a 1 0 a 2 b 2 c 2  =  0 b 0 c 0 
2
1 2 2
− aa31 0 1 a 3 b 3 c 3 0 b 30 c 30
and, if b 20 6= 0,
1 0 0  a b c   a b c  u u u 
 
1 1 1 1 1 1 1 2 3
0 1 0  0 b 20 c 20  =  0 b 20 c 20  =  0 u 4 u 5  .
b0

0 − b30 1 0 b 30 c 30 0 0 c 300 0 0 u6
2
 
1 1 2
Example 2.3.26. Find an LU-decomposition of the matrix 3 1 1.
2 4 3
1 0 0 1 1 2 1 1 2
−3 1 0 3 1 1 = 0 −2 −5
−2 0 1 2 4 3 0 2 −1
and     
1 0 0 1 1 2 1 1 2
0 1 0 0 −2 −5 = 0 −2 −5 ,
0 1 1 0 2 −1 0 0 −6
we have      
1 0 0 1 0 0 1 1 2 1 1 2
0 1 0 −3 1 0 3 1 1 = 0 −2 −5 .
0 1 1 −2 0 1 2 4 3 0 0 −6
Hence      
1 1 2 1 0 0 1 0 0 1 1 2
3 1 1 = 3 1 0 0 1 0 0 −2 −5 ,
2 4 3 2 0 1 0 −1 1 0 0 −6
because
 −1    −1  
1 0 0 1 0 0 1 0 0 1 0 0
0 1 0 = 0 1 0 and −3 1 0 = 3 1 0 .
0 1 1 0 −1 1 −2 0 1 2 0 1
Since     
1 0 0 1 0 0 1 0 0
3 1 0 0 1 0 = 3 1 0 ,
2 0 1 0 −1 1 2 −1 1
we obtain the following LU-decomposition of the matrix
    
1 1 2 1 0 0 1 1 2
3 1 1 = 3 1 0 0 −2 −5 .
2 4 3 2 −1 1 0 0 −6
 
2 1 4
2 3 1
1 0 0 2 1 4 2 1 4
− 1 1 0 1 1 1 = 0 1 −1
2 2
−1 0 1 2 3 1 0 2 −3
and     
1 0 0 2 1 4 2 1 4
0 1 0 0 1 −1 = 0 1 −1  ,
2 2
0 −4 1 0 2 −3 0 0 1
we have      
2 1 4 1 0 0 1 0 0 2 1 4
1
1 1 1 =  1 0 0 1 0 0 1
2 −1 .

2
2 3 1 1 0 1 0 4 1 0 0 1
Consequently,
    
2 1 4 1 0 0 2 1 4
1 1 1 =  1 1 0 0 1 −1 .
2 2
2 3 1 1 4 1 0 0 1
 
2 1 0
2 3 0
1 0 0 2 1 0 2 1 0
− 3 1 0 3 1 0 = 0 − 1 0
2 2
−1 0 1 2 3 0 0 2 0
and     
1 0 0 2 1 0 2 1 0
1 1
0 1 0 0 − 0 = 0 − 0 ,
2 2
0 4 1 0 2 0 0 0 0
we obtain     
2 1 0 1 0 0 2 1 0
3 1 0 =  3 1 0 0 − 12 0 .
2
2 3 0 1 −4 1 0 0 0
As in the case of 2 × 2 matrices, not every 3 × 3 matrix has an LU-decomposition.

 
1 1 2
Example 2.3.29. Show that the matrix A = 1 1 3 has no LU-decomposition.
0 1 4
Solution. Suppose to the contrary, that the matrix A has an LU-decomposition,

that is there are numbers l j ’s and u k ’s such that
    
1 1 2 1 0 0 u1 u2 u3
1 1 3 = l 1 1 0  0 u 4 u 5  .
0 1 4 l2 l3 1 0 0 u6
Since
    
1 0 0 u1 u2 u3 u1 u2 u3
l 1 1 0  0 u 4 u 5  = l 1 u 1 l 1 u2 + u4 l 1 u3 + u5  ,
l2 l3 1 0 0 u6 l 2 u1 l 2 u2 + l 3 u4 l 2 u3 + l 3 u5 + u6
we obtain u 1 = u 2 = 1, then l 1 = 1 and l 2 = 0, and then u 4 = 0. But this is not

possible, because then we would have
1 = l 2 u 2 + l 3 u 4 = 0 · 1 + l 3 · 0 = 0.
The following theorem gives us a condition that guarantees existence of an LU-

decomposition of a 3 × 3 matrix.
Theorem 2.3.30. If r 6= ap, then the matrix

 
1 p q
A = a r s 
b t u
has an LU-decomposition of the matrix A.
Proof. Since
    
1 0 0 1 p q 1 p q
−a 1 0 a r s  = 0 r − ap s − aq 
−b 0 1 b t u 0 t − bp u − bq
and
    
1 0 0 1 p q 1 p q
 0 r − ap s − aq  = 0 r − ap s − aq 
0 1 0 ,


t −bp (s−aq)(t −bp)
0 − r −ap 1 0 t − bp u − bq 0 0 u − bq − r −ap
we have
   
1 0 0 1 p q

1 p q
a
a r s  =  1 0 0 r − ap s − aq 
.

t −bp (s−aq)(t −bp)
b t u b r −ap 1 0 0 u − bq − r −ap
In the next example we use LU-decomposition to solve a system of three linear

equations with three unknowns.
Example 2.3.31. Use the LU-decomposition

    
2 1 4 1 0 0 2 1 4
1 1 1 =  1 1 0 0 1 −1
2 2
2 3 1 1 4 1 0 0 1
to solve the system


 2x 1 + x 2 + 4x 3 = 2
x + x2 + x3 = 1 .
 1
2x 1 + 3x 2 + x 3 = 0
Solution. The system can be written as a matrix equation

    
2 1 4 x1 2
1 1 1 x 2  = 1
2 3 1 x3 0
or      
1 0 0 2 1 4 x1 2
 1 1 0 0 1 −1 x 2  = 1
2 2
1 4 1 0 0 1 x3 0
We let     
2 1 4 x1 y1
0 1 −1 x 2  =  y 2 
2
0 0 1 x3 y3
and solve the equation
    
1 0 0 y1 2
 1 1 0  y 2  = 1 .
2
1 4 1 y3 0
By forward substitution we get
y 1 = 2, y 2 = 0, and y 3 = −2.
Next we solve the equation

    
2 1 4 x1 2
0 1   x 2  =  0 .
2 −1
0 0 1 x3 −2
By backward substitution we get
x 3 = −2, x 2 = −4, and x 1 = 7.
2.3.1 Exercises
Find the given products of matrices.

  
1 0 0 a1 b1 c1 d1 1 0 0 0 0 a1
  
1. 0 1 4 a 2 b 2 c 2 d 2  0
 0 0 1  a 2 
0  
0 0 1 a3 b3 c3 d3 0
6.  0 1 0  a 3 
0  
0 1 0 0 0 a 4 

 
1 0 0 a1 b1 c1 d1
 0 0 0 0 1 a5
2. 0 1 0 a 2 b 2 c 2 d 2 
1 3 0 0 a1 b1
  
0 1 1 a3 b3 c3 d3
 a 2 b2 
0 1 0 0 
7.  
  
0 0 1 0  a 3 b3 
1 0 0 a1 b1 c1 d1 0 8 0 1 a4 b4
3. −1 1 0 a 2 b 2 c 2 d 2 
7 0 1 a3 b3 c3 d3 1 0 −5 0 a 1 b1 c1
  
 a 2 b2 c2 
0 1 0 0 
8.  
 
1 0 −4 a 1 b 1 c 1 d 1
 0 0 1 0  a 3 b3 c3 
4. 0 1 −5 a 2 b 2 c 2 d 2  0 0 1 1 a4 b4 c4
0 0 1 a3 b3 c3 d3    
1 2 0 1 0 1 1 0 0
9. 0 1 0 0 1 −2  3 1 0
1 −a 1 0 0 0 a1
  
0 5 1 0 0 1 −1 0 1
0 1 0 0 0
  1
 
    
0
5.  −a 3 1 0 a 3 
0  1 0 0 0 0 1 4 0 0
 
0 −a 2 0 1 0 a2 
  10. 0 3 0 0 1 0 0 1 0
0 −a 5 0 0 1 a5 0 0 1 1 0 0 0 0 1
Find a matrix P such that the given equality holds.

   
a1 b1 5a 1 5b 1 a1 3a 1
   
11. P a 2 b 2  = 4a 2 4b 2  a 2  5a 2 
12. P 
a 3  = 4a 3 
  
a3 b3 2a 3 2b 3
a4 2a 4
a1 b1 a3 b3 a1 b1 a1 + j a2 b1 + j b2
       
a 2 b2  = a 2 b 2 
  a 2 b2   a2 b2 
13. P 
a 3 17. P  = 
b 3  a 1 b 1  a 3 b 3   a 3 + ka 2 b 3 + kb 2 
a4 b4 a4 b4 a4 b4 a 4 + ma 2 b 4 + mb 2
a1 a 1 + 3a 4
   
a1 a1 a 2   a 2 + a 4 
   
   
a 2  a 4 
a 3  = a 3 + 5a 4 
18. P    
14. P   =  
  
a3 a3  a 4   a4 
a4 a2 a5 a 5 + 7a 4
a1 a 1 + 2a 2
   
   
a1 b1 a2 b2 a 2  
   a2 
15. P a 2 b 2  = a 3 b 3  19. P a 3  = a 3 + 7a 2 
  

a3 b3 a1 b1 a 4  a 4 + 8a 2 
a5 a 5 + 3a 2
a1 a3 a1 b1 c1 a 1 + 5a 2 b 1 + 5b 2 c 1 + 5a 2
       
a 2  a 1  a 2 b2 c2   a2 b2 c2 
16. P 
a 3  = a 4 
   20. P 
a 3
= 
b3 c3   a3 b3 c3 
a4 a2 a4 b4 c4 a4 b4 c4
Find a matrix P such that matrix P A is the reduced row echelon form of A.
   
−4 4 1 1 3
21. A =  11 −3 23. A = −1 1 1
1 −1 1 −1 −1
   
1 1 0 1 1
22. A = −1 1 24. −2 2 −2
0 −2 1 −1 1
25. Write the inverse of the matrix

  
1 0 0 0 0 1
A = 0 2 0 0 1 0
0 0 1 1 0 0
as a product of 2 elementary matrices. Calculate A and A −1 and verify the

result.

  
1 0 0 1 0 0
A = 0 0 1 0 1 0
0 1 0 0 0 3

result.

   
1 0 4 1 0 0 1 0 0
A = 0 1 7 0 2 0 0 0 1
0 0 1 0 0 1 0 1 0
as a product of 3 simple matrices. Calculate A and A −1 and verify the result.

   
2 0 0 1 0 0 1 0 0
A = 0 1 0 0 2 0 1 1 0
0 0 1 0 0 1 3 0 1

1 0 0 0 0 0 1 0
  
0 3 0 0
 0 1 0 0

A=  
0 0 1 0 1
 0 0 0
0 0 0 1 0 0 0 1

result.

1 0 0 0 1 4 0 0
  
0 0 0 1
 0 1 0 0

A= 0

0 1 0 0
 0 1 0
0 1 0 0 0 0 0 1

result.

1 0 0 0 1 0 0 3
  
0 1 0 0
 0 1 0 5

A= 0

0 0 1 0
 0 1 2
0 0 1 0 0 0 0 1

1 3 0 0 1 0 2 0
  
0 1 0 0
 0 1 1 0

A= 0

2 0 1 0 0 1 0
0 4 1 0 0 0 1 1

· ¸
x p s
33. Find a matrix such that
y q t
· ¸· ¸ · ¸
2 1 x p s 2 1 1
= .
3 4 y q t 3 1 2
· ¸
x p s
34. Find a matrix such that
y q t
· ¸· ¸ · ¸
3 1 x p s 2 5 1
= .
1 1 y q t 0 4 2
 
x p
35. Find a matrix  y q  such that
z r
    
2 3 1 x p 1 1
 3 4 1  y q  = 1 1 .
1 0 2 z r 1 0
 
x p
36. Find a matrix  y q  such that
z r
    
3 1 1 x p 0 2
 1 2 1  y q = 2 0 .
1 2 2 z r 0 0
Show that the given matrix A is invertible and find its inverse.
   
2 1 1 1 1 1
37. A =  1 3 1  40. A = 1 1 2
−1 1 1 1 2 3
1 1 3 1
 
 
4 3 3 1 1 1 2
41. A =  
38. A = 3 4 3 3 1 1 1
3 3 4 1 0 1 1
1 1 1 1
 
 
2 1 1 1 2 1 1
42. A =  
39. A = 2 1 2 1 1 2 1
2 2 1 1 1 1 2
Show that the given matrix A is not invertible.

1 1 2 0 1 2 2 1
   
1 2 1 2 1 1 1 1
43. A = 
2
 44. A =  
1 1 2 3 4 1 2
1 1 1 1 3 5 2 2
Express the given matrix A as a product of simple matrices.

   
1 1 1 1 0 1
45. A = 1 1 2 46. A = 1 2 0
2 1 1 0 1 1
Express the inverse of the given matrix A as a product of simple matrices and then
find A −1 .
   
2 4 1 2 1 1
47. A = 1 2 2 48. A = 2 1 2
3 5 2 2 2 1
Determine the LU-decomposition of the given matrix.

   
1 0 1 2 1 1
49. 1 1 1 51. 1 2 1
1 1 0 1 1 2
   
2 3 0 1 2 −1
50. 0 1 1 52. 2 1 2
1 0 2 1 1 1
Chapter 3
The vector space R2
· ¸
u1
The main object of interest in this chapter are matrices of the form . We are
u2
interested in their algebraic properties and their geometric interpretations. Many
ideas introduced in this chapter will be generalized in the rest of this book and in the
second book of linear algebra. A solid understanding of these ideas in the familiar
context of R2 will make it easier to understand their generalizations.
· ¸
u1
Matrices of the form are usually called vectors. The set of all such vectors is
u2
denoted by R2 . Elements of R2 are often denoted by (u 1 , u 2 ).
We note that
· ¸
u1 ¡ ¢ £ ¤
= u 1 , u 2 6= u 1 u 2 .
u2
¡ ¢ £ ¤
In other words, u 1 , u 2 is not the matrix u 1 u 2 .
We will use “a vector in R2 ”, “an element of R2 ”,“a 2 × 1 matrix”, and “a point in
2
R ” as interchangeable phrases. Depending on the context, we choose the one that
seems most intuitive and thus best facilitating understanding of the discussed idea.
y
2 ·
u1
¸
u2 (u 1 , u 2 ) =
u2
1
−2 −1 0 1 2 u1 x
−1
−2
Figure 3.1: Cartesian coordinates.
131
132 Chapter 3: The vector space R2
In order to interpret R2 geometrically, we identify elements of R2 with points on

the plane. This identification is done by the introduction on the plane of two per-
pendicular lines, usually called the x-axis and the y-axis. The point
· ¸of intersection
0
of the axes, called the origin and denoted by 0, is identified with . On each axis
0
we choose a unit of length (usually the same on both axes). Then every point on
the plane can be described by a unique pair of numbers u 1 and u 2 as illustrated on
Figure 3.1. The numbers u 1 and u 2 are called the Cartesian coordinates of the point.
3.1 Vectors in R2
Algebraic operations and vector lines in R2
Since elements of R2 are matrices, they can be added and multiplied by numbers:
· ¸ · ¸ · ¸ · ¸ · ¸
u1 v1 u1 + v 1 u1 t u1
+ = and t = .
u2 v2 u2 + v 2 u2 t u2

· ¸ · ¸ · ¸ · ¸ · ¸
3 1 4 3 6
+ = and 2 = .
2 5 7 5 10
u2 + v 2 u+v
u
u2
v2 v
0 u1 v 1 u1 + v 1 x
Figure 3.2: Addition in R2 .
Addition in R2 has a clear geometric interpretation,

· ¸ · ¸ as shown in Figure 3.2. Note
u1 v1
that we label points as u and v instead of and . We will do that often, when
u2 v2
it is not essential to mention the coordinates. Observe that the geometric interpreta-
tion of addition allows us to add u and v without knowing the coordinates. It suffices
to know their position relative to the origin (see Fig. 3.3).
3.1. VECTORS IN R2 133
u+v
u
Figure 3.3: To add u and v it suffices to know the position of u and v relative to the
origin.
The geometric interpretation of multiplication by numbers is illustrated in Fig-

ure 3.4. Again, it is not necessary to know the coordinates of u, but only the position
of u relative to the origin.
1.4a
−0.3a 0
Figure 3.4: Multiplication by numbers.
Example 3.1.2. Choose arbitrary a and b in R2 and draw b − a.
a 2a
2a − 12 b
b
a
0
b−a b
0
− 12 b
−a
Figure 3.5: Solutions for Examples 3.1.2 and 3.1.3.
Example 3.1.3. Choose arbitrary a and b i R2 and draw 2a − 12 b.

Example 3.1.4. Choose arbitrary a, b, and c i R2 and draw 21 a − b + 32 c.
1
2a−b
a
1 3
2a−b+ 2c
−b
1
2a
0
c
3
2c
Figure 3.6: A solution for Example 3.1.4.
Using addition and multiplication by numbers we can describe lines in a conve-

nient way. First consider a point u different from the origin, that is u 6= 0. Note that
when the real number t varies, then the point t u moves along the line through u and
the origin. When we let t take all real values, then we obtain the entire line through
u and the origin.
Definition 3.1.5. Let u be a vector in R2 . The set of all vectors of the form t u,
where t is an arbitrary real number is called the vector subspace spanned by
u and is denoted by Span{u}. That is,
Span{u} = {t u : t in R} .
If u is different from the origin, then Span{u} will be called a vector line.
0
u
Figure 3.7: A segment of the vector line Span{u}.
In the definition of vector lines we have to assume that u is different from the
origin, because otherwise Span{u} would not be a line, but a point, namely the ori-
gin. We adopt the convention that, when we say “a vector line Span{u},” we always
implicitly assume that u 6= 0.
Theorem 3.1.6. For any two vectors u 6= 0 and v 6= 0 in R2 the following con-
ditions are equivalent:
(a) Span{u} = Span{v};
(b) v = xu for some real number x 6= 0.
Proof. If Span{u} = Span{v}, then v is in Span{u}. This means that there is a real
number x such that v = xu. Note that x cannot be 0, because then we would have
v = xu = 0u = 0.
If v = xu for some real number x 6= 0, then we also have u = x1 v. Now, if w is in
Span{v}, then there is a number s such that w = sv and thus w = (sx)u, which means
that w is in Span{u}.
On the other hand, if w is in Span{u}, then there is a number t such that w =
t u and thus w = xt v, which means that w is in Span{v}. Consequently, Span{u} =
Span{v}.
Definition 3.1.7. Let u be a vector in R2 . If v is a vector in Span{u} different

from the origin, then {v} is called a basis of Span{u}.
Linearly dependent and independent vectors in R2

If the vector u is in Span{v}, then there is a real number s such that u = sv. In this
case we say that the vector u is linearly dependent of the vector v. Similarly, if the
vector v is in Span{u}, the vector v is linearly dependent of the vector u.
Definition 3.1.8. Vectors u and v from R2 are linearly dependent if at least

one of the following conditions is true:
(a) the vector u is in Span{v};
(b) the vector v is in Span{u}.
u
0
v
Figure 3.8: Linearly dependent vectors.

Intuitively, vectors u and v in R2 are linearly dependent if u, v, and 0 are on the

same line.
Example 3.1.9. Since · ¸ · ¸

6 2
=3 ,
3 1
· ¸ ½· ¸¾ · ¸ · ¸
6 2 6 2
the vector is in Span and thus the vectors and are linearly de-
3 1 3 1
pendent.
· ¸ · ¸
4 −7
Example 3.1.10. The vectors u = and v = are linearly dependent, be-
−8 14
4
cause we have u = − 7 v and consequently u is in Span{v}.
· ¸ · ¸
0 v1
Example 3.1.11. The vectors u = and v = are linearly dependent, because
0 v2
we have u = 0v. · ¸ · ¸
u1 0
Similarly, the vectors u = and v = are linearly dependent, because we
u2 0
have v = 0u.
In other words, if at least one of the vectors u and v is the zero vector, then u
and v are linearly dependent.
· ¸ · ¸
1 3
Example 3.1.12. The vectors u = and v = are not linearly dependent.
2 4
Indeed, if u was in Span{v}, then we would have u = av for some real number a.
But then that number would have to satisfy both equations 1 = 3a and 2 = 4a. This
is not possible, since it would mean that a = 31 and at the same time a = 12 . This
shows that u is not in Span{v}.
Arguing in a similar way we can show that v is not in Span{u}.
The next theorem gives us a useful method for verifying linear dependence without
using the spans.
Theorem 3.1.13. Vectors u and v in R2 are linearly dependent if and only if

one of the following conditions holds:
(a) u = 0 or
(b) u 6= 0 and v = xu for a real number x.
Proof. If either (a) or (b) holds, then it is clear that the vectors u and v are linearly
dependent.
Now suppose that the vectors u and v are linearly dependent. To show that either
(a) or (b) holds, it suffices to show that, if u 6= 0, then the equation v = xu has a
solution.
If v is in Span{u}, then there is a number x such that v = xu and we are done.
If u is in Span{v}, there is a number y such that u = yv. Since u 6= 0, we must have
y 6= 0 and thus v = 1y u.
The following theorem shows that linear dependence of nonzero vectors is equiv-
alent to the conditions in Theorem 3.1.6.
Theorem 3.1.14. For any two vectors u 6= 0 and v 6= 0 in R2 the following

conditions are equivalent:
(a) The vectors u and v are linearly dependent;
(b) Span{u} = Span{v};
(c) v = xu for a real number x 6= 0.
Proof. This theorem is a consequence of Theorems 3.1.6 and 3.1.13.
The following theorem expresses linear dependence in a more algebraic lan-

guage, namely, in terms of solutions of vector equations and determinants.
· ¸ · ¸
u1 v1
Theorem 3.1.15. Let u = and v = . The following conditions are
u2 v2
equivalent:
(b) The equation · ¸ · ¸ · ¸

u1 v1 0
x +y = (3.1)
u2 v2 0
solution x = y = 0;
(c) · ¸
u1 v 1
det = 0. (3.2)
u2 v 2
Proof. Assume that u and v are linearly dependent. If u is in Span{v}, then u = av

for some real number a. Then u − av = 0 and thus x = 1 and y = −a is a nontrivial
solution of the equation (3.1). The case when v is in Span{u} can be treated in a
similar way. Therefore (a) implies (b).
Now assume that the equation (3.1) has a nontrivial solution. If x 6= 0, then we
have
y v1
· ¸ · ¸
u1
=−
u2 x v2
or, equivalently,
y y
u1 = − v 1 and u 2 = − v 2 .
x x
Hence " y #
− x v1 v1 y y
· ¸
u1 v 1
det = det y = − v 1 v 2 + v 2 v 1 = 0.
u2 v 2 − x v2 v2 x x
If y 6= 0 we use y instead of x and modify the proof accordingly. Therefore (b) implies
(c).
Finally we assume that (3.2) holds. If u = 0, then u is in Span{v} and u and v are
linearly dependent. If u 6= 0, then at least one of the numbers u 1 and u 2 must be
different from 0. If u 1 6= 0, then from (3.2) we get v 2 = uv 11 u 2 and, consequently,
v 1 u1
· ¸ · ¸
v1
=
v2 u1 u2
because the equality v 1 = uv 11 u 1 is obvious.

This means that v is in Span{u} and consequently u and v are linearly dependent.
If u 2 6= 0 we use u 2 instead of u 1 and modify the proof accordingly. Therefore (c)
implies (a).
· ¸ · ¸
3 −4
Example 3.1.16. The vectors and are linearly dependent, because
−9 12
· ¸
3 −4
det = 3 · 12 − (−4) · (−9) = 0.
−9 12
Since · ¸ · ¸ · ¸ · ¸
3 1 −4 1
=3 and = −4 ,
−9 −3 12 −3
we have · ¸ · ¸
−4 4 3
=− .
12 3 −9
· ¸ · ¸
1 3
Example 3.1.17. In Example 3.1.12 we argue that the vectors u = and v =
2 4
are not linearly dependent. We can accomplish that easier if we use the determi-
nant: · ¸
1 2
det = 1 · 4 − 2 · 3 = −2 6= 0.
3 4
Definition 3.1.18. If the vectors u and v are not linearly dependent, we say
that they are linearly independent. In other words, the vectors u and v are
linearly independent if u is not in Span{v} and v is not in Span{u}.
v
u
0
Figure 3.9: Linearly independent vectors.
Vectors u and v in Example 3.1.17 are linearly independent.

The following theorem is a reformulation of Theorem 3.1.15 in terms of linear
independence.
· ¸ · ¸
u1 v1
Theorem 3.1.19. Let u = and v = . The following conditions are
u2 v2
equivalent:
(a) The vectors u and v are linearly independent;
(b) The equation · ¸ · ¸ · ¸

u1 v1 0
x +y =
u2 v2 0
has only the trivial solution, that is, x = y = 0;
(c) · ¸
u1 v 1
det 6= 0.
u2 v 2
Theorem 1.3.9 connects invertibility of a matrix with a condition on its determi-

nant. The above theorem relates the determinant of a matrix to linear independence
of its columns. When we put these two facts together we obtain the following inter-
esting and useful result.
· ¸ · ¸ · ¸
a1 b1 a1 b1 £ ¤
Theorem 3.1.20. Let a = and b = . The matrix = a b is
a2 b2 a2 b2
invertible if and only if the column vectors a and b are linearly independent.
Bases in R2
If v is any nonzero vector on the vector line Span{u}, then any other vector from
Span{u} can be written as av for some real number a. In other words, the whole
vector line can be reconstructed from any nonzero vector on that vector line. Now
we are going to consider a similar question for the whole plane R2 instead of a vec-
tor line in R2 . It leads to the idea of a basis. As we will see later, this is an idea of
fundamental importance in linear algebra.
Definition 3.1.21. A pair of vectors {a, b} in R2 is called a basis of R2 if the

vectors satisfy the following two conditions:
(a) a and b are linearly independent;
(b) For every c in R2 there are real numbers x and y such that c = xa + yb.
In other words, by a basis of R2 we mean a pair of linearly independent vectors

a and b such that every vector in R2 can be written in the form xa + yb, so that the
whole plane can be constructed this way from a and b.

The expression of the form xa + yb is referred to as a linear combination of vec-
tors a and b. Condition (2) in the above definition can be stated as “every vector in
R2 is a linear combination a and b.”
It turns out that it is not necessary to assume both conditions in the definition
of a basis in R2 . The next theorem shows that each one of the conditions implies the
other one.
Theorem 3.1.22. Let a and b be vectors in R2 . The following two conditions

are equivalent:
(b) For every c in R2 there are real numbers x and y such that c = xa + yb.
· ¸ · ¸
a1 b1
Proof. Let a = and b = .
a2 b2
First we observe that, by
· Theorem
¸ 3.1.19, the vectors a and b·are linearly
¸ inde-
a1 b1 a1 b1
pendent if and only if det 6= 0. Then we note that, if det 6= 0, then
a2 b2 a2 b2
the matrix equation
· ¸· ¸ · ¸
a1 b1 x c1
=
a2 b2 y c2
· ¸
c1
has a unique solution for every c = , by Theorem 1.3.10. But this means that for
c2
every c in R2 there are real numbers x and y such that c = xa + yb.
Now suppose that for every c in R2 there are real numbers x and y such that
c = xa + yb. Then there are real numbers x 1 , y 1 , x 2 , y 2 such that
· ¸ · ¸
1 0
= x 1 a + y 1 b and = x 2 a + y 2 b,
0 1
which can be written as · ¸ · ¸

1 0 £ ¤ x1 x2
= a b .
0 1 y1 y2
£ ¤
But this means, by Theorem 1.2.18, that the matrix a b is invertible, which is
equivalent to the fact that the vectors a and b are linearly independent, by Theo-
rem 3.1.20.
Note that from the above proof it follows that, if the vectors a and b are linearly
independent, then the numbers x and y such that c = xa + yb are unique for every c
in R2 . Those unique numbers x and y are called the coordinates of the vector c with
respect to the basis {a, b}.
3.1.1 Exercises
· ¸ · ¸ · ¸
4 2 3
1. Determine d such that b−a = d−c where a = ,b= , and c = .
1 3 0
Draw a, b, c and d.
· ¸ · ¸ · ¸
0 3 2
2. Determine d such that b−a = d−c where a = ,b= , and c = .
2 7 −1
Draw a, b, c and d.
Draw vectors a and b and then draw the given vector.
3. 2a + 31 b 6. − 32 a + 21 b
1
4. 2a−b 7. −0.5a − 0.75b
5. − 43 a + 2b 8. 2(a + b) − 12 (a − b)
Draw vectors a, b, and c and then draw the given vector.
1 5
9. 2a+b− 3c 11. − 34 a + 31 b + 3c
10. a − 2b − 12 c 12. 0.2a − 0.5b + 0.7c
Show that the given vectors u and v are linearly dependent.

· ¸ · ¸ · ¸ · ¸
5 −7 8 −10
13. u = ,v= 14. u = ,v=
−5 7 −4 5
Show that the given vectors u and v are linearly independent.

· ¸ · ¸ · ¸ · ¸
5 7 8 −10
15. u = ,v= 16. u = ,v=
−5 7 −2 5
Find a real number a such that the given vectors u and v are linearly dependent.
· ¸ · ¸ · ¸ · ¸
a 7 a 3
17. u = ,v= 21. u = ,v=
3 −2 3 8+a
· ¸ · ¸ ¸ · · ¸
a 8 2−a 2
18. u = ,v= 22. u = ,v=
2 a 2 5−a
· ¸ · ¸ · ¸ · ¸
a −7 −2 5−a 3
19. u = ,v= 23. u = ,v= .
5 a 2 4−a
· ¸ · ¸ · ¸ · ¸
a 7 a 3a
20. u = ,v= 24. u = , v =
3a a a2 − 4 a +2
3.2. THE DOT PRODUCT AND THE PROJECTION ON A VECTOR LINE IN R2 143
½· ¸ · ¸¾
5 −1
25. Show that , is a basis in R2 .
2 4
½· ¸ · ¸¾
a −1
26. Show that , is a basis in R2 for any real number a.
1 a
½· ¸ · ¸¾
a c
27. Show that , is a basis in R2 for any b < 0 and c > 0.
b a
½· ¸ · ¸¾
a −b
28. Show that , is a basis in R2 for any b ∈ R such that b 6= 0.
b a
Find the coordinates of the given vector u with respect to the given basis B.
· ¸ ½· ¸ · ¸¾ ·¸ ½· ¸ · ¸¾
1 1 1 1 3 0
29. u = ,B= , 31. u = ,B= ,
2 1 0 −2 1 1
· ¸ ½· ¸ · ¸¾ · ¸ ½· ¸ · ¸¾
3 1 1 1 3 2
30. u = ,B= , 32. u = ,B= ,
1 −1 2 2 1 4
· ¸ ½· ¸ · ¸¾
1 1 0
33. Find the coordinates of the vector with respect to the basis , and
0 1 1
½· ¸ · ¸¾
1 1
to the basis , .
1 3
· ¸ ½· ¸ · ¸¾
2 2 1
34. Find the coordinates of the vector with respect to the basis ,
−1 1 0
½· ¸ · ¸¾
2 0
and in the basis , .
1 1
3.2 The dot product and the projection on a vector line

in R2
The dot product in R2
· ¸
u1
Definition 3.2.1. For a vector u = we define
u2
°· ¸° q
° u1 ° 2 2
° u2 ° = u1 + u2 .
kuk = ° °
The number kuk is called the norm of u.

· ¸
u1
Geometrically kuk is the distance from an arbitrary point u = to the origin.
u2
· ¸ · ¸
u1 v1
The number ku − vk is the distance from the point u = to the point v = .
u2 v2
Indeed,
°· ¸ · ¸° °· ¸° q
° u1 v1 °° = ° u 1 − v 1 ° = (u 1 − v 1 )2 + (u 2 − v 2 )2 .
° °
ku − vk = °
° u2 −
v2 ° ° v2 − v2 °
y
v
v2
|v 2 − u 2 |
u ku − vk
u2
u1 v1 x
|v 1 − u 1 |
Figure 3.10: Norm as the distance between points.
If kuk = 1, then we say that u is a unit vector. If u is a nonzero vector, then the
1 1
vector kuk u is a unit vector. If we multiply a nonzero vector u by kuk , we say that we
normalize the vector u.
Example 3.2.2.
p p
°· ¸° p
° 3 °
2 2
° 4 ° = 3 + 4 = 9 + 16 = 25 = 5
° °
· ¸ · ¸
5 3
Example 3.2.3. The distance between and is
1 4
p p
°· ¸ · ¸° °· ¸° p
° 5 3 °
° ° 2 °
° °
2 2
° 1 − 4 ° = ° −3 ° = 2 + (−3) = 4 + 9 = 13.
°
Addition of vectors in R2 , multiplication of a vector in R2 by a number, and the

norm of a vector in R2 are similar to the familiar concepts from the algebra of real
numbers. Now we are going to introduce an operation of “multiplication” of two
vectors in R2 . While it generalizes multiplication of real numbers, it is different from

that operation, because the product of two vectors is not a vector, but a number.
Definition 3.2.4. For arbitrary u and v in R2 , by the dot product of u and v

we mean the real number u • v defined as follows
· ¸ · ¸
u1 v1
u v=
• • = u1 v 1 + u2 v 2 .
u2 v2
Example 3.2.5. · ¸ · ¸
3 2
• = 3 · 2 + 5 · (−6) = 6 − 30 = −24.
5 −6
At this point the dot product has no obvious geometric meaning, but later we
will learn that it has an important geometric interpretation.
Since
u • 0 = 0,
for every u in R2 , the vector 0 seems to play a role similar to the role of zero in mul-
tiplication of numbers. The dot product has other properties that are similar to the
properties of multiplication of numbers. For example, it is easy to verify that
u•v = v•u
u • (v + w) = u • v + u • w
t (u • v) = (t u) • v = u • (t v)
for arbitrary u, v, and w in R2 and real number t .

On the other hand,
· ¸ · ¸
1 −6
• = 0,
2 3
which is very different from our experience with real numbers, where the product of
two nonzero numbers is always different from zero.
Here is a nice and useful connection between the dot product and the norm:
p
u • u = kuk2 or kuk = u • u.
Definition 3.2.6. Two vectors u and v in R2 are called orthogonal vectors if
||u − v|| = ||u + v||.
Orthogonality is closely related to perpendicularity of lines. In elementary ge-

ometry, lines are called perpendicular if they intersect at a right angle. If u and v are
different from the origin, then u and v are orthogonal if and only if the vector lines
Span{u} and Span{v} are perpendicular, see Figure 3.11.
u ku − vk
ku + vk
v
−v
Figure 3.11: The lines Span{u} and Span{v} are perpendicular.
· ¸ · ¸
3 −6
Example 3.2.7. and are orthogonal because
2 9
p p
°· ¸ · ¸° °· ¸° p
° 3
° ° 9 °
−6 ° ° °
2 2
° 2 − 9 ° = ° −7 ° = 9 + (−7) = 81 + 49 = 130
°
and
p p
°· ¸ · ¸° °· ¸° p
° 3 −6 °
° ° −3 °
° °
2 2
° 2 + 9 ° = ° 11 ° = (−3) + 11 = 9 + 121 = 130.
°
Here is another interesting connection between the dot product and the norm.
Theorem 3.2.8 (Parallelogram law). For arbitrary u and v in R2 we have
1¡
||u + v||2 − ||u − v||2 .
¢
u•v =
4
Proof. Since
||u + v||2 = (u + v) • (u + v) = u • u + 2(u • v) + v • v
and
||u − v||2 = (u − v) • (u − v) = u • u − 2(u • v) + v • v,
we have
||u + v||2 − ||u − v||2 = 4u • v.
From the above lemma we obtain a theorem which gives us the first hint of the
geometric meaning of the dot product.
Theorem 3.2.9. Vectors u and v are orthogonal if and only if u • v = 0.
Proof. It is obvious from Theorem 3.2.8 that u • v = 0 if and only if ||u − v|| = ||u + v||.
Theorem 3.2.10 (The Pythagorean Theorem). If the vectors u and v are or-
thogonal, then
ku + vk2 = kuk2 + kvk2 .
Proof. If u and v are orthogonal, then u • v = 0 and thus
ku + vk2 = (u + v) • (u + v) = u • u + 2(u • v) + v • v = kuk2 + kvk2 .
It is clear from the above proof and Theorem 3.2.9 that
u and v are orthogonal if and only if ku + vk2 = kuk2 + kvk2 .
· ¸ · ¸
3 −6
Example 3.2.11. In Example 3.2.7 we have seen that and are orthogonal.
2 9
As expected from the Pythagorean Theorem, we have
°· ¸ · ¸°2 °· ¸°2
° 3 −6 ° ° −3 ° 2 2
° 2 + 9 ° = ° 11 ° = (−3) + 11 = 130
° ° ° °
and °· ¸°2 °· ¸°2

° 3 ° ° −6 ° 2 2 2 2
° 2 ° + ° 9 ° = (3 + 2 ) + ((−6) + 9 ) = 130.
° ° ° °
· ¸ · ¸
1 3
Now we consider and . Since
2 5
· ¸ · ¸
1 3
• = 1 · 3 + 2 · 5 = 13,
2 5
· ¸ · ¸
1 3
the vectors and are not orthogonal. As expected, in this case the numbers
2 5
°· ¸ · ¸°2 °· ¸°2
° 1 3 ° ° 4 °
° 2 + 5 ° = ° 7 ° = 65
° ° ° °
and °· ¸°2 °· ¸°2

° 1 ° ° 3 °
° 2 ° + ° 5 ° = 5 + 34 = 39
° ° ° °
are different.
Note that the dot product can be interpreted as a product of matrices:
· ¸ · ¸ · ¸ · ¸
u1 v1 £ ¤ v1 £ ¤ u1
• = u1 u2 = v1 v2 . (3.3)
u2 v2 v2 u2
The above property can also be written as u • v = uT v = vT u. This simple obser-

vation is often useful in calculations, see for example the proof of Theorem 3.2.17.
The projection on a vector line in R2

Consider a vector line Span{u} and a point b not on the line, see Figure 3.12. For
every point a on the vector line we can measure the distance between b and a. There
must be a point p on the vector line for which the distance is the smallest. If p is
such a point, we say that p minimizes the distance from b and we call p the best
approximation to the point b by elements of the vector line Span{u}. It turns out
that linear algebra provides useful tools for dealing with such problems.
Theorem 3.2.12. Consider a nonzero vector u in R2 . For any point b in R2

there is a unique p on the vector line Span{u} such that
kb − pk ≤ kb − t uk (3.4)
for all t in R. That unique point is
b•u
p= u.
kuk2
u
The point minimizing the distance from b
p to a point on the vector line Span{u}.
Figure 3.12: Best approximation to a point by elements of a vector line.
Proof. Since
kb − t uk2 = (b − t u) • (b − t u) = kbk2 − 2t b • u + t 2 kuk2

(b • u)2 (b • u)2
= kbk2 − + − 2t b • u + t 2 kuk2
kuk2 kuk2
¶2
(b • u)2
µ
b•u
= kbk2 − + − t kuk ,
kuk2 kuk
the norm kb − t uk is minimized when bkuku − t kuk = 0. Solving for t we get t = b•u
•
kuk2
.
Moreover, since
° °
°b − b u u° < kb − t uk
° • °
° kuk2 °
b•u
for all t 6= kuk2
, the point minimizing the distance is unique.
The point
b•u
p= u
kuk2
is called the best approximation to the point b by elements of the vector line Span{u}.
In calculating the best approximation we often use the identity
b•u b•u
2
u= u.
kuk u•u
We note that if the point b is on the vector line Span{u} then the best approximation
to the point b by elements of the vector line Span{u} is b.
b−p
u
b•u
p= kuk2
u
Figure 3.13: (b − p) · u = 0.
Theorem 3.2.13. Consider a point b and a vector line Span{u} in R2 . The best
approximation p to the point b by elements of the vector line Span{u} can be
characterized as the point p in Span{u} satisfying the equation
(b − p) • u = 0.
In other words, p is the best approximation to b by elements of the line Span{u}

if and only if (b − p) • u = 0.
Proof. The point p, being on the vector line Span{u}, is of the form t u. Note that
(b − t u) • u = 0
if and only if
b • u − t kuk2 = 0
if and only if
b•u
t= .
kuk2
Hence (b − p) • u = 0 if and only if
b•u
p= u,
kuk2
which is the point obtained in Theorem 3.2.12.
In view of Theorem 3.2.13, the best approximation to the point b by elements of

the vector line Span{u} is also called the projection of b on the vector line Span{u}
and is denoted by projSpan{u} b or simply proju b:
b•u b•u
proju b = 2
u= u
kuk u•u
· ¸ ½· ¸¾
1 3
Example 3.2.14. Determine the projection of on the vector line Span .
4 1
Solution. The projection is

· ¸ · ¸
1 3
• · ¸ " 21 #
4 1 3
· ¸
7 3 10
°· ¸°2 1 = = .
° 3 ° 10 1 7
° ° 10
° 1 °
Example 3.2.15. Let q and u be points in R2 with u 6= 0. Find a point s such that
p = 12 (q + s) where p is the projection of the point q on Span{u}. Choose q and u
and draw s.
q·u
Solution. Since the projection p of the point q on the vector line Span{u} is u·u u,
the point s must satisfy the equation
1 q·u
(q + s) = u.
2 u·u
Solving for s we get
q·u
s=2 u − q.
u·u
0
p = proju q
u
Span{u}
s
Figure 3.14: A solution for Example 3.2.15.

Definition 3.2.16. Let q and u be points in R2 with u 6= 0. The point s such

that
1
proju q = (q + s)
2
is called the reflection of the point q across the vector line Span{u}.
The projection matrix on a vector line in R2

In the following theorem we give another way to calculate the projection of a point
on a vector line. It uses the equality (3.3).
Theorem 3.2.17. Let u be a nonzero vector in R2 and let

1
A= uuT .
kuk2
Then for any b in R2 we have
proju b = Ab.
Moreover, the matrix A is the unique matrix with this property.
·¸ · ¸ · ¸
u1 0 b1
Proof. Let u = 6 = and b = . Then
u2 0 b2
b•u
proju b = = u
kuk2
u•b
= u
kuk2
· ¸
£ ¤ b1
u1 u2
b2 u1
· ¸
=
kuk2 u2
· ¸µ · ¸¶
1 u1 £ ¤ b1
= u 1 u 2
kuk2 u 2 b2
µ· ¸ ¶· ¸
1 u1 £ ¤ b1
= 2
u1 u2
kuk u 2 b2
1
= (uuT )b.
kuk2
The uniqueness part of the theorem follows from the fact that, if Ax = B x for every
vector x from R2 , then A = B , by Theorem 2.1.18.
Definition 3.2.18. Let u be a nonzero vector in R2 . The matrix

1
(uuT ) (3.5)
kuk2
is called the projection matrix on the vector line Span{u}.
· ¸
1
Example 3.2.19. Let u = . Determine the projection matrix on Span{u} and use
3
· ¸
2
it to calculate the projection of the vector b = on Span{u}.
1
Solution. The projection matrix on Span{u} is

· ¸ · ¸
1 T 1 1 £ ¤ 1 1 3
(uu ) = 1 3 =
kuk2 10 3 10 3 9
· ¸
2
and the projection of b = on Span{u} is
1
· ¸· ¸ · ¸ · ¸ "1#
1 T 1 1 3 2 1 5 5 1
proju b = (uu )b = = = = 23 .
kuk2 10 3 9 1 10 15 10 3 2
The perp operation

Now we introduce another simple but useful operation in R2 .
· ¸
a1
Definition 3.2.20. For an arbitrary vector in R2 we define
a2
· ¸x · ¸
a1 −a 2
= .
a2 a1
This operation is called the perp operation and ax is read “a perp.”
From the definition of the perp operation we easily obtain the following identi-
ties:
y
· ¸
−a 2
ax = a1
a1
· ¸
a1
a2 a=
a2
−a 2 0 a1 x
Figure 3.15: The perp operation.
kax k = kak, a • ax = 0, and (ax )x = −a.
Example 3.2.21.
· ¸x · ¸
1 −2
= ,
2 1
° 1 ° p
°· ¸° °· ¸°
° = 5 = ° −2 ° ,
° °
°
° 2 ° ° 1 °
· ¸ · ¸x · ¸ · ¸
1 1 1 −2
• = • = 1 · (−2) + 2 · 1 = 0,
2 2 2 1
µ· ¸x ¶x · ¸x · ¸
1 −2 −1
= = .
2 1 −2
The identity a • ax = 0 suggests that the perp operation is related to orthogonality.

The following theorem exploits this property further.
Theorem 3.2.22. Let n be a vector in R2 different from the origin. For any
vector x in R2 the following conditions are equivalent:
(a) x • n = 0;
(b) x is in Span{nx }.
· ¸ · ¸
n1 x1
Proof. Let n = and x = and assume that x • n = 0. Since n is different from
n2 x2
the origin, at least one of the numbers n 1 and n 2 must be different from 0. Suppose
n 1 6= 0. Since · ¸ · ¸
x1 n1
x•n = • = x 1 n 1 + x 2 n 2 = 0,
x2 n2
we have
x2
x1 = (−n 2 ).
n1
This, combined with the obvious equality
x2
x2 = n1 ,
n1
gives us
¸ " x2 #
n 1 (−n 2 ) x 2 −n 2 x2 x
· · ¸
x1
x= = x2 = = n ,
x2 n1 n 1 n1 n 1 n1
which means that x is in Span{nx }. For the case when n 2 6= 0 the above argument
requires only minor modifications. Therefore (a) implies (b).
Now we assume that x is in Span{nx }. Then x = t nx for some real number t and
consequently
x • n = (t nx ) • n = t (nx • n) = 0.
Therefore (b) implies (a).
Corollary 3.2.23. Let u be a vector in R2 different from the origin. Then x is

in Span{u} if and only if x • ux = 0.
Proof. By Theorem 3.2.22, x • ux = 0 if and only if x is in Span{(ux )x }. This proves the

result since
Span{(ux )x } = Span{(−u)} = Span{u}.
Example 3.2.24. Describe all solutions of the equation
2x + 3y = 0
as a the vector line Span{u}.
Solution. The equation 2x + 3y = 0 can be written as

· ¸ · ¸
x 2
• = 0.
y 3
· ¸ · ¸
x x
According to Theorem 3.2.22, satisfies the above equation if and only if is
y y
in the vector line ½· ¸x ¾ ½· ¸¾
2 −3
Span = Span .
3 2
Theorem 3.2.25. Two nonzero orthogonal vectors a and b, that is two

nonzero vectors a and b such that a • b = 0, form a basis of R2 . Such a basis
is called an orthogonal basis.
Proof. According to Theorem 3.2.22, if a • b = 0, then b = t ax , where t is a nonzero

real number. It is easy to verify that
t ax = t det a ax = t kak2 6= 0,
£ ¤ £ ¤ £ ¤
det a b = det a
because a 6= 0. The result is now a consequence of Theorems 3.1.19 and 3.1.22.
Area of a triangle in R2
Now we discuss the geometric meaning of the determinant of a 2×2 matrix. First we
need to define the distance from a point to a line.
· ¸ · ¸
a1 b1
Definition 3.2.26. Let a = and b = be points in R2 such that b 6= 0.
a2 b2
By the distance from a to the vector line Span{b} we mean the number
° °
°a − proj a° .
b
From Theorem 3.2.12 we obtain a formula for the distance of a point from a vec-
tor line in terms of a determinant.
· ¸ · ¸
a1 b1
Theorem 3.2.27. Let a = and b = be points in R2 such that b 6= 0.
a2 b2
The distance from a to the vector line Span{b} is
¯ · ¸¯
1 ¯¯ a 1 b 1 ¯¯
det .
kbk ¯ a2 b2 ¯
distance from a to the vector line Span{b}
b
p = projb a
Figure 3.16: The distance from a point to a line.
Proof. Let p be the projection of a on the vector line Span{b}, see Fig. 3.16. Then
a•b
p = projb a = b,
kbk2
· ¸ · ¸
a1 b1
by Theorem 3.2.12. If a = and b = , then
a2 b2
° °2
a•b °
ka − pk2 = °
°
a − b °
° kbk2 °
µ ¶ µ ¶
a•b a•b
= a− b a−
• b
kbk2 kbk2
(a • b)2 (a • b)2
= kak2 − 2 + kbk2
kbk2 kbk4
(a • b)2
= kak2 −
kbk2
1 ¡
kak2 kbk2 − (a • b)2
¢
=
kbk2
1 ¡ 2
(a 1 + a 22 )(b 12 + b 22 ) − (a 1 b 1 + a 2 b 2 )2
¢
= 2
kbk
1
= (a 1 b 2 − a 2 b 1 )2
kbk2
µ · ¸¶2
1 a1 b1
= det .
kbk2 a2 b2
¯ · ¸¯
1 ¯det a 1 b 1 ¯.
° ° ¯ ¯
Consequently, °a − p° = kbk ¯ a2 b2 ¯
The above theorem gives us a convenient formula for calculating the area of a
triangle defined by two vectors in R2 .
proja b
Figure 3.17: The triangle defined by vectors a and b.
· ¸ · ¸
a1 b1
Corollary 3.2.28. Let a = and b = be nonzero vectors in R2 . The
a2 b2
area of the triangle 0ab is ¯ · ¸¯
1 ¯¯ a 1 b 1 ¯¯
det
2¯ a2 b2 ¯
Proof. Since the area of the triangle 0ab is
1
· (the length of the base 0b) · (the height from a),
2
the result is an immediate consequence of Theorem 3.2.27.
a+b
Figure 3.18: The area of the parallelogram is |det[a b]|.
From the derived formula for the area of a triangle we obtain an explicit geomet-
ric interpretation of |det[a b]| as the area of the parallelogram with vertices 0, a, b,
and a + b, see Fig. 3.18.
· ¸ · ¸
1 3
Example 3.2.29. The area of the triangle 0ab where a = and b = is
2 −5
¯ · ¸¯
1 ¯¯ 1 3 ¯¯ 1 11
det = · | − 11| = .
2 ¯ 2 −5 ¯ 2 2
3.2.1 Exercises
Calculate the norms of the given vectors.
p1
· ¸ · ¸ " #
3 −5 a
1. 3. 5. , a >0
4 3 p1
a
p1
" #
sin α
· ¸ · ¸
2 4. 2
2. p1 6.
7 2 cos α
Calculate the distance between the given vectors.

· ¸ · ¸ · ¸ · ¸
5 7 0 a
7. and 9. and
4 1 a 0
· ¸ · ¸ · ¸ · ¸
1 5 a −b
8. and 10. and
2 −1 b a
Calculate the following dot products.

· ¸ · ¸ · ¸ · ¸
1 7 −4 1
11. • 13. •
2 2 1 4
· ¸ · ¸ · ¸ · ¸
1 3 4 −4
12. • 14. •
2 0 1 1
For the given vectors u and v find a real number a such that u • v = 0 and then draw
u and v.
· ¸ · ¸ · ¸ · ¸
4 1 a a
15. u = and v = 17. u = and v =
a 2 −1 1
· ¸ · ¸ · ¸ · ¸
1 a 4 −a
16. u = and v = 18. u = and v =
1 −2 a 2
· ¸ · ¸ · ¸ · ¸
1 2 3 7
19. Let a = ,b= ,c= , and d = . Find a real number a
1 1+a 5 a
such that (b − a) • (d − c) = 0 and then draw the vectors a, b, c, and d.
· ¸ · ¸ · ¸ · ¸
−1 1 1 −1
20. Let a = ,b= ,c= and d = . Find a real num-
2 2+a 1 a −1
ber a such that (b − a) • (d − c) = 0 and then draw the vectors a, b, c, and d.
· ¸ · ¸
−1 1
21. For a = and n = draw all vectors x such that (x − a) · n = 0.
2 3
· ¸
4
22. For a = n = draw all vectors x such that (x − a) · n = 0.
1
For the given vectors b and u find proju b using the formula in Theorem 3.2.12.
· ¸ · ¸ · ¸ · ¸
2 1 x 1
23. b = and u = 25. b = and u =
1 3 y 1
· ¸ · ¸ · ¸ · ¸
1 2 x −3
24. b = and u = 26. b = and u =
−3 5 y 2
For the given vectors b and u, find the projection matrix for the projection on the
vector line Span{u} and then use it to calculate proju b.
· ¸ · ¸ · ¸ · ¸
0 1 x 2
27. b = and u = 29. b = and u =
1 1 y −1
· ¸ · ¸ · ¸ · ¸
1 −3 x 4
28. b = and u = 30. b = and u =
1 1 y −3
Find the reflection of the point b across the vector line Span{u}.
· ¸ · ¸ · ¸ · ¸
2 1 x 1
31. b = and u = 33. b = and u =
1 3 y 1
·¸ · ¸ · ¸ · ¸
1 2 x −3
32. b = and u = 34. b = and u =
−3 5 y 2
35. Find ½· ¸¾ A such that the reflection of the point b across the vector line
a matrix
−3
Span is Ab.
1
36. Find ½· ¸¾ A such that the reflection of the point b across the vector line
a matrix
2
Span is Ab.
5
37. Describe all solutions of the equation 3x − 4y = 0 as a vector line Span{u}.
38. Describe all solutions of the equation 5x + 2y = 0 as a vector line Span{u}.
Find the equation of the vector line Span{u} for the given vector u.
· ¸ · ¸
1 −3
39. u = 41. u =
1 1
· ¸ · ¸
2 −1
40. u = 42. u =
−1 −1
Find the distance from the given point a to the vector line Span{b}.
· ¸ · ¸ · ¸ · ¸
0 1 2 1
43. a = and b = 44. a = and b =
1 1 −1 3
· ¸ · ¸
3 2
45. Find the area of the triangle defined by vectors a = and b = .
1 5
¸ · ¸
·
−2 −3
46. Find the area of the triangle defined by vectors a = and b = .
−1 7
47. Show that for arbitrary vectors a and b in R2 we have b • ax = det a

£ ¤
b .
48. Show that the following conditions are equivalent

(b) b • ax 6= 0;
(c) a • bx 6= 0.
49. Show that det ax bx = det a

£ ¤ £ ¤
b .
50. Show that the vectors a and b are linearly independent if and only if the vectors
ax and bx are linearly independent.
51. Let S be the solution set of the equation

· ¸· ¸ · ¸
a1 b1 x 0
= ,
a2 b2 y 0
that is, ½· ¸ · ¸ · ¸ · ¸¾
x 2 a1 b1 x 0
S= in R : = .
y a2 b2 y 0
Show that one of the following is true:
½· ¸¾
0
(a) S = ;
0
(b) S is a vector line;
(c) S = R2 .
52. Let a and b be linearly independent vectors in R2 . Show that if x • a = 0 and

x • b = 0, then x = 0.
53. If p = projb a, show that
a·b kpk
= if a · b ≥ 0
kakkbk kak
and
a·b kpk
=− if a · b ≤ 0.
kakkbk kak
54. If p = projb a, show that
£ ¤
det a b ka − pk £ ¤
= if det a b ≥0
kakkbk kak
and £ ¤
det a b ka − pk £ ¤
=− if det a b ≤ 0.
kakkbk kak
55. (Cramer’s Rule revisited) Let a and b be linearly independent vectors in R2 . For
every c in R2 the equation
xa + yb = c
has a unique solution
c • bx c • ax
x= and y= .
a • bx b • ax
56. Using exercise 55 solve the system

½
2x + 5y = 4
.
x + 3y = 3
57. Find a matrix A such that the reflection of the point b across the vector line
Span{u} is Ab. Show that A is a symmetric matrix.
3.3 Symmetric 2 × 2 matrices

Recall that a matrix A is called symmetric if A T = A. In the case of 2 × 2 matrices,
every symmetric matrix has the form
· ¸
a c
.
c b
In this section we discuss eigenvectors of symmetric matrices. We start with the

following interesting theorem.
Theorem 3.3.1. Let A be a 2 × 2 matrix. If A has two orthogonal eigenvectors,

then A is symmetric.
3.3. SYMMETRIC 2 × 2 MATRICES 163
· ¸ · ¸
u1 v1
Proof. Let and be orthogonal eigenvectors of the matrix A, that is,
u2 v2
·¸ · ¸ · ¸ · ¸
u1 u1 v1 v1
A =α and A =β ,
u2 u2 v2 v2
for some numbers α and β, and

· ¸ · ¸
u1 v1
• = 0.
u2 v2
Let  u1 
q
 u12 +u22 
· ¸ · ¸
1 u1 q1
q= q =  u2  =
2 u2 q2
u 12 + u 2
q
u 12 +u 22
and  v1 
q
 v 12 +v 22 
· ¸ · ¸
1 v1 r1
r= q =  v2  = .
v2 r2
v 12 + v 22
q
v 12 +v 22
Note that
kqk = krk = 1 and q • r = 0
and
Aq = αq and Ar = βr. (3.6)
Equations (3.6) can be written as a single equation
¤ α 0
· ¸
£ ¤ £
A q r = q r . (3.7)
0 β
Now, if we let £ ¤
P= q r ,
then we have · T¸ · ¸ · ¸
q £ q•q q•r 1 0
PT P =
¤
q r = = ,
rT r•q r•r 0 1
which means, by Theorem 1.2.18, that the matrix P is invertible and we have
P −1 = P T .
Since equation (3.7) can be written as
α 0
· ¸
AP = P ,
0 β
we get
α 0 −1 α 0 T
· ¸ · ¸ · ¸
1 0
A=A = AP P −1 = P P =P P .
0 1 0 β 0 β
Now we can show that the matrix A is symmetric. Indeed, we have
¸ ¶T ¸T
α 0 T T T α 0 α 0 T
µ · · · ¸
T T
A = P P = (P ) P =P P = A.
0 β 0 β 0 β
· ¸
1
Example 3.3.2. Find a matrix which has as an eigenvector corresponding the
2
· ¸
−2
eigenvalue λ = 1 and as an eigenvector corresponding the eigenvalue λ = 3.
1
Explain why the result is a symmetric matrix.
Solution. The matrix · ¸· ¸· ¸−1

1 −2 1 0 1 −2
2 1 0 3 2 1
will have the desired properties. Since
· ¸−1 · ¸ " 1 2
#
1 −2 1 1 2 5 5
= = ,
2 1 5 −2 1 − 25 1
5
we have
¸" 1 2 13
− 45
· ¸· ¸· ¸−1 · ¸· # " #
1 −2 1 0 1 −2 1 −2 1 0 5 5 5
= = .
2 1 0 3 2 1 2 1 0 3 −2 1
− 45 7
5 5 5
· ¸ · ¸
1 −2
This matrix is symmetric because the eigenvectors and are orthogonal.
2 1
From the proof of Theorem 3.3.1 we can obtain the following useful result.
Theorem 3.3.3. If p and q are vectors in R2 such that
kqk = krk = 1 and q • r = 0,
then the matrix P = q r is invertible and P −1 = P T .

£ ¤
Matrices of the type described in the above theorem are important in theoretical
considerations and practical applications.
Definition 3.3.4. A 2×2 matrix P is called an orthogonal matrix if it is invert-

ible and
P T = P −1 .
It turns out that every orthogonal matrix satisfies the conditions in Theorem
3.3.3, which explains the name.
Theorem
£ ¤ 3.3.5. If A is an orthogonal 2 × 2 matrix, then it has the form A =
q r with
kqk = krk = 1 and q • r = 0.
£ ¤
Proof. If A = q r is an orthogonal 2 × 2 matrix, then
kqk2 0
· ¸ · ¸ · ¸
1 0 £ ¤T £ ¤ q•q q•r
= q r q r = • = .
0 1 r q r•r 0 krk2
Consequently, kqk = krk = 1 and q • r = 0.
Example 3.3.6. The columns of the matrix

· ¸
1 −2
A=
2 1
are orthogonal, but A is not an orthogonal matrix because
° 1 ° ° −2 ° p
°· ¸° °· ¸°
° 2 ° = ° 1 ° = 5.
° ° ° °
p
Since dividing every entry of the matrix by 5 does not affect the orthogonality of
the columns of the matrix, the obtained matrix
p1 −2 #
"
p
5 5
p2 p1
5 5
is an orthogonal matrix.
Now we return to the main subject of this section, namely, symmetric matrices.
We proved that every 2 × 2 matrix with two orthogonal eigenvectors is symmetric. It
turns out that the converse is also true, that is, every symmetric 2 × 2 matrix has two
orthogonal eigenvectors. To prove that we will use the following simple lemma.
Lemma 3.3.7. If A is a symmetric 2 × 2 matrix, then
(Au) • v = u • (Av)
for every u and v in R2 .
Proof. The equality can be verified by direct calculations and is left as an exercise
(Exercise 27).
Theorem 3.3.8. Every symmetric 2 × 2 matrix has real eigenvalues and two
orthogonal eigenvectors.
Proof. Let
· ¸
a c
A= .
c b
The roots of the equation
a −λ
· ¸
c
det = (a − λ)(b − λ) − c 2 = λ2 − λ(a + b) + ab − c 2 = 0
c b −λ
are
1³ p ´ 1³ p ´
α= a + b + (a − b)2 + 4c 2 and β = a + b − (a − b)2 + 4c 2 .
2 2
Since (a − b)2 + 4c 2 ≥ 0, α and β are real numbers, so A has real eigenvalues.

Now we consider two cases: α 6= β and α = β.
First we assume that α 6= β. Let u be an eigenvector corresponding to the eigen-
value α and let v be an eigenvector corresponding to the eigenvalue β. Then
(α − β)(u • v) = (αu) • v − u • (βv) = (Au) • v − u • (Av) = 0,
where the last equality follows from Lemma 3.3.7. Since (α − β)(u • v) = 0 and α 6= β,
we must have u • v = 0.
Now we assume that α = β. This means that (a − b)2 + 4c 2 = 0 and consequently
a = b and c = 0. But then · ¸
a 0
A= .
0 a
This yields α = β ·= a¸ and every

· nonzero
¸ vector is an eigenvector of A, so we can take,
1 0
for example, u = and v = .
0 1
From the proof of the above theorem we obtain the following useful result.
Corollary 3.3.9. Eigenvectors corresponding to different eigenvalues of a sym-

metric 2 × 2 matrix are orthogonal.
Definition 3.3.10. A representation of a 2 × 2 matrix A in the form
α 0 T
· ¸
A=P P
0 β
where P is a 2 × 2 orthogonal matrix and α and β are real numbers is called

an orthogonal diagonalization of A. If a matrix A has an orthogonal diago-
nalization, we say that A can be orthogonally diagonalized.
Example 3.3.11. Since, by Theorem 3.3.3, the inverse of an orthogonal matrix is

equal to its transpose, we have
"
p1 −2 #−1
p
"
p1 p2
#
5 5 5 5
=
p2 p1 −2
p p1
5 5 5 5
and an orthogonal diagonalization of the matrix obtained in Example 3.3.2 is
p1 −2 # · ¸ " p1 p2
# " 13
− 54
" #
p
5 5 1 0 5 5
= 54 7
.
p2 p1 0 3 p −2 p1 −5
5 5 5 5 5
From the proof of Theorem 3.3.1 and Theorem 3.3.8 we obtain the following fun-
damental property of symmetric matrices.
Theorem 3.3.12. A 2 ×2 matrix can be orthogonally diagonalized if and only

if it is symmetric.
Example 3.3.13. Orthogonally diagonalize the symmetric matrix

· ¸
7 3
A= .
3 −1
7−λ
· ¸
3
det = λ2 − 6λ − 16 = 0,
3 −1 − λ
which are λ = −2 and λ = 8.

The eigenvectors corresponding to the eigenvalue λ = −2 are the solutions of
the equation · ¸· ¸ · ¸
7 3 x x
= −2
3 −1 y y
or, equivalently, of the equation
· ¸· ¸ · ¸
9 3 x 0
= .
3 1 y 0
This gives us 3x + y = 0, which can be written as y = −3x. The solutions are of the
form · ¸ · ¸ · ¸
x x 1
= =x ,
y −3x −3
· ¸
1
so for an eigenvector corresponding to the eigenvalue λ = −2 we can take .
−3
An eigenvector
· ¸ corresponding· to¸ the eigenvalue λ = 8 must be orthogonal to
1 3
the vector , so we can take because according to the Theorem 3.2.22 a
−3 1
· ¸ ½· ¸¾
1 3
vector orthogonal to the vector is in Span .
−3 1
Since
° 1 ° ° 3 ° p
°· ¸° °· ¸°
° −3 ° = ° 1 ° = 10,
° ° ° °
we have
#−1 #T
p1 p3 p1 p3 p1 − p3
" " " #
10 10 10 10 10 10
= =
− p3 p1 − p3 p1 p3 p1
10 10 10 10 10 10
and consequently
p1 p3 ¸ " p1 − p3 #
" #·
10 10 −2 0 10 10
A= .
− p3 p1 0 8 p3 p1
10 10 10 10
The spectral decomposition of a 2 × 2 symmetric matrix

Now we turn our attention to the so-called spectral decomposition of a 2 × 2 sym-
metric matrix.
Definition 3.3.14. Let A be a 2 × 2 matrix. By a spectral decomposition of A

we mean a representation of A in the form
α β
A= 2
uuT + vvT ,
kuk kvk2
where u and v are nonzero orthogonal vectors and α and β are real numbers.
Theorem 3.3.15. If {u, v} is a basis of orthogonal eigenvectors of the 2×2 sym-

metric matrix A, then
α β
A= uuT + vvT ,
kuk2 kvk2
where α is the eigenvalue corresponding to the eigenvector u and β is the

eigenvalue corresponding to the eigenvector v, that is,
Au = αu and Av = βv.
Proof. Let α and β be the eigenvalues of A corresponding to u and v, respectively.

We define
1 1
q= u and r = v.
kuk kvk
Then we get, as in the proof of Theorem 3.3.1,
¤ α 0 qT
· ¸· ¸
£
A= q r ,
0 β rT
and, for an arbitrary x in R2 , we have
¤ α 0 qT
· ¸· ¸
£
Ax = q r x
0 β rT
¸ T
 
¤ α 0 q x
·
£
= q r
0 β rT x
 
T
¤ αq x
 
£
= q r  
βrT x
= αq qT x + βr rT x
¡ ¢ ¡ ¢
= α(qqT )x + β(rrT )x
= α(qqT ) + β(rrT ) x.
¡ ¢
Consequently, by Theorem 2.1.18, we get
α β
A = α(qqT ) + β(rrT ) = uuT + vvT .
kuk2 kvk2
The following theorem is a converse of Theorem 3.3.15.
Theorem 3.3.16. If {u, v} is a basis of orthogonal vectors of R2 and α and β

are arbitrary real numbers, then the matrix
α β
A= uuT + vvT
kuk2 kvk2
is symmetric and u is an eigenvector of A corresponding to the eigenvalue α

and v is an eigenvector of A corresponding to the eigenvalue β.
Proof. Assume that

α β
A= 2
uuT + vvT ,
kuk kvk2
where u and v are nonzero orthogonal vectors and α and β are real numbers. Then
¶T
α β
µ
AT = uu T
+ vvT
kuk2 kvk2
α T β ¡ T ¢T
uuT +
¡ ¢
= 2
vv
kuk kvk2
α β
= 2
uuT + vvT = A,
kuk kvk2
So A is a symmetric matrix. It is easy to verify that Au = αu and Av = βv.
Example 3.3.17. A symmetric

· ¸ 2 × 2 matrix A has the eigenvalues λ = 3 and λ = 7.
4
Find the matrix A if is an eigenvector of A corresponding to λ = 3.
1
Solution. Since the matrix A is symmetric,

· eigenvectors
¸ of A corresponding
· to ¸ the
4 1
eigenvalue λ = 7 must be orthogonal to . For example, we can use be-
1 −4
· ¸
4
cause according to the Theorem 3.2.22 a vector orthogonal to the vector is in
1
½· ¸¾
1
Span . Consequently, the matrix is
−4
· ¸ · ¸ · ¸ · ¸
3 4 £ ¤ 7 1 £ ¤ 3 16 4 7 1 −4
4 1 + 1 −4 = +
17 1 17 −4 17 4 1 17 −4 16
· ¸
1 55 −16
=
17 −16 115
" 55 16
#
17 − 17
= .
− 16
17
115
17
Recall that for any nonzero vector a in R2 we have
1
proja x = aaT x
kak2
for all x in R2 . From Theorems 3.3.15 and 3.3.16 we obtain the following version of
the spectral decomposition.
Corollary 3.3.18.
(a) Let A be a 2 × 2 matrix with two orthogonal eigenvectors u and v. Then,

for every vector x in R2 , we have
Ax = α proju x + β projv x,
where α is the eigenvalue of A corresponding to u and β is the eigen-

value of A corresponding to v.
(b) If {u, v} is a basis of orthogonal vectors in R2 , α and β are real numbers,

and A is a 2 × 2 matrix such that for every vector x in R2 we have
Ax = α proju x + β projv x,
then A is a symmetric matrix such that u is an eigenvector of A corre-

sponding to the eigenvalue α and v is an eigenvector of A corresponding
to the eigenvalue β.
The above version of spectral decomposition gives it a clear geometric meaning,

see Fig. 3.19.
Example 3.3.19. Let α and β be two real numbers. Find a matrix A such that u =
x Ax = 3 proju x + 12 projv x
v
projv x
3 proju x
1
2 projv x
u
proju x
Figure 3.19: Ax = 3 proju x + 21 projv x.
· ¸ · ¸
1 1
is an eigenvector of A corresponding to an eigenvalue α and v = is an
1 −1
eigenvector of A corresponding to an eigenvalue β.
Solution 1. First we calculate proju x and projv x using the formula from Theorem
3.2.12, that is
b•w b•w
projw b = 2
w= w.
kwk w•w
In our case we have · ¸ · ¸
x 1
•
y 1 1 x+y 1
· ¸ · ¸
proju x = · ¸ · ¸ =
1 1 1 2 1
•
1 1
and · ¸ · ¸
x 1
•
y −1 x+y
· ¸ · ¸
1 1
projv x = · ¸ · ¸ = .
1 1 −1 2 −1
•
−1 −1
Consequently,
x+y 1 x−y
· ¸ · ¸ · ¸
x 1
A =α +β .
y 2 1 2 −1
The first column of the matrix A is
· ¸ "α β#
+
· ¸ · ¸
1 1 1 1 1
A =α +β = 2 β2
0 2 1 2 −1 α
− 2 2
and the second column of the matrix A is

· ¸ "α β#
−
· ¸ · ¸
0 1 1 −1 1
A =α +β = 2 β2 .
1 2 1 2 −1 α
+ 2 2
Hence "α β α β#
2 + 2 2 − 2
A= β β
.
α α
2 − 2 2 + 2
Solution 2. Using the formula obtained in Theorem 3.3.15 we get the same
result, but the calculations are simpler:
α 1 £ ¤ β 1 £
· ¸ · ¸
¤
A= 1 1 + 1 −1
2 1 2 −1
α 1 1 β 1 −1
· ¸ · ¸
= +
2 1 1 2 −1 1
"α β α β#
+ −
= 2 β2 2 β2 .
α α
2 − 2 2 + 2
The QR factorization of a 2 × 2 matrix

Now we consider a special factorization of 2 × 2 matrices that turns out to be useful
in various computations, including solving systems of equations. This idea can be
generalized to 3×3 square matrices (see the end of Chapter 7) and to square matrices
of any size (see Core Topics in Linear Algebra). The QR factorization can also be used
for non-square matrices. At the end of Chapter 4 we present QR factorization of 3×2
matrices.
Theorem 3.3.20. A 2×2 matrix with linearly independent columns can writ-
ten as a product of an orthogonal matrix and an upper triangular 2×2 matrix
with positive entries on the main diagonal. £ ¤
More precisely, if the columns of the 2 × 2 matrix A = c1 c2 are linearly in-
dependent, then A can be represented in the form
A = QR
· ¸
r 11 r 12
where Q is a 2 × 2 orthogonal matrix and R = with r 1,1 > 0 and
0 r 22
r 2,2 > 0.
£ ¤
Proof. Let A = c1 c2 be a 2×2 matrix with linearly independent columns. First we
define
c2 • v1
v1 = c1 and v2 = c2 − projv1 c2 = c2 − v1 .
v1 • v1
We note that v2 is nonzero (because the vectors c1 , c2 are linearly independent), the
vectors v1 and v2 are orthogonal, and we have
c2 • v1
c2 = v2 + v1 .
v1 • v1
Next we normalize the vectors v1 and v2 :
1 1
u1 = v1 and u2 = v2 .
kv1 k kv2 k
If we denote
c2 • v1
r 1,1 = kv1 k, r 1,2 = kv1 k , and r 2,2 = kv2 k,
v1 • v1
then we have
c1 = r 1,1 u1 and c2 = r 1,2 u1 + r 2,2 u2 ,
and consequently · ¸
£ ¤ £ ¤ r 11 r 12
c1 c2 = u1 u2 .
0 r 22
Note that r 1,1 > 0 and r 2,2 > 0.
· ¸
2 3
Example 3.3.21. Find the QR factorization of the matrix A = .
4 1
Solution. We follow the construction used in the proof of Theorem 3.3.20.

Since · ¸ · ¸
3 2
•
1 4 2
· ¸ · ¸ · ¸ · ¸ · ¸
3 3 1 2 2
−· ¸ · ¸ = − = ,
1 2 2 4 1 2 4 −1
•
4 4
we have · ¸ · ¸ · ¸
3 1 2 2
= + .
1 2 4 −1
Next we calculate the norms
p ° 2 ° p
°· ¸° °· ¸°
° 2 °
° 4 ° = 2 5 and ° −1 ° = 5
° ° ° °
and let · ¸ · ¸ · ¸
1 2 1 1 1 2
u1 = p =p and u2 = p .
2 5 4 5 2 5 −1
Then we have
p p p
· ¸ · ¸
2 3
= 2 5u1 and = 5u1 + 5u2 .
4 1
Consequently,
· p p ¸ " p1 p2 · p
# p ¸
£ ¤ 2 5 5 5 5 2 5 5
A = u1 u2 p = p .
0 5 p2 − p1 0 5
5 5
· ¸
2 1
Example 3.3.22. Find the QR factorization of the matrix A = .
1 5
Solution. Since
· ¸ · ¸
1 2
·
5 1 2
· ¸ · ¸ · ¸ · ¸ · ¸
1 1 7 2 9 1
−· ¸ · ¸ = − =− ,
5 2 2 1 5 5 1 5 −2
·
1 1
we have · ¸ · ¸ · ¸ · ¸ · ¸
1 7 2 9 1 7 2 9 −1
= − = + .
5 5 1 5 −2 2 1 5 2
· ¸ · ¸
1 −1
In the above sum we changed to because the coefficient in front of the
−2 2
vector u2 must be positive.
Next we calculate the norms
° 2 ° p ° p
°· ¸° °· ¸°
° = 5 and ° −1 ° = 5
°
°
° 1 ° ° 2 °
and define · ¸ · ¸
1 2 1 −1
u1 = p and u2 = p .
5 1 5 2
Consequently,
p 7p 9p
· ¸ · ¸
2 1
= 5u1 and = 5u1 + 5u2 .
1 5 5 5
Now it is easy to obtain the QR factorization of the matrix A:
"p p # # "p p #
7 p2 − p1 7
"
£ ¤ 5 5 5 5 5
5 5 5
A = u1 u2 p = p .
0 9 p1 p2 9
5 5 5 5
0 5 5
3.3.1 Exercises
· ¸
3
1. A matrix A has as an eigenvector corresponding the eigenvalue λ = 2 and
2
· ¸
−2
as an eigenvector corresponding the eigenvalue λ = 5. Explain why A is
3
a symmetric matrix.
· ¸
a
2. A matrix A has as an eigenvector corresponding the eigenvalue α and
b
· ¸
−b
as an eigenvector corresponding the eigenvalue β. Explain why A is a
a
symmetric matrix.
For the given symmetric matrix A find matrices D and P such that D is a diagonal
matrix, P is invertible, P −1 = P T , and A = P DP T .
· ¸ · ¸
8 2 27 5
3. A = 7. A =
2 5 5 3
· ¸ · ¸
5 2 7 20
4. A = 8. A =
2 5 20 82
· ¸ · ¸
9 8 a +2 a
5. A = 9. A =
8 −3 a a +2
· ¸ · ¸
9 4 a a −k
6. A = 10. A =
4 3 a −k a
Determine the spectral decomposition of the matrix A.

·
¸ · ¸
5 2 8 2
11. A = 13. A =
2 5 2 5
· ¸ · ¸
9 4 9 8
12. A = 14. A =
4 3 8 −3
· ¸
1
15. Find a symmetric 2 × 2 matrix A with eigenvalues 2 and −1 such that u =
1
is an eigenvector of A corresponding to the eigenvalue 2.
· ¸
3
16. Find a 2 × 2 matrix A with eigenvalues 5 and 2 such that u = is an eigen-
4
vector of A corresponding to the eigenvalue 5.
17. Let α and β be two

· ¸real numbers. Find a 2×2 matrix A with eigenvalues α and
2
β such that u = is an eigenvector of A corresponding to the eigenvalue α
1
· ¸
−1
and v = is an eigenvector of A corresponding to the eigenvalue β.
2
18. Let α and β are two

· ¸real numbers. Find a 2×2 matrix A with eigenvalues α and
3
β such that u = is an eigenvector of A corresponding to the eigenvalue α
4
· ¸
4
and v = is an eigenvector of A corresponding to the eigenvalue β.
−3
Orthogonally diagonalize the projection matrix on the given vector line.

½· ¸¾ ½· ¸¾
1 1
19. Span 21. Span
1 3
½· ¸¾ ½· ¸¾
2 −4
20. Span 22. Span
−1 1
Orthogonally diagonalize the reflection matrix across the given vector line.
½· ¸¾ ½· ¸¾
4 −3
23. Span 24. Span
1 1
½· ¸¾
a
25. Find the eigenvalues and eigenvectors of the projection matrix on Span
b
· ¸ · ¸
a 0
where 6= .
b 0
26. Suppose A is a symmetric 2 × 2 matrix with two different eigenvalues α and

β and such that u is an eigenvector corresponding to the eigenvalue α. Show
that w 6= 0 is an eigenvector corresponding to the eigenvalue β if and only if w
is in Span {ux }.
27. If A is a symmetric 2 × 2 matrix, show that (Au) • v = u • (Av) for every u and v in
R2 .
· ¸ · ¸
u1 0
28. Let α and β are two real numbers. Find a matrix A such that u = 6=
u2 0
· ¸
u2
is an eigenvector of A corresponding to the eigenvalue α and v = is an
−u 1
eigenvector of A corresponding to the eigenvalue β.
Find the QR factorization of the given matrix.

· ¸ · ¸
1 1 3 −1
29. 30.
1 2 1 0
¸ ·
1 0
31. Find the spectral decomposition of the matrix A = .
0 1
Chapter 4
The vector space R3
In Chapter 3 we interpreted elements of R2 as 2 × 1 matrices and used tools of linear

algebra to solve problems in R2 . In this chapter we consider the space R3 . We will
see that some operations introduced in R2 generalize easily to R3 and have similar
properties. On the other hand, there are same essential differences between R2 and
R3 .
Following the convention adopted in Chapter 3, elements of R3 will be denoted
 
u1
as 3 × 1 matrices u 2 . In order to interpret algebraic operations in R3 geometri-
u3
cally, we identify elements of R3 with points in the space. This identification is done
by applying the same idea that was used in the construction of Cartesian coordi-
nates R2 . We introduce in the space three mutually perpendicular lines, called the
x-axis, y-axis, and z-axis, that intersect at one point called the origin. The origin is
 
0
identified with the vector 0 and denoted by 0. On each axis we choose a unit of
0
length (usually the same on all three axes). Then every point in the space can be de-
scribed by a unique triple of numbers u 1 u 2 , and u 3 as illustrated in Fig. 4.1. These
numbers are referred to as the Cartesian coordinates of the point.
 
0
Note that we use the same name “origin” and the same symbol 0 to denote 0
0
· ¸
0
and . This slight inconsistency will not cause any problems. In the case of possi-
0
 
0 · ¸
0
ble ambiguity we will write 0 or .
0
0
Depending on the context, we will say “an element of R3 ”, “a 3×1 matrix”, “a point
in R3 ”, or “a vector in R3 ”. We will consider these expressions equivalent. Sometimes
179
z
u3
 
u1
u 2 
u3 1
1
u2 −1 1 y
−1 −1
u1
x
Figure 4.1: Cartesian coordinates in R3 .
we will denote an element of R3 by (u 1 , u 2 , u 3 ). We note that
 
u1 ¡ ¢ £ ¤
u 2  = u 1 , u 2 , u 3 6= u 1 u 2 u 3 .
u3
¡ ¢ £ ¤
In other words, u 1 , u 2 , u 3 is not the matrix u 1 u 2 u 3 .
4.1 Vectors in R3
Algebraic operations and the vector line in R3
Addition of vectors and multiplication of vectors by numbers are defined as for ar-
bitrary matrices, that is:
         
u1 v1 u1 + v 1 u1 t u1
u 2  + v 2  = u 2 + v 2  and t u 2  = t u 2  .
u3 v3 u3 + v 3 u3 t u3
These operations are natural extensions of the corresponding operations in R2

and have the same properties.
u+v
2u
v
u
u
0 −0.7u
Figure 4.2: Addition and multiplication by a number in R3 .
Definition 4.1.1. Let u be a vector in R3 . The set of all vectors of the form t u,
where t is an arbitrary real number is called the vector subspace spanned by
u and is denoted by Span{u}. That is,
Span{u} = {t u : t in R} .
If u is different from the origin, then Span{u} will be called a vector line.
Note that the definitions of Span{u} and vector lines in R3 are identical with the
definitions in R2 . As before, we use the convention that when we say “vector line
Span{u}” we always implicitly assume that u is different from the origin.
Theorem 3.1.6 in Chapter 3 formulated for vectors in R2 is true for vectors in R3
and we can use the same proof.
(a) Span{u} = Span{v};
(b) v = xu for some real number x 6= 0.
Definition 4.1.3. Let u be a vector in R3 . If v is a vector in Span{u} different

from the origin, then {v} is a basis of Span{u}.
Linearly dependent and independent vectors in R3

Definition 3.1.8 of linear dependence of two vectors in R2 makes perfect sense in R3 .
Definition 4.1.4. The vectors u and v from R3 are linearly dependent if at

least one of the following conditions is true
(a) the vector u is in Span{v};
(b) the vector v is in Span{u}.
As in the case of vectors in R2 , vectors u and v in R3 are linearly dependent if u,

v, and 0 are on the same line.
u
0
v
Figure 4.3: Linearly dependent vectors.
The next two theorems are identical with the theorems formulated for vectors in
R2 in Chapter 3 (Theorems 3.1.13 and 3.1.14). Moreover, the proofs presented there
work in R3 without any modifications.
Theorem 4.1.5. Vectors u and v in R3 are linearly dependent if and only if one
of the following conditions holds:
(a) u = 0 or
(b) u 6= 0 and we have v = xu for a real number x.
(b) Span{u} = Span{v};
(c) v = xu for some real number x 6= 0.

   
1 −2
Example 4.1.7. The vectors  3 and −6 are linearly dependent since
−2 4
   
−2 1
−6 = −2  3 .
4 −2
The following theorem, while very similar to Theorem 3.1.15, is not identical:
condition (c) looks different. Like Theorem 3.1.15, this theorem describes practical
methods for verifying linear dependence.
   
u1 v1
Theorem 4.1.8. Let u = u 2  and v = v 2 . The following conditions are
u3 v3
equivalent
(b) The equation

     
u1 v1 0
x u 2  + y v 2  = 0 (4.1)
u3 v3 0
solution x = y = 0;
(c) · ¸ · ¸ · ¸
u1 v 1 u2 v 2 u1 v 1
det = det = det = 0. (4.2)
u2 v 2 u3 v 3 u3 v 3
Proof. The ideas used in the proof of Theorem 3.1.15 still work, but their implemen-
tation requires some modification.
First assume that u and v are linearly dependent. If u is in Span{v}, then u = av
for some real number a. Then 1 · u − av = 0 and thus x = 1 and y = −a is a nontrivial
solution of the equation (4.1). The case when v is in Span{u} can be treated in a
similar way. Therefore (a) implies (b).
Now assume that the equation (4.1) has a nontrivial solution. We can suppose
that x 6= 0. Then we have
   
u1 v1
y
u 2  = −  v 2 
x
u3 v3
or, equivalently,
y y y
u1 = − v 1 , u2 = − v 2 , and u 3 = − v 3 .
x x x
Hence
" y #
− x v1 v1 y y
· ¸
u1 v1
det = det y = − v 1 v 2 + v 2 v 1 = 0,
u2 v2 − x v2 v2 x x
" y #
− x v2 v2 y y
· ¸
u2 v2
det = det y = − v 2 v 3 + v 3 v 2 = 0,
u3 v3 − x v3 v3 x x
" y #
− x v1 v1 y y
· ¸
u1 v1
det = det y = − v 1 v 3 + v 3 v 1 = 0.
u3 v3 − x v3 v3 x x
The argument is similar if we have y 6= 0 instead of x 6= 0. Therefore (b) implies (c).

Finally assume that (4.2) holds. If u = 0, then (a) is trivially true. If u 6= 0, then
at least one of the numbers u 1 , u 2 , u 3 must be different from 0. Without loss of
generality we can assume that u 1 6= 0. Then
v1 v1 v1
v1 = u1 , v2 = u2 , and v 3 = u3
u1 u1 u1
and, consequently,
   
v1 u1
v
v 2  = 1 u 2  .
u1
v3 u3
This means that v is in Span{u} and consequently u and v are linearly dependent. If
we have u 2 6= 0 or u 3 6= 0 and we modify the proof accordingly. Therefore (c) implies
(a).
   
1 −2
Example 4.1.9. In Example 4.1.7 we found that the vectors  3 and −6 are
−2 4
linearly dependent. According to the above theorem the same conclusion can be
derived from the following calculations:
·
¸
1 −2
det = 1 · (−6) − (−2) · 3 = 0,
3 −6
· ¸
3 −6
det = 3 · 4 − (−6) · (−2) = 0,
−2 4
· ¸
1 −2
det = 1 · 4 − (−2) · (−2) = 0.
−2 4
  
1 −2
Example 4.1.10. The vectors  3 and −5 are not linearly dependent because
−2 4
· ¸
1 −2
det = 1 · (−5) − (−2) · 3 = 1 6= 0.
3 −5
· ¸ · ¸
3 −6 1 −2
There is no need to calculate det and det .
−2 4 −2 4
Definition 4.1.11. If the vectors u and v are not linearly dependent, we say
that they are linearly independent. In other words, the vectors u and v are
linearly independent if u is not in Span{v} and u is not in Span{v}.
u
v
Figure 4.4: Linearly independent vectors.
Intuitively, vectors u and v are linearly independent if the only common point of
the vector lines Span{u} and Span{v} is the origin 0, see Fig. 4.4.
The following theorem characterizing linearly independent vectors is a direct

consequence of Theorem 4.1.8.
   
u1 v1
Theorem 4.1.12. Let u = u 2  and v = v 2 . The following conditions are
u3 v3
equivalent:
(a) The vectors u and v are linearly independent;
(b) The equation

     
u1 v1 0
x u 2  + y v 2  = 0
u3 v3 0
has only the trivial solution, that is, x = y = 0;
(c) At least one of the numbers

· ¸ · ¸ · ¸
u1 v 1 u2 v 2 u1 v 1
det , det , det
u2 v 2 u3 v 3 u3 v 3
is different from 0.
The vector plane

Sets of all vectors in R3 that can be written as su + t v for some vectors u and v and
arbitrary numbers s and t play an important role in linear algebra. They are a natural
extension of Span{u} defined as the set of all vectors of the form t u, where t is an
arbitrary real number.
Definition 4.1.13. Let u and v be two vectors in R3 . The set of all vectors in
R3 of the form
su + t v,
where s and t are arbitrary real numbers, is called the vector subspace
spanned by u and v and is denoted by Span{u, v}. That is,
Span{u, v} = {su + t v : s, t in R} .
If the vectors u and v are linearly independent, then the vector subspace
Span{u, v} is called the vector plane spanned by the vectors u and v.
Note that every vector plane is a vector subspace, but not every vector subspace
is a vector plane.
.
0
u
Figure 4.5: The vector plane spanned by vectors u and v.
   
2 1
Example 4.1.14. Since the vectors 1 and −1 are linearly independent,
3 2
   
 2 1 
Span 1 , −1
3 2
 
is a vector plane. On the other hand,

   
 2 −1 
Span −2 ,  1
4

−2

  
2 −1
is a vector subspace that is not a vector plane, because the vectors −2 and  1
4 −2
are linearly dependent. Actually it is easy to verify that
       
 2 −1   2   −1 
Span −2 ,  1 = Span −2 = Span  1 .
4 4

−2
   
−2

As in the case of vector lines we adopt the convention that when we say “a vector
plane Span{u, v},” we implicitly assume that u and v are linearly independent. When
u and v are linearly dependent, then Span{u, v} is a vector line provided u 6= 0 or
v 6= 0. If u = v = 0, then Span{u, v} = {0}.
The same vector subspace can be spanned by different pairs of vectors. How can
we check that two pairs of vectors span the same vector subspace? We address this
question in the remainder of this section.
Theorem 4.1.15. Let a, b, u, v be vectors in R3 . The following two conditions

are equivalent
(a) Span{a, b} = Span{u, v};
(b) a, b are elements of Span{u, v} and u, v are elements of Span{a, b}.
Proof. If Span{a, b} = Span{u, v}, then clearly a, b are elements of Span{u, v} and u, v
are elements of Span{a, b}. This shows that (a) implies (b).
Now, if a, b are elements of Span{u, v}, then there are real numbers q, r, s, t such
that
a = qu + r v and b = su + t v
and an arbitrary element from Span{a, b} can be written in the form
xa + yb = x(qu + r v) + y(su + t v) = (xq + y s)u + (x y + y t )v.
Consequently, an arbitrary element from Span{a, b} is in Span{u, v}. Similarly, if u, v

are elements of Span{a, b}, then there are real numbers e, f , g , h such that
u = ea + f b and v = g a + hb
and an arbitrary element from Span{u, v} can be written in the form
αu + βv = α(ea + f b) + β(g a + hb) = (αe + βg )a + (α f + βh)b.
Consequently, an arbitrary element from Span{u, v} is in Span{a, b}. We can thus

conclude that Span{a, b} = Span{u, v}. This shows that (b) implies (a).
Example 4.1.16. Show that

       
 1 1   −1 3 
Span 2 , −1 = Span  7 , 0 .
3 2 0 7
   
Solution. We use Theorem 4.1.15. Since

           
−1 1 1 3 1 1
 7 = 2 2 − 3 −1 and 0 = 2 + 2 −1 ,
0 3 2 7 3 2
       
−1 3  1 1 
the vectors  7 and 0 are elements of the vector plane Span 2 , −1 .
0 7 3 2
 
Moreover, since
           
1 −1 3 1 −1 3
2 3
2 =  7 + 0 1 2
−1 = −  7 + 0 ,
and
7 7 7 7
3 0 7 2 0 7
       
1 1  −1 3 
the vectors 2 and −1 are elements of the vector plane Span  7 , 0 .
3 2 0 7
 
The following results are consequences of Theorem 4.1.15.
Theorem 4.1.17. Let u and v be vectors in R3 .
(a) For any real numbers s and t such that s 6= 0 and t 6= 0 we have
Span{u, v} = Span{su, t v};
(b) For any real numbers s and t we have
Span{u, v} = Span{u + sv, v} = Span{u, v + t u}.
Example 4.1.18. Let

  
1 −1
u =  2 and v = −1 .
−1 2
Since  
−3
2u + 5v = −1 ,
8
we have
           
 1 −1   2 −1   −3 −1 
Span  2 , −1 = Span  4 , −1 = Span −1 , −1 .
2 2 8 2

−1
 
−2
  
Bases of vector planes in R3

Vector planes in R3 have many properties that are similar to R2 . For example, every
vector plane in R3 has a basis consisting of two vectors.
Definition 4.1.19. Let u and v be linearly independent vectors. A pair of

vectors {a, b} in Span{u, v} is called a basis of the vector plane Span{u, v} if the
vectors satisfy the following two conditions:
1. a and b are linearly independent;
2. Span{a, b} = Span{u, v}.
An expression of the form xa + yb is called a linear combination of vectors a and

b. According to the above definition, linearly independent vectors a and b form a
basis of the vector plane Span{u, v}, if every vector in Span{u, v} can be written as a
linear combination of vectors a and b.
Clearly, {u, v} is a basis of the vector plane Span{u, v}.
Theorem 4.1.20. Let a and b be linearly independent vectors and let c be an

arbitrary vector in the vector plane Span{a, b}. Then the real numbers x and y
such that
c = xa + yb
are uniquely determined by the vector c.
Proof. Suppose that
c = x 1 a + y 1 b and c = x 2 a + y 2 b.
Then
(x 1 − x 2 )a + (y 1 − y 2 )b = 0.
Since vectors a and b are linearly independent, we must have x 1 − x 2 = 0 and

y 1 − y 2 = 0, by Theorem 4.1.12. Consequently, for every c in Span{a, b} there is only
one pair of numbers x and y such that c = xa + yb.
Definition 4.1.21. Let c be an arbitrary vector in Span{a, b}. The unique real
numbers x and y such that c = xa + yb are called the coordinates of c in the
basis {a, b}.
If {a, b} is a basis of the vector plane Span{u, v}, then Span{u, v} = Span{a, b}. In
the next theorem we show that any two linearly independent vectors from a vector
plane span that vector plane. It also shows that, if a vector subspace Span{u, v} con-
tains two linearly independent vectors, then that vector subspace must be a vector
plane.
Theorem 4.1.22. If a and b are two linearly independent elements of a vector

subspace Span{u, v}, then
Span{a, b} = Span{u, v}
and the vectors u and v are linearly independent.
Proof. First we prove that Span{a, b} = Span{u, v}. Let

       
a1 b1 u1 v1
a = a 2  , b = b 2  , u = u 2  , v = v 2  .
a3 b3 u3 v3
If a, b are in Span{u, v}, then

           
a1 u1 v1 b1 u1 v1
 a 2  = q u 2  + r  v 2  and b 2  = s u 2  + t v 2  ,
a3 u3 v3 b3 u3 v3
for some real numbers q, r , s, and t . These two equations can be written as a matrix
equation:
   
a1 b1 u1 v 1 · ¸
a 2 b 2  = u 2 v 2  q s .
r t
a3 b3 u3 v 3
· ¸
x1
Consequently, if is an arbitrary vector, then
x2
   
a1 b1 · ¸ u1 v 1 · ¸· ¸
a 2 b 2  x 1 q s x1
= u2 v 2
  .
x2 r t x2
a3 b3 u3 v 3
   
· ¸· ¸ · ¸ a1 b1 · ¸ 0
q s x1 0 x 1
If = , then we have  a 2 b 2
 = 0 which can be written as

r t x2 0 x2
a3 b3 0
     
a1 b1 0
a
x 1 2 + x 2 2 = 0 .
   b  
a3 b3 0
   
· ¸ · ¸ a1 b1
x1 0
Consequently, = , because the vectors a = a 2  and b = b 2  are linearly
x2 0
a3 b3
· ¸
q s
independent. Thus the matrix is invertible and we have
r t
     
a1 b1 · ¸−1 u1 v 1 · ¸· ¸−1 u1 v 1
a 2 b 2  q s q s q s
= u 2 v 2  = u 2 v 2  .
r t r t r t
a3 b3 u3 v 3 u3 v 3
If we let · ¸−1 · ¸
q s e g
= ,
r t f h
then we have    
a1 b1 · ¸ u1 v 1
a 2 b 2  e g
= u 2 v 2  ,
f h
a3 b3 u3 v 3
which means that
u = ea + f b and v = g a + hb.
Now the equality Span{a, b} = Span{u, v} follows by Theorem 4.1.15.
Next we prove that the vectors u and v are linearly independent. The equation
 
£
· ¸
¤ x1 0
x1 u + x2 v = u v = 0 
x2
0
is equivalent to the equation

 
£ ¤ q s −1 x 1
· ¸ · ¸ 0
a b = 0 .
r t x2
0
Since the vectors a and b are linearly independent we get

· ¸−1 · ¸ · ¸
q s x1 0
= .
r t x2 0
¸ ·
q s
Now we multiply both sides by and obtain
r t
· ¸ · ¸· ¸−1 · ¸ · ¸· ¸ · ¸
x1 q s q s x1 q s 0 0
= = = .
x2 r t r t x2 r t 0 0
This means that the vectors u and v are linearly independent.
Note that Theorem 4.1.22 implies that the two conditions in the definition 4.1.19
are equivalent.
Theorem 4.1.23. Let a and b be vectors in the vector plane Span{u, v}. The
following two conditions are equivalent:
(b) Span{a, b} = Span{u, v}.

Proof. If the vectors a and b are linearly independent in the vector plane Span{u, v}
then we have
Span{a, b} = Span{u, v},
by Theorem 4.1.22.
If Span{u, v} is a vector plane, then the vectors u and v are linearly independent.
Consequently, if Span{a, b} = Span{u, v}, then the vectors a and b are linearly inde-
pendent, again by Theorem 4.1.22.
Example 4.1.24. Let

   
1 −1
u = 2 and v =  0 .
3 −2
Consider      
1 −1 3
a = 2u − v = 2  2  −  0  = 4

3 −2 8
and      
1 −1 −2
b = 3u + 5v = 3 2 + 5  0 =  6 .
3 −2 −1
Show that {a, b} is a basis of Span{u, v}.
Solution. Clearly, a and b are vectors in Span{u, v}. Moreover, since

· ¸
3 −2
det = 26 6= 0,
4 6
a and b are linearly independent, by Theorem 4.1.12. Consequently,
by Theorem 4.1.22, and {a, b} is a basis of Span{u, v}.
It is often necessary to switch from one basis of a vector plane to another basis
of that vector plane. This transition between two bases is called a change of basis.
It can be conveniently described by a matrix. If {a, b} and {u, v} are two bases of a
vector plane, since a and b are elements of Span{u, v}, there are real numbers q, r , s,
and t such that
a = qu + r v and b = su + t v.
But this means that · ¸

£ ¤ £ ¤ q s
a b = u v .
r t
¸ Let {a, b} and {u, v} be two bases of a vector plane. Then

Definition ·4.1.25.
q s
the matrix such that
r t
· ¸
£ ¤ £ ¤ q s
a b = u v
r t
is called the transition matrix from the basis {a, b} to the basis {u, v}.
The next theorem shows that the transition matrix allows us to easily calculate
the coordinates of any vector in the basis {u, v} if we know its coordinates in the basis
{a, b}.
Theorem
· ¸ 4.1.26. Let {a, b} and {u, v} be two bases of a vector plane and let
q s
be the transition matrix from the basis {a, b} to the basis {u, v}. If
r t
c = xa + yb,
then
c = x0u + y 0v
where the real numbers x 0 and y 0 are given by the equation
· 0¸ · ¸· ¸
x q s x
0 = .
y r t y
Proof. Note that the equation c = xa + yb can be written as

· ¸
£ ¤ x
c= a b
y
and the equation c = x 0 u + y 0 v can be written as
¤ x0
· ¸
£
c= u v .
y0
Now to prove the desired property it suffices to note that
¤ x0
· ¸ · ¸· ¸ · ¸
£ ¤ x £ ¤ q s x £
c= a b = u v = u v .
y r t y y0
       
 3 1   1 1 
Example 4.1.27. Show that 1 , 3 and 1 , −1 are bases of the same
2 2 1 0
   
   
 3 1 
vector plane. Determine the transition matrix from the basis 1 , 3 to the
2 2
 
   
 1 1 
basis 1 , −1 . Then use this matrix to determine the coordinates of the vec-
1 0
 
tor    
3 1
w = 3 1 − 2 3
2 2
   
 1 1 
in the basis 1 , −1 .
1 0
 
Solution. First we observe that

           
3 1 1 1 1 1
1 = 2 1 + −1 and 3 = 2 1 − −1 . (4.3)
2 1 0 2 1 0
   
3 1
Since the vectors 1 and 3 are linearly independent, we can conclude that
2 2
       
 3 1   1 1 
Span 1 , 3 = Span 1 , −1 ,
2 2 1 0
   
by Theorem 4.1.22.
Since the equalities in (4.3) can be written as the single matrix equality
   
3 1 1 1 · ¸
1 3 = 1 −1 2 2 ,
1 −1
2 2 1 0
       
 3 1   1 1 
the transition matrix from the basis 1 3 to the basis 1 −1 is the
2 2 1 0
   
· ¸
2 2
matrix .
1 −1
Finally, since · ¸· ¸ · ¸
2 2 3 2
= ,
1 −1 −2 5
we have    
1 1
w = 2 1 + 5 −1 .
1 0
4.1.1 Exercises
    
1  −3 
1. Show that  2 is on the vector line Span −6 .
−1 3
 
    
4  5 
2. Show that 4 is on the vector line Span  5 .
8 10
 
    
1  1 
3. Show that 2 is not on the vector line Span  0 .
3

−1

   
3  3 
4. Show that 7 is not on the vector line Span 7 .
1 2
 
Show that the given vectors u and v are linearly dependent.

       
3 12 1 −1
5. u = 1 and v =  4 7. u = −3 and v =  3
2 8 1 −1
  1    
1 2 2a 1
6. u = 2 and v =  1 8. u = 4a  and v = 2

4 2 2a 1
Show that the given vectors u and v are linearly independent.

       
3 3 1 2
9. u = 1 and v = 1 10. u = 2 and v = 4
2 3 4 5
       
1 −1 2 1
11. u = −3 and v =  3 12. u = 3 and v = 2
1 2 2 1
Show the following equalities using Theorem 4.1.15.

       
 1 3   1 1 
13. Span −1 , −1 = Span −1 , 1
1 3 1 1
   
       
 1 4   1 1 
14. Span −1 , 3 = Span −1 , 1
1 4 1 1
   
         
 2 1   2 1 
15. Span 2 , 0 = Span 2 , 2
3 2 3 1
   
        
 2 1   1 1 
16. Span 4 , 1 = Span 2 , 1
8 2 4 2
   
Show that the vector x is in the vector plane Span {u, v} and find the coordinates of x
in the basis {u, v}.
           
4 2 1 1 1 1
17. x = 5, u = 1, v = 2 20. x = 4, u = −1, v = 1
7 3 2 1 1 1
           
1 1 1 3 1 1
18. x = 0, u = −1, v = 0 21. x = −1, u = −2, v =  1
1 1 2 1 2 −1
           
1 2 1 −1 1 −1
19. x = 0, u = 2, v = 1 22. x =  11, u = 1, v =  2
0 3 1 7 1 1
       
 3 3   1 1 
23. Show that  5 , 4 is a basis in the vector plane Span 1 ,  2 .
1 1

−1
 
−1

       
 1 3   1 1 
24. Show that 7 ,  7 is a basis in the vector plane Span 1 ,  2 .
5 1

−5
 
−1

        
 5 2   1 1 
25. Show that 7 , 1 is a basis in the vector plane Span 1 ,  2 .
1 4 1
  
−1

       
 3 0   1 1 
26. Show that 1 , −1 is a basis in the vector plane Span 1 ,  2 .
7 2 1
  
−1

   
 3 3 
27. Find the transition matrix from the basis  5 , 4 to the basis
1

−1

   
 1 1 
1 ,  2 and use it to find the coordinates of the vector
1

−1

       
3 3  1 1 
w = 5  5 + 2 4 relative to the basis 1 ,  2 .
−1 1 1

−1

   
 1 3 
28. Find the transition matrix from the basis 7 ,  7 to the basis Span
5

−5

   
 1 1 
1

−1

       
1 3  1 1 
w = 3 7 − 2  7 relative to the basis 1 ,  2 .
5 −5 1

−1

   
 5 2 
29. Find the transition matrix from the basis 7 , 1 to the basis Span
1 4
 
   
 1 1 
1

−1

       
5 2  1 1 
w = a 7 + b 1 relative to the basis 1 ,  2 .
1 4 1

−1

   
 3 0 
30. Find the transition matrix from the basis 1 , −1 to the basis
7 2
 
   
 1 1 
1

−1

       
3 0  1 1 
w = a 1 + b −1 relative to the basis 1 ,  2 .
7 2 1

−1

31. Suppose that the vectors a and b in R3 are linearly independent and let u =
2a + b and v = 3a + 5b. Find the transition matrix from the basis {a, b} to the
basis {u, v} and from the basis {u, v} to the basis {a, b}.
a + 3b and v = 3a + b. Find the transition matrix from the basis {a, b} to the
3a + 2b and v = b. Find the transition matrix from the basis {a, b} to the basis
{u, v} and from the basis {u, v} to the basis {a, b}.
5a + 2b and v = 7a + 3b. Find the transition matrix from the basis {a, b} to the
2a − b and v = a + 2b. Find the transition matrix from the basis {a, b} to the
basis {u, v} and use it to find the coordinates of the vector w = a + b relative to
the basis {u, v}.
3a + b and v = 7a + 4b. Find the transition matrix from the basis {a, b} to the
basis {u, v} and use it to find the coordinates of the vector w = 2a−b relative to
the basis {u, v}.
3a + 2b and v = a + 5b. Find the transition matrix from the basis {a, b} to the
basis {u, v} and use it to find the coordinates of the vector w = 4a − 3b relative
to the basis {u, v}.
a + b and v = 2a. Find the transition matrix from the basis {a, b} to the basis
{u, v} and use it to find the coordinates of the vector w = xa + yb relative to the
basis {u, v}.
39. Suppose that the vectors a and b in R3 are linearly independent and that the
vector u 6= 0 is in the vector plane Span{a, b}. Show that one of the following
conditions holds:
(a) {a, u} is a basis of the vector plane Span{a, b};
(b) {b, u} is a basis of the vector plane Span{a, b}.
40. Let u and v be two linearly independent vectors in R3 . Show that {pu + sv, v}
and {u, qv + t u} are bases of the vector plane Span{u, v} for any real numbers
p, q, s, and t such that p and q are different from 0.
4.2 Projections in R3
The dot product in R3
The definition of the dot product in R2 can be modified in an obvious way to work
in R3 :
   
u1 v1
u 2  •  v 2  = u 1 v 1 + u 2 v 2 + u 3 v 3 .
u3 v3
Example 4.2.1.
   
5 4
 3 • 2 = 5 · 4 + 3 · 2 + (−2) · 6 = 14.
−2 6
It is easy to verify that the dot product has the following properties
u•v = v•u
u • (v + w) = u • v + u • w
t (u • v) = (t u) • v = u • (t v)
for arbitrary u, v, w in R3 and t in R.
As in the case of R2 we use the notation
° °
° u1 ° q
2 2 2
° °
° u2 ° = u1 + u2 + u3
kuk = ° °
° u °
3
and call kuk the norm of u. Geometrically, kuk is the distance from the point u to the
origin. If kuk = 1, then we say that u is a unit vector. If u is a non-zero vector, then
1 1
the vector kuk u is a unit vector. If we multiply a non-zero vector u by kuk , we say that
we normalize the vector.
4.2. PROJECTIONS IN R3 201
Example 4.2.2. ° °
° 3 ° p
° ° p
°−2° = 32 + (−2)2 + 12 = 14.
° °
° 1 °
The number ku − vk is the distance from u to v:

°   °
° u1 v1 °
° °
ku − vk = ° u 2  − v 2 °
° °
° u v3 °
3
° °
° u1 − v 1 °
° °
=°° v2 − v2 °
 °
° u −v °
3 3
q
= (u 1 − v 1 )2 + (u 2 − v 2 )2 + (u 3 − v 3 )2 .
 
 
5 3
Example 4.2.3. The distance between  1  and 4 is

−2 0
°   ° ° °
° 5 3 °
° ° 2 ° p p
° °
°
° 1 − 4° = °−3° = 22 + (−3)2 + (−2)2 = 17.
° ° ° °
° −2 0 ° ° −2 °
The useful connection between the dot product and the norm in R2 is also valid
in R3 :
p
u • u = kuk2 or kuk = u • u.
The same is true about the parallelogram law:
Theorem 4.2.4 (Parallelogram law). For arbitrary u and v in R3 we have
1¡
||u + v||2 − ||u − v||2 .
¢
u•v =
4
Proof. The proof is identical with the proof of Theorem 3.2.8. Note that the proof
uses only the algebraic properties of the norm and the dot product that are the same
in R2 and in R3 .
The definition of orthogonal vectors is the same as in R2 :
Definition 4.2.5. Two vectors u and v in R3 are called orthogonal if
||u − v|| = ||u + v||.
From the Parallelogram law we obtain as in R2 the characterisation of orthogonal

vectors:
Theorem 4.2.6. Two vectors u and v in R3 are orthogonal if and only if u•v = 0.

   
1 1
2 •  1 = 1 + 2 − 3 = 0,
3 −1
   
1 1
the vectors 2 and  1 are orthogonal.
3 −1
The vector plane defined by an equation
Theorem 4.2.8. If n is a vector in R3 different from the origin, then the equa-
tion
n•x = 0
defines a vector plane, that is, there are two linearly independent vectors u
and v in R3 such that n • x = 0 if and only if x is in Span{u, v}.
   
n1 x1
Proof. Let n = n 2  6= 0 and x = x 2 . Since n is different from the origin, at least
n3 x3
one of the numbers n 1 , n 2 , n 3 must be different from 0. Suppose that n 3 6= 0. Let
1 0
   
u= 0 and v =  1 .
− nn31 − nn23
Clearly u and v are linearly independent.

x
.
0
v u
Figure 4.6: Illustration of Theorem 4.2.8.
If n • x = 0, then
n 1 x 1 + n 2 x 2 + n 3 x 3 = 0,
which gives us
n1 n2
µ ¶
x3 = − x1 + − x2 .
n3 n3
Since  

x1
 x1 
1
 
0

x2 
 
x 2  =  0 + x 2  1 = x 1 u + x 2 v,
´  = x1 
      
³
x3 − nn31 x 1 + − nn23 x 2 − nn13 − nn23
x is in Span{u, v}.
Now assume that x = su + t v for some numbers s and t . Then
1 0
      
n1
n • x = n 2  • s  0 + t  1
n3 − nn13 − nn32
1 0
       
n1 n1
= s n 2  •  0 + t n 2  •  1 = 0.
n3 − nn13 n3 − nn23
This completes the proof in the case when n 3 6= 0. The cases when n 1 6= 0 or n 2 6= 0
are treated similarly with appropriate changes.
Example 4.2.9. Let

 
1
n = −1 .
2
Describe the vector plane defined by n • x = 0 in the form Span{u, v}.
 
x1
Proof. If we let x = x 2 , then
x3
   
1 x1
n • x = −1 • x 2  = x 1 − x 2 + 2x 3 .
2 x3
Now
n•x = 0
is equivalent to
x 1 − x 2 + 2x 3 = 0
or
x 1 = x 2 − 2x 3
This yields
       
x1 x 2 − 2x 3 1 −2
x 2  =  x 2  = x 2 1 + x 3  0 .
x3 x3 0 1
   
1 −2
Note that the vectors 1 and  0 are linearly independent. If we define
0 1
   
1 −2
u = 1 and v =  0 ,
0 1
then the above argument shows that n • x = 0 if and only if x = su + t v for some real
numbers s and t . In other words, the equation n • x = 0 defines the vector plane
Span {u, v}.
The projection on a vector line in R3

We continue investigating how the ideas we considered in R2 extend to R3 . Now we
turn to the question of finding the best approximation to a point by elements of a
vector line. As we see in the next theorem, the answer in R3 is identical with the case
of R2 .
Theorem 4.2.10. Consider a nonzero vector u in R3 . For any point b in R3

there is a unique point p on the vector line Span{u} such that
kb − pk ≤ kb − t uk (4.4)
for all t in R. That unique point is
b•u b•u
p= 2
u= u.
kuk u•u
Proof. Since the proof of Theorem 3.2.12 uses only properties of the norm and the
dot product, without referring to R2 , it can simply be copied here without any changes.
kb − pk
u
b
p
Figure 4.7: The point p minimizes the distance from b to the vector line Span{u}.
b•u
As in the case of R2 , the point p = u is called the best approximation to the
kuk2
point b by elements of the vector line Span{u}.
 
 3 
Example 4.2.11. The point on the vector line Span 1 closest to the point
2
 
 
2
−1 is
1
   
2 3
−1 • 1     3
1 2 3 3 2
7   1
p =     1  = 1 = 2.
3 3 14
2 2 1
1  • 1 
2 2
Theorem 4.2.12. Consider a point b and a vector line Span{u} in R3 . The best
approximation p to the point b by elements of the vector line Span{u} can be
characterized as the point p in Span{u} satisfying the equation
(b − p) • u = 0.
In other words, the point p in Span{u} is the best approximation to b by ele-

ments of the vector line Span{u} if and only if (b − p) • u = 0.
Proof. Again, the proof is identical with the one given for Theorem 3.2.13.
In view of Theorem 4.2.12, the best approximation to the point b by elements of

the vector line Span{u} is also called the projection of b on the vector line Span{u}
and is denoted by projSpan{u} b or simply by proju b:
b•u
proju b = u.
kuk2
Finally, we show that Theorem 3.2.17 remains true in R3 .
Theorem 4.2.13. Let u be a nonzero vector in R3 and let

1
A= uuT .
kuk2
Then for any b in R3 we have
proju b = Ab.

   
u1 b1
Proof. Let u = u 2 . For any b = b 2  we have
u3 b3
b•u
proju b = u
kuk2
   
u ¤ b1
1  1  £
= u2 u 1 u 2 u 3 b 2 
kuk2
u3 b3
   
u ¤ b1
1  1  £
= u 2 u 1 u 2 u 3
 b 2 
kuk2
u3 b3
= Ab.
The uniqueness part of the theorem is a consequence of the fact that the equality
Ax = B x for every x in R3 implies A = B , by Theorem 2.1.18.
Definition 4.2.14. Let u be a nonzero vector in R3 . The matrix

1
uuT (4.5)
kuk2
is called the projection matrix on the vector line Span{u}.
 
 1 
Example 4.2.15. Find the projection matrix on the vector line Span 3 .
1
 
1 £ 1 3 1
uuT = 3 1 3 1 = 3 9 3 ,
¤
1 1 3 1
and kuk2 = 11 the projection matrix on the vector line Span{u} is

 
1 3 1
1 1
uuT = 3 9 3  .
kuk2 11
1 3 1
We have a lot of flexibility in choosing a basis for a vector plane. As we will see,
bases {a, b} such that a • b = 0 are often preferable.
Theorem 4.2.16. Let a and b be two nonzero orthogonal vectors in R3 , that is,
a 6= 0, b 6= 0, and a • b = 0. Then the vectors a and b are linearly independent.
Proof. If
 
0
xa + yb = 0 ,
0
then  
0
xa • a + yb • a = 0 • a.
0
Since a • a = kak2 and b • a = 0, the above can be written as xkak2 = 0. This gives us
x = 0, because kak 6= 0. In the same way we can show that y = 0. The result is now a
consequence of Theorems 4.2.12 and 4.1.23.
Definition 4.2.17. A basis {a, b} of a vector plane is called an orthogonal basis

if a • b = 0. A basis {a, b} of a vector plane is called an orthonormal basis if it is
an orthogonal basis and kak = kbk = 1.
If {a, b} is an orthogonal basis of a vector plane, it is easy to change it to an or-

thonormal basis. Clearly, we have
½ ¾ ° ° ° °
a b ° a ° ° b °
Span{a, b} = Span , and ° ° ° = ° ° = 1.
kak kbk kak ° ° kbk °
The next theorem shows that any basis of a vector plane can be transformed to an
orthogonal basis of that vector plane.
Theorem 4.2.18. Let u and v be two linearly independent vectors. There is

a vector u0 such that u0 • v = 0 and Span{u0 , v} = Span{u, v}, that is {u0 , v} is an
orthogonal basis of the vector plane Span{u, v}.
Similarly, there is a vector v0 such that u • v0 = 0 and Span{u, v0 } = Span{u, v},
that is {u, v0 } is an orthogonal basis of Span{u, v}.
Proof. We will show that the vector

u•v
u0 = u − v
v•v
has the desired properties. First note that {u0 , v} is a basis of the vector plane Span{u, v},
by Theorem 4.1.17. Moreover,
u•v
u0 • v = u • v − v • v = 0,
v•v
completing the proof of the first part.

For the second part we use
v•u
v0 = v − u
u•u
and proceed as in the first part of the proof.
Example 4.2.19. Determine two orthogonal bases in the vector plane

   
 1 1 
Span  0  −2  .
2 2
 
1 1
0 • −2 = 5,
2 2
   
 1 1 
0 , −2 is not an orthogonal basis. We find that
2 2
 
        4
1 1 1 1 9
−2 • −2 = 9 and 0 − 5 −2 = 
 10
9 ,

9
2 2 2 2 8
9
 4       
 9
 1   1 1 
 10   
so  9  , −2 is an orthogonal basis in the vector plane Span 0 −2 .
2  2 2

 8   
9 4 4
   
2 9 9 2
10 10 2 
Note that we can use 5 instead of  9 , because  9  = 9 5 and
     
4 8 8 4
9 9
     4       
 2 1   9
 1 
  1 1 
Span 5 , −2 = Span  10 , −2 = Span 0 −2 ,
 
9 
4 2 2  2 2
  
 8   
9
by Theorem 4.1.17.
   
 2 1 
Thus 5 , −2 is an orthogonal basis in the vector plane
4 2
 
   
 1 1 
Span 0 −2 .
2 2
 
Another possibility is to use
   
1 1
  −2 • 0    
1 2 2 1 0
−2 −     0 = −2
2 1 1 2 0
0 • 0
2 2
         
0 0 0  1 0 
or just 1 because −2 = −2 1. Thus 0 , 1 is another orthogonal ba-
0 0 0 2 0
 
   
 1 1 
sis in the vector plane Span 0 −2 .
2 2
 
The projection on a vector plane in R3

We have considered the problem of finding a point on a line minimizing the distance
from a given point and observed that methods of linear algebra give us a simple and
elegant solution to the problem. Now we would like to find a point on a vector plane
minimizing the distance from a given point. Again we find that linear algebra can
handle this problem well.
Theorem 4.2.20. Let u and v be linearly independent vectors in R3 . For every

vector b in R3 there is a unique vector p in the vector plane Span{u, v} such that
the inequality
kb − pk < kb − xk
holds for every x in Span{u, v} different from p. If the vectors u and v are or-
thogonal, then that unique point is
b•u b•v
p= 2
u+ v = proju b + projv b. (4.6)
kuk kvk2
Proof. In view of Theorem 4.2.18, we can assume that the vectors u and v are orthog-
onal, that is u • v = 0, because if they are not we can modify the basis of the vector
plane so that it becomes an orthogonal basis. Now, using the fact that u • v = 0, we
rewrite the square of the distance from b to an arbitrary point on the vector plane
Span{u, v} in a way that may seem at first quite artificial:
kb − su − t vk2 = (b − su − t v) • (b − su − t v)
= kbk2 + s 2 kuk2 + t 2 kvk2 − 2sb • u − 2t b • v + 2st u • v
(b • u)2 (b • v)2
= kbk2 − − + s 2 kuk2
kuk2 kvk2
(b • u)2 (b • v)2
− 2sb • u + 2
+ t 2 kvk2 − 2t b • v +
kuk kvk2
(b • u)2 (b • v)2 b•u 2 b•v 2
µ ¶ µ ¶
= kbk2 − − + skuk − + t kvk −
kuk2 kvk2 kuk kvk
Note that the part

(b • u)2 (b • v)2
kbk2 − −
kuk2 kvk2
does not depend on t or s, so it is fixed for the point b and the vector plane. Con-
sequently, to minimize kb − su − t vk2 , we need to minimize the remaining part, that
is
b•u 2 b•v 2
µ ¶ µ ¶
skuk − + t kvk − .
kuk kvk
Clearly, the minimum is attained when
b•u b•v
skuk − = 0 and t kvk − = 0.
kuk kvk
Solving for t and s we get
b•u b•v
s= and t =
kuk2 kvk2
and consequently
b•u b•v
p = su + t v = 2
u+ v.
kuk kvk2
The point p in the above theorem minimizes the distance from the point b to
an arbitrary point on the vector plane Span{u, v}. For this reason p is called the best
approximation to b by elements of the vector plane Span{u, v}.
 
3
Example 4.2.21. Determine the best approximation to the point b = 1 by ele-
1
   
1 1
ments of Span{u, v} where u = −2 and y = 1.
1 1
   
1 1
Solution. Since the vectors u = −2 and u = 1 are orthogonal, we can use for-
1 1
mula (4.6), which gives us
       
3 1 3 1
1 • −2   1 • 1        
1 1 1 1 1 1 1 1 2
1 5
p =     −2 +     1 = −2 + 1 = 1 .
1 1 1 1 3 3
1 1 1 1 2
−2 • −2 1 • 1
1 1 1 1
 
1
Example 4.2.22. Determine the best approximation to the point b = 1 by ele-
1
   
1 1
ments of Span{x, y} where x = 0 and y = 1.
1 0
Solution. Since x • y = 1, the vectors x and y are not orthogonal. To find an or-
thogonal basis for Span{x, y} we use the method presented in the proof of Theorem
4.2.18:
½ ¾
x•y
Span{x, y} = Span x − y, y
y•y
  1   

 2 1 
= Span − 1  , 1
 
2
0 
 
1

   
 1 1 
= Span −1 , 1 .
2 0
 
   
1 1
Since the vectors −1 and 1 are orthogonal, we can use formula (4.6) with
2 0

  
1 1
u = −1 and v = 1, which gives us
2 0
       
1 1 1 1
1 • −1   1 • 1        
1 2 1 1 0 1 1 1 2
1 2
p =     −1 +     1 = −1 + 1 = 1 .
1 1 1 1 3 3
2 0 2 0 1
−1 • −1 1 • 1
2 2 0 0
The following theorem is very similar to Theorem 4.2.12 that characterizes the
best approximation to the point by elements of a vector line.
Theorem 4.2.23. Consider a point b and a vector plane Span{u, v} in R3 . The

best approximation to b by elements of Span{u, v} can be characterized as the
point p in Span{u, v} such that
(b − p) • w = 0 for every w in Span{u, v}. (4.7)
In other words, the point p in Span{u, v} is the best approximation to b by ele-

ments of the vector plane Span{u, v} if and only if the vector b−p is orthogonal
to every vector in Span{u, v}.
b−p
b
v
w
.
0
p u
Figure 4.8: Illustration of Theorem 4.2.23.
Proof. Let p is the best approximation to b by elements of the vector plane Span{u, v}.
Without loss of generality we can assume that the vectors u and v are orthogonal.
Then, by (4.6), we have
b•u b•v
p= 2
u+ v.
kuk kvk2
If w is in Span{u, v}, then w = su + t v for some real numbers s and t and we have
µ ¶
b•u b•v
(b − p) • w = b − u − v • (su + t v)
kuk2 kvk2
b•u b•v b•u b•v
= sb • u + t b • v − s 2
u•u−t 2
v•v−t 2
u•v−s v•u
kuk kvk kuk kvk2
= 0,
since u • u = kuk2 , v • v = kvk2 , and u • v = v • u = 0.

Now assume that the point p in Span{u, v} satisfies the equality
(b − p) • w = 0
for every w in Span{u, v}. For any vector x in Span{u, v} we have
kb − xk2 = (b − x) • (b − x)
= (b − p + p − x) • (b − p + p − x)
= kb − pk2 + 2(b − p) • (p − x) + kp − xk2
= kb − pk2 + kp − xk2 ,
because w = p − x is in Span{u, v} and consequently (b − p) • (p − x) = 0. Since
kb − xk2 = kb − pk2 + kp − xk2 ,
we can conclude that

kb − xk > kb − pk
for every x in Span{u, v} different from p which means that p is the best approxima-
tion to b by elements of the vector plane Span{u, v}.
Theorem 4.2.23 says that the point p of the vector plane Span{u, v} which is the
best approximation to b by elements of Span{u, v} is exactly the point p for which the
vector b − p is perpendicular on an arbitrary vector of the vector plane Span{u, v}.
For this reason such the point p is called the projection of b on the vector plane
Span{u, v} and is denoted by projSpan{u,v} b:
Let u and v be nonzero vectors in R3 such that u • v = 0.

b•u b•v
projSpan{u,v} b = 2
u+ v = proju b + projv b.
kuk kvk2
Now we prove a result similar to the Theorem 4.2.13.

Theorem 4.2.24. Let u and v be nonzero orthogonal vectors in R3 . If
1 1
A= (uuT ) + (vvT ),
kuk2 kvk2
then the projection of any vector b in R3 on the vector plane Span{u, v} is Ab.
     
u1 v1 b1
Proof. Let u = u 2  and v = v 2 . For any vector b = b 2  in R3 we have
u3 v3 b3
b•u b•v
projSpan{u,v} b = u+ v
kuk2 kvk2
       
u ¤ b1 v1 £ ¤ b1
1  1  £ 1
= u2 u 1 u 2 u 3 b 2  + v 2   v 1 v 2 v 3 b 2 
kuk2 kvk2
u3 b3 v3 b3
       
u ¤ b1 v1 £ ¤ b1
1  1  £  b 2  + 1 v 2  v 1 v 2 v 3  b 2 
= u 2 u 1 u 2 u 3
kuk2 kvk2
u3 b3 v3 b3
      
u1 £ v1 £ ¤ b1
1   1  
¤
=  kuk 2 u 2 u 1 u 2 u 3 + kvk 2 v 2 v 1 v 2 v 3  b 2 
u3 v3 b3
µ ¶
1 1
= (uuT ) + (vvT ) b
kuk2 kvk2
= Ab.
The uniqueness part of the theorem is a consequence of the fact that the equality
Ax = B x for every x in R3 implies A = B , by Theorem 2.1.18.
Definition 4.2.25. Let u and v be nonzero vectors in R3 such that u • v = 0.

The matrix
1 1
(uuT ) + (vvT ) (4.8)
kuk2 kvk2
is called the projection matrix on Span{u, v}.
 
1
Example 4.2.26. Determine the projection matrix on Span{x, y}, where x = 0
1
 
1
and y = 1.
0
Solution. From Example 4.2.22 we have

   
 1 1 
Span{x, y} = Span −1 , 1 .
2 0
 

  
1 1
Since the vectors −1 and 1 are orthogonal, we can use formula (4.8) with
2 0
   
1 1
u = −1 and v = 1 which gives us
2 0
       
1 ¤ 1 1 £ 1 −1 2 1 1 0
1  £ ¤ 1 1
−1 1 −1 2 + 1 1 1 0 = −1 1 −2 + 1 1 0
6 2 6 2
2 0 2 −2 4 0 0 0
 
2 1 1
1
= 1 2 −1 .
3
1 −1 2
The least squares problem in R3

In this section we prove a theorem that shows how geometrical ideas can be used to
solve a problem which does not seem to have anything to do with geometry.
Theorem 4.2.27. The numbers x and y which minimize the sum
(b 1 − u 1 x − v 1 y)2 + (b 2 − u 2 x − v 2 y)2 + (b 3 − u 3 x − v 3 y)2
are the solution of the equation

· ¸
x
AT A = AT b (4.9)
y
where    
u1 v 1 b1
A = u 2 v 2  and b = b 2  .
u3 v 3 b3
· ¸
x
Proof. With A and b defined above the equation A T A = A T b is
y
   
¸ u v · ¸ · ¸ b
u1 u2 u3  1 1  x u1 u2 u3  1 
·
u2 v 2 = b2 (4.10)
v1 v2 v3 y v1 v2 v3
u3 v 3 b3
and is equivalent to the equation

    
¸ b u v · ¸
u 1 u 2 u 3  1   1 1  x 
· · ¸
0
b2 − u2 v 2 = .
v1 v2 v3 y 0
b3 u3 v 3
This equation can be written as a system of equations:

     

 £ ¤ b1 u1 v 1 · ¸
x
 u 1 u 2 u 3 b 2  − u 2 v 2 
 =0
y


b3 u3 v 3


    
 b1 u1 v 1 · ¸

 £ ¤ x
v v v b 2  − u 2 v 2  =0

 1 2 3

y


b3 u3 v 3
 
u1 v 1 · ¸
x
If we let p = u 2 v 2  , the system becomes
y
u3 v 3
    

 b1 u1


b 2  − p • u 2  = 0

b3 u3


     (4.11)


 b1 v1


 b 2  − p • v 2  = 0

b3 v3

   
u1 v1
If we denote u = u 2  and v = v 2  and w = su + t v, the system (4.11) yields
u3 v3
(b − p) • w = (b − p) • (su + t v) = s(b − p) • u + t (b − p) • v = 0.
If the vectors u and v are linearly independent, then, using Theorem 4.2.23, we
can give a geometrical interpretation of the above system: the solution of the system
 
u1 v 1 · ¸
x
(4.11) is the point p = u 2 v 2  = xu+ yv that is the projection of b on the vector
y
u3 v 3
plane Span{u, v}. Consequently, if the vectors u and·v are
¸ linearly independent, then
T x
the numbers x and y that satisfy the equation A A = A T b minimize
y
°b − xu − yv°2 = (b 1 − u 1 x − v 1 y)2 + (b 2 − u 2 x − v 2 y)2 + (b 3 − u 3 x − v 3 y)2 .

° °
If the vectors u and v are linearly dependent and u 6= 0 or v 6= 0, then Span{u, v}

is actually a vector line and, according to Theorem 4.2.12, the solution of the system
(4.11) is the point p that is the projection of b on the vector line Span{u, v}. Note that
in this case numbers x and y such that p = xu+ yv · ¸are not unique. Nevertheless, any
T x
numbers x and y that satisfy the equation A A = A T b minimize
y
°b − xu − yv°2 = (b 1 − u 1 x − v 1 y)2 + (b 2 − u 2 x − v 2 y)2 + (b 3 − u 3 x − v 3 y)2 .
° °
Finally, if u = v = 0, then the sum

(b 1 − u 1 x − v 1 y)2 + (b 2 − u 2 x − v 2 y)2 + (b 3 − u 3 x − v 3 y)2 = b 12 + b 22 + b 32
is a constant and there is nothing to prove.
Example 4.2.28. Find numbers x and y which minimize the sum
(1 − 2x − y)2 + (2 − x + y)2 + (1 + x + y)2 .

   
2 1 1
Solution. If we let A =  1 −1 and b = 2, then the numbers x and y which
−1 −1 1
minimize (1 − 2x − y)2 + (2 − x + y)2 + (1 + x + y)2 are the solution of the equation
· ¸
x
AT A = A T b,
y
that is,    
· ¸ 2 1 · ¸ · ¸ 1
2 1 −1  x 2 1 −1
1 −1 = 2  ,
1 −1 −1 y 1 −1 −1
−1 −1 1
or · ¸· ¸ · ¸
6 2 x 3
= .
2 3 y −2
This equation is equivalent to the system
½
6x + 2y = 3
2x + 3y = −2
which can be solved using Cramer’s rule:

· ¸ · ¸
3 2 6 3
det det
−2 3 13 2 −2 18
x= ¸ = and y= ¸ =− .
14 14
· ·
6 2 6 2
det det
2 3 2 3
Note that the formula (4.9) in Theorem 4.2.27 gives us another way to calculate
the point on a vector plane that minimizes the distance from a given point to an
arbitrary point on that vector plane.
The system of equations (4.11) in the proof of Theorem 4.2.27 suggests a simple
way to calculate the numbers x and y which minimize the sum
(b 1 − u 1 x − v 1 y)2 + (b 2 − u 2 x − v 2 y)2 + (b 3 − u 3 x − v 3 y)2 .
We formulate that observation in the next theorem.
Theorem 4.2.29. The numbers x and y which minimize the sum
are the solution of the system

       

 b1 u1 v1 u1


 b 2  − x u 2  − y v 2  • u 2  = 0

b3 u3 v3 u3


        . (4.12)
 b1
 u1 v1 v1



 b 2  − x u 2  − y v 2  • v 2  = 0

b3 u3 v3 v3

Example 4.2.30. Find numbers x and y which minimize the sum
(1 − 2x − y)2 + (2 − x + y)2 + (1 + x + y)2 .
Solution. In this case the system (4.12) becomes

       

 1 2 1 2


 2 − x  1 − y −1 •  1 = 0

 1

−1 −1 −1
        .
 1
 2 1 1



 2 − x  1 − y −1 • −1 = 0

1 −1 −1 −1

After calculating the dot products and simplifying we obtain the system
½
6x + 2y = 3
.
2x + 3y = −2
The solutions are as in Example 4.2.28
13 18
x= and y =− .
14 14
     
u1 v1 b1
Corollary 4.2.31. Let u 2 , v 2 , and b 2  be vectors in R3 . If numbers x
u3 v3 b3
and y satisfy the system of equations (4.12), then the vector
   
u1 v1
x u 2  + y v 2 
u3 v3
 
b1
is the projection of the vector b 2  on the vector subspace
b3
   
 u1 v1 
Span u 2  , v 2  .
u3 v3
 
    
1 1 1
Example 4.2.32. Let b =  1  ,u=  2  , and v = 1. Find the projection of the

−1 2 1
vector b on the vector plane Span {u, v}.
Proof. In our case the system (4.12) becomes

       

 1 1 1 1


  1 − x 2 − y 1 • 2 = 0

2 1 2

 −1
        .


 1 1 1 1


  1 − x 2 − y 1 • 1 = 0

−1 2 1 1

After calculating the dot products and simplifying we obtain the system
½
9x + 5y = 1
.
5x + 3y = 1
The solutions can be found using Cramer’s Rule:

· ¸ · ¸
1 5 9 1
det det
1 3 5 1
x= · ¸ = −1 and y= · ¸ = 2.
9 5 9 5
det det
5 3 5 3
This means that the projection of the vector b on the vector plane Span {u, v} is
     
1 1 1
−u + 2v = − 2 + 2 1 = 0 .
2 1 0
The next result gives a formula to calculate the numbers x and y which minimize
the sum
and the projection matrix on Span{u, v} when the vectors u and v are linearly inde-
pendent.
   
u1 v1
Theorem 4.2.33. Let u = u 2  and v = v 2  be linearly independent vectors
u3 v3
 
u1 v 1
and let A = u 2 v 2  . Then
u3 v 3
(a) The matrix A T A is invertible;

 
b1
(b) If b = b 2  is a vector in R3 the numbers x and y that minimize the
b3
sum
are given by the equation

· ¸
x
= (A T A)−1 A T b; (4.13)
y
(c) The projection matrix on the vector plane Span{u, v} is
A(A T A)−1 A T . (4.14)

Proof. First we observe that

 
· ¸ u1 v 1
u 1 u 2 u 3
AT A = u 2 v 2 
v1 v2 v3
u3 v 3
2 2
u 1 + u 2 + u 32
· ¸
u1 v 1 + u2 v 2 + u3 v 3
= .
v 1 u1 + v 2 u2 + v 3 u3 v 12 + v 22 + v 32
Consequently,
u 12 + u 22 + u 32
· ¸
T u1 v 1 + u2 v 2 + u3 v 3
det A A = det
u1 v 1 + u2 v 2 + u3 v 3 v 12 + v 22 + v 32
= (u 12 + u 22 + u 32 )(v 12 + v 22 + v 32 ) − (u 1 v 1 + u 2 v 2 + u 3 v 3 )2
= (u 1 v 2 − u 2 v 1 )2 + (u 2 v 3 − u 3 v 2 )2 + (u 1 v 3 − u 3 v 1 )2
µ · ¸¶2 µ · ¸¶2 µ · ¸¶2
u1 v 1 u2 v 2 u1 v 1
= det + det + det .
u2 b2 u3 v 3 u3 v 3
Since the vectors u and v are linearly independent, we get det A T A 6= 0, by Theorem
4.1.12, and that the matrix A T A is invertible, by Theorem 1.3.9.
Now, by the proof of Theorem 4.2.27, the projection of any vector b on the vector
plane Span{u, v} is the point xu + yv where x and y are the solutions of the equation
· ¸
x
AT A = A T b.
y
By multiplying this equation by the inverse of the matrix A T A we obtain

· ¸
x
= (A T A)−1 A T b.
y
Hence the projection of the vector b on the vector plane Span{u, v} is

 
u1 v 1 · ¸ · ¸
x x
xu + yv = u 2 v 2  =A = A(A T A)−1 A T b.
y y
u3 v 3
Because the uniqueness of the projection matrix this means that
A(A T A)−1 A T
is the projection matrix on the vector plane Span{u, v}.
   
1 1
Example 4.2.34. We consider the vectors u = 0 and v = 1.
1 1
Determine the projection matrix on Span{u, v}.
Solution. Since the vectors are linearly independent, the projection matrix on the
vector plane Span{u, v} is
¸ 1 1 −1 ·
     
1 1 · ¸ 1 1 · ¸−1 · ¸
0 1  1 0 1 0 1 1 0 1 2 2 1 0 1
= 0 1
 
1 1 1 1 1 1 2 3 1 1 1
1 1 1 1 1 1
 
1 1 µ · ¸¶ · ¸
1 3 −2 1 0 1
= 0 1 2
 
−2 2 1 1 1
1 1
 
1 0 · ¸
1 1 0 1
= −2 2 
2 1 1 1
1 0
 
1 0 1
1
= 0 2 0 .
2
1 0 1
The least-squares line

In this short section we present an example of application of Theorem 4.2.27 to
statistics.
Let (x 1 , y 1 ), (x 2 , y 2 ) and (x 3 , y 3 ) be three points in R2 such that x 1 < x 2 < x 3 . These
three pairs of numbers are interpreted as the result of an experiment. The numbers
x 1 , x 2 , x 3 are called the parameter values, since they represent some parameter that
we can control, and the numbers y 1 , y 2 , y 3 are called the observed values, since they
represent the observed outcomes for different values of the parameter.
We want to determine a line y = b 0 + b 1 x which minimizes the sum
(y 1 − b 0 − b 1 x 1 )2 + (y 2 − b 0 − b 1 x 2 )2 + (y 3 − b 0 − b 1 x 3 )2 .
The line y = b 0 + b 1 x is called the least-squares line. It is the line that best fits the
points (x 1 , y 1 ), (x 2 , y 2 ), and (x 3 , y 3 ). In statistics it is usually called the regression line
of y 1 , y 2 , y 3 on x 1 , x 2 , x 3 . The numbers b 0 + b 1 x 1 , b 0 + b 1 x 2 , b 0 + b 1 x 3 are called the
predicted values and the numbers y 1 − (b 0 + b 1 x 1 ), y 2 − (b 0 + b 1 x 2 ), y 3 − (b 0 + b 1 x 3 )
are called the residuals.
To find the least square line we use Theorem 4.2.27. If we let
 
1 x1
X = 1 x 2  ,
1 x3
then the numbers b 0 and b 1 can be found as the solution of the equation
 
· ¸ y1
b 0
XT X = X T y2 .
b1
y3
Example 4.2.35. Determine the least square line that best fits the points (1, 3),
(3, 4), and (4, 7).
 
1 1
Solution. Let X = 1 3. We have to solve the equation
1 4
 
· ¸ 3
b 0
XT X = X T 4 ,
b1
7
that is, the equation · ¸· ¸ · ¸

3 8 b0 14
= .
8 26 b1 43
The solution is b 0 = 10 17
7 and b 1 = 14 . The least squares line that best fits the points
(1, 3), (3, 4), and (4, 7) is the line y = 10 17
7 + 14 x.
7
88
14
observed values
71
14
predicted values
4 the least squares line
3
37
14
1 3 4 x
Figure 4.9: The least squares line.

The QR factorization of 3 × 2 matrices

The QR factorization of 2×2 matrices was introduced in Chapter 3. Here we present
an extension of that idea to 3 × 2 matrices.
£ ¤
Theorem 4.2.36. If the columns of a 3 × 2 matrix A = c1 c2 are linearly
independent, then A can be represented in the form
A = QR,
£ ¤
where Q = u1 u2 is a 3 × 2 matrix such that the columns u1 and u2 are
orthonormal vectors in R3 , that is
ku1 k = ku2 k = 1 and u1 • u2 = 0,

· ¸
r 11 r 12
and R = is an upper triangular 2 × 2 matrix such that r 1,1 > 0 and
0 r 22
r 2,2 > 0.
Moreover, we have
R = Q T A.
£ ¤
Proof. Let A = c1 c2 be a 3 × 2 matrix such that the vectors c1 and c2 are linearly
independent. First we define
c2 · v1
v1 = c1 , and v2 = c2 − projv1 c2 = c2 − v1 .
v1 · v1
The vectors v1 and v2 are orthogonal and the vector v2 is nonzero, because the vec-
tors c1 and c2 are linearly independent. Moreover, we have
c2 · v1
c2 = v2 + v1 .
v1 · v1
Next we define
1 1
u1 = v1 and u2 = v2 ,
kv1 k kv2 k
and
c2 · v1
r 1,1 = kv1 k, r 1,2 = kv1 k , and r 2,2 = kv2 k.
v1 · v1
Note that the vectors u1 and u2 are orthonormal and we have r 1,1 > 0 and r 2,2 > 0.
Since
c1 = r 1,1 u1 and c2 = r 1,2 u1 + r 2,2 u2 ,
we have · ¸
£ ¤ £ ¤ r 11 r 12
A = c1 c2 = u1 u2 ,
0 r 22
which is the desired QR factorization of A.
Moreover, since
· T¸ · ¸ · ¸
T u1 £ ¤ u1 • u1 u1 • u2 1 0
Q Q= T u 1 u 2 = = ,
u2 u1 • u2 u2 • u2 0 1
from the equality A = QR we get

· ¸
1 0
Q T A = Q T QR = R = R.
0 1
 
1 1
Example 4.2.37. Determine the QR factorization of the matrix A = 1 1.
1 2
Solution. Since
   
1 1
  1 · 1        
1 2 1 1 1 1 1
4 1
1 −     1 = 1 − 1 = −  1 ,
1 1 3 3
2 1 2 1 −2
1 · 1
1 1
we have          
1 1 1 1 −1
4 1 4 1
1 = 1 −  1 = 1 + −1 . (4.15)
3 3 3 3
2 1 −2 1 2
By a slight modification of the method from the proof of Theorem 4.2.36 we choose
       
1 −1 −1 1
v1 = 1 and v2 = −1. We have taken v2 = −1 and not v2 =  1, because the
1 2 2 −2
last coefficient of the vector v2 in (4.15) must be positive. Now we calculate
° ° ° °
° 1 ° ° −1 °
° ° p ° ° p
°1° = 3 and °−1° = 6
° ° ° °
° 1 ° ° 2 °
and let    
1 −1
1   1  
u1 = p 1 and u2 = p −1 .
3 1 6 2
Consequently
 
1
 
1 p p
p
1 = 3u1 and 1  = 4 3 u 1 + 6 u 2 .
3 3
1 2
Now we define
"p p #
4 3
£ ¤ T
£ ¤T 3 3
Q = u1 u2 and R = Q A = u1 u2 A = p
6
.
0 3
Thus the QR factorization of the matrix A is

"p p #
4 3
£ ¤ 3 3
A = u1 u2 p
6
.
0 3
4.2.1 Exercises
Find the following dot products.

       
1 7 3 2
1. 5 • 2 2. −4 • 1
2 4 1 4
Find the norms of the given vectors.

   
2 −5
3.  1 4.  3
−5 4
Find the distance between the given points.

       
1 −1 2 7
5. 7 and  9 6. 1 and 2
3 2 5 4
Describe the vector plane defined by n • x = 0 in the form Span{a, b}.

   
2 3
7. n =  1 8. n =  4
−2 −1
Find the projection of the point b on the vector line Span{u} using the formula from
Theorem 4.2.10.
       
0 5 3 1
9. b = 1 and u = 1 11. b = 1 and u = 2
0 1 1 2
       
2 2 1 1
10. b = 1 and u = −2 12. b = a  and u =  0
1 1 b −1
13. Find x which minimizes the sum (1 − 3x)2 + (2 − x)2 + (2 + x)2 using the projec-
 
 3 
tion on the vector line Span  1 .

−1

14. Find x which minimizes the sum (1 + x)2 + (1 − x)2 + (3 − 2x)2 using the projec-
 
 −1 
tion on the vector line Span  1 .
2
 
Find the projection matrix on the given vector line.

   
 1   3 
15. Span 0 17. Span −2
2 5
   
   
 1   3 
16. Span 1 18. Span  2
1
  
−1

Find the projection of b on the vector line Span{u} using Theorem 4.2.13.
       
1 1 x 1
19. b = 1 and u =  1 21. b =  y  and u = 2
1 −2 z 2
       
1 1 x 2
20. b =  3 and u =  0 22. b =  y  and u =  1
−2 −1 z −2
Find two different orthogonal bases in the vector plane Span {u, v} (See Example
4.2.19).
       
1 1 2 3
23. u =  1 and v = 0 25. u = 1 and v =  2
−2 2 5 −4
       
2 −1 3 1
24. u = 2 and v = −2 26. u = 1 and v = 2
1 1 1 1
Find an orthonormal basis in the given vector plane.
27. 2x + y − 2z = 0 29. x − y + 2z = 0
28. 3x − 2y + z = 0 30. 3x + y − z = 0
Find the projection of the point b on Span{u, v} where u · v = 0.

           
1 1 1 0 1 5
31. b = 2, u = −1, v = −1 33. b = 0, u =  1, v = −1
2 −1 2 1 −4 1
           
1 1 1 1 1 1
32. b = 0, u =  0, v = −1 34. b = 0, u = 1, v = −2
0 −1 1 0 1 1
Find the projection of the point b on Span{u, v}.

           
1 1 1 2 1 1
35. b = 0, u =  2, v = 1 37. b = 3, u = 0, v = 1
1 −1 1 1 1 0
           
2 1 2 3 1 −1
36. b =  1, u = 1, v = −1 38. b = 2, u = −1, v =  1
−1 2 1 2 −1 1
Find the distance of the point b to the vector plane Span{u, v}.
           
2 1 2 1 1 1
39. b = 3, u = 2, v = 1 41. b = 3, u = 1, v = −1
1 1 2 5 1 1
           
1 1 2 1 2 1
40. b = 1, u = 0, v =  2 42. b = 1, u =  1, v = 1
2 1 −1 1 −1 3
Find the projection matrix on Span{u, v}.

       
−1 1 1 1
43. u =  1 and v =  2 45. u =  2 and v = 0
1 −1 −1 1
       
1 2 1 2
44. u = 2 and v = −1 46. u = 1 and v = −1
1 0 1 −1
       
1 2 1 1
47. u = 1 and v = 1 49. u = 2 and v = 2
2 1 3 7
       
1 1 1 1
48. u = 1 and v = −1 50. u = 1 and v = −2
1 1 2 −1
Using Theorem 4.2.27 find numbers x and y which minimize the sum.
51. (2 − x + y)2 + (1 − 2x + y)2 + (1 + x − y)2
52. (2 + y)2 + (1 − x + y)2 + (1 + x − y)2
53. (1 + x − y)2 + (1 + x + y)2 + (1 − x − y)2
54. (1 + x − y)2 + (1 + y)2 + (1 − x − y)2
55. (2+x − y)2 +(1+2x −2y)2 +(1−x + y)2 (Explain why the solution is not unique.)
56. (1 − 2x + y)2 + (1 + 2x − y)2 + (1 − 4x + 2y)2 (Explain why the solution is not

unique.)
Using (4.12) find numbers x and y which minimize the sum.
57. (2 − y)2 + (1 − x)2 + (1 − 2x − y)2
58. (2 − x − y)2 + (1 − x + 3y)2 + (1 − x − y)2
59. (2 + x − y)2 + (1 + x + y)2 + (1 − x + y)2
60. (1 − x − y)2 + (1 − 2x + y)2 + (1 + x − 3y)2
Find the projection of the point b on Span{u, v}. (See Example 4.2.32).
           
1 1 3 1 2 0
61. b = 2, u =  1, v = 2 63. b = 1, u = 3, v = 1
1 −1 2 1 5 1
           
1 1 1 1 2 0
62. b = 0, u = 2, v = 3 64. b = 1, u = 3, v = 1
1 0 2 1 5 1
Using (4.13) find numbers x and y which minimize the sum.
65. (2 − y)2 + (1 − x)2 + (1 − x − y)2 66. (1 − x − y)2 + (1 − 2x + y)2 + (1 − 2y)2
Find the projection matrix on Span{u, v} using formula (4.14).

       
1 1 0 1
67. u = 1, v = 0 68. u = 1, v = 5
0 2 0 2
69. Assume that u 6= 0, v 6= 0, and u • v = 0. Use Theorem 4.2.33 to show that the
projection of the point b ∈ R3 on the vector plane Span{u, v} is
b•u b•v
u+ v.
u•u v•v
70. Show that the intersection of two different vector planes is a vector line.
Find the least square line that best fits the given points.
71. (1, 7), (2, 4), and (4, 1) 73. (−1, 0), (2, 1), and (3, 4)
72. (0, 4), (2, 1), and (3, −1) 74. (1, 1), (2, 5), and (5, 3)
Determine the QR factorization of the given matrix.

   
2 0 1 1
75. 1 −1 77. 1 0
0 1 0 3
 
1 2
76. 1 0
1 1
Chapter 5
Determinants and bases in R3
5.1 The cross product

The definition of the cross product
The numbers · ¸ · ¸ · ¸
u1 v 1 u2 v 2 u1 v 1
det , det , det
u2 v 2 u3 v 3 u3 v 3
that appear in Theorems 4.1.8 and 4.1.12 play a central role in the next result.
   
u1 v1
Theorem 5.1.1. If the vectors u = u 2  and v = v 2  are linearly indepen-
u3 v3
dent, then the solution of the system
½
x•u = 0
(5.1)
x•v = 0
is  · ¸
u2 v2
det

 u3 v3 
 · ¸
 u1 v1 
x=t
− det u 3

 v3 
 · ¸
 u1 v1 
det
u2 v2
where t is an arbitrary real number.
   
u1 v1
Proof. Since u = u 2  and v = v 2  are linearly independent at least one of the
u3 v3
233
234 Chapter 5: Determinants and bases in R3
numbers · ¸ · ¸ · ¸
u1 v 1 u2 v 2 u1 v 1
det , det , det
u2 v 2 u3 v 3 u3 v 3
must be different from 0, by (c) in Theorem 4.1.12. Suppose that
· ¸ · ¸
u1 v 1 u1 u2
det = det 6= 0.
u2 v 2 v1 v2
The system (5.1) can be written as

½
u1 x + u2 y + u3 z = 0
v1 x + v2 y + v3 z = 0
or ½
u 1 x + u 2 y = −u 3 z
.
v 1 x + v 2 y = −v 3 z
From the Cramer’s rule (Theorem 1.3.10), for every value of z we get a unique solu-
tion for x and y:
· ¸ · ¸
−u 3 z u 2 u2 v 2
det det
−v 3 z v 2 u3 v 3
x= · ¸ =z · ¸
u1 u2 u1 v 1
det det
v1 v2 u2 v 2
and · ¸ · ¸
u 1 −zu 3 u1 v1
det det
v 1 −zv 3 u3 v3
y= · ¸ = −z · ¸.
u1 u2 u1 v1
det det
v1 v2 u2 v2
Note that we also have · ¸
u1 v1
det
u2 v2
z=z · ¸.
u1 v1
det
u2 v2
If we denote
z
t= · ¸,
u1 v 1
det
u2 v 2
then we have
· ¸ · ¸ · ¸
u2 v 2 u1 v 1 u1 v 1
x = t det , y = −t det , and z = t det .
u3 v 3 u3 v 3 u2 v 2
It is easy to verify that if t is an arbitrary real number

· ¸ · ¸ · ¸
u2 v 2 u1 v 1 u1 v 1
x = t det , y = −t det , and z = t det
u3 v 3 u3 v 3 u2 v 2
5.1. THE CROSS PRODUCT 235
is a solution of the system (5.1).

Consequently, the solution of the system (5.1) is
 · ¸
u2 v2
 det u v3 
   3 
x  · ¸
 u 1 v1 
x = y  = t 
− det u 3
.
v3 
z 
 ·

¸
 u1 v1 
det
u2 v2

If
· ¸ · ¸
u2 v 2 u1 v 1
det 6= 0 or det 6 0,
=
u3 v 3 u3 v 3
the system is solved in a similar way, with obvious modifications.
The above theorem can be rephrased as follows: x, y, and z solve the system
    

 x u1


  y  • u 2  = 0

 z

u3
    (5.2)
 x
 v1



  y  • v 2  = 0

z v3

 
x
if and only if  y  is a point on the vector line
z
 · ¸
u2 v2 
det

 

u3 v3 
 


 

 ¸
 · 
u1 v1 
 
Span 
− det u 3
 . (5.3)

 v 3 



 · ¸

 u1 v 1 
 det

 

u2 v2

   
u1 v1
Theorem 5.1.1 tells us that, if the vectors u = u 2  and v = v 2  are linearly
u3 v3
independent, then the only vector line perpendicular to both Span{u} and Span{v}
is the vector line (5.3). This geometric interpretation motivates the definition of the
cross product, which is an important tool in R3 .
v
u
u×v
Figure 5.1: The vector line perpendicular to both Span{u} and Span{v}.
   
u1 v1
Definition 5.1.2. By the cross product of the vectors u = u 2  and v = v 2 
u3 v3
we mean the vector  · ¸ 
u2 v 2
 det u v 
 3 3 
 · ¸
 u 1 v 1 
u × v = − det
 .
 v3 v3  
 · ¸
 u1 v 1 
det
u2 v 2
Note that, unlike the dot product, the cross product of two elements from R3 is
an element from R3 .
Using the cross product, we can state Theorem 5.1.1 in a form that is easier to
remember:
If the vectors u and v in R3 are linearly independent, then the solution of the
system ½
x•u = 0
(5.4)
x•v = 0
is x = t (u × v), where t is an arbitrary real number.
The theorem can be used to solve systems of two equations with three unknowns,
as the next example illustrates.

½
2x + y + z = 0
x − y + 2z = 0
Solution. First observe that the system can be written in the form
    

 x 2


  y  • 1 = 0

z 1


    .
 x
 1



  y  • −1 = 0

z 2

Consequently, the general solution is

          
x 2 1 3 3t
 y  = t 1 × −1 = t −3 = −3t  ,
z 1 2 −3 −3t
where t is an arbitrary real number. Note that the same solution can be described
in a simpler equivalent way as
   
x s
 y  = −s  ,
z −s
where s is an arbitrary real number.
The cross product gives us an elegant and useful characterization of linear inde-
pendence of pairs of vectors in R3 .
Theorem 5.1.4. Vectors u and v in R3 are linearly independent if and only if

u × v 6= 0.
Proof. The property is a direct consequence of Theorem 4.1.12.

     
1 2 −1
0  × 1  =  1  ,
1 1 1
   
1 2
the vectors 0 and 1 are linearly independent.
1 1
On the other hand, since
     
−4 2 0
−2 × 1 = 0 ,
−2 1 0

  
−4 2
the vectors −2 and 1 are linearly dependent.
−2 1
In the next theorem we gather some algebraic properties of the cross product.
Theorem 5.1.6. Let u, v, and w be arbitrary vectors in R3 and let x and y be

arbitrary real numbers. Then
(a) u • (u × v) = 0 and v • (u × v) = 0;
(b) u × u = 0;
(c) u × v = −(v × u);
(d) (xu + yv) × w = x(u × w) + y(v × w);
(e) u • (v × w) = −v • (u × w),
(f) u • (v × w) = v • (w × u) = w • (u × v).

    
u1 v1 w1
Proof. Let u = u 2 , v = v 2 , and w = w 2 .
u3 v3 w3
Part (a) is an immediate consequence of the Theorem 5.1.1.
For (b) it suffices to note that
· ¸ · ¸ · ¸
u2 u2 u1 u1 u1 u1
det = det = det = 0.
u3 u3 u3 u3 u2 u2
From
 · ¸  · ¸  · ¸
v 2 u2 u2 v2 u2 v2
 det v u  − det u v3   det
u3 v3 
 3 3   3  
 · ¸  · ¸  · ¸
 v 1 u 1   u 1 v1  = − − det u 1
 v1 
− det v 3 u 3  =  det u 3
v×u =    ,
   v3 

 u3 v3 
 · ¸  · ¸  · ¸
 v 1 u1   u1 v1   u1 v1 
det − det det
v 2 u2 u2 v2 u2 v2
we get (c): v × u = −(u × v).

Since for any real numbers x and y we have
      
u1 v1 w1
(xu + yv) × w = x u 2  + y v 2  × w 2 
u3 v3 w3
 · ¸
xu 2 + y v 2 w 2
 det xu + y v w 
 3 3 3 
 · ¸
 xu 1 + y v 1 w 1 
= − det 

 xu 3 + y v 3 w 3 
 · ¸
 xu 1 + y v 1 w 1 
det
xu 2 + y v 2 w 2
 · ¸  · ¸
xu 2 w 2 y v2 w2
 det xu w   det y v w3 
 3 3   3 
 · ¸  · ¸
 xu 1 w 1    y v1 w1 
=− det xu 3 w 3  + − det y v 3

   w3 
 · ¸  · ¸
 xu 1 w 1   y v1 w1 
det det
xu 2 w 2 y v2 w2
= x(u × w) + y(v × w),
we obtain (d).
Part (e) can be obtained easily from the properties of the dot product and the
cross product already established. Indeed, since by (a),(b), and (c) we have
0 = (u + v) • ((u + v) × w)
= u • ((u + v) × w) + v • ((u + v) × w)
= u • (u × w) + u • (v × w) + v • (u × w) + v • (v × w)
= u • (v × w) + v • (u × w),
we have u • (v × w) = −v • (u × w).
To obtain (f) we use (c) and (e):
u • (v × w) = −v • (u × w) = v • (w × u).
The equality v • (w × u) = w • (u × v) is obtained in the same way.
The equation of a vector plane

The following theorem nicely complements Theorem 5.1.1. Both theorems provide
some insight into the geometric interpretation of the cross product.
Theorem 5.1.7. Let u and v be linearly independent vectors in R3 . Then a

vector x is in the vector plane Span{u, v} if and only if
x • (u × v) = 0 (5.5)
Proof. The vector x = su+ t v satisfies (5.5) for any real numbers s and t , by Theorem
5.1.6.
Let    
u1 v1
u=  u 2
 and v = v 2 

u3 v3
 
x1
be linearly independent vectors and let x = x 2  be such that (5.5) holds. Since u
x3
and v are linearly independent, at least one of the numbers
· ¸ · ¸ · ¸
u1 v 1 u2 v 2 u1 v 1
det , det , det
u2 v 2 u3 v 3 u3 v 3
must be different from 0, by (c) in Theorem 4.1.12. Suppose that

· ¸
u1 v 1
det 6 0.
=
u2 v 2
· ¸ · ¸
u1 v1
Then the vectors and are linearly independent and there exist real num-
u2 v2
bers s and t such that · ¸ · ¸ · ¸
x1 u1 v1
=s +t ,
x2 u2 v2
by Theorem 3.1.22. Hence
x 1 = su 1 + t v 1 and x 2 = su 2 + t v 2 . (5.6)
Since x • (u × v) = 0, u • (u × v) = 0, and v • (u × v) = 0, we have
x • (u × v) − su • (u × v) − t v • (u × v) = (x − su − t v) • (u × v) = 0.
Consequently, using (5.6), we obtain

 ¸ ·
u2 v 2
det
   u3 v 3 
x 1 − su 1 + t v 1  ·

¸
• u 1 v 1 
0 = x 2 − su 2 + t v 2  
− det u 3 v 3 
 
x 3 − su 3 + t v 3 
 
· ¸
 u1 v 1 
det
u2 v 2
 · ¸
u2 v 2
det
   u3 v 3 
0   ·

¸
• u 1 v 1 
= 0  − det u 3 v 3 
 
x 3 − su 3 − t v 3 
 
· ¸
 u1 v 1 
det
u2 v 2
· ¸
u1 v 1
= (x 3 − su 3 − t v 3 ) det .
u2 v 2
· ¸
u1 v 1
Since det 6= 0, we must have
u2 v 2
x 3 − su 3 − t v 3 = 0.
This, together with (5.6), gives us

x 1 = su 1 + t v 1
x = su 2 + t v 2
 2
x 3 = su 3 + t v 3
which is equivalent to
     
x1 u1 v1
x 2  = s u 2  + t v 2  .
x3 u3 v3
· ¸
u1 v 1
This shows that, if det 6= 0, then there are numbers s and t such that x =
u2 v 2
· ¸ · ¸
u2 v 2 u1 v 1
su+t v. The other cases, that is, when det 6= 0 or det 6= 0, are treated
u3 v 3 u3 v 3
in a similar way with appropriate modifications.
n = u×v
.
0
x
u
Figure 5.2: The vector plane Span{u, v} is the set of all points x such that the angle
∠n0x is a right angle.
The above theorem gives the following geometric interpretation of a vector plane:
If u and v are linearly independent, then the vector plane Span{u, v} consists of all
vectors perpendicular to the vector line Span{u×v}. In other words, the vector plane
Span{u, v} is the set of all points x such that the angle ∠n0x, where n = u×v, is a right
angle, see Fig. 5.2.
Example 5.1.8. Find an equation of the vector plane which contains the vectors
   
2 1
u = 1 and v = 5.
4 3
 
−17
Solution. Since u × v =  −2, the vector plane can be described by the equation
9
   
x −17
 y  •  −2 = 0
z 9
or −17x − 2y + 9z = 0.
   
1 3
Example 5.1.9. We consider the vectors u = 2 and v = 1. Find a real number
2 2
 
a
a such that the vector and 2a + 1 is in Span{u, v}.
1
 
2
Solution. First we find that u × v =  4. According to the Theorem 5.1.7, the
−5
 
a
vector 2a + 1 is in Span{u, v} if and only if
1
     
a a 2
2a + 1 • (u × v) = 2a + 1 •  4 = 2a + 8a + 4 − 5 = 10a − 1 = 0.
1 1 −5
This gives us a = 0.1 and thus

   
a 0.1
2a + 1 = 1.2 .
1 1
 
1
Note that the vector 12 must also be an element of the vector plane Span{u, v}.
10
It is easy to check that

     
1 1 3
12 = 7 2 − 2 1 .
10 2 2
In the previous chapter we considered the question of when two pairs of vectors
span the same vector subspace. The following theorem complements that discus-
sion.
Theorem 5.1.10. Let a, b, u, and v be vectors in R3 such that the vectors a and
b are linearly independent and the vectors u and v are linearly independent.
Then the following conditions are equivalent
(a) {a, b} is a basis of the vector plane Span{u, v};
(b) There is a real number λ 6= 0 such that a × b = λ(u × v).
Proof. If a, b are elements of Span{u, v}, then there are real numbers q, r, s, t such
that
a = qu + r v and b = su + t v
and, using Theorem 5.1.6, we obtain
a × b = (qu + r v) × (su + t v)
= q s(u × u) + q t (u × v) + r s(v × u) + r t (v × v)
= q t (u × v) + r s(v × u)
= (q t − r s)(u × v).
If we let λ = q t −r s, then we have a×b = λ(u×v). Moreover, if the vectors a and b are
linearly independent, then we must have λ 6= 0, by Theorem 5.1.4. Thus (a) implies
(b).
If there is a real number λ 6= 0 such that a×b = λ(u×v), then we have x·(a×b) = 0
if and only if x · (u × v) = 0. Consequently
by Theorem 5.1.7. Thus (b) implies (a).
   
 −1 3 
Example 5.1.11. Show that the set  7 , 0 is a basis of the vector plane
0 7
 
   
 1 1 
Span 2 , −1 .
3 2
 
Solution. Since      
1 1 7
2 × −1 =  1
3 2 −3
and        
−1 3 49 7
 7 × 0 =  7 = 7  1 ,
0 7 −21 −3
       
 −1 3   1 1 
 7 , 0 is a basis of the vector plane Span 2 , −1 , by Theorem 5.1.10.
0 7 3 2
   
Determinants of 3 × 3 matrices
From Theorem 5.1.6 it follows that
a • (b × c) = b • (c × a) = c • (a × b)
for arbitrary vectors a, b, c in R3 . At this point it is not obvious, but this mixed prod-
uct extends the notion of the determinant of a 2 × 2 matrix to 3 × 3 matrices.
Definition 5.1.12. Let a, b, and c be arbitrary column vectors in R3 . The

common value
a • (b × c) = b • (c × a) = c • (a × b) (5.7)
£ ¤
is called the determinant of the 3 × 3 matrix a b c and is denoted by
£ ¤
det a b c .
According to the definition of the determinant we have

       
a1 b1 c1 a1 b1 c1
det a 2 b 2 c 2  = a 2  • b 2  × c 2 
a3 b3 c3 a3 b3 c3
 · ¸
b2 c2
 det b c 
   3 3 
a1  · ¸
 b 1 c 1 
= a 2 − det
  •  
b3 c3 
a3  
·

¸
 b1 c1 
det
b2 c2
· ¸ · ¸ · ¸
b2 c2 b1 c1 b1 c1
= a 1 det − a 2 det + a 3 det .
b3 c3 b3 c3 b2 c2
The identity
 
a1 b1 c1 · ¸ · ¸ · ¸
b2 c2 b1 c1 b1 c1
det a 2 b 2 c 2  = a 1 det − a 2 det + a 3 det (5.8)
b3 c3 b3 c3 b2 c2
a3 b3 c3
connects determinants of 3 × 3 matrices with determinants of 2 × 2 matrices. It also

gives us a practical way of calculating determinants of 3 × 3 matrices.
Example 5.1.13. For

   
2 1 −3
a = 1 , b = −2 , and c =  1 .
3 2 −1
£ ¤
calculate the determinant det a b c .
Solution.
 
£ ¤ 2 1 −3
det a b c = det 1 −2 1
3 2 −1
· ¸ · ¸ · ¸
−2 1 1 −3 1 −3
= 2 det − det +3
2 −1 2 −1 −2 1
= 2 · 0 − 5 + 3 · (−5)
= −20
In the following theorem we list some useful properties of determinants.

Theorem 5.1.14. For any a, b, c, d in R3 and s, t in R we have

£ ¤ £ ¤ £ ¤
(a) det a a b = det a b a = det a a b = 0;
£ ¤ £ ¤ £ ¤ £ ¤
(b) det a b c = − det a c b = − det c b a = − det b a c ;
£ ¤ £ ¤ £ ¤ £ ¤
(c) t det a b c = det t a b c ) = det a t b c = det a b t c ;
£ ¤ £ ¤ £ ¤
(d) det a + d b c = det a b c + det d b c ;
£ ¤ £ ¤ £ ¤
(e) det a b + d c = det a b c + det a d c ;
£ ¤ £ ¤ £ ¤
(f) det a b c + d = det a b c + det a b d ;
£ ¤ £ ¤
(g) det a + sc b + t c c = det a b c ;
£ ¤ £ ¤
(h) det a + sb b c + t b = det a b c ;
£ ¤ £ ¤
(i) det a b + sa c + t a = det a b c .
Proof. These identities follow easily from the definition of the determinant. We
prove (g ) and leave the other proofs as exercises. In the proof we are using the defi-
nition of the determinant and Theorem 5.1.6
£ ¤
det a + sc b + t c c = c · ((a + sc) × (b + t c))
= c · (a × b + t (a × c) + s(c × b) + st (c × c))
= c · (a × b + t (a × c) + s(c × b))
= c · (a × b) + t (c · (a × c)) + s(c · (c × b))
£ ¤
= c · (a × b) = det a b c
Here is another useful property of determinants.
Theorem 5.1.15. For any 3 × 3 matrix A we have
det A T = det A,
that is,    
a1 b1 c1 a1 a2 a3
det  a 2 b2 c 2  = det  b 1 b2 b3  .
a3 b3 c3 c1 c2 c3
Proof. The identity can be obtained directly from the definition of the determinant
by straightforward calculations.
Now we show that the determinant of the product of two 3 × 3 matrices is equal
to the product of determinants of those matrices. We proved the same property
for determinants of 2 × 2 matrices in Theorem 1.3.4. In both cases the proofs are
based on direct calculations. As expected, the proof for 3 × 3 matrices is much more
tedious.
Theorem 5.1.16. Let A and B be arbitrary 3 × 3 matrices. Then
det(AB ) = det(A) det(B ).
Proof. Let
   
a1 b1 c1 s1 t1 u1
A =  a2 b2 c2  and B =  s 2 t2 u2 
a3 b3 c3 s3 t3 u3
and      
a1 b1 c1
a = a 2  , b = b 2  and c = c 2  .
a3 b3 c3
Then
  
a1 b1 c1 s1 t1 u1
det(AB ) = det  a 2 b 2 c 2   s 2 t 2 u 2 
a3 b3 c3 s3 t3 u3
£ ¤
= det s 1 a + s 2 b + s 3 c t 1 a + t 2 b + t 3 c u 1 a + u 2 b + u 3 c
£ ¤
= det s 1 a t 1 a + t 2 b + t 3 c u 1 a + u 2 b + u 3 c
£ ¤
+ det s 2 b t 1 a + t 2 b + t 3 c u 1 a + u 2 b + u 3 c
£ ¤
+ det s 3 c t 1 a + t 2 b + t 3 c u 1 a + u 2 b + u 3 c
£ ¤
= det s 1 a t 2 b + t 3 c u 2 b + u 3 c
£ ¤
+ det s 2 b t 1 a + t 3 c u 1 a + u 3 c
£ ¤
+ det s 3 c t 1 a + t 2 b u 1 a + u 2 b
£ ¤ £ ¤
= det s 1 a t 2 b u 3 c + det s 1 a t 3 c u 2 b
£ ¤ £ ¤
+ det s 2 b t 1 a u 3 c + det s 2 b t 3 c u 1 a
£ ¤ £ ¤
+ det s 3 c t 1 a u 2 b + det s 3 c t 2 b u 1 a
£ ¤ £ ¤
= det s 1 a t 2 b u 3 c − det s 1 a u 2 b t 3 c
£ ¤ £ ¤
− det t 1 a s 2 b u 3 c + det u 1 a s 2 b t 3 c
£ ¤ £ ¤
+ det t 1 a u 2 b s 3 c − det u 1 a t 2 b s 3 c
£ ¤ £ ¤
= s 1 t 2 u 3 det a b c − s 1 t 3 u 2 det a b c
£ ¤ £ ¤
− s 2 t 1 u 3 det a b c + s 2 t 3 u 1 det a b c
£ ¤ £ ¤
+ s 3 t 1 u 2 det a b c − s 3 t 2 u 1 a b c
£ ¤
= det a b c (s 1 (t 2 u 3 − t 3 u 2 ) − s 2 (t 1 u 3 − t 3 u 1 ) + s 3 (t 1 u 2 − t 2 u 1 ))
 
£ ¤ s1 t1 u1
= det a b c det  s 2 t 2 u 2 
s3 t3 u3
= det(A) det(B ).
£ ¤
In Section 3 we will see that the determinant det a b c tells us something im-
portant about the vectors a, b, c and has many useful properties. Later we will also
show that the determinant of a 3 × 3 matrix can be interpreted as the volume of a
tetrahedron.
5.1.1 Exercises
Calculate the given cross products.

       
5 1 2 3
1. 2 ×  4 3. 1 × −2
7 −2 3 5
       
3 1 2 1
2. 2 × 2 4. 0 × 1
1 3 7 1
Solve the given systems of equations.

½ ½
x + 2y − 4z = 0 5x + 2y + 7z = 0
5. 7.
x − 2y + 5z = 0 x + y + 2z = 0
½ ½
4x − y − z = 0 y + z =0
6. 8.
x + y − 2z = 0 x + y + 5z = 0
Find an equation of the given vector planes.

       
 1 −1   2 3 
9. Span 1 ,  2 11. Span 1 ,  2
1 1 5
  
−4

       
 2 5   1 1 
10. Span 1 , 2 12. Span −2 ,  1
1 1 2
  
−1

Calculate the determinant det A.

   
1 1 3 4 2 −3
13. A = 2 5 −1 15. A = −2 1 2
1 4 2 3 7 −1
   
3 1 2 2 1 1
14. A = 2 3 1 16. A = 1 2 1
1 2 3 1 1 2
Show that the following identities hold for arbitrary vectors a, b, c and d, and for
arbitrary numbers s and t .
£ ¤ £ ¤ £ ¤
17. det a a b = det a b a = det b a a = 0
£ ¤ £ ¤ £ ¤ £ ¤
18. det a b c = − det a c b = − det c b a = − det b a c
£ ¤ £ ¤ £ ¤
19. det a + d b c = det a b c + det d b c
£ ¤ £ ¤ £ ¤
20. det a b c + d = det a b c + det a b d
£ ¤ £ ¤
21. det a + sb b c + t b = det a b c
£ ¤ £ ¤
22. det a b + sa c + t a = det a b c
£ ¤ £ ¤
23. det sa t b uc = st u det a b c
£ ¤ £ ¤
24. det a a + b a + b + c = det a b c
25. Show that

   
a1 b1 c1 a 1 + sa 2 b 1 + sb 2 c 1 + sc 2
det  a 2 b2 c 2  = det  a2 b2 c2  .
a3 b3 c3 a3 + t a2 b3 + t b2 c3 + t c2
26. Show that

   
a1 b1 c1 a1 b1 c1
det  a 2 b2 c 2  = det  a 2 + sa 1 b 2 + sb 1 c 2 + sc 1  .
a3 b3 c3 a3 + t a1 b3 + t b1 c3 + t c1
 
2 1 2
27. Show that det 2 4 2 = 0 without calculating the determinant.
7 5 7
 
1 2 1
28. Show that det 2 4 1 = 0 without calculating the determinant.
5 10 2
 
a b c
If det p q r  = 33, find the following determinants.
x y z
   
a + 9b b c 3a + 5b b c
29. det p + 9q q r  31. det 3p + 5q q r 
x + 9y y z 3x + 5y y z
   
5a b c a +b +c b c
30. det 5p q r  32. det p + q + r q r 
5x y z x +y +z y z
33. Show that

    
1 s 0 a1 b1 c1 a1 b1 c1
det 0 1 0 a 2 b 2 c 2  = det a 2 b 2 c 2 
0 t 1 a3 b3 c3 a3 b3 c3
34. Show that

    
a1 b1 c1 1 0 s a1 b1 c1
det a 2 b 2 c 2  0 1 t  = det a 2 b 2 c 2  .
a3 b3 c3 0 0 1 a3 b3 c3
5.2 Calculating inverses and determinants of 3 × 3 ma-

trices
We have seen that solving problems often requires calculating determinants of ma-
trices or inverse matrices. In this section we present some practical methods for
calculating determinants and inverses of 3 × 3 matrices.
Recall that the product of two 3 × 3 matrices is defined as follows:
      
£ ¤ a1 £ ¤ b1 £ ¤ c1
 s 1 t 1 u 1 a 2  s 1 t 1 u 1 b 2  s 1 t 1 u 1 c 2 
 

 a3 b3 c3 
         
s1 t1 u1 a1 b1 c1 
£ ¤ a 1 £ ¤ b 1 £ ¤ c 1 

s 2 t 2 u 2  a 2 b 2 c2  =  s 2 t 2 u 2
a 2  s 2 t 2 u 2 b 2  s 2 t 2 u 2 c 2 
 
s3 t3 u3 a3 b3 c3 
 a3 b3 c3 
      

£ ¤ a 1 £ ¤ b 1 £ ¤ c 1 

 s 3 t 3 u 3 a 2  s 3 t 3 u 3 b 2  s 3 t 3 u 3 c 2 
a3 b3 c3
5.2. CALCULATING INVERSES AND DETERMINANTS OF 3 × 3 MATRICES 251
or equivalently
           
s1 a1 s1 b1 s1 c1
 t 1  • a 2   t 1  • b 2   t 1  • c 2 
 
 u a3 u1 b3 u1 c3 
 1 
              
s1 t1 u1 a1 b1 c1  s2
 a1 s2 b1 s2 c1  
s 2 t 2 u 2  a 2 b 2 c2  =   t 2  • a 2   t 2  • b 2   t 2  • c 2  . (5.9)
 
s3 t3 u3 a3 b3 c3  u
 2 a3 u2 b3 u2 c3  
           
 s3 a1 s3 b1 s3 c1 
 
 t 3  •  a 2   t 3  • b 2   t 3  • c 2 
u3 a3 u3 b3 u3 c3
Theorem 5.2.1. Let  

a 11 a 12 a 13
A = a 21 a 22 a 23 
a 31 a 32 a 33
be an arbitrary 3 × 3 matrix and let
           
a 21 a 31 a 31 a 11 a 11 a 21
B = a 22  × a 32  a 32  × a 12  a 12  × a 22  .
a 23 a 33 a 33 a 13 a 13 a 23
Then  
det A 0 0
AB = B A =  0 det A 0 .
0 0 det A
Proof. To simplify the calculations we denote

     
a 11 a 21 a 31
A 1 = a 12  , A 2 = a 22  , and A 3 = a 32  .
a 13 a 23 a 33
Now we can write

 T
A1
A =  A2  , AT = A1 A2 A3 ,
 T £ ¤ £ ¤
and B = A 2 × A 3 A 3 × A 1 A 1 × A 2 .
A T3
From the definition of the determinant of a 3 × 3 matrix we have

£ ¤
det A 1 A 2 A 3 = A 1 • (A 2 × A 3 ) = A 2 • (A 3 × A 1 ) = A 3 • (A 1 × A 2 )
and, by Theorem 5.1.15, we have
det A 1 A 2 A 3 = det A T = det A.

£ ¤
Now, from (5.9) it follows that

 
A 1 • (A 2 × A 3 ) A 1 • (A 3 × A 1 ) A 1 • (A 1 × A 2 )
AB =  A 2 • (A 2 × A 3 ) A 2 • (A 3 × A 1 ) A 2 • (A 1 × A 2 )
A 3 • (A 2 × A 3 ) A 3 • (A 3 × A 1 ) A 3 • (A 1 × A 2 )
 
A 1 • (A 2 × A 3 ) 0 0
= 0 A 2 • (A 3 × A 1 ) 0 
0 0 A 3 • (A 1 × A 2 )
 
det A 0 0
= 0 det A 0 .
0 0 det A
Next we show that  

det A 0 0
BA= 0 det A 0 .
0 0 det A
For this part of the proof we define
     
a 11 a 12 a 13
C 1 = a 21  , C 2 = a 22  , C 3 = a 23 
a 31 a 32 a 33
and note that

           
a 21 a 31 a 31 a 11 a 11 a 21
B = a 22  × a 32  a 32  × a 12  a 12  × a 22 
a 23 a 33 a 33 a 13 a 13 a 23
 · ¸ · ¸ · ¸
a 22 a 23 a 12 a 13 a 12 a 13
 det a 32 a 33 − det
a 32 a 33
det
a 22 a 23 
 
 
 · ¸ · ¸ · ¸
a 21 a 23 a 11 a 13 a 11 a 13 
 
= − det det − det

a 31 a 33 a 31 a 33 a 21 a 23 


 
 
 · ¸ · ¸ · ¸
 a 21 a 22 a 11 a 12 a 11 a 12 
det − det det
a 31 a 32 a 31 a 32 a 21 a 22
£ ¤T
= C 2 ×C 3 C 3 ×C 1 C 1 ×C 2 .
Since
 T
C1
A T = C 2T  and C 1 • (C 2 ×C 3 ) = C 2 • (C 3 ×C 1 ) = C 3 • (C 1 ×C 2 ) = det A,
 
C 3T
we have
 T T
C
£ ¤T  1T 
B A = C 2 ×C 3 C 3 ×C 1 C 1 ×C 2 C 2 
C 3T
 T  T
C1
 T  £ ¤
= C 2  C 2 ×C 3 C 3 ×C 1 C 1 ×C 2 
C 3T
 T
C 1 • (C 2 ×C 3 ) C 1 • (C 3 ×C 1 ) C 1 • (C 1 ×C 2 )
= C 2 • (C 2 ×C 3 ) C 2 • (C 3 ×C 1 )
 C 2 • (C 1 ×C 2 )
C 3 • (C 2 ×C 3 ) C 3 • (C 3 ×C 1 ) C 3 • (C 1 ×C 2 )
 T
C 1 • (C 2 ×C 3 ) 0 0
=  0 C 2 • (C 3 ×C 1 ) 0 
0 0 C 3 • (C 1 ×C 2 )
 T
det A 0 0
=  0 det A 0 
0 0 det A
 
det A 0 0
= 0 det A 0 .
0 0 det A

3 1 2
Example 5.2.2. We consider the matrix 1 −1 4. Since
1 0 3
     
3 1 6
1 × −1 = −10 ,
2 4 −4
     
1 1 −3
−1 × 0 =  1 ,
4 3 1
     
1 3 −3
0  × 1  =  7  ,
3 2 1
and  
3 1 2
det 1 −1 4 = −6,
1 0 3
we have
       
3 1 2 −3 −3 6 −3 −3 6 3 1 2 −6 0 0
1 −1 4  1 7 −10 =  1 7 −10 1 −1 4 =  0 −6 0 .
1 0 3 1 1 −4 1 1 −4 1 0 3 0 0 −6
Calculating inverses
From Theorem 5.3.10 we obtain that a 3×3 matrix A is invertible if and only if det A 6=
0.
Now Theorem 5.2.1 gives us a practical method of calculating A −1 .
Theorem 5.2.3. Let  

a 11 a 12 a 13
A = a 21 a 22 a 23  .
a 31 a 32 a 33
If the matrix A is invertible we have
           
a 21 a 31 a 31 a 11 a 11 a 21
1
A −1 = a 22  × a 32  a 32  × a 12  a 12  × a 22  .
det A
a 23 a 33 a 33 a 13 a 13 a 23
Proof. The theorem is a direct consequence of Theorem 5.2.1.
Definition 5.2.4. For an arbitrary 3 × 3 matrix

 
a 11 a 12 a 13
A = a 21 a 22 a 23 
a 31 a 32 a 33
the matrix
 · ¸ · ¸ · ¸
a 22 a 23 a 12 a 13 a 12 a 13
det − det det

 a 32 a 33 a 32 a 33 a 22 a 23 
 · ¸ · ¸ · ¸
 a 21 a 23 a 11 a 13 a 11 a 13 
adj A = 
− det a 31 det − det 
 a 33 a 31 a 33 a 21 a 23 
 · ¸ · ¸ · ¸
 a 21 a 22 a 11 a 12 a 11 a 12 
det − det det
a 31 a 32 a 31 a 32 a 21 a 22
is called the adjoint of the matrix A.

We note that this is the same matrix that was used in Theorem 5.2.1, where it was
written as            
a 21 a 31 a 31 a 11 a 11 a 21
a 22  × a 32  a 32  × a 12  a 12  × a 22  .
a 23 a 33 a 33 a 13 a 13 a 23
We are going to examine the construction of this matrix more carefully. To this
end we first denote by A i j the 2 × 2 matrix obtained from the matrix A by deleting
the i -th row and the j -th column, that is,
· ¸ · ¸ · ¸
a 22 a 23 a 12 a 13 a 12 a 13
A 11 = , A 21 = , A 31 = ,
a 32 a 33 a 32 a 33 a 22 a 23
· ¸ · ¸ · ¸
a 21 a 23 a 11 a 13 a 11 a 13
A 12 = , A 22 = , A 32 = ,
a 31 a 33 a 31 a 33 a 21 a 23
· ¸ · ¸ · ¸
a 21 a 22 a 11 a 12 a 11 a 12
A 13 = , A 23 = , A 33 = .
a 31 a 32 a 31 a 32 a 21 a 22
Then we consider the matrix

 
det A 11 det A 12 det A 13
det A 21 det A 22 det A 23 
det A 31 det A 32 det A 33
and change the sign of every other entry of this matrix according to the following
pattern
 
+ − +
− + −
+ − +
to obtain  
det A 11 − det A 12 det A 13
− det A 21 det A 22 − det A 23  .
Finally we transpose the above matrix and obtain the adjoint matrix A:
 
adj A = − det A 12 det A 22 − det A 32  .
In Theorem 5.2.1 we show that

 
det A 0 0
AB = B A =  0 det A 0 ,
0 0 det A
where B = adj A. Consequently, for any matrix A such that det A 6= 0, we have
1
A −1 = adj A. (5.10)
det A

 
2 4 3
A = 5 1 −1 .
7 3 −2
First we find
· ¸ · ¸ · ¸
1 −1 5 −1 5 1
det A 11 = det = 1, det A 12 = det = −3, det A 13 = det = 8,
3 −2 7 −2 7 3
· ¸ · ¸ · ¸
4 3 2 3 2 4
det A 21 = det = −17, det A 22 = det = −25, det A 23 = det = −22,
3 −2 7 −2 7 3
· ¸ · ¸ · ¸
4 3 2 3 2 4
det A 31 = det = −7, det A 32 = det = 17, det A 33 = det = −17.
1 −1 5 −1 5 1
This means that

   
det A 11 det A 12 det A 13 1 −3 8
det A 21 det A 22 det A 23  = −17 −25 −22 .
det A 31 det A 32 det A 33 −7 −17 −18
Now we change the sign of every other entry of this matrix and obtain
   
det A 11 − det A 12 det A 13 1 3 8
− det A 21 det A 22 − det A 23  =  17 −25 22 .
det A 31 − det A 32 det A 33 −7 17 −18
The adjoint of the matrix A is the transpose of the above matrix:

 T  
1 3 8 1 17 −7
adj A =  17 −25 22 = 3 −25 17 .
−7 17 −18 8 22 −18
Since
       
2 4 3 1 17 −7 1 17 −7 2 4 3 38 0 0
5 1 −1 3 −25 17 = 3 −25 17 5 1 −1 =  0 38 0 ,
7 3 −2 8 22 −18 8 22 −18 7 3 −2 0 0 38
we have
 −1  
2 4 3 1 17 −7
5 1 −1 = 1 3 −25 17 .
38
7 3 −2 8 22 −18
Example 5.2.6. The adjoint of the matrix

 
3 −1 2
1 −2 1
2 1 6
is  
−13 8 3
 −4 14 −1
5 −5 −5
and  
3 −1 2
det 1 −2 1 = −25.
2 1 6
Consequently,
 −1  
3 −1 2 −13 8 3
1
1 −2 1 = −  −4 14 −1
25
2 1 6 5 −5 −5
Calculating determinants of 3 × 3 matrices

Let  
a 11 a 12 a 13
A = a 21 a 22 a 23  .
a 31 a 32 a 33
The identity
    
a 11 a 12 a 13 det A 11 − det A 21 det A 31 det A 0 0
a 21 a 22 a 23  − det A 12 det A 22 − det A 32  =  0 det A 0
a 31 a 32 a 33 det A 13 − det A 23 det A 33 0 0 det A
gives us a convenient method for evaluating the determinant. Since the entry in
 
det A 0 0
the upper left corner of the matrix  0 det A 0 is the matrix product of the
0 0 det A
first row of A and the first column of A −1 , we obtain the following formula for the
determinant of A:
det A = a 11 det A 11 − a 12 det A 12 + a 13 det A 13

· ¸ · ¸ · ¸
a 22 a 23 a 21 a 23 a 21 a 22
= a 11 det − a 12 det + a 13 det
a 32 a 33 a 31 a 33 a 31 a 32
| {z }
the expansion across the first row
If we use the second row of A and the second column of A −1 we obtain
det A = −a 21 det A 21 + a 22 det A 22 − a 23 det A 23

· ¸ · ¸ · ¸
a 12 a 13 a 11 a 13 a 11 a 12
= −a 21 det + a 22 det − a 23 det
a 22 a 23 a 31 a 33 a 31 a 32
| {z }
the expansion across the second row
and if we use the third row of A and the third column of A −1 we get

· ¸ · ¸ · ¸
a 12 a 13 a 11 a 13 a 11 a 12
= a 31 det − a 32 det + a 33 det .
a 22 a 23 a 21 a 23 a 21 a 22
| {z }
the expansion across the third row
Similarly, if we use
    
det A 11 − det A 21 det A 31 a 11 a 12 a 13 det A 0 0
− det A 12 det A 22 − det A 32  a 21 a 22 a 23  =  0 det A 0 ,
det A 13 − det A 23 det A 33 a 31 a 32 a 33 0 0 det A
we get

· ¸ · ¸ · ¸
a 22 a 23 a 12 a 13 a 12 a 13
= a 11 det − a 21 det + a 31 det ,
a 32 a 33 a 22 a 23 a 22 a 23
| {z }
the expansion down the first column

· ¸ · ¸ · ¸
a 21 a 23 a 11 a 13 a 11 a 13
= −a 12 det + a 22 det − a 32 det ,
a 31 a 33 a 31 a 33 a 21 a 23
| {z }
the expansion down the second column
and

· ¸ · ¸ · ¸
a 21 a 22 a 11 a 12 a 11 a 12
= a 13 det − a 23 det + a 33 det .
a 31 a 32 a 31 a 32 a 21 a 22
| {z }
the expansion down the third column
We observe that in all these expansions we multiply a i j by (−1)i + j det A i j . This

means that when i + j is an even number we have (−1)i + j a i j det A i j = a i j det A i j
and that when i + j is an odd number we have (−1)i + j a i j det A i j = −a i j det A i j .
Example 5.2.7. For the matrix

 
2 4 3
A = 5 1 −1
7 3 −2
we have

| {z }
the expansion across the first row
= 2 · 1 − 4(−3) + 3 · 8 = 38,

| {z }
= (−5) · (−17) + 1 · (−25) − (−1)(−22) = 38,

| {z }
the expansion across the third row
= 7 · (−7) − 3 · (−17) + (−2) · (−18) = 38,

| {z }
the expansion down the first column
= 2 · 1 − 5 · (−17) + 7 · (−7) = 38

| {z }
the expansion down the second column
= −4 · (−3) + 1 · (−25) − 3 · (−17) = 38,

| {z }
the expansion down the third column
= 3 · 8 − (−1) · (−22) + (−2) · (−18) = 38.

When it is necessary to find the determinant of a 3 × 3 matrix, we can use one of

the six formulas. Since they all give the same result, it does not matter which one we
choose. In practice, we look for the one where the calculations are simpler. In the
above example it seems that

| {z }
= (−5) · (−17) + 1 · (−25) − (−1)(−22) = 38
is probably the best choice, since we have 1 and −1 in the second row of A and
multiplication by 1 or −1 is easy. The situation is even simpler, if the matrix A has
some zero entries.
Example 5.2.8. For the matrix

 
1 2 3
A = 0 1 2
3 5 −2
we find
· ¸ · ¸
1 3 1 2
det A = det A 22 − 2 det A 23 = det − 2 det = −11 − 2 · (−1) = −9.
3 −2 3 5
For the matrix  

1 5 7
A = 0 3 −2
0 1 2
we have · ¸
3 −2
det A = det A 11 = det = 8.
1 2
Cramer’s Rule
We close this section with the statement of Cramer’s Rule for systems of three equa-
tions with three variables. In the proof we use the method for calculating the inverse
matrix introduced in this section.
Theorem 5.2.9 (Cramer’s Rule). Let

 
a 11 a 12 a 13
A = a 21 a 22 a 23 
a 31 a 32 a 33
be an arbitrary 3 × 3 matrix. If det A 6= 0, then the system of equations


 a 11 x + a 12 y + a 13 z = b 1
a x + a 22 y + a 23 z = b 2
 21
a 31 x + a 32 y + a 33 z = b 3
has a unique solution for any real numbers b 1 , b 2 , and b 3 . The solution is
     
b 1 a 12 a 13 a 11 b 1 a 13 a 11 a 12 b 1
det b 2 a 22 a 23  det a 21 b 2 a 23  det a 21 a 22 b 2 
b 3 a 32 a 33 a 31 b 3 a 33 a 31 a 32 b 3
x= ,y= ,z= .
det A det A det A
Proof. If det A 6= 0 then the matrix A is invertible and the system has a unique solu-
tion, by Theorem 2.3.22. The unique solution is
   −1  
x a 11 a 12 a 13 b1
 y  = a 21 a 22 a 23  b 2 
z a 31 a 32 a 33 b3
  
det A 11 − det A 21 det A 31 b 1
1 
= − det A 12 det A 22 − det A 32  b 2 
det A
det A 13 − det A 23 det A 33 b 3
 
b det A 11 − b 2 det A 21 + b 3 det A 31
1  1
= −b 1 det A 12 + b 2 det A 22 − b 3 det A 32  .
det A
b 1 det A 13 − b 2 det A 23 + b 3 det A 33
Since
 
b 1 a 12 a 13
b 1 det A 11 − b 2 det A 21 + b 3 det A 31 = det b 2 a 22 a 23  ,
b 3 a 32 a 33
 
a 11 b 1 a 13
−b 1 det A 12 + b 2 det A 22 − b 3 det A 32 = det a 21 b 2 a 23  ,
a 31 b 3 a 33
and
 
a 11 a 12 b 1
b 1 det A 13 − b 2 det A 23 + b 3 det A 33 = det a 21 a 22 b 2  ,
a 31 a 32 b 3
we obtain
     
b 1 a 12 a 13 a 11 b 1 a 13 a 11 a 12 b 1
det b 2 a 22 a 23  det a 21 b 2 a 23  det a 21 a 22 b 2 
b 3 a 32 a 33 a 31 b 3 a 33 a 31 a 32 b 3
x= , y= , and z = .
det A det A det A
5.2.1 Exercises
 
det A 0 0
For a given matrix A find a matrix B such that AB =  0 det A 0, using
0 0 det A
Theorem 5.2.1.
   
2 1 1 3 4 2
1. A = 1 3 2 3. A = 2 3 1
1 2 1 4 1 2
   
1 2 1 3 1 2
2. A = 2 1 2 4. A = 2 1 1
1 1 2 5 2 3
Find the inverse of the given matrix A using Theorem 5.2.3.

   
1 1 1 4 2 1
5. A = 2 1 1 6. A = 1 1 1
1 1 5 2 1 3
Find the adjoint of the given matrix A using Definition 5.2.4.

   
4 1 1 1 1 1
7. A = 2 2 1 8. A = 1 0 1
3 1 5 1 2 0
Find the inverse of the given matrix A using (5.10).

   
1 1 1 1 3 1
9. A = 1 0 1 10. A = 2 1 1
1 1 0 1 1 2
 
4 3 2
11. Calculate the determinant of the matrix A = 2 1 5 using the expansion
1 1 3
across the first row.
 
4 3 2
1 1 3
across the third row.
 
4 3 2
1 1 3
down the second column.
 
4 1 1
2 3 1
down the first column.
 
4 1 1
2 3 1
across the second row.
 
4 1 1
2 3 1
across the first row.
 
2 1 1
0 3 0
across the third row.
 
2 1 1
0 3 0
across the second row.
 
4 0 1
1 0 2
down the second column.
 
4 0 1
1 0 2
down the first column.
Solve the given system of linear equations using Theorem 5.2.9.

 
 2x + y + 2z = 1  x + 4y + z = 0
21. x + 3y + z = 0 22. 2x + 3y + z = 1
3x + 2y + 4z = 0 x + 2y + 3z = 1
 
 
 x + y + 2z = 0 x + z =0
23. x + 2y + z = 1 24. y + 2z = 0
3x + 2y + 2z = 0 x + 2y + 3z = 5
 
5.3 Linear dependence of three vectors in R3

Linear dependence and independence of vectors was first considered in Section 3.1
in the context of vectors in R2 . In Section 4.1 we observed that the definition used
in R2 makes perfect sense in R3 . However, in R3 there are possibilities that did not
exist in R2 . In particular, in R3 it makes sense to talk about linear dependence and
independence of three vectors.
If the vector u is in Span{v, w}, then there are real numbers s and t such that
u = sv + t w which makes the vector u dependent of the vectors v and w. In the same
way, if the vector v is in Span{u, w}, then the vector v is dependent of the vectors
u and w and, if the vector w is in Span{u, v}, then the vector w is dependent of the
vectors u and w. This suggests the following extension of the definition of linear
dependence of vectors to three vectors.
Definition 5.3.1. Vectors u, v, and w in R3 are linearly dependent if at least

(a) the vector u is in Span{v, w};
(b) the vector v is in Span{u, w};
(c) the vector w is in Span{u, v}.
The definition says that the vectors u, v, and w in R3 are linearly dependent if
there are real numbers a and b such that u = av + bw or there are real numbers c
and d such that v = cu + d w or there are real numbers e and f such that w = eu + f v.

     
−4 1 2
 2  = 2 1  − 3  0  ,
3 0 −1
           
−4  1 2  1 −2
the vector  2 is in Span 1 ,  0 and consequently the vectors 1,  0,
3 0 0 1

−1

 
−4
and  2 are linearly dependent.
3
The following theorem is a version of Theorem 4.1.8 for three vectors. Note the
5.3. LINEAR DEPENDENCE OF THREE VECTORS IN R3 265
similarities and the differences between these two theorems.
Theorem 5.3.3. Let u, v, and w be vectors in R3 . The following conditions are

equivalent
(a) The vectors u, v and w are linearly dependent;
(b) The equation

xu + yv + zw = 0
solution x = y = z = 0;
(c) £ ¤
det u v w = 0.
Proof. First we prove that conditions (a) and (b) are equivalent. If the vectors u, v,
and w are linearly dependent, then one of the vectors is in the span of the remaining
two. If u is in Span{v, w}, then u = sv+ t w for some real numbers s and t . This means
that
−u + sv + t w = 0,
so the equation xu + yv + zw = 0 has a nontrivial solution, namely x = −1, y = s, and
z = t . The cases when v is in Span{u, w} or w is in Span{u, v} are proved in the same
way.
Now suppose that xu+ yv+zw = 0 for some x, y, and z, not all equal to 0. If x 6= 0,
then
y z
u = − v − w,
x x
which means that u is in Span{v, w} and thus the vectors u, v and w are linearly de-
pendent. If y 6= 0 or z 6= 0, then we modify the argument in the obvious way.
Now we prove that conditions (a) and (c) are equivalent.
If u is in Span{v, w}, then u = sv + t w for some real numbers s and t and conse-
quently
£ ¤
det u v w = u • (v × w) = (sv + t w) • (v × w) = s(v • (v × w)) + t (w • (v × w)) = 0.
If v is in Span{u, w}, then v = su+t w for some real numbers s and t and consequently
£ ¤
det u v w = u • (v × w) = u • ((su + t w) × w) = s(u • (u × w)) + t (u • (w × w)) = 0.
If w is in Span{u, v}, then w = su+t v for some real numbers s and t and consequently
£ ¤
det u v w = u • (v × w) = u • (v × (su + t v)) = s(u • (v × u)) + t (u • (v × v)) = 0.
£ ¤
Now suppose that det u v w = u • (v × w) = 0. If v and w are linearly inde-
pendent, then u = sv + t w, according to Theorem 5.1.7. But this means that u is in
Span{v, w}. If v and w are linearly dependent, then the vector v is in Span{w} and con-
sequently in Span{u, w} or the vector w is in Span{v} and consequently in Span{u, v}.
In any case, the vectors u, v and w are linearly dependent.
     
1 2 1
Example 5.3.4. Show that the vectors 0, −1, and 1 are linearly dependent.
1 3 0
Solution. Since  · ¸
0 −1
det
    
 1 3   
1 2  · ¸ 1

0 × −1 = − det 1 2 
 = −1 ,
 1 3 
1 3 
 ·

¸ −1
 1 2 
det
0 −1
we have             
1 1 2 1 1 2 1 1
det 1 0 −1 = 1 • 0 × −1 = 1 • −1 = 0
0 1 3 0 1 3 0 −1
     
1 2 1
and thus the vectors 0, −1, and 1 are linearly dependent.
1 3 0
We can also show dependence of these vectors by observing that
       
1 2 1 0
3 0 − −1 − 1 = 0 .
1 3 0 0
The next theorem offers a somewhat different perspective on linear dependence

of three vectors.
Theorem 5.3.5. Vectors u, v, and w in R3 are linearly dependent if and only if

(a) u = 0;
(b) u 6= 0 and the equation

xu = v
has a solution;
(c) the vectors u and v are linearly independent and the equation
xu + yv = w
has a solution.
Proof. If one of the conditions (a), (b), or (c), holds, then it is clear that the vectors
u, v, and w are linearly dependent.
Now assume that the vectors u, v and w are linearly dependent.
If the vectors u and v are linearly dependent, then we have (a) or (b), by Theorem
5.3.5. If the vectors u and v are linearly independent, then w is in Span{u, v} or u is
in Span{v, w} or v is in Span{u, w}.
If w is in Span{u, v}, we have nothing to prove.
If u is in Span{v, w}, then there are real numbers b and c such that u = bv+cw. We
must have c 6= 0, because the vectors u and v are linearly independent. This means
that we have w = 1c u − bc v.
The case when v is in Span{u, w} is similar.
Sometimes it is beneficial to think

£ of linear
¤ dependence of vectors u, v, and w in
terms of properties of the matrix u v w . We saw the first indication of that in part
(c) of Theorem 5.3.3. The next theorem connects linear dependence of £ vectors¤ u, v,
and w with the shape of the reduced row echelon form of the matrix u v w . The
theorem is a direct consequence of Theorem 5.3.5.
Theorem 5.3.6. Vectors u, v, and w in R3 are linearly dependent if and only if

£ ¤
(a) The first column of the reduced row echelon form of the matrix u v w
 
0
is 0;
0
(b) The first two columns of the reduced row echelon form of the matrix
 
£ ¤ 1 x
u v w are 0 0;
0 0
 
£ ¤ 1 0 x
(c) The reduced row echelon form of the matrix u v w is 0 1 y .
0 0 0
     
3 2 1
Example 5.3.7. Show that the vectors 1 , 2 , and 3 are linearly dependent
1 1 1
     
1 3 2
and write 3 as a linear combination of the vectors 1 and 2.
1 1 1
   
3 2
Solution. We note that the vectors 1 and 2 are linearly independent. We need
1 1
to show that the equation
     
3 2 1
x 1 + y 2 = 3
1 1 1
has a solution. Since    
3 2 1 1 0 −1
1 2 3 ∼ 0 1 2 ,
1 1 1 0 0 0
the solution is x = −1 and y = 2.
     
2 1 1
Example 5.3.8. Consider the vectors −1 , 1 , and  2a . Find a number a
1 1 a +2
such that these vectors are linearly dependent and then write the vector
     
1 2 1
 2a  as a linear combination of the vectors −1 and 1.
a +2 1 1
   
2 1
Solution. Since the vectors −1 and 1 are linearly independent we need to
1 1
show that the equation
     
2 1 1
x −1 + y 1 =  2a 
1 1 a +2
has a solution. We find that
   
2 1 1 1 0 −a − 1
−1 1 2a  ∼ 0 1 2a + 3 .
1 1 a +2 0 0 −a − 4
Consequently, the equation has a solution if and only if −a − 4 = 0 or a = −4. If

a = −4, then    
2 1 1 1 0 3
−1 1 2a  ∼ 0 1 −5 ,
1 1 a +2 0 0 0
which means that the solution is x = 3 and y = −5.
Definition 5.3.9. If vectors u, v, and w are not linearly dependent, then we

say that they are linearly independent.
In other words, vectors u, v, and w are linearly independent, if the vector u is not
in Span{v, w}, the vector v is not in Span{u, w}, and the vector w is not in Span{u, v}.
The following theorem is a direct consequence of Theorems 5.3.3 and 2.3.17.
Theorem 5.3.10. Let u, v, and w be vectors from R3 . The following conditions

are equivalent:
(a) u, v, and w are linearly independent;
(b) The only solution of the equation
xu + yv + zw = 0
is the trivial solution x = y = z = 0;

£ ¤
(c) det u v w 6= 0;
£ ¤
(d) The matrix u v w is invertible;
 
£ ¤ 1 0 0
(e) The reduced row echelon form of the matrix u v w is 0 1 0 .
0 0 1
Note that if three vectors in R3 are linearly independent, then any two of them
are linearly independent. However, the converse is not true. In Example 5.3.4 we
show that the vectors      
1 2 1
0 , −1 , and 1
1 3 0
are linearly dependent, but any two of them are linearly independent.
Example 5.3.11. Show that the vectors

     
1 2 1
0 , −1 , and 1 .
1 3 1
are linearly independent.

Solution. Since
         
1 2 1 1 1
0 × −1 = −1 and 1 • −1 = −1 6= 0,
1 3 −1 1 −1
we have  
1 2 1
det 0 −1 1 = −1
1 3 1
     
1 2 1
and the vectors 0, −1, and 1 are linearly independent.
1 3 1
We can also show linear independence using
   
1 2 1 1 0 0
0 −1 1 ∼ 0 1 0 ,
1 3 1 0 0 1
which means that the equation

       
1 2 1 0
x 0 + y −1 + z 1 = 0
1 3 1 0
has only the trivial solution, that is, x = y = z = 0.
Bases in R3
In Chapter 3 we introduced the notion of a basis in R2 as a pair of linearly inde-
pendent vectors {a, b} in R2 such that for any c in R2 we have c = xa + yb for some
numbers x and y. Now we are going to define a basis in R3 . As expected, this time
three linearly independent vectors are needed, but otherwise the definition is the
same.
Definition 5.3.12. A set of three vectors {a, b, c} in R3 is called a basis in R3 if

the vectors satisfy the following two conditions:
(i) a, b and c are linearly independent;
(ii) For every vector d in R3 there exist real numbers x, y, and z such that
d = xa + yb + zc.
An expression of the form xa + yb + zc is called a linear combination of vectors

a, b, and c. According to the above definition, linearly independent vectors a, b, and
c form a basis in R3 , if every vector in R3 can be written as a linear combination of

vectors a, b, and c.
Definition 5.3.13. Let a, b, and c be vectors in R3 . The set of all possible

linear combinations of the form xa + yb + zc is denoted by Span{a, b, c} and
is called the vector subspace spanned by the vectors a, b, and c. In symbols,
Span{a, b, c} = xa + yb + zc : x, y, z in R .
© ª
Note that instead of the condition (2) in Definition 5.3.12 we could simply say
that Span{a, b, c} = R3 .
Theorem 5.3.14. Let a, b, c, u, v, w be vectors in R3 . The following two condi-

(a) Span{a, b, c} = Span{u, v, w};
(b) a, b, c are elements of Span{u, v, w} and u, v, w are elements of

Span{a, b, c}.
Proof. The proof is similar to the proof of Theorem 4.1.15 and we leave it as an exer-
cise.
As in the case of bases in R2 , the representation of any vector from R3 as a linear

combination of vectors from a basis is unique.
Theorem 5.3.15. Let {a, b, c} be a basis in R3 and let d be an arbitrary vector

in R3 . The real numbers x, y, and z such that
d = xa + yb + zc
are uniquely determined by the vector d.
Proof. If
d = xa + yb + zc
and
d = x 0 a + y 0 b + z 0 c,
then
0 = (x 0 − x)a + (y 0 − y)b + (z 0 − z)c,
which implies that x 0 − x = y 0 − y = z 0 − z = 0, because the vectors a, b, and c are
linearly independent.
Definition 5.3.16. Let {a, b, c} be a basis in R3 and let d be an arbitrary vector

in R3 . The unique real numbers x, y, and z such that
d = xa + yb + zc
are called the coordinates of d in the basis {a, b, c}.
In the definition of a basis of R3 we are assuming that the vectors are linearly
independent and that every vector in R3 is a linear combination of vectors from the
basis. It turns out that it is sufficient to assume only one of the conditions. Actually,
the conditions are equivalent, which is a consequence of the following important
theorem.
Theorem 5.3.17. Let c1 , c2 , c3 be linearly independent vectors in R3 and let

b1 , b2 , b3 be arbitrary vectors in R3 . If the vectors c1 , c2 , c3 are elements of the
vector subspace Span{b1 , b2 , b3 }, then
Span{c1 , c2 , c3 } = Span{b1 , b2 , b3 } = R3
and the vectors b1 , b2 , b3 are linearly independent.
Proof. Since the vectors c1 , c2 , c3 are elements of Span{b1 , b2 , b3 }, for j = 1, 2, 3 there

are real numbers a 1 j , a 2 j , a 3 j such that
c j = a 1 j b1 + a 2 j b2 + a 3 j b3 .
This can be written as a matrix product equation

£ ¤ £ ¤£ ¤
c1 c2 c3 = b1 b2 b3 a1 a2 a3 , (5.11)
   
a1 j x1
where a j = a 2 j . Consequently, if x 2  is an arbitrary vector, then
a3 j x3
   
£ ¤ x1 £ ¤£ ¤ x1
c1 c2 c3 x 2  = b1 b2 b3 a1 a2 a3 x 2  .
x3 x3
       
£ ¤ x1 0 £ ¤ x1 0
If a1 a2 a3 x 2  = 0, then we have c1 c2 c3 x 2  = 0 and consequently
x3 0 x3 0
   
x1 0
x 2  = 0, because the vectors c1 , c2 , c3 are linearly independent. This implies
x3 0
£ ¤
that the 3 × 3 matrix a1 a2 a3 is invertible, by Theorem 2.3.17. Moreover, from
(5.11) we obtain
£ ¤£ ¤−1 £ ¤
c1 c2 c3 a1 a2 a3 = b1 b2 b3 ,
which means that the vectors b1 , b2 , b3 are elements of Span{c1 , c2 , c3 } and, conse-
quently, we have
Span{c1 , c2 , c3 } = Span{b1 , b2 , b3 }
by Theorem 5.3.14.
The equation
   
£ ¤ x1 0
b 1 b 2 b 3  x 2  = 0 
x3 0

   
£ ¤£ ¤−1 x 1 0
c1 c2 c3 a1 a2 a3  x 2  = 0  .
x3 0
Because the vectors c1 , c2 , c3 are linearly independent we get

   
£ ¤−1 x 1 0
a1 a2 a3 x 2  = 0 .
x3 0
£ ¤
Now we multiply both sides by a1 a2 a3 and obtain
       
x1 £ ¤£ ¤−1 x 1 £ ¤ 0 0
x 2  = a1 a2 a3 a1 a2 a3 x 2  = a1 a2 a3 0 = 0 .
x3 x3 0 0
This means that the vectors b1 , b2 , b3 are linearly independent.

Since all linearly independent vectors c1 , c2 , c3 are in
      
 1 0 0 
Span 0 , 1 , 0 = R3
0 0 1
 
we conclude that
Span{c1 , c2 , c3 } = Span{b1 , b2 , b3 } = R3 .
The property described in Theorem 5.3.17 is not a special property of R3 . A gen-

eral version of the theorem will be presented in the book Core Topics in Linear Alge-
bra.
Theorem 5.3.18. For arbitrary vectors a, b, c in R3 , the following two condi-

(a) a, b, c are linearly independent;
(b) Span{a, b, c} = R3 .
Proof. If a, b, c are linearly independent vectors in R3 , then we must have

     
 1 0 0 
Span {a, b, c} = Span 0 , 1 , 0 = R3 ,
0 0 1
 
by Theorem 5.3.17.
If Span {a, b, c} = R3 , then
      
 1 0 0 
Span 0  ,  1  , 0 = Span {a, b, c}

0 0 1
 
and the vectors a, b, and c must be linearly independent vectors by Theorem 5.3.17.
Corollary 5.3.19. Any set of three linearly independent vectors in R3 is a basis

of R3 .
Corollary 5.3.20. Let v1 , v2 , and v3 be three nonzero orthogonal vectors in R3 ,

that is,
v1 • v2 = v1 • v3 = v1 • v3 = 0.
Then {v1 , v2 , v3 } is a basis of R3 .
Proof. It is enough to show that the vectors v1 , v2 , and v3 are linearly independent.
If  
0
x 1 v1 + x 2 v2 + x 3 v3 = 0 ,
0
then  
0
x 1 v1 • v1 + x 2 v2 • v1 + x 3 v3 • v1 = 0 • v1 .
0
Since v1 • v1 = kv1 k2 , v2 • v1 = 0, and v3 • v1 = 0, the above can be written as xkv1 k2 = 0.

This gives us x = 0, because kv1 k 6= 0. In the same way we can show that y = 0 and
z = 0.
A basis {v1 , v2 , v3 } of R3 such that the vectors v1 , v2 and v3 are orthogonal is called
an orthogonal basis.
Corollary 5.3.21. For arbitrary vectors a, b, c, and d in R3 the equation
x1 a + x2 b + x3 c + x4 d = 0
has a nontrivial solution, that is, a solution such that at least one of the num-
bers x 1 , x 2 , x 3 , or x 4 is different from 0.
Proof. If the vectors a, b, and c are linearly dependent, then the result is a direct
consequence of the definition of linear dependence. If the vectors a, b, c are linearly
independent this is a consequence of Corollary 5.3.19.
The definition of linear dependence of three vectors in R3 (Definition 5.3.1) can

be extended to more than three vectors. For example, we can say that vectors a,
b, c, and d are linearly dependent, if one of these vectors is in the vector subspace
spanned by the remaining three vectors, that is, we have a is in Span{b, c, d} or b is
in Span{a, c, d} or c is in Span{a, b, d} or d is in Span{a, b, c}. Note that, using linear
dependence of four vectors, Corollary 5.3.21 can be equivalently stated as follows.
Corollary 5.3.22. Any four vectors in R3 are linearly dependent.
The following simple theorem is often useful in proofs.
Theorem 5.3.23. Let A and B be 3×3 matrices and let {v1 , v2 , v3 } be a basis in
R3 . If
Av1 = B v1 , Av2 = B v2 , and Av3 = B v3 ,
then
A = B.
Proof. An arbitrary vector in R3 is of the form xv1 + yv2 + zv3 . Since
A(xv1 + yv2 + zv3 ) = x Av1 + y Av2 + z Av3 = xB v1 + yB v2 + zB v3 = B (xv1 + yv2 + zv3 ),
the result is a consequence of Theorem 2.1.18.

Example 5.3.24. Let {v1 , v2 , v3 } be an orthogonal basis of R3 , that is, such that
v1 • v2 = v1 • v3 = v2 • v3 = 0. Show that
 
1 0 0
1 1 1
v1 vT1 + v2 vT2 + v3 vT3 = 0 1 0 .
kv1 k2 kv2 k2 kv3 k2
0 0 1
Solution. Since {v1 , v2 , v3 } is an orthogonal basis of R3 , it is easy to verify that

µ ¶
1 T 1 T 1 T
v v
1 1 + v v
2 2 + v3 3 v1 = v1 ,
v
µ ¶
1 T 1 T 1 T
v v
1 1 + v v
2 2 + v3 3 v2 = v2 ,
v
and µ ¶
1 T 1 T 1 T
v1 v1 + v2 v2 + v3 v3 v3 = v3 .
Now, because we also have
     
1 0 0 1 0 0 1 0 0
0 1 0 v1 = v1 , 0 1 0 v2 = v2 , and 0 1 0 v3 = v3 ,
0 0 1 0 0 1 0 0 1
the desired equality is a consequence of Theorem 5.3.23.
Characterization of the vector subspaces of R3
Theorem 5.3.25. Let u, v, and w be arbitrary vectors in R3 and let V be

Span{u} or Span{u, v} or Span{u, v, w}. Then
(i) If a is a vector in V and c a real number, then the vector ca is in V , and
(ii) If a and b are vectors in V , then the vector a + b is in V .
Proof. We give the proof in the case when V is the vector subspace Span{u, v}. The
other cases are similar.
If a is a vector in V , then
a = xu + yv,
for some real numbers x and y. For any real number c we have
ca = c(xu + yv) = (c x)u + (c y)v,

which shows that the vector ca is in V .

If a and b are vectors in V , then
a = x 1 u + y 1 v and b = x 2 v1 + y 2 v,
for some real numbers x 1 , x 2 , y 1 , and y 2 . Since
a + b = x 1 u + y 1 v + x 2 v1 + y 2 v = (x 1 + y 1 )u + (x 2 + y 2 )v,
the vector a + b is in V .
It turns out that any subset of R3 that satisfies conditions (i) and (ii) in Theorem
5.3.25 must be one of the special sets listed in Theorem 5.3.25.
Theorem 5.3.26. Let V be a subset of R3 such that
(i) If a is a vector in V and c a real number, then the vector ca is in V , and

 
 0 
Then V is the set 0 or a vector line in R3 or a vector plane in R3 or all of
0
 
R3 .
Proof. If V contains a vector u 6= 0, then V must contain Span{u}. If V = Span{u},

then we are done.
If V 6= Span{u}, then V contains a vector v which is not in Span{u} and the vectors
u and v are linearly independent, by part (b) in Theorem 4.1.5. The subset V must
contain Span{u, v}. Now there are two possibilities: either V = Span{u, v} or V 6=
Span{u, v}.
If V = Span{u, v}, then we are done.
If V 6= Span{u, v}, then V contains a vector w which is not in Span{u, v} and
the vectors u, v, and w are linearly independent, by part (c) in Theorem 5.3.5.
The subset V must contain Span{u, v, w}. Consequently, by Theorem 5.3.18,
V = Span{u, v, w} = R3 .
From Theorems 5.3.25 and 5.3.26 we get the following corollary that character-
izes all vector subspaces of R3 .
Corollary 5.3.27. A subset V is a vector subspace spanned by one, two, or

three vectors if and only if V satisfies the following two conditions
(i) If a is a vector in V and c a real number then the vector ca is in V , and;

5.3.1 Exercises
Determine if the vectors a, b, c are linearly dependent or linearly independent.

           
3 1 3 1 1 1
1. a = 3, b = 3, c = 1 5. a = 1, b = 1, c = 0
1 3 3 1 0 0
           
2 1 1 4 1 8
2. a = 2, b = 2, c = 1 6. a = 1, b = 1, c = 5
1 2 3 3 1 7
           
1 3 1 2 1 1
3. a = 3, b =  2, c = 2 7. a = 1, b = 3, c = 4
2 −1 1 9 7 8
           
1 4 1 2 3 1
4. a = 2, b = 3, c = 1 8. a = 3, b = 1, c = 4
3 2 1 1 1 5
9. Suppose that the vectors b and c are linearly independent and the vectors a,
£ c are ¤linearly dependent. Find the reduced row echelon forms of the matrix
b,
a b c .
10. Suppose that the vectors a and c are linearly independent and the vectors a,
£ c are ¤linearly dependent. Find the reduced row echelon forms of the matrix
b,
a b c .
Show that the given vectors a, b, c are in the same vector plane and determine the
equation of this plane. Are the vectors linearly dependent?
           
1 3 1 2 1 1
11. a = 3, b =  2, c = 2 13. a = 1, b = 3, c = 4
2 −1 1 9 7 8
           
1 3 7 1 2 1
12. a = 1, b = 1, c = 5 14. a = 2, b = 1, c = 1
2 1 9 2 1 1
   
1 1
15. Let a = 2 and b = 1. Determine all vectors c such that the vectors a, b, c
3 1
are linearly dependent.
   
3 5
16. Let a =  2  and b = 1. Determine all vectors c such that the vectors a, b, c

3 4
are linearly dependent.
     
1 1 a
17. Find a number a such that the vectors 1, 0, and  1 are linearly depen-
0 1 1
 
a
dent and then with this value of a write the vector  1 as a linear combination
1
   
1 1
of vectors 1 and 0.
0 1
     
5 2 a
3 4 a
 
a
a
   
5 2
3 4
     
1 2 a
19. Find a number a such that the vectors −1, 1, and a  are linearly depen-
1 1 1
 
a
dent and then with this value of a write the vector a  as a linear combination
1
   
1 2
of vectors −1 and 1.
1 1
     
1 3 5
1 2 a
 
5
a
   
1 3
1 2
Find a number a such that the given vectors are linearly independent.
           
2 1 a 3 2 1
21. 2, 0,  2 22. 1, 1, a 
1 1 1 1 2 1
           
1 2 a 2 1 5
23. 2, 1, 2a  24. 2, 3,  7
1 1 1 1 1 a
For the given vectors a, b, c, and x show that {a, b, c} is a basis of R3 and find the
coordinates of x in the basis {a, b, c}.
       
1 3 2 1
25. a = 1  ,b=  1 ,c=  0  , and x = 2

1 0 0 3
       
1 1 0 1
26. a = 0, b = 1, c = 1, and x = 1
1 0 1 1
       
2 1 1 1
27. a = 1, b = 2, c = 1, and x = 0
1 1 2 0
       
2 1 1 1
28. a = 1, b = 2, c = 3, and x = 1
0 0 2 1
 
1 1 2
29. Find a number a such that the columns of the matrix 1 5 − a 4 are lin-
2 3 7−a
early dependent.
 
4−a 1 2
30. Find a number a such that the columns of the matrix  3 2 − a 2 are lin-
2 3 1
early dependent.
31. Find a number a such that the system


 2x + 3y + z = 0
3x + y + 2z = 0
2x + a y + 3z = 0

has a nontrivial solution and then solve the system.
32. Find a number a such that the system


 3x + y + z = 0
2x + 3y + 2z = 0
x + 2y + az = 0

has a nontrivial solution and then solve the system.
33. Let a, b, c, u, v be vectors in R3 . Show that the following two conditions are
equivalent:
(a) Span{a, b, c} = Span{u, v};

(b) a, b, c are elements of Span{u, v} and u, v are elements of Span{a, b, c}.
34. Let a, b, c, u, v, w be vectors in R3 . Show that the following two conditions are
equivalent:
(a) Span{a, b, v} = Span{u, v, w};

(b) a, b, c are elements of Span{u, v, w} and u, v, w are elements of Span{a, b, c}.
35. Show that det a b a × b = ka × bk2 for all vectors a, b, c in R3 .

£ ¤
36. Show that ka × bk2 = kak2 kbk2 − (a • b)2 for all vectors a, b, c in R3 .
37. Show that the set {a, b, a × b} is a basis in R3 for any two linearly independent
vectors a and b in R3 .
   
1 1
38. Find the coordinates of the vector 2 in the basis {a, b, a × b} if a = 3 and
1 0
 
1
b = 0.
2
39. Let a and b be two linearly independent vectors in R3 and let c be an arbitrary
vector in R3 . If
c = r a + sb + t (a × b),
for some numbers r , s, and t , show that r a + sb is the projection of c on the

vector plane Span{a, b} and t (a × b) is the projection of c on the vector line
Span{a × b}.
vector in R3 . Show that the projection p of the point c on the vector plane
Span{a, b} is
c • (a × b)
p = c− (a × b). (5.12)
ka × bk2
vector in R3 . Show that the distance from c to the vector plane Span{a, b} is
£ ¤
| det a b c |
.
ka × bk
42. Show that the distance from a point a in R3 to a vector line Span{b} in R3 is
ka × bk
.
kbk
ka×bk
kbk
b
p
Figure 5.3: The projection of a on the vector line Span{b} in R3 .
43. Let a and b be nonzero vectors in R3 . Using the formula for the area of a trian-
gle
1
A = · (the length of the base) · (the height),
2
show that the area of the triangle 0ab is 12 ka × bk.
Figure 5.4: The tetrahedron defined by vectors a, b, and c.
44. Let a, b, and c be linearly independent vectors in R3 . Using the formula for the
volume of a tetrahedron
1
V= · (the area of the base) · (the height),
3
show the the volume of the tetrahedron 0abc is 16 ¯det a b c ¯.

¯ £ ¤¯
45. Suppose that the set {a, b, c} is a basis of R3 and that the u is nonzero vector in
R3 . Show that one of the following is true:
(a) {a, b, u} is a basis in R3 ;

5.4. THE DIMENSION OF A VECTOR SUBSPACE OF R3 283
(b) {b, c, u} is a basis in R3 ;

(c) {a, c, u} is a basis in R3 .
46. Let {a, b, c} be a basis of R3 . If u and v are linearly independent vectors in R3 ,

show that one of the following is true:
(a) {a, u, v} is a basis in R3 ;

(b) {b, u, v} is a basis in R3 ;
(c) {c, u, v} is a basis in R3 .
5.4 The dimension of a vector subspace of R3

The dimension of a vector subspace is one of the fundamental ideas of linear alge-
bra. We motivate the definition with the following three simple theorems.
Theorem 5.4.1. Let V be a vector line.
(a) If a vector u 6= 0 is in V , then {u} is a basis of V .
(b) If vectors u and v are in V , then they are linearly dependent.
Proof. Part (a) is a consequence of Theorem 4.1.6.

If vectors u and v are in the vector line V = Span{w}, then u = sw and v = t w. If
t = 0, then v = 0 and, if t 6= 0, then t u − sv = 0. In both cases the vectors u and v are
linearly dependent. This proves part (b).
Theorem 5.4.2. Let V be a vector plane.
(a) If vectors u and v are in V and are linearly independent, then {u, v} is a
basis of V .
(b) If vectors u, v, and w are in V , then they are linearly dependent.
(c) If u is an arbitrary vector in R3 , then V 6= Span{u}.

Let u, v, and w be vectors in V . If the vectors u and v are linearly dependent,
we are done. If the vectors u, v are linearly independent then V = Span{u, v}, by
Theorem 4.1.22, and w is in Span{u, v}. But then the vectors u, v and w are linearly
dependent, completing the proof of part (b).
Now suppose that V = Span{v, w} and V = Span{u}. Then the vectors v and w are
linearly dependent, by Theorem 5.4.1, which is not true. This proves part (c).
Theorem 5.4.3.
(a) If vectors u, v, and w are three linearly independent vectors in R3 , then

{u, v, w} is a basis of R3 ;
(b) Any four vectors in R3 are linearly dependent;
(c) If u and v are arbitrary vectors in R3 , then Span{u, v} 6= R3 .

Part (b) is a consequence of the Corollary 5.3.21.
     
1 0 0
To prove part (c) we observe that R3 = Span{u, v} would imply that 0 , 1 , 0
0 0 1
are linearly dependent, by Theorem 5.4.2 (b).
Note that the above three theorems show that the minimum number of nonzero
vectors that span a vector subspace V of R3 is the same as the maximum number
of linearly independent vectors in V and is the same as the number of vectors in an
arbitrary basis of V .
Let
Definition 5.4.4. V be a nontrivial vector subspace of R3 , that is, a vector

 0 
subspace V 6= 0 . By the dimension of V , denoted by dim V , we mean
0
 
the minimum number of nonzero vectors that span V . Consequently,
(a) A vector line V has dimension 1 and we write dim V = 1;
(b) A vector plane V has dimension 2 and we write dim V = 2;
(c) The vector space R3 has dimension 3 and we write dim R3 = 3.

   
 0   0 
The trivial vector subspace Span 0 = 0 has dimension 0, that is,
0 0
   
 
 0 
dim 0 = 0.
0
 
     
 1 2 1 
Example 5.4.5. Determine the dimension of Span 1 , 1 , 2 .
3 5 4
 
   
1 2 1 1 0 3
Solution. Since the reduced row echelon form of the matrix 1 1 2 is 0 1 −1,
3 5 4 0 0 0
     
1  1 2 
the vector 2 is in Span 1 , 1 and we have
4 3 5
 
           
 1 2   1 2 1 
Span 1 , 1 = Span 1 , 1 , 2 .
3 5 3 5 4
   
Consequently,
     

 1 2 1 
dim Span 1 , 1 , 2  = 2.
3 5 4
 
The Rank Theorem for 3 × 3 matrices
 
a1 b1 c1
Theorem 5.4.6. Let a 2 b 2 c 2  be an arbitrary 3 × 3 matrix. Then
a3 b3 c3
             
 a1 b1 c1   a1 a2 a3 
dim Span a 2  , b 2  , c 2   = dim Span b 1  , b 2  , b 3  
a3 b3 c3 c1 c2 c3
   
Proof. Since
   
a1 b1 c1 a1 a2 a3
det a 2 b 2 c 2  = det b 1 b 2 b 3  ,
a3 b3 c3 c1 c2 c3
by Theorem 5.1.15, we have

      
 a1 b1 c1 
dim Span a 2  , b 2  , c 2   = 3
a3 b3 c3
 
if and only if
      
 a1 a2 a3 
dim Span b 1  , b 2  , b 3   = 3.
c1 c2 c3
 
Now suppose that      


 a1 b1 c1 
dim Span a 2  , b 2  , c 2   = 2.
a3 b3 c3
 
     
a1 b1 c1
This means that the vectors a 2 , b 2 , and c 2  are linearly dependent and two
a b c3
     3 3
 1 a b 1 c 1 
vectors from a 2  , b 2  , c 2  are linearly independent. Suppose that the vec-
a3 b3 c3
 
   
a1 b1 · ¸ · ¸
a1 b1 a1 b1
 a   b
tors 2 and 2 are linearly independent. Then det
 6= 0 or det 6=
a2 b2 a3 b3
a3 b3
· ¸
a2 b2
0 or det 6= 0, by Theorem 4.1.12. The proof is similar in all three cases. If we
a3 b3
   
· ¸ a1 a2
a1 b1
suppose that det 6= 0, then the vectors b 1  and b 2  are linearly indepen-
a2 b2
c1 c2
· ¸ · ¸
a1 b1 a1 a2
dent, because det = det .
a2 b2 b1 b2
By Theorem 5.1.15 we have
   
a1 a2 a3 a1 b1 c1
det b 1 b 2 b 3  = det a 2 b 2 c 2  = 0
c1 c2 c3 a3 b3 c3
     
a1 a2 a3
and thus the vectors b 1 , b 2 , b 3  are linearly dependent, by Theorem 5.3.3,
c1 c2 c3
and we have      
a3 a1 a2
b 3  = α b 1  + β b 2  ,
c3 c1 c2
for some real numbers α and β. Consequently
         
 a1 a2 a3   a1 a2 
Span b 1  , b 2  , b 3  = Span b 1  , b 2  ,
c1 c2 c3 c1 c2
   
         
 a1 a2   a1 a2 a3 
which means that b 1  , b 2  is a basis of Span b 1  , b 2  , b 3  , because
c1 c2 c1 c2 c3
   
   
a1 a2
the vectors b 1  and b 2  are linearly independent. This gives us
c c2
 1     
 a1 a2 a3 
dim Span b 1  , b 2  , b 3  = 2.
c1 c2 c3
 
   
a1 c1
The cases when the vectors a 2  and c 2  are linearly independent or when the
a3 c3
   
b1 c1
vectors b 2  and c 2  are linearly independent are treated in a similar way.
b3 c3
Finally we consider the case when
      
 a1 b1 c1 
dim Span a 2  , b 2  , c 2   = 1.
a3 b3 c3
 
         
 a1 b1 c1  0 a1
Then at least one vector from a 2  , b 2  , c 2  is different from 0. If a 2  6=
a3 b3 c3 0 a3
 
 
0
0, then we must have a 1 6= 0 or a 2 6= 0 or a 3 6= 0. If a 1 6= 0, then we also have
0
            
a1 0  a1   a1 b1 c1 
b 1  6= 0 and, because a 2  is a basis of Span a 2  , b 2  , c 2  , there
c1 0 a3 a3 b3 c3
   
       
a1 b1 a1 c1
are real numbers s and t such that s a 2  = b 2  and t a 2  = c 2 . This gives us
a3 b3 a3 c3
       
a1 a2 a1 a3
b 1 = b 2 and aa3 b 1  = b 3  and, consequently,
a2    
a1 1
c1 c2 c1 c3
       
 a1 a2 a3   a1 
Span b 1  , b 2  , b 3  = Span b 1  ,
c1 c2 c3 c1
   
which means that      


 a1 a2 a3 
dim Span b 1  , b 2  , b 3   = 1,
c1 c2 c3
 
   
a1 0
because b 1  6= 0. The proof when a 2 6= 0 or a 3 6= 0 is similar.
c1 0
       
b1 0 c1 0
The cases when b 2  6= 0 and c 2  6= 0 are treated in a similar way.
b3 0 c3 0
      
 a1 a2 a3 
Note that from the above proof it follows that if dim Span b 1  , b 2  , b 3   =
c c c
 
        1  2  3 
 a1 b1 c1   a1 a2 a3 
2, then dim Span a 2  , b 2  , c 2   = 2, and if dim Span b 1  , b 2  , b 3   =
a3 b3 c3 c1 c2 c3
   
         
 a1 b1 c1 
1, then dim Span a 2  , b 2  , c 2   = 1.
a3 b3 c3
 
   
a1 a2 a3 0 0 0
Since in the case when b 1 b 2 b 3  = 0 0 0 there is nothing to prove, the
c1 c2 c3 0 0 0
proof is now complete.
 
a1 b1 c1
Definition 5.4.7. Let A be the 3×3 matrix a 2 b 2 c 2 . The vector subspace
a3 b3 c3
     
 1 a b 1 c 1 
Span a 2  , b 2  , c 2  is called the column space of A and the vector
a3 b3 c
 
 3    
 a1 a2 a3 
subspace Span b 1  , b 2  , b 3  is called the row space of A. The di-
c1 c2 c3
 
mension of these vector subspaces is called the rank of the matrix A.

5 2 3
Example 5.4.8. Verify the Rank Theorem for the matrix 1 1 2.
3 0 −1
Solution. The result is a consequence of the

 fact that the reduced row echelon
1 0 − 31
  
5 2 3
form of the matrix 1 1 2 is 0 1
 7 and the reduced row echelon form
3
3 0 −1 0 0 0
   
5 1 3 1 0 1
of the matrix 2 1 0 is 0 1 −2.
3 2 −1 0 0 0
5.4.1 Exercises
Determine the dimension of the given subspace of R3 .

           
 2 1   3 1 2 
1. Span 5 , 2 6. Span 1 , 3 , 2
1 3 1 1 1
   
          
 1 1   10 4 2 
2. Span 1 , 1 7. Span  5 , 2 , 1
0 1 5 2 1
   
         
 2 −4   3 1 −4 
3. Span −1 ,  2 8. Span 9 , 3 , −12
4 3 1

−8
 
−4

          
 4 5   2 1 2 
4. Span −4 , −5 9. Span 1 , 2 , 2
8 10 2 2 1
   
             
 3 2 1   1 4 1 
5. Span 1 , 3 , 2 10. Span 1 , 1 , 0
7 7 4 3 2 5
   
Verify the Rank Theorem for the given matrix.

   
1 7 4 1 1 0
11.  2 2 3 13. 1 1 1
−1 5 1 1 3 1
   
1 2 3 3 2 1
12. 1 1 4 14. 4 −3 5
2 5 5 2 2 0
Chapter 6
Singular value decomposition of

3 × 2 matrices
In Chapter 3 we discussed orthogonal diagonalization and spectral decomposition

of 2 × 2 matrices. We proved that a 2 × 2 matrix can be orthogonally diagonalized
and has a spectral decomposition if and only if it is symmetric. In this chapter we
obtain similar results for 3 × 2 matrices. The method described here generalizes to
matrices of any dimension and has numerous practical applications, including, data
compression, noise reduction, or data analysis.
We start by observing that for an arbitrary 3 × 2 matrix A, the 2 × 2 matrix A T A is
symmetric, so it is easier to work with and it gives us useful information about the
original matrix A.
Theorem 6.1. Let  

a1 b1
A = a 2 b 2 
a3 b3
and let λ1 and λ2 be the eigenvalues of the symmetric matrix A T A, not neces-
sarily different. Let v1 and v2 be orthonormal eigenvectors of the matrix A T A
corresponding to the eigenvalues λ1 and λ2 , respectively. Then
(a) λ1 ≥ 0 and λ2 ≥ 0,
(b) kAv1 k = λ1 and kAv2 k = λ2 ,

p p
(c) Av1 • Av2 = 0,

   
 a1 b1 
(d) Span {Av1 , Av2 } = Span a 2  , b 2  .
a3 b3
 
291
292 Chapter 6: Singular value decomposition of 3 × 2 matrices
Proof. Since
kAv1 k2 = (Av1 ) • (Av1 ) = (Av1 )T (Av1 )
= vT1 A T (Av1 ) = (vT1 )(A T Av1 )
= (vT1 )(λ1 v1 ) = λ1 vT1 v1 = λ1 ,
we obtain λ1 ≥ 0 and kAv1 k = λ1 . In the same way we get λ2 ≥ 0 and kAv2 k = λ2 .

p p
Since
Av1 • Av2 = vT1 (A T Av2 ) = λ2 v1 • v2 = 0,
we obtain (c).
Now let x 1 and x 2 be arbitrary real numbers. Since the set {v1 , v2 } is a basis in R2 ,
we have · ¸
x1
= sv1 + t v2 ,
x2
for some real numbers s and t . Then
   
a1 b1 · ¸
x1
x1 a2 + x2 b2 = A
    = A(sv1 + t v2 ) = s Av1 + t Av2 ,
x2
a3 b3
which implies    
 a1 b1 
Span {Av1 , Av2 } = Span a 2  , b 2  ,
a3 b3
 
   
 a1 b1 
because it is obvious that A(v1 ) and A(v2 ) are in Span a 2  , b 2  .
a3 b3
 
Definition 6.2. Let  

a1 b1
A = a 2 b 2 
a3 b3
λ2 be the eigenvalues of the matrix A T A. The numbers σ1 =

p let λ1 and p
and
λ1 and σ2 = λ2 are called the singular values of the matrix A.
It is customary to label the eigenvalues λ1 and λ2 of the matrix A T A so that we

have λ1 ≥ λ2 and, consequently, σ1 ≥ σ2 .
 
4 2
Example 6.3. Find the singular values of the matrix A = 1 −7 .
4 2
Chapter 6: Singular value decomposition of 3 × 2 matrices 293
Solution. Since · ¸
T 33 9
A A= ,
9 57
the eigenvalues of the matrix A T A are the roots of the equation
33 − λ
· ¸
9
det = λ2 − 90λ + 1800 = 0.
9 57 − λ
T
Consequently, the eigenvalues
p p matrix A A are 60 and 30 and the singular
of the
values of the matrix A are 60 and 30.

a1 b1
A = a 2 b 2  .
a3 b3
Let v1 and v2 be orthonormal eigenvectors of the matrix A T A and let σ1 and

σ2 are the singular values of the matrix A. The following conditions are equiv-
alent:
   
a1 b1
(a) The vectors a 2  and b 2  are linearly independent;
a3 b3
(b) The vectors Av1 and Av2 are linearly independent;

   
 a1 b1 
(c) {Av1 , Av2 } is a basis for Span a 2  , b 2  ;
a3 b3
 
(d) σ1 > 0 and σ2 > 0.
Proof. Equivalence of (a), (b), and (c) follows immediately from part (d) of Theorem
6.1 and Theorem 4.1.22.
If the vectors Av1 and Av2 are linearly independent, then Av1 6= 0 and Av2 6= 0
and consequently σ1 = kAv1 k > 0 and σ2 = kAv2 k > 0. On the other hand, if kAv1 k =
σ1 > 0 and kAv2 k = σ2 > 0, then the vectors Av1 and Av2 are linearly independent,
because they are orthogonal. Indeed, if
 
0
x A(v1 ) + y A(v2 ) = 0 ,
0
then  
0
x A(v1 ) • A(v1 ) + y A(v2 ) • A(v1 ) = 0 • A(v1 ).
0
Since A(v2 ) • A(v1 ) = 0, the above can be written as
xkA(v1 )k2 = 0.
This gives us x = 0, because kA(v1 )k 6= 0. In the same way we can show that y = 0.
   
 4 2 
Example 6.5. Find an orthogonal basis for Span 1 , −7 .
4 2
 
 
4 2 · ¸ · ¸
1 −3
Solution. Let A = 1 −7. First we find that the vectors and are a basis
3 1
4 2
   
· ¸ 10 · ¸ 10
T 1 −3
of eigenvectors of the matrix A A. Since A = −20 and A = 10, the
3 1
10 10
       
 1 1   4 2 
set −2 , 1 is an orthogonal basis for Span 1 , −7 .
1 1 4 2
   
Now we consider the case when the columns of A are linearly dependent.

a1 b1
A = a 2 b 2 
a3 b3
be a nonzero matrix. Let v1 and v2 be orthonormal eigenvectors of the matrix
A T A and let σ1 and σ2 be the singular values of the matrix A such that σ1 ≥
σ2 . The following conditions are equivalent:
   
a1 b1
(a) The vectors a 2  and b 2  are linearly dependent;
a3 b3
(b) The vectors Av1 and Av2 are linearly dependent;

   
 a1 b1 
(c) {Av1 } is a basis for Span a 2  , b 2  ;
a3 b3
 
(d) σ1 > 0 and σ2 = 0.
Proof. Equivalence of (a) and (b) follows immediately from the equivalence of (a)
and (b) in Theorem 6.1.
Since the vectors Av1 and Av2 are orthogonal, they are linearly dependent if and
only if one of them is the zero vector. If one of the vectors is the zero vector, then it
must be Av2 because we have
kAv1 k = σ1 ≥ σ2 = kAv2 k.
This shows that the vectors Av1 and Av2 are linearly dependent if and only if Av2 = 0.
If Av2 = 0, then
   
 a1 b1 
Span {Av1 } = Span {Av1 , Av2 } = Span a 2  , b 2  ,
a3 b3
 
   
 a1 b1 
so {Av1 } is a basis for Span a 2  , b 2  .
a3 b3
 
   
 a1 b1 
Now, if {Av1 } is a basis for Span a 2  , b 2  , then we must have
a3 b3
 
σ1 = kAv1 k > 0,
because the matrix A is a nonzero matrix, and
σ2 = kAv2 k = 0,
because the vectors Av1 and Av2 are orthogonal and

   
 a1 b1 
Span {Av1 } = Span a 2  , b 2  = Span {Av1 , Av2 } .
a3 b3
 
Finally, if σ2 = 0, then kAv2 k = σ2 = 0 and thus Av2 = 0, so the vectors Av1 and
Av2 are linearly dependent.
In the next lemma we prove a simple identity for orthonormal vectors in R2 that
will be used in the proof of Theorem 6.8.
Lemma 6.7. If the vectors v1 and v2 in R2 are orthonormal, then

· ¸
1 0
= v1 vT1 + v2 vT2 .
0 1
· ¸ · ¸ · ¸
a −b b
Proof. If v1 = , then v2 = or v2 = .
b a −a
· ¸
−b
If v2 = , then we have
a
· ¸ · ¸
a £ −b £
v1 vT1 + v2 vT2
¤ ¤
= a b + −b a
b a
· 2
b 2 −ab
¸ · ¸
a ab
= +
ab b 2 −ab a2
· 2
a + b2
¸
0
=
0 a2 + b2
· ¸
1 0
= ,
0 1
because a 2 + b 2 = kv·1 k =¸ 1.
b
In the case v2 = the above argument requires only some obvious modifica-
−a
tions.
Now we prove a result similar to the spectral decomposition for symmetric ma-
trices.

a1 b1
Theorem 6.8. Let A = a 2 b 2  be a nonzero matrix.
a3 b3
   
a1 b1
(a) If the vectors a 2  and b 2  are linearly independent, then there are
a3 b3
two orthonormal vectors u1 and u2 in R3 and two orthonormal vectors
v1 and v2 in R2 such that
A = σ1 u1 vT1 + σ2 u2 vT2 ,
where σ1 and σ2 are the singular values of the matrix A.

   
a1 b1
(b) If the vectors a 2  and b 2  are linearly dependent, then there are a
a3 b3
unit vector u1 in R3 and a unit vector v1 in R2 such that
A = σ1 u1 vT1 ,
where σ1 is the singular value of the matrix A such that σ1 > 0.
Proof. Let v1 and v2 be orthonormal eigenvectors of the matrix A T A. Since

· ¸
1 0
= v1 vT1 + v2 vT2 ,
0 1
we have · ¸
1 0
A=A = Av1 vT1 + Av2 vT2 . (6.1)
0 1
   
a1 b1
If the vectors a 2  and b 2  are linearly independent, then Av1 6= 0 and Av2 6= 0,
a3 b3
by Theorem 6.4, and (6.1) can be written as
³ ´ ³ ´
1 1
A = kAv1 k kAv1 k Av1 vT1 + kAv2 k kAv2 k Av2 vT2 = σ1 u1 vT1 + σ2 u2 vT2 ,
where σ1 = kAv1 k, u1 = kAv1 1 k Av1 , σ2 = kAv2 k, and u2 = kAv1 2 k Av2 .

   
a1 b1
If the vectors a 2  and b 2  are linearly dependent, then Av1 6= 0 and Av2 = 0,
a3 b3
by Theorem 6.4, and (6.1) becomes
³ ´
1
A = Av1 vT1 = kAv1 k kAv1 k Av1 vT1 = σ1 u1 vT1
1
where σ1 = kAv1 k and u1 = kAv1 k Av1 .
Note that A = σ1 u1 vT1 can be viewed as a special case of A = σ1 u1 vT1 + σ2 u2 vT2

with σ2 = 0.
The representation
A = σ1 u1 vT1 + σ2 u2 vT2
is called the outer product expansion for the matrix A.
In the proof of Theorem 6.8 we assume from the beginning that the eigenvectors
v1 and v2 are unit vectors. When we find eigenvectors of a specific matrix, they are
usually not unit vectors. The “recipe” for finding the outer product expansion of a
matrix 3 × 2 matrix given below takes this fact into account.
Step 1 Calculate the matrix A T A.
Step 2 Find the eigenvalues λ1 ≥ λ2 and eigenvectors V1 , V2 of A T A.
Step 3 Write
· ¸
1 0 1 1
= 2
V1 VT1 + V2 VT2 .
0 1 kV1 k kV2 k2
Step 4 Multiply both sides of the equality by the matrix A to get

· ¸
1 0 1 1
A=A = (AV1 )VT1 + (AV2 )VT2 .
0 1 kV1 k2 kV2 k2
Step 5 If the columns of A are linearly independent, then the outer product ex-
pansion of the matrix A is
µ ¶µ ¶ µ ¶µ ¶
1 1 T 1 1 T
A = σ1 AV1 V + σ2 AV2 V
σ1 kV1 k kV1 k 1 σ2 kV2 k kV2 k 2
or
A = σ1 u1 vT1 + σ2 u2 vT2 ,
where
1 1
v1 = VT , v2 = VT ,
kV1 k 1 kV2 k 2
1 1
u1 = AV1 , u2 = AV2 .
σ1 kV1 k σ2 kV2 k
To get this expansion we used the fact that the singular values are
° µ ¶° ° µ ¶°
p kAV1 k ° V1 ° ° and σ2 = λ2 = kAV2 k = ° A V2 ° .
p
σ1 = λ1 =
° °
=° A
kV1 k ° kV1 k ° kV2 k ° kV2 k °
If the columns of A are linearly dependent, then the outer product expan-
sion of the matrix A is
µ ¶µ ¶
1 1
A = σ1 AV1 VT1 = σ1 u1 vT1 ,
σ1 kV1 k kV1 k
where
1 1
v1 = VT1 , u1 = AV1 .
kV1 k σ1 kV1 k
To get this expansion we used the fact that
° µ ¶°
p kAV1 k ° V1 °
σ1 = λ1 = = °A
° °.
kV1 k kV1 k °
 
1 4
Example 6.9. Find the outer product expansion for the matrix A = 1 1.
1 −1
Solution.
Step 1 We calculate the matrix A T A:
 
· ¸ 1 4 · ¸
1 1 1 1 1  = 3 4 .
AT A =
4 1 −1 4 18
1 −1
· ¸
1
Step 2 The eigenvalues of the matrix A T A are λ1 = 19 and λ2 = 2, is an
4
· ¸
4
eigenvector corresponding to the eigenvalue 19, and an eigenvector corre-
−1
sponding to the eigenvalue 2.

Step 3 · ¸ · ¸ · ¸
1 0 1 1 £ ¤ 1 4 £ ¤
= 1 4 + 4 −1 .
0 1 17 4 17 −1
Step 4
 
1 4 · ¸
1 0
A = 1 1 
0 1
1 −1
   
1 4 · ¸£ ¤ 1 1 4
· ¸
1  1 1 1 4 4 −1
£ ¤
= 1 1 1 4 +
17 4 17 −1
1 −1 1 −1
   
17 ¤ 1 0 £
1  £ ¤
= 5 1 4 + 3 4 −1
17 17
−3 5
Step 5
     
p 17 µ ¤ p
¶ 0 µ ¶
1 1 £ 1 1 £ ¤
A = 19 p
  5  p 1 4 + 2 p
 3  p 4 −1
19 · 17 −3 17 2 · 17 5 17
or
A = σ1 u1 vT1 + σ2 u2 vT2 ,
where
   
17 0 · ¸ · ¸
1  5  , u2 = p 1 3 , v1 = p1 1 , v2 = p1 4
u1 = p ,
19 · 17 −3 2 · 17 5 17 4 17 −1
p p
and the singular values are σ1 = 19 and σ2 = 2.

1 2
Example 6.10. Find the outer product expansion of the matrix A = −1 −2.
1 2
Solution.
Step 1 · ¸
3 6
AT A = .
6 12
· ¸
T 1
Step 2 The eigenvalues of the matrix A A are λ1 = 15 and λ2 = 0, is an
2
· ¸
2
eigenvector corresponding to the eigenvalue 15, and an eigenvector corre-
−1
sponding to the eigenvalue 0.
Step 3 · ¸ · ¸ · ¸
1 0 1 1 £ ¤ 1 2 £ ¤
= 1 2 + 2 −1 .
0 1 5 2 5 −1
Step 4
 
1 2 · ¸
1 0
A = −1 −2
 
0 1
1 2
   
1 2 · ¸£ 1 2 · ¸£
1 1 ¤ 1 2 ¤
= −1 −2 1 2 + −1 −2 2 −1
5 2 5 −1
1 2 1 2
   
5 £ ¤ 1 0 £
1 ¤
= −5 1 2 + 0 2 −1
5 5
5 0
 
5 £
1 ¤
= −5 1 2
5
5
 
1 £ ¤
= −1 1 2
1
Step 5
  
p 1 µ ¶
1 1 £
p 1 2 = σ1 u1 vT1
¤
A = 15 p −1
   
3 1 5
where  
1
1   1 £ ¤
u1 = p −1 , v1 = p , 1 2
3 1 5
p
and σ1 = 15.
Definition 6.11. Let A be a 3 × 2 matrix. By a singular value decomposition

of A we mean the representation of A in the form
¤ σ1 0 vT1
 " #
A = u1 u2 u3  0 σ2  T ,
£
0 0 v2
where {u1 , u2 , u3 } is an orthonormal basis in R3 , {v1 , v2 } is an orthogonal basis

in R2 , and σ1 and σ2 are the singular values of the matrix A.
The singular value decomposition should be thought of as a version of diagonal-

ization of 2 × 2 matrices for 3 × 2 matrices. The following theorem implies that every
3 × 2 matrix has a singular value decomposition.

a1 b1
A = a 2 b 2 
a3 b3
be a nonzero matrix.
   
a1 b1
(a) If the vectors a 2  and b 2  are linearly independent, then there is an
a3 b3
orthonormal basis {u1 , u2 , u3 } in R3 and an orthogonal basis {v1 , v2 } in
R2 such that
¤ σ1 0 vT1
 " #
A = u1 u2 u3  0 σ2  T ,
£
0 0 v2
where σ1 and σ2 are the singular values of the matrix A.
   
a1 b1
(b) If the vectors a 2  and b 2  are linearly dependent, then there is an
a3 b3
orthonormal basis {u1 , u2 , u3 } in R3 and an orthogonal basis {v1 , v2 } in
R2 such that
¤ σ1 0 vT1
 " #
£
A = u1 u2 u3  0 0 T ,
0 0 v2
where σ1 is the singular value of the matrix A such that σ1 > 0.
   
a1 b1
Proof. If the vectors a 2  and b 2  are linearly independent, then there are two
a3 b3
orthonormal vectors u1 , u2 in R and two orthonormal vectors v1 and v2 in R2 such
3
that
A = σ1 u1 vT1 + σ2 u2 vT2 , (6.2)
where σ1 and σ2 are the singular values of the matrix A, by Theorem 6.8. The equal-
ity (6.2) can be written as
¤ σ1 0 vT1
· ¸" #
£
A = u1 u2 ,
0 σ2 vT
2
because for every vector x of R2 we have
Ax = (σ1 u1 vT1 + σ2 u2 vT2 )x

¤ σ1 vT1 x
" #
£
= u1 u2
σ2 vT2 x
¤ σ1 0 vT1 x
· ¸" #
£
= u1 u2
0 σ2 vT x
2
¸" T #
¤ σ1 0 v1
·
£
= u1 u2 x.
0 σ2 vT
2
Note that, if u3 is an arbitrary vector in R3 , then we have
¤ σ1 0
 
¤ σ1 0
· ¸
= u1 u2 u3  0 σ2  .
£ £
u1 u2
0 σ2
0 0
Since we want that {u1 , u2 , u3 } to be an orthonormal basis in R3 , we choose

1
u3 = (u1 × u2 ).
ku1 × u2 k
We note that u1 × u2 6= 0, because the vectors u1 and u2 are linearly independent.

The desired representation of the matrix A is
¤ σ1 0 vT1
 " #
A = u1 u2 u3  0 σ2  T .
£
0 0 v2
   
a1 b1
Now we assume that the vectors a 2  and b 2  are linearly dependent. By The-
a3 b3
orem 6.8, there are a unit vector u1 in R3 and a unit vector v1 in R2 such that
A = σ1 u1 vT1 , (6.3)
where σ1 is the singular value of the matrix A such that σ1 > 0. Note that, if u2 and
u3 are arbitrary vectors in R3 , then we have
¤ σ1
 
σ1 u1 = u1 u2 u3  0 .
£
We want {u1 , u2 , u3 } to be an orthonormal basis in R3 , so we choose u2 and u3 such

that {u2 , u3 } is an orthonormal basis of the vector plane u1 • x = 0. Since
σ1 σ1 0 "vT #
   
 0 v =  0 0 1 ,
T
1 T
0 0 0 v2
we obtain the desired representation of the matrix A:
¤ σ1 0 vT1
 " #
£
A = u 1 u 2 u 3  0 0 T .
0 0 v2
Example 6.13. Find the singular value decomposition of the matrix
1 3
 
p
A =  2 p0 .
0 2
Solution. The outer product expansion of A is
A = σ1 u1 vT1 + σ2 u2 vT2 ,
where
10 0
   
1  p  1  p 
· ¸ · ¸
1 1 1 3
v1 = p , v2 = p , u1 = p 2 , u2 = p 3 2 ,
10 3 10 −1 2 30 3p2 2 5 −p2
p p
and the singular values are σ1 = 2 3 and σ2 = 2.
To complete {u1 , u2 } to an orthonormal basis in R3 we calculate the cross prod-
uct p 
0 10 20
     
p p p p 2
 3 2 ×  2 = −10 2 = 10 2  −1
p p p
− 2 3 2 −30 2 −3
and the norm °p °

° 2 °
° ° p p
° −1° = 12 = 2 3
° °
° −3 °
and then define

p1
 
p 
2 6
1 
 1 

u3 = p  −1 = − 2p3  .
2 3 −3  
3
− p
2 3
3
Now {u1 , u2 , u3 } is an orthonormal basis in R and the singular value decomposi-
tion of the matrix A is
 p " #
£ ¤ 2 3 p0 vT1
A = u1 u2 u3  0 2  T ,
v2
0 0
that is, p 
10 p1  p
p 0
1 3 0 " p1 p3
  
 2 3 6 2 3 #
p  p1 p 10 10
 2 0 p3 1 
.
p = 2 15 10
− p
2 3
 0 2 3
p 1 p1
0 2 0 0 10 10
p3 − p1 3
− p
2 15 10 2 3
Example 6.14. Find the singular value decomposition of the matrix

 
1 −1
A = −3 3 .
−1 1
Solution. The outer product expansion of A is
A = σ1 u1 vT1 ,
where  1 
p
p1 11 


" # 1
2 1 
 3 
v1 = , u1 = p −3 = − p11 
− p1 11 −1  
2
− p1
11
p
and σ1 = 22.
Consequently,
p1
 
p
 11 22 0 " p1 − p1 #


 3 2 2
A = − p11 u2 u3 
  0 0 1 ,
p p1
0 0
 
2 2
− p1
11
where the vectors u2 and u3 form an orthonormal basis of the vector plane u1 •x = 0,
that is, the vector plane
x − 3y − z = 0.
 
3
Since the vector 1 and the vector
0
     
1 3 1
−3 × 1 = −3
−1 0 10
are orthogonal vectors in that plane, we can take
p1
 
p3
 
10  110 
 1  3 
u2 = 
 p10  and u3 = − p110  .
 
 
0 − p10
110
Consequently, the singular value decomposition of the matrix A is

 1
p3 p1

p p
 11 10 110  0 " p1 − p1 #
  
1 −1 22
3 p1 2 2
− p11
−3 3 =  − p3   0 0 1 .
10 110  p1

p
−1 1 0 0

2 2
− p1 0 − p10
11 110
6.1 Exercises
Find the singular values of the given matrix.

   
2 1 3 1
1. −1 2 3. −1 3
1 1 −2 6
   
1 3 2 0
2.  3 −1 4.  3 −1
−1 1 −2 −4
Use Theorem 6.1 to find an orthogonal basis in the given vector plane.
       
 2 0   2 1 
5. Span  3 , −1 7. Span 1 ,  2
1

−2 −4
 
−2

       
 −3 4   1 3 
6. Span  1 , 2 8. Span  3 , −1
0 1

−5
 
−1

Find the outer product expansion of the given matrix.

   
3 1 1 −2
9.  1 2 11. −2 −1
−1 3 3 −1
   
3 1 −3 4
10. −3 4 12.  1 2
3 1 −5 0
Find the singular value decomposition of the given matrix.

   
3 1 1 −2
13.  1 2 15. −2 −1
−1 3 3 −1
   
3 1 1 3
14. −3 4 16.  3 −1
3 1 −1 1
17. Let A be a 3×2 matrix. Suppose that u1 and u2 are two orthonormal vectors in
R3 and v1 and v2 are two orthonormal vectors in R2 . If A = σ1 u1 vT1 + σ2 u2 vT2 is
the outer product expansion of A, show that A T = σ1 v1 uT1 + σ2 v2 uT2 .
the outer product expansion of A, show that A T A = σ21 v1 vT1 + σ22 v2 vT2 .
R3 and v1 and v2 are two orthonormal vectors in R2 . If A = σ1 u1 vT1 + σ2 u2 vT2
for some real numbers σ1 ≥ σ2 ≥ 0, show that the numbers σ1 and σ2 are the
singular values of A.
the outer product expansion of A, show that the vectors u1 and u2 are eigen-
vectors of the matrix A A T and determine the corresponding eigenvalues.
Chapter 7
Diagonalization of 3 × 3
matrices
In Chapters 1 and 3 we discussed diagonalization of 2 × 2 matrices, which is one of

the most important ideas in linear algebra with numerous applications. In this chap-
ter we consider diagonalization of 3 × 3 matrices. While there are many similarities,
things are more complicated in the case of 3 × 3 matrices.
7.1 Eigenvalues and eigenvectors of 3 × 3 matrices
We begin this section with some preliminary results.
The solution of the equation Ax = 0

    
a1 b1 c1 x 0
Solving the equation a 2 b 2 c 2   y  = 0, especially in the case when the matrix
a3 b3 c3 z 0
 
a1 b1 c1
a 2 b 2 c 2  is not invertible, plays a crucial role in this chapter. In the first theorem
a3 b3 c3
we give a complete description of possible solutions of such equations. Moreover,
the presented proof gives us a method of solving such equations. The method is
used in many examples in this chapter.
307
308 Chapter 7: Diagonalization of 3 × 3 matrices
 
a1 b1 c1
Theorem 7.1.1. Let a 2 b 2 c 2  be an arbitrary 3×3 matrix. The general
a3 b3 c3
solution of the equation
    
a1 b1 c1 x 0
a 2 b 2 c 2   y  = 0 , (7.1)
a3 b3 c3 z 0
is one of the following:

 
0
(a) The vector 0;
0
(b) A vector line in R3 ;
(c) A vector plane in R3 ;
(d) All of R3 .
 
a1 b1 c1
Proof. Let A = a 2 b 2 c 2 .
a3 b3 c3
     
a1 b1 c1 x 0
If det a 2 b 2 c 2  6= 0, then the unique solution of (7.1) is  y  = 0, by Theo-
a3 b3 c3 z 0
rem 5.3.10.  
a1 b1 c1
Now suppose that det a 2 b 2 c 2  = 0. Since
a3 b3 c3
   
a1 a2 a3 a1 b1 c1
det b 1 b 2 b 3  = det a 2 b 2 c 2  = 0,
c1 c2 c3 a3 b3 c3
    
a1 a2 a3
by Theorem 5.1.15, the vectors b 1 , b 2 , and b 3  are linearly dependent.
c1 c2 c3
   
a1 a2
Suppose that the vectors b 1  and b 2  are linearly independent. The equation
c1 c2
(7.1) is equivalent to the following three equations
           
a1 x a2 x a3 x
b 1  •  y  = 0, b 2  •  y  = 0, and b 3  •  y  = 0.
c1 z c2 z c3 z
7.1. EIGENVALUES AND EIGENVECTORS OF 3 × 3 MATRICES 309
     
a1 a2 a3
Since the vectors b 1 , b 2 , and b 3  are linearly dependent, we have
c1 c2 c3
     
a3 a1 a2
b 3  = α b 1  + β b 2  ,
c3 c1 c2
   
a3 x
for some real numbers α and β. Consequently, the equation b 3  •  y  = 0 is a
c3 z
consequance of the equations
       
a1 x a2 x
b 1  •  y  = 0 and b 2  •  y  = 0.
c1 z c2 z
This means that the equation (7.1) is equivalent to the following two equations
       
a1 x a2 x
b 1  •  y  = 0 and b 2  •  y  = 0
c1 z c2 z
and the general solution of (7.1) is

     
x a1 a2
 y  = t b 1  × b 2 
z c1 c2
 
x
wheret is an arbitrary number, by Theorem 5.1.1. This means that  y  is a solution
z
   
 a1 a2 
of (7.1) if it is in the vector line Span b 1  × b 2  .
c1 c2
 
   
a1 a3
The cases when the vectors b 1  and b 3  are linearly independent or when
c1 c3
   
a2 a3
the vectors b 2  and b 3  are linearly independent are treated in a similar way.
c2 c3
     
a1 a2 a3
Suppose now that any two vectors from b 1 , b 2 , and b 3  are linearly de-
c1 c2 c3
           
a1 0 a2 a1 a3 a1
pendent. If b 1  6= 0, then b 2  = α b 1  and b 3  = β b 1  for some real num-
c1 0 c2 c1 c3 c1
bers α and β. Consequently, the equations

       
a2 x a3 x
b 2  •  y  = 0 and b 3  •  y  = 0
c2 z c3 z
follow from the equation

   
a1 x
b 1  •  y  = 0
c1 z
In other words, the equation (7.1) is equivalent to the above equation, which is a
equation of a vector plane, by Theorem 4.2.8.
       
a2 0 a3 0
The cases when b 2  6= 0 or b 3  6= 0 are similar.
c2 0 c3 0
 
x
Finally we note that every vector  y  in R3 satisfies the equation
z
    
0 0 0 x 0
0 0 0  y  = 0 .
0 0 0 z 0
Example 7.1.2. We solve the equation

    
2 1 3 x 0
1 3 2  y  = 0 ,
3 −1 4 z 0
which is equivalent to the system


 2x + y + 3z = 0
x + 3y + 2z = 0 (7.2)
3x − y + 4z = 0

Since  
2 1 3
det 1 3 2 = 0
3 −1 4
   
2 1
and the vectors 1 and 3 are linearly independent, the system (7.2) is equiva-
3 2
lent to the system ½

2x + y + 3z = 0
.
x + 3y + 2z = 0
The general solution of this system is
       
x 2 1 −7
 y  = t 1 × 3 = t −1 ,
z 3 2 5
where t is an arbitrary real number. This means that all solutions of the system are
on the vector line  
 −7 
Span −1 .
5
 
Note that the system (7.2) is also equivalent to the system

½
2x + y + 3z = 0
.
3x − y + 4z = 0

        
x 2 3 7
 y  = t 1 × −1 = t  1 ,
z 3 4 −5
on the vector line     
 7   −7 
Span  1 = Span −1 .
5

−5
  
Finally we note that the system (7.2) is equivalent to the system

½
x + 3y + 2z = 0
.
3x − y + 4z = 0

       
x 1 3 14
 y  = t 3 × −1 = t  2 ,
z 2 4 −10
on the vector line    
 14   −7 
Span  2 = Span −1 .
5

−10
  
From Theorem 7.1.1 and its proof we can easily obtain an important result called
the rank-nullity theorem. First we need a new definition.
 
a1 b1 c1
Definition 7.1.3. Let A = a 2 b 2 c 2  be an arbitrary 3 × 3 matrix. By the
a3 b3 c3
 
x
nullspace of A, denoted by N(A), we mean the set of all vectors  y  such
z
that     
a1 b1 c1 x 0
a 2 b 2 c 2   y  = 0 .
a3 b3 c3 z 0
From Theorem 7.1.1, or by verifying the conditions in Corollary 5.3.27, it follows

that N(A), the nullspace of A, is a vector subspace of R3 . The dimension of the sub-
space N(A) is called the nullity of A and is denoted by nullity(A).
Theorem 7.1.4 (The rank-nullity theorem for 3 × 3 matrices). If A is an arbi-

trary 3 × 3 matrix, then
rank(A) + nullity(A) = 3.
Determinants of 3 × 3 matrices revisited

In this chapter we often have to calculate determinants in order to find eigenval-
ues. The next result list properties of determinants which facilitate such calcula-
tions. Most of these results are in the exercises in Chapter 5.
 
a1 b1 c1
Theorem 7.1.5. For an arbitrary 3 × 3 matrix A =  a 2 b2 c 2  we have:
a3 b3 c3
(a) 
t a1 b1 c1 a1 t b1 c1 a1 b1 t c1
    
det  t a 2 b2 c2  = det  a 2 t b2 c2  = det  a 2 b2 t c 2  = t det A

t a3 b3 c3 a3 t b3 c3 a3 b3 t c3
(b)
t a1 t b1 t c1 a1 b1 c1 a1 b1 c1
     
det  a 2 b2 c 2  = det  t a 2 t b2 t c 2  = det  a 2 b2 c 2  = t det A

a3 b3 c3 a3 b3 c3 t a3 t b3 t c3
(c)
a1 b 1 + sa 1 c1 + t a1 a 1 + sb 1 b1 c1 + t b1
   
det  a 2 b 2 + sa 2 c 2 + t a 2  = det  a 2 + sb 2 b2 c2 + t b2 
a3 b 3 + sa 3 c3 + t a3 a 3 + sb 3 b3 c3 + t b3
a 1 + sc 1 b1 + t c1 c1
 
= det  a 2 + sc 2 b 2 + t c 2 c 2  = det A
a 3 + sc 3 b3 + t c3 c3
(d)
a1 b1 c1 a 1 + sa 2 b 1 + sb 2 c 1 + sc 2
   
det  a 2 + sa 1 b 2 + sb 1 c 2 + sc 1  = det  a2 b2 c2 
a3 + t a1 b3 + t b1 c3 + t c1 a3 + t a2 b3 + t b2 c3 + t c2
a 1 + sa 3 b 1 + sb 3 c 1 + sc 3
 
= det  a 2 + t a 3 b2 + t b3 c 2 + t c 3  = det A
a3 b3 c3
Proof. We can verify these equalities by direct calculations or using the cross prod-
uct. To illustrate this method we show that
   
a1 b1 c1 a1 b1 c1
det  a 2 + sa 1 b 2 + sb 1 c 2 + sc 1  = det  a 2 b2 c2  .
a3 + t a1 b3 + t b1 c3 + t c1 a3 b3 c3
First we note that

   
a1 b1 c1 a1 a 2 + sa 1 a3 + t a1
det  a 2 + sa 1 b 2 + sb 1 c 2 + sc 1  = det  b 1 b 2 + sb 1 b3 + t b1 
a3 + t a1 b3 + t b1 c3 + t c1 c1 c 2 + sc 1 c3 + t c1
because the determinant of a matrix is equal to the determinant of its transpose, by

    
a1 a2 a3
Theorem 5.1.15. If we let e = b 1 , f = b 2 , and g = b 3 , then we have
c1 c2 c3
 
a1 a 2 + sa 1 a3 + t a1 £ ¤
det  b 1 b 2 + sb 1 b 3 + t b 1  = det e f + se g + t e
c1 c 2 + sc 1 c3 + t c1
¡ ¢
= e • (f + se) × (g + t e)
= e • (f × g) + s(e • (e × g)) + t (e • (f × e)) + st (e • (e × e))
= e • (f × g)
£ ¤
= det e f g
To complete the proof we use again the fact that the determinant of a matrix is equal
to the determinant of its transpose and obtain
   
£ ¤ a1 a2 a3 a1 b1 c1
det e f g = det  b 1 b2 b 3  = det  a 2 b2 c 2  = det A.
c1 c2 c3 a3 b3 c3
Eigenvalues of 3 × 3 matrices
In this chapter we generalize the ideas introduced in the previous chapters to 3 × 3
matrices. In particular, we are interested in representing a 3 × 3 matrix A in the form
A = P DP −1 where P is an invertible matrix and D is a diagonal matrix. As in the case
of 2 × 2 matrices, eigenvalues play a fundamental role.
The definition of eigenvalues for 3 × 3 matrices is similar to the definition of
eigenvalues for 2 × 2 matrices (Definition 1.4.2).
 
a b c
Definition 7.1.6. A real number λ is an eigenvalue of the matrix b d e 
c e f
if the equation
    
a b c x x
b d e   y  = λ  y 
c e f z z
   
x 0
has a solution  y  6= 0.
z 0
 
7 3 3
Example 7.1.7. The number 4 is an eigenvalue of the matrix 1 5 1 because we
5 5 9
have     
7 3 3 2 2
1 5 1 −5 = 4 −5 .
5 5 9 3 3
When finding eigenvalues of a 3 × 3 matrix we use the following theorem which

is an immediate consequence of Theorem 7.1.1.


a b c
Theorem 7.1.8. A real number λ is an eigenvalue of the matrix b d e  if
c e f
and only if
a −λ
 
b c
det  b d −λ e  = 0.
c e f −λ
Definition 7.1.9. The polynomial
a −λ
 
b c
P (λ) = det  b d −λ e
c e f −λ
 
a b c
is called the characteristic polynomial of the matrix b d e .
c e f
 
7 2 2
Example 7.1.10. Determine the eigenvalues of the matrix 4 5 2.
2 1 4
Solution. The eigenvalues are the roots of the equation
7−λ
 
2 2
det  4 5−λ 2 = 0.
2 1 4−λ
To make solving this equation easier we first multiply the third row by −2 and
add to the second row and multiply the third row by λ−7 2 and add to the first row
and obtain
7 − λ + 2 • λ−7 2 + λ−7 2 + ( λ−7
2 )(4 − λ)
 
2 2
det  4−4 5−λ−2 2 − 2(4 − λ) = 0.
2 1 4−λ
Since
−λ2 +11λ−24
0 λ−3
  " #
2 2 λ−3 −λ2 +11λ−24
det 0 3 − λ 2λ − 6 = 2 det 2 2 = 0,
3−λ 2λ − 6
2 1 4−λ
it suffices to solve the equation
λ − 3 −λ2 + 11λ − 24
· ¸
det = 0.
3−λ 2λ − 6
The above equation is
(λ − 3)(2λ − 6) − (−λ2 + 11λ − 24)(3 − λ) = 0,
which can be simplified to
(3 − λ)(−2λ + 6 + λ2 − 11λ + 24) = 0
or
(3 − λ)(λ2 − 13λ + 30) = 0.
Because the roots of the quadratic equation λ2 − 13λ + 30 = 0 are 3 and 10, the
eigenvalues are 3 and 10, with 3 being a double eigenvalue.
We note that we can calculate the determinant
7−λ
 
2 2
det  4 5−λ 2
2 1 4−λ
in many different ways. For example, we can subtract the third column from the
second one and get
7−λ 7−λ
   
2 0 2 0
det  4 5 − λ −3 + λ = (3 − λ) det  4 5 − λ −1 .
2 1 3−λ 2 1 1
Now we add in the new matrix the third row to the second one and get
7−λ 7−λ
   
2 0 2 0
(3 − λ) det  4 5 − λ −1 = (3 − λ) det  6 6 − λ 0 = (3 − λ)(λ2 − 13λ + 30).
2 1 1 2 1 1
The first method for calculating the above determinant is not the shortest but
does not demand tricks. Generally, we have to use properties of determinants in
order to calculate the characteristic polynomial. Then we need to find the roots of
this polynomial. Since the characteristic polynomial of a 3×3 matrix is a polynomial
of degree 3, it is often not obvious how to find the roots.
Eigenvectors of 3 × 3 matrices
 
a b c
Definition 7.1.11. Let λ be an eigenvalue of the matrix b d e . A vector
c e f
   
x 0
 y  6= 0 such that
z 0
    
a b c x x
b d e   y  = λ  y 
c e f z z
is called an eigenvector corresponding to the eigenvalue λ.

  
2 7 3 3
Example 7.1.12. The vector −5 is an eigenvector of the matrix 1 5 1 corre-
3 5 5 9
    
7 3 3 2 2
sponding to the eigenvalue 4 because 1 5 1 −5 = 4 −5.
5 5 9 3 3
Note that eigenvectors corresponding two different eigenvalues of a matrix have

 
x
to be different. Indeed, if  y  was an eigenvector corresponding to the eigenvalue
z
α and, at the same time, to the eigenvalue β, then we would have
      
x a b c x x
α  y  = b d e   y  = β  y 
z c e f z z
and consequently
   
x 0
(α − β)  y  = 0 .
z 0
   
x 0
But this implies α = β, since  y  6= 0.
z 0
 
a b c
Definition 7.1.13. If λ is an eigenvalue of the matrix b d e , then the set
c e f
 
x
of all vectors  y  which satisfy the equation
z
    
a b c x x
b d e   y  = λ  y 
c e f z z
is called the eigenspace corresponding to the eigenvalue λ and is denoted by

Eλ.
The eigenspace E λ consists of all eigenvectors corresponding to λ and the vector

 
0
0, which is not an eigenvector. Note that any eigenspace of a 3 × 3 matrix is a
0
solution of an equation considered in Theorem 7.1.1 and, consequently, is a vector
subspace of R3 .
 
3 2 2
Example 7.1.14. Show that 8 is an eigenvalue of the matrix A = 3 4 3 and then
2 2 3
find the eigenspace corresponding to the eigenvalue 8.
Solution. It is easy to verify that 8 is an eigenvalue of the matrix because

   
3−8 2 2 −5 2 2
det  3 4−8 3 = det  3 −4 3 = 0.
2 2 3−8 2 2 −5
To find the eigenvectors corresponding to the eigenvalue 8 we solve the equation

    
3 2 2 x x
3 4 3   y  = 8  y 
2 2 3 z z
which is equivalent to the system of equations


 −5x + 2y + 2z = 0
3x − 4y + 3z = 0 .
2x + 2y − 5z = 0

This system can be solved with Gauss elimination or using the cross product as
follows. By Theorem 7.1.1, the system reduces to
½
3x − 4y + 3z = 0
.
2x + 2y − 5z = 0

          
x 3 2 14 2
 y  = t −4 ×  2 = t 21 = 7t 3 ,
z 3 −5 14 2

This means that the eigenspace corresponding to the eigenvalue λ = 8 is the
 
 2 
vector line Span 3 .
2
 
Definition 7.1.15. A 3 × 3 matrix D is called diagonal if there are real num-

bers α, β, and γ such that
α 0 0
 
D =  0 β 0 .
0 0 γ
Definition 7.1.16. A 3 × 3 matrix A is called diagonalizable if there is a diag-

onal matrix D and an invertible matrix P such that
A = P DP −1 .
This means that the matrix  

a1 b1 c1
a 2 b 2 c 2 
a3 b3 c3
is diagonalizable if there are real numbers α, β, and γ and an invertible matrix
 
q1 r 1 s1
q 2 r 2 s 2 
q3 r 3 s3
such that
−1
q1 r 1 s1 α 0 0 q1 r 1 s1
    
a1 b1 c1
 a 2 b 2 c 2  =  q 2 r 2 s 2   0 β 0  q 2 r 2 s 2  .
a3 b3 c3 q3 r 3 s3 0 0 γ q3 r 3 s3
Theorem 7.1.17. A 3×3 matrix is diagonalizable if and only if the matrix has
3 linearly independent eigenvectors.
 
a1 b1 c1
Proof. Let A = a 2 b 2 c 2 . We need to show that A is diagonalizable if and only if
a3 b3 c3
there exist real numbers α, β, and γ, not necessarily different, and linearly indepen-
     
u1 v1 w1
dent vectors u 2 , v 2  and w 2  such that
u3 v3 w3
    
a1 b1 c1 u1 u1
a 2 b 2 c 2  u 2  = α u 2  ,
a3 b3 c3 u3 u3
    
a1 b1 c1 v1 v1
a 2 b 2 c 2  v 2  = β v 2  ,
a3 b3 c3 v3 v3
and     
a1 b1 c1 w1 w1
a 2 b 2 c 2  w 2  = γ w 2  .
a3 b3 c3 w3 w3
We first note that the above three equations can be written as a single equation
u1 v 1 w 1 α 0 0
     
a1 b1 c1 u1 v 1 w 1
a 2 b 2 c 2  u 2 v 2 w 2  = u 2 v 2 w 2   0 β 0 .
a3 b3 c3 u1 v 1 w 3 u1 v 1 w 3 0 0 γ
     
u1 v1 w1
By Theorem 5.3.10, the vectors u 2 , v 2  and w 2  are linearly independent if
u3 v3 w3
 
u1 v 1 w 1
and only if the matrix P = u 2 v 2 w 2  is invertible. Consequently, to prove the
u1 v 1 w 3
theorem it suffices to show that a matrix A is diagonalizable if and only if there exists
an invertible matrix P and a diagonal matrix D such that AP = P D.
If A is diagonalizable, then there exists an invertible matrix P and a diagonal
matrix D such that A = P DP −1 . Consequently,
AP = P DP −1 P = P D.
Now, if there exists an invertible matrix P and a diagonal matrix D such that
AP = P D, then
P DP −1 = AP P −1 = A,
so A is diagonalizable.
Note that the above proof explains the relationship between the linearly inde-
 
a1 b1 c1
pendent eigenvectors of a matrix A = a 2 b 2 c 2  and the invertible matrix P in the
a3 b3 c3
equation A = P DP −1 : the eigenvectors are the column vectors of P . More precisely,
if     
a1 b1 c1 u1 u1
 a 2 b 2 c 2  u 2  = α u 2  ,
a3 b3 c3 u3 u3
    
a1 b1 c1 v1 v1
a 2 b 2 c 2  v 2  = β v 2  ,
a3 b3 c3 v3 v3
and     
a1 b1 c1 w1 w1
a 2 b 2 c 2  w 2  = γ w 2  ,
a3 b3 c3 w3 w3
then for
α 0 0
   
u1 v 1 w 1
P = u 2 v 2 w 2  and D =  0 β 0
u1 v 1 w 3 0 0 γ
we have A = P DP −1 .
Theorem 7.1.18. If a 3 × 3 matrix A has 3 different eigenvalues, then A is

diagonalizable.
Proof. Let
 
a1 b1 c1
A = a 2 b 2 c 2 
a3 b3 c3
and let
a1 − λ
 
b1 c1
det  a2 b2 − λ c 2  = −(λ − α)(λ − β)(λ − γ),
a3 b3 c3 − λ
where α, β, and γ are three different real numbers.
The equation
    
a1 b1 c1 x x
a 2 b 2 c 2   y  = α  y 
a3 b3 c3 z z
 
u1
has a nontrivial solution u = u 2  because
u3
a1 − α
 
b1 c1
det  a2 b2 − α c 2  = 0.
a3 b3 c3 − α
In the same way the equation

    
a1 b1 c1 x x
a 2 b 2 c 2   y  = β  y 
a3 b3 c3 z z
 
v1
has a nontrivial solution v = v 2  and the equation
v3
    
a1 b1 c1 x x
a 2 b 2 c 2   y  = γ  y 
a3 b3 c3 z z
   
w1 u1 v 1 w 1
has a nontrivial solution w = w 2 . We have to prove that the matrix u 2 v 2 w 2 
w3 u1 v 1 w 3
is invertible which is equivalent to the fact that the vectors u, v, and w are linearly
independent.
We show first that the vectors u and v are linearly independent.
We have to show that
pu + qv = 0
implies p = q = 0.
If pu + qv = 0, then
pαu + qβv = p Au + q Av = A(pu + qv) = A0 = 0.
Since
(pαu + qβv) − β(pu + qv) = p(α − β)u + q(β − β)v = p(α − β)u = 0,
α − β 6= 0, and u 6= 0, we conclude that p = 0. Similarly, since
(pαu + qβv) − α(pu + qv) = p(α − α)u + q(β − α)v = q(β − α)v = 0,
β − α 6= 0, and v 6= 0, we conclude that q = 0.

Now we prove that the vectors u, v, and w are linearly independent.
To this end we will prove that r u + sv + t w = 0 implies that r = s = t = 0.
If r u + sv + t w = 0, then
r αu + sβv + t γw = r Au + s Av + t Aw = A(r u + sv + t w) = 0
and hence
(r αu + sβv + t γw) − γ(r u + sv + t w) = r (α − γ)u + s(β − γ)v = 0.
Since α − γ 6= 0 and β − γ 6= 0 and because the vectors u and v are linearly indepen-
dent, we obtain that r = s = 0. Now, since w 6= 0, we also obtain that t = 0, which
completes the proof.
 
2 1 −1
Example 7.1.19. If possible, diagonalize the matrix  2 3 1.
−2 −1 1
Solution. The eigenvalues of the matrix are the roots of the equation
2−λ
 
1 −1
det  2 3−λ 1 = 0.
−2 −1 1 − λ
We add the second row to the third one and then we add the second row multiplied
λ−2
by − 2−λ
2 = 2 to the first row and get
0 1 + λ−2 λ−2 
2 (3 − λ) −1 + 2

det 2 3−λ 1 = λ(2 − λ)(λ − 4) = 0.

0 2−λ 2−λ
Consequently, the egenvalues are 0, 2, and 4.

We can also calculate the determinant in the following simpler way. If we add
the first row to the last one, the determinant becomes
2−λ 2−λ
   
1 −1 1 −1
det  2 3 − λ 1 = −λ det  2 3 − λ 1 .
−λ 0 −λ 1 0 1
Next we subtract the first column from the third column and the determinant be-
comes
2−λ 1 −3 + λ
 
−λ det  2 3−λ −1 = −λ(λ2 − 6λ + 8) = 0.

1 0 0
Now we find a basis of eigenvectors for the matrix. First we determine the
eigenvectors which correspond to the eigenvalue 0. We have to solve the equation
    
2 1 −1 x x
 2 3 1  y  = 0  y  .
−2 −1 1 z z


 2x + y − z = 0
2x + 3y + z = 0 .
−2x − y + z = 0

which is equivalent to ½
2x + y − z = 0
.
2x + 3y + z = 0
     
x 4 1
 y  = t −4 = 4t −1 ,
z 4 1
where t is an arbitrary real number. This means that the eigenspace corresponding
 
 1 
to the eigenvalue 0 is the vector line Span −1 .
1
 
Now we determine the eigenvectors which correspond to the eigenvalue 2. We
have to solve the equation
    
2 1 −1 x x
 2 3 1  y  = 2  y  .
−2 −1 1 z z


 2x + y − z = 2x
2x + 3y + z = 2y
−2x − y + z = 2z

or ½
y −z =0
.
2x + y + z = 0
     
x 2 −1
 y  = t −2 = −2t  1 ,
z −2 1
 
 −1 
to the eigenvalue 2 is the vector line Span  1 .
1
 
Finally we determine the eigenvectors which correspond to the eigenvalue 4.

We have to solve the equation
    
2 1 −1 x x
 2 3 1  y  = 4  y 
−2 −1 1 z z
or the system

 2x + y − z = 4x
2x + 3y + z = 4y
−2x − y + z = 4z

which is equivalent to 
 −2x + y − z = 0
2x − y + z = 0
−2x − y − 3z = 0

or ½
2x − y + z = 0
−2x − y − 3z = 0
     
x 4 1
 y  = t  4 = 4t  1 ,
z −4 −1
  
 1 
to the eigenvalue 4 is the vector line Span  1 .

−1

From our calculations we can conclude that
     −1
2 1 −1 −1 1 1 2 0 0 −1 1 1
 2 3 1 =  1 −1 1 0 0 0  1 −1 1 .
−2 −1 1 1 1 −1 0 0 4 1 1 −1
Note that the order of the eigenvalues in the matrix

 
2 0 0
0 0 0 
0 0 4
corresponds to the order of the corresponding eigenvectors in the matrix

 
−1 1 1
 1 −1 1 .
1 1 −1
The matrix in the example above had three different eigenvalues. Now we are
going to investigate what happens when a matrix has only two different eigenvalues.
From Theorem 7.1.17 we know that a 3 × 3 matrix is diagonalizable if and only if the
matrix has 3 linearly independent eigenvectors. Therefore, in our case we expect to
find two linearly independent eigenvectors for one of the two eigenvalues.
 
4 1 2
Example 7.1.20. If possible, diagonalize the matrix A = 2 3 2.
2 1 4
Solution. The eigenvalues are given by the equation
4−λ
 
1 2
det  2 3−λ 2 = 0.
2 1 4−λ
In order to calculate this determinant we subtract the second row from the third
one and then we add the second row multiplied by λ−42 to the first one. We get
−λ2 +7λ−10
λ−2
 
0 2
2
det 2 3−λ 2 = (λ − 2) (7 − λ) = 0.
0 λ−2 2−λ
The same result can be obtained if we subtract the second row from the first
one and get
2−λ λ−2
   
0 −1 1 0
det  2 3−λ 2 = (λ − 2) det  2 3 − λ 2 .
2 1 4−λ 2 1 4−λ
and then we add the first column to the second one and get
 
−1 0 0
(λ − 2) det  2 5 − λ 2 = (λ − 2)2 (7 − λ).
2 3 4−λ
Consequently, the matrix A has two eigenvalues: 2 and 7.

First we consider the eigenvalue 2. The equation

    
4 1 2 x x
2 3 2   y  = 2  y 
2 1 4 z z
can be written as     
2 1 2 x 0
2 1 2  y  = 0
2 1 2 z 0
and it thus reduces to a single equation
2x + y + 2z = 0
or
y = −2x − 2z.
The general solution of this equation is
       
x x 1 0
 y  = −2x − 2z  = x −2 + z −2
z z 0 1

  
1 0
for arbitrary real numbers x and z. This means that both vectors −2 and −2
0 1
are eigenvectors of A corresponding to the eigenvalue 2. In this case the eigenspace
   
 1 0 
is the vector plane Span  −2  ,  −2  .
0 1
 
Now we consider the eigenvalue 7. The equation
    
4 1 2 x x
2 3 2   y  = 7  y 
2 1 4 z z
can be written as     
−3 1 2 x 0
 2 −4 2  y  = 0 .
2 1 −3 z 0
The equation is equivalent to the system

 −3x + y + 2z = 0
2x − 4y + 2z = 0 ,
2x + y − 3z = 0

which can be solve using Gauss elimination or Theorem 7.1.1. The system reduces
to a system of two equations
½
−3x + y + 2z = 0
.
x − 2y + z = 0

         
x −3 1 5 1
 y  = t  1 × −2 = t 5 = 5t 1
z 2 1 5 1
 
 1 
to the eigenvalue 7 is the vector line Span 1 . Since
1
 
 
1 0 1
det −2 −2 1 = −5,
0 1 1
     
1 0 1
the vectors −2, −2, and 1 are linearly independent and thus the matrix A
0 1 1
is diagonalizable, by Theorem 7.1.17:
     −1
4 1 2 1 0 1 2 0 0 1 0 1
2 3 2 = −2 −2 1 0 2 0 −2 −2 1 .
2 1 4 0 1 1 0 0 7 0 1 1
Note that we have other choices for diagonalization of A, for example,

     −1
4 1 2 1 1 0 2 0 0 1 1 0
2 3 2 = −2 1 −2 0 7 0 −2 1 −2
2 1 4 0 1 1 0 0 2 0 1 1
or
     −1
4 1 2 1 1 0 7 0 0 1 1 0
2 3 2 = 1 −2 −2 0 2 0 1 −2 −2 .
2 1 4 1 0 1 0 0 2 1 0 1
The matrix in the above example has two eigenvalues: 2 and 7. We found two
   
1 0
linearly independent eigenvectors corresponding to 2, namely −2 and −2, and
0 1
 
1
one eigenvector corresponding to 7, namely 1. We used the determinant to check
1
that these three vectors are linearly independent. It turns out that it was not neces-
sary to check that.
Theorem 7.1.21. Let A be a 3 × 3 matrix with two different eigenvalues α

and β. If u and v are linearly independent eigenvectors corresponding to the
eigenvalue α and w is an eigenvector corresponding to the eigenvalue β, then
the vectors u, v, and w are linearly independent.
Proof. To show that the vectors u, v, and w are linearly independent assume that
r u + sv + t w = 0.
We need to show that r = s = t = 0.

If t 6= 0, then
r s
w = u+ v
t t
which means, because an eigenspace is a vector subspace, that w is in the eigenspace
corresponding to the eigenvalue α. But this is not possible, since w is in the
eigenspace corresponding to the eigenvalue λ = β and α 6= β. Thus t = 0 and the
equation r u + sv + t w = 0 reduces to the equation
r u + sv = 0.
Since the vectors u and v are linearly independent, we must have r = s = 0.

4 1 −2
Example 7.1.22. If possible, diagonalize the matrix A = 3 2 −2.
4 2 −2
Solution. The eigenvalues are 1 and 2. The eigenspace corresponding to the eigen-
 
 1 
value 1 is Span 1 and the eigenspace corresponding to the eigenvalue 2 is
2
 
 
 2 
Span 2 . Since the matrix does not have three linearly independent eigenvec-
3
 
tors, it is not diagonalizable, by Theorem 7.1.17.
7.1.1 Exercises
Find an eigenvalue of the matrix A without calculating the characteristic polynomial

of A.
   
7 4 9 3 4 1
1. A = 1 5 1 5. A = 5 7 9
1 2 4 2 4 2
   
5 4 2 5 4 1
2. A = 1 2 0 6. A = 1 7 4
1 1 1 1 2 9
   
2 1 1 7 8 4
3. A = 1 2 1 7. A = 2 9 3
8 3 3 2 4 5
   
4 2 1 11 5 15
4. A = 1 5 1 8. A =  2 2 3
1 2 4 8 9 7
Find two eigenvalues of the given matrix without calculating its characteristic poly-
nomial.
   
5 4 4 3 3 1
9. 4 5 4 10. 1 5 1
4 1 8 1 2 4
Verify that the given λ is an eigenvalue of the given matrix A and calculate the
eigenspace corresponding to the eigenvalue λ.
   
3 3 1 4 2 5
11. λ = 7 and A = 1 5 1 15. λ = 3 and A = 1 5 5
2 2 3 1 2 8
   
3 3 1 2 4 1
12. λ = 1 and A = 2 4 1 16. λ = 1 and A = 1 5 1
2 3 2 1 4 2
   
4 2 5 3 3 1
13. λ = 11 and A = 1 5 5 17. λ = 2 and A = 1 5 1
1 2 8 2 2 3
   
5 1 1 8 1 1
14. λ = 4 and A = 3 7 3 18. λ = 7 and A = 3 10 3
1 5 2 1 1 8
Find the eigenvalues of the given matrix.

   
1 2 3 2 1 1
19. −1 2 1 21. 1 2 1
1 1 2 1 0 2
   
4 1 1 3 1 2
20. 1 4 1 22. 2 4 2
1 3 2 2 1 3
Write, if possible, the given matrix A in the form A = P DP −1 where P is an invertible

matrix and D is a diagonal matrix.
   
1 0 0 4 2 2
23. A = 1 2 0 27. A = 2 7 4
2 1 0 2 4 7
   
3 0 2 3 1 5
24. A = 1 2 2 28. A = 1 3 5
4 4 9 1 1 7
   
3 4 2 1 2 0
25. A = 1 3 1 29. A = −1 4 0
1 2 2 −1 3 2
   
2 1 3 5 2 −3
26. A = 1 2 3 30. A = 2 5 −3
3 3 10 5 3 −3
7.2 Symmetric 3 × 3 matrices

Recall that a matrix is called symmetric if A T = A.
Definition 7.2.1. A 3×3 matrix P is called orthogonal matrix if it is invertible

and we have
P T = P −1 .
Note that we can also say that a 3 × 3 matrix A is orthogonal if

 
1 0 0
T
P P = 0 1 0 .
0 0 1

1 2 T
− 23
1
3 3 3 − 23 3
2  1
3
2
3
21
3 3 − 23 2 
3 1 0 0

2
− 13 − 23 
2
− 13 − 32  = − 23 − 31 22
− 13 − 23  = 0 1 0 ,
   
3 3 3   3
2 2 1 2 2 1 2
− 32 1 2 2 1 0 0 1
3 3 3 3 3 3 3 3 3 3 3
the matrix 1
3 − 32 2
3
2
− 31 − 23 

3
2 2 1
3 3 3
is orthogonal.
The following theorem gives an indication why orthogonal matrices are called
orthogonal.
Theorem 7.2.3. If the columns of a 3 × 3 matrix P are pairwise orthogonal

unit vectors, then P is an orthogonal matrix.
£ ¤
Proof. Let P = p1 p2 p3 where p1 , p2 , and p3 are vectors such that
p1 • p3 = p2 • p3 = p3 • p1 = 0 and kp1 k = kp2 k = kp3 k = 1.
Then
 T  T
p1 p1 pT1 p2 pT1 p3
 
p1 p1 • p1 p1 • p2 p1 • p3
 
1 0 0

T  T  T T T
£ ¤
P P = p2  p1 p2 p2 = p2 p1 p3 p2 p2 p3  = p2 p1 p2 • p2 p2 • p3  = 0 1 0 ,
  •
pT p T p 1 pT p 2 p T p 3 p3 • p1 p3 • p2 p3 • p3 0 0 1
3 3 3 3
which means that P T = P −1 , that is, P is an orthogonal matrix.
Note that the columns of the matrix in Example 7.2.2, that is,
1
3 − 23 2
3
2
− 13 − 23  ,

3
2 2 1
3 3 3
are pairwise orthogonal unit vectors.
Corollary 7.2.4. Let p1 , p2 , and p3 be three vectors in R3 such that the matrix
p1 p2 p3 is an orthogonal matrix. Then {p1 , p2 , p3 } is a basis of R3 .
£ ¤
Proof. The set {p1 , p2 , p3 } is a basis of R3 because the matrix p1 p2 p3 is invertible.

£ ¤
A basis {p1 , p2 , p3 } of R3 such that the vectors p1 , p2 , and p3 are orthonormal is

called an orthonormal basis.
Definition 7.2.5. We say that a matrix A is orthogonally diagonalizable if

there are an orthogonal matrix P and a diagonal matrix D such that
A = P DP −1 = P DP T .
Example 7.2.6. In the previous example we show that the matrix

1
3 − 32 2
3
2
− 31 − 23 

3
2 2 1
3 3 3
is orthogonal. Consequently, the matrix

1
3 − 23 2
3 2 0 0
 1
3
2
3
2  2
3 9 − 92 10 
9
2
− 13 − 23  0 −1 0 − 32 − 13 2  2
3  = − 9
11 8
 
3 9 9
2 2 1 0 0 3 2 2 1 10 8 5
3 3 3 3 −3 3 9 9 9
is orthogonally diagonalizable.
We know that the matrix

2
− 29 10 

9 9
 2 11 8
− 9 9 9
10 8 5
9 9 9
in the example above is orthogonally diagonalizable because it is constructed as a
product P DP T where P is an orthogonal matrix P and D is a diagonal matrix, but
how could we check if a matrix is orthogonally diagonalizable? The following theo-
rem can help with this question.
Theorem 7.2.7. If a 3 × 3 matrix A has 3 orthogonal eigenvectors, then A is

symmetric and orthogonally diagonalizable.
Proof. Let v1 ,v2 , and v3 be the orthogonal eigenvectors of the matrix A correspond-
ing to the eigenvalues α1 , α2 , and α3 , respectively. This means that
Av1 = α1 v1 , Av2 = α2 v2 , Av3 = α3 v3 , and v1 • v2 = v2 • v3 = v1 • v3 = 0.

If we let
1 1 1
p1 = v1 , p2 = v2 , and p3 = v3 ,
kv1 k kv2 k kv3 k
then we have
p1 • p2 = p2 • p3 = p1 • p3 = 0 and kp1 k = kp2 k = kp3 k = 1.
Let P be the matrix whose columns are the vectors p1 , p2 , and p3 , that is,
£ ¤
P = p1 p2 p3 .
Then P is an orthogonal matrix, by Theorem 7.2.3.

The vectors p1 , p2 , and p3 satisfy the equations
Ap1 = α1 p1 , Ap2 = α2 p2 , Ap3 = α3 p3 ,
that can be written as a single equation
¤ α1 0 0
 
A p1 p2 p3 = p1 p2 p3  0 α2 0
£ ¤ £
0 0 α3
or
α1 0 0
 
AP = P  0 α2 0  .
0 0 α3
Hence
α1 0 0 α1 0 0
     
1 0 0
A = A 0 1 0 = AP P −1 = P  0 α2 0  P −1 = P  0 α2 0  P T .
0 0 1 0 0 α3 0 0 α3
Since P is an orthogonal matrix, the matrix A is orthogonally diagonalizable. More-

over, since
 T T
α 0 0 α1 0 0 α1 0 0
    
A T = P  0 β 0 P T  = (P T )T  0 α2 0  P T = P  0 α2 0  P T = A,
0 0 γ 0 0 α3 0 0 α3
the matrix A is symmetric.
From Theorem 7.2.7 and its proof we obtain the following useful corollary.
Corollary 7.2.8. If A is a 3 × 3 matrix with 3 orthogonal eigenvectors v1 , v2 ,

v3 corresponding to the real eigenvalues α1 , α2 , α3 , that is
Av1 = α1 v1 , Av2 = α2 v2 , Av3 = α3 v3 ,
then
A = P DP T ,
£ ¤
where P = p1 p2 p3 is the orthogonal matrix with columns
1 1 1
p1 = v1 , p2 = v2 , p3 = v3
kv1 k kv2 k kv3 k
and
α1 0 0
 
D =  0 α2 0  .
0 0 α3
Calculating the diagonal form of a 3 × 3 symmetric matrix

We first state and prove the following useful property of the dot product.
Lemma 7.2.9. If A is an arbitrary 3 × 3 matrix, then
Au • v = u • A T v
for any vectors u and v in R3 .
Proof. The proof uses the connection between the dot product and the product of
matrices, namely
x • y = xT y,
associativity of matrix multiplication, and a property of the transpose operation:
Au • v = (Au)T v = (uT A T )v = uT (A T v) = u • A T v.
Note that for symmetric matrices we have
Au • v = u • Av.
In the next two theorems we formulate some properties of symmetric matrices

that will help us diagonalize such matrices. These facts are important in their own
right and belong to the fundamental properties of symmetric matrices.
Theorem 7.2.10. Eigenvectors corresponding to different eigenvalues of a

symmetric matrix are orthogonal.
Proof. If Au = αu and Av = βv, then
(α − β)(u • v) = α(u • v) − β(u • v) = (αu) • v − u • β(v) = Au • v − u • Av.
For a symmetric matrix A, we have
Au • v − u • Av = 0
and consequently
(α − β)(u • v) = 0.
If α 6= β, we must have u • v = 0.
Theorem 7.2.11. If A is a symmetric 3 × 3 matrix with three different eigen-

values, then there is an orthonormal basis of R3 consisting of eigenvectors of
A.
Proof. Let
 
a b c
A = b d e 
c e f
and let
a −λ
 
b c
det  b d −λ e  = −(λ − α)(λ − β)(λ − γ),
c e f −λ
where α, β, and γ are three different real numbers. Since A has three different eigen-
values, it is diagonalizable, by Theorem 7.1.18, and thus it has three linearly inde-
pendent eigenvectors, by Theorem 7.1.17. Let u be an eigenvector corresponding
to the eigenvalue α, v an eigenvector corresponding to the eigenvalue β, and w an
eigenvector corresponding to the eigenvalue γ. By Corollary 5.3.19, {u, v, w} is a ba-
sis in R3 . Since the vectors u, v, w are orthogonal, by Theorem 7.2.10, {u, v, w} is an
orthogonal basis in R3 . Consequently,
½ ¾
u v w
, ,
kuk kvk kwk
is an orthonormal basis in R3 consisting of eigenvectors of the matrix A.

 
4 1 2
Example 7.2.12. Orthogonally diagonalize the matrix A = 1 5 1.
2 1 4
Solution. The eigenvalues are the roots of the equation
4−λ
 
1 2
det  1 5−λ 1 = 0.
2 1 4−λ
We multiply the second row by λ − 4 and add to the first and then multiply the
second row by −2 and add to the third row and get
0 (5 − λ)(λ − 4) + 1 λ − 2
 
det 1 5−λ 1 = (2 − λ)(λ2 − 11λ + 28) = 0.

0 −9 + 2λ 2 − λ
Instead, we can subtract the third row from the first one and get
4−λ 2−λ 0 λ−2

   
1 2
det  1 5−λ 1 = det  1 5−λ 1
2 1 4−λ 2 1 4−λ
 
1 0 −1
= (2 − λ) det 1 5 − λ 1
2 1 4−λ
In the last determinant we add the first column to the third one and get
 
1 0 0
= (2 − λ) det 1 5 − λ 2
2 1 6−λ
= (2 − λ)(λ2 − 11λ + 28).
Consequently, the eigenvalues are 2, 4, and 7. Now we calculate the eigenspaces.

The eigenvectors corresponding to the eigenvalue 2 are given by the equation
    
4 1 2 x x
1 5 1   y  = 2  y 
2 1 4 z z
which can be written as 

 4x + y + 2z = 2x
x + 5y + z = 2y
2x + y + 4z = 2z

or 
 2x + y + 2z = 0
x + 3y + z = 0 .
2x + y + 2z = 0

This system is equivalent to the system

½
2x + y + 2z = 0
x + 3y + z = 0
and its general solution is

     
x −5 1
 y  = t  0 = −5t  0 ,
z 5 −1
  
 1 
to the eigenvalue 2 is Span  0 .

−1

    
4 1 2 x x
1 5 1  y  = 4  y  .
2 1 4 z z


 4x + y + 2z = 4x
x + 5y + z = 4y
2x + y + 4z = 4z

or 
 y + 2z = 0
x +y + z =0 .
2x + y =0

The above system reduces to the system

½
y + 2z = 0
2x + y =0

     
x −2 1
 y  = t  4 = −2t −2 ,
z −2 1
where t is an arbitrary real number. Consequently, the eigenspace corresponding

 
 1 
to the eigenvalue 4 is Span −2 .
1
 
    
4 1 2 x x
1 5 1   y  = 7  y 
2 1 4 z z


 4x + y + 2z = 7x
x + 5y + z = 7y
2x + y + 4z = 7z

or 
 −3x + y + 2z = 0
x − 2y + z = 0 .
2x + y − 3z = 0

The above system reduces to the system

½
x − 2y + z = 0
2x + y − 3z = 0

     
x 5 1
 y  = t 5 = 5t 1 ,
z 5 1
where t is an arbitrary real number. Thus the eigenspace corresponding to the
 
 1 
eigenvalue 7 is Span 1 .
1
 
Note that we can obtain that the eigenspace corresponding to the eigenvalue 7
as a consequence of the fact that the eigenvectors corresponding to the eigenvalue
7 have to be orthogonal on the eigenvectors corresponding to the eigenvalues 2
and 4. Since        
1 1 −2 1
 0 × −2 = −2 = −2 1 ,
−1 1 −2 1
 
 1 
Span 1 must be the eigenspace corresponding to the eigenvalue 7.
1
 

    
1 1 1
Since the vectors  0, −2, and 1 are orthogonal eigenvectors corre-
−1 1 1
sponding to eigenvalues 2, 4 and 7, respectively, they can be used to orthogonally
diagonalize the matrix A. The last necessary step is normalization of these eigen-
vectors. Since
° ° ° ° ° °
° 1 ° ° 1 ° ° 1 °
° ° p ° ° p ° ° p
° 0° = 2, °−2° = 6, and °1° = 3,
° ° ° ° ° °
° −1 ° ° 1 ° ° 1 °
we conclude that
p1 p1 p1 p1 0 − p1
   
2 6 3 2 0  2 2
   
4 1 2  0
1 5 1  =  0 − p2 p1  0 4
 1
0  p6 − p2 p1 
6 3 6 6
 
2 1 4 0 0 7
 
− p1 p1 p1 p1 p1 p1
2 6 3 3 3 3
is an orthogonal diagonalization of A.
In the next theorem we discuss the general case of 3×3 symmetric matrices with
two different eigenvalues.
Theorem 7.2.13. Let A be a symmetric 3×3 matrix. If the characteristic poly-

nomial of A has the form
P (λ) = −(λ − α)2 (λ − β),
where α and β are two different real numbers, then the matrix A has an or-
thogonal basis of eigenvectors consisting of two eigenvectors corresponding to
the eigenvalue α and one eigenvector corresponding to the eigenvalue β.
 
a b c
Proof. Let A = b d e . The equation
c e f
a −α
    
b c x 0
 b d −α e   y  = 0
c e f −α z 0
 
u1
has a nontrivial solution u 2  because
u3
a −α
 
b c
det  b d −α e  = 0.
c e f −α
Similarly, the equation
a −β
    
b c x 0
 b d −β e   y  = 0
c e f −β z 0
     
v1 u1 v1
has a nontrivial solution v 2 . By Theorem 7.2.10, we have u 2  • v 2  = 0. Let
v3 u3 v3
     
w1 u1 v1
w 2  = u 2  × v 2  .
w3 u3 v3
       
w1 0 u1 v1
We have w 2  6= 0 because the vectors u 2  and v 2  are linearly independent
w3 0 u3 v3
being nonzero and orthogonal. By Lemma 7.2.9, we have
           
a b c w1 u1 w1 a b c u1
b d e  w 2  • u 2  = w 2  • b d e  u 2 
c e f w3 u3 w3 c e f u3
       
w1 u1 w1 u1
= w 2  • α u 2  = α w 2  • u 2  = 0
w3 u3 w3 u3
and
            
a b c w1 v1 w1 a b c v1
  b d e   w 2  •  v 2  =  w 2  •   b d e   v 2 
c e f w3 v3 w3 c e f v3
        
w1 v1 w1 v1
= w 2  • β v 2  = β w 2  • v 2  = 0.
w3 v3 w3 v3
Hence     
a b c w1 w1
b d e  w 2  = γ w 2 
c e f w3 w3
for some real number γ by Theorem 5.1.1. So γ is an eigenvalue of A and we must
have
−(λ − α)2 (λ − β) = −(λ − α)(λ − β)(λ − γ),
 
w1
so γ = α and thus w 2  is another eigenvector corresponding to α.
w3
 
2 1 2
Example 7.2.14. Orthogonally diagonalize the matrix A = 1 2 2.
2 2 5
Solution. First we need to find the eigenvalues of A, that is, the values of λ satisfy-
ing the equation
2−λ
 
1 2
det  1 2−λ 2 = 0.
2 2 5−λ
We add the second row multiplied by −2 to the third one and then we multiply the
second row by λ − 2 and add to the first row. We get
0 1 + (λ − 2)(2 − λ) −2 + 2λ
 
det 1 2−λ 2 = 0
0 −2 + 2λ 1−λ
or
0 −λ2 + 4λ − 3 −2 + 2λ
 
det 1 2−λ 2 = 0.
0 −2 + 2λ 1−λ
Since
0 −λ2 + 4λ − 3 −2 + 2λ
 
det 1 2−λ 2 = (1 − λ)(λ2 − 8λ + 7),

0 −2 + 2λ 1−λ
the eigenvalues are 1, 1, and 7.
Alternatively, we can subtract the second row from the first one and get
1−λ λ−1
   
0 1 −1 0
det  1 2−λ 2 = (1 − λ) 1 2−λ 2
2 2 5−λ 2 2 5−λ
= (1 − λ)(λ2 − 8λ + 7)
= (1 − λ)2 (7 − λ).

    
2 1 2 x x
1 2 2   y  =  y  .
2 2 5 z z


 2x + y + 2z = x
x + 2y + 2z = y
2x + 2y + 5z = z

which reduces to the equation
x + y + 2z = 0
or
x = −y − 2z.
Consequently, the general solution of the system is
       
x −y − 2z −1 −2
y  =  y  = y  1 + z  0
z z 0 1
   
 −1 −2 
and the eigenspace corresponding to the eigenvalue 1 is Span  1 ,  0 .
0 1
 
Since the vector
   
−2 −1
   0 •  1    
−2 1 0 −1 −1
 0 −      1 = −1
1 −1 −1 0 1
 1 •  1
0 0
 
−1
is an eigenvector corresponding to the eigenvalue 1 which is orthogonal to  1,
0
   
−1 −1
the vectors  1 and −1 are orthogonal and
0 1
        
 −1 −2   −1 −1 
Span  1 ,  0 = Span  1 , −1 .
0 1 0 1
   
Now we find the eigenspace corresponding to the eigenvalue 7 by noting that

an eigenvector corresponding to the eigenvalue 7 is orthogonal to all the eigenvec-
tors corresponding to the eigenvalue 1. We have
     
−1 −2 1
 1 ×  0 = 1 ,
0 1 2
 
 1 
so Span 1 must be the eigenspace corresponding to the eigenvalue 7. Since
2
 
° °2 ° °2 ° °2

° 1 ° ° −1 ° ° −1 °
° ° ° ° ° °
°1° = 6, ° 1° = 2, and °−1° = 3,
° ° ° ° ° °
° 2 ° ° 0 ° ° 1 °
normalizing the eigenvectors we obtain a basis of orthonormal eigenvectors
p1
   1   1 
−p −p
 6 2  3
 p1   1  − p1 
 6,  p ,
 2  3,
   
p2 0 p1
6 3
corresponding to the eigenvalues λ = 7, λ = 1, λ = 1. Now we are ready to present

an orthogonal diagonalization of A:
p1 − p1 − p1
 1
p1 p2
  
p
 6 2 3 7 0 0  6 6 6
 
 1 p1 − p1   1 p1 0
A =  p6 2 3
 0 1 0 − p2 2
.
0 0 1
  
p2 0 p1 − p1 − p1 p1
6 3 3 3 3
The spectral decomposition of a symmetric matrix

The representation of a matrix presented in the next theorem is a consequence of
the representation A = P DP −1 , but is expressed in a very different form. This form
explains the geometric meaning of orthogonal diagonalization and is useful in ap-
plications.
Theorem 7.2.15. If {v1 , v2 , v3 } be a basis of orthogonal eigenvectors of a 3 × 3

symmetric matrix A, then
1 1 1
A = α1 v1 vT1 + α2 v2 vT2 + α3 v3 vT3 ,
where α1 , α2 , and α3 are the eigenvalues of A corresponding to the eigenvec-

tors v1 , v2 , and v3 , respectively, that is,
Av1 = α1 v1 , Av2 = α2 v2 , and Av3 = α3 v3 .
Proof. We give two proofs.

Proof 1. Let
1 1 1
p1 = v1 , p2 = v2 , and p3 = v3 .
kv1 k kv2 k kv3 k
First we note that
 pT 
¤ α1 0 0  1T 

A = p1 p2 p3  0 α2 0 p2  .
£
0 0 α3 pT
3
Now, if x is an arbitrary vector from R3 , then we have

 p T 
α

1 0 0 1
Ax = p1 p2 p3  0 α2 0 pT2  x
£ ¤  
0 0 α3 pT
3
 pT x
¤ α1 0 0  1 

= p1 p2 p3  0 α2 0 pT2 x
£
0 0 α3 pT x
3
α1 pT1 x
 
= p1 p2 p3 α2 pT2 x
£ ¤ 
α3 pT3 x
= α1 p1 pT1 x + α2 p2 pT2 x + α3 p3 pT3 x
¡ ¢ ¡ ¢ ¡ ¢
= α1 p1 pT1 x + α2 p2 pT2 x + α3 p3 pT3 x

¡ ¢ ¡ ¢ ¡ ¢
= α1 p1 pT1 + α2 p2 pT2 + α3 p3 pT3 x.

¡ ¢
Consequently, by Theorem 2.1.18, we have
A = α1 p1 pT1 + α2 p2 pT2 + α3 p3 pT3 ,

1 1 1
A = α1 2
v1 vT1 + α2 2
v2 vT2 + α3 v3 vT3 .
kv1 k kv2 k kv3 k2
Proof 2. We have shown in Example 5.3.24 that
 
1 0 0
1 1 1
v1 vT1 + v2 vT2 + v3 vT3 = 0 1 0 ,
0 0 1
for any orthogonal basis {v1 , v2 , v3 } in R3 . Hence

 
µ ¶ 1 0 0
1 T 1 T 1 T
A v1 v1 + v2 v2 + v3 v3 = A 0 1 0 = A,
0 0 1

1 1 1
Av1 vT1 + Av2 vT2 + Av3 vT3 = A.
Now, because
Av1 = α1 v1 , Av2 = α2 v2 , Av3 = α3 v3 ,
we obtain the desired spectral decomposition
1 1 1
A = α1 v1 vT1 + α2 v2 vT2 + α3 v3 vT3 .
Definition 7.2.16. Let A be a 3 × 3 matrix. By a spectral decomposition of A

we mean a representation of A in the form
α1 α2 α3
A= v1 vT1 + v2 vT2 + v3 vT3 ,
where v1 , v2 , and v3 are nonzero orthogonal vectors and α1 , α2 , and α3 are

real numbers.
1 T
Recall that, for any nonzero vector u in R3 , the matrix kuk 2 uu is the projection
matrix on the vector line Span{u}, (see Theorem 4.2.13). The spectral decomposition
of a symmetric matrix is thus a representation of the matrix in terms of projection
matrices on vector lines spanned by the eigenvectors of that matrix.
The following theorem is a converse of Theorem 7.2.15.
Theorem 7.2.17. If {v1 , v2 , v3 } is a basis of orthogonal vectors in R3 , then the

matrix
1 1 1
A = α1 2
v1 vT1 + α2 2
v2 vT2 + α3 v3 vT3 ,
kv1 k kv2 k kv3 k2
is a symmetric matrix such that
Av1 = α1 v1 , Av2 = α2 v2 , and Av3 = α3 v3 .
Proof. The proof is an easy consequence of the fact that the matrix uuT is symmetric
for any u in R3 .
 
2 1 2
Example 7.2.18. Find the spectral decomposition of the matrix A = 1 2 2.
2 2 5
Solution. The matrix A is the matrix considered in Example 7.2.14 where we found
that its eigenvalues are 1, 1, and 7, and corresponding orthogonal eigenvectors are
     
−1 −1 1
 1, −1, and 1. Since
0 1 2
1 1 1 1 7 7
° °2 = 2 , ° °2 = 3 , and ° °2 = 6 ,
° −1 ° ° −1 ° ° 1 °
° ° ° ° ° °
° 1° °−1° °1°
° ° ° ° ° °
° 0 ° ° 1 ° ° 2 °
the spectral decomposition of A is

       
2 1 2 −1 £ ¤ 1 −1 £ ¤ 7 1 £
1 ¤
1 2 2 =  1 −1 1 0 + −1 −1 −1 1 + 1 1 1 2 .
2 3 6
2 2 5 0 1 2
Example 7.2.19. Find a 3 × 3 symmetric matrix A with the eigenvalues α, β and γ

such that
 
1
1 is an eigenvector of A corresponding to the eigenvalue α,
1
 
1
−2 is an eigenvector of A corresponding to the eigenvalue β.
1
Solution. Since        
1 1 3 1
1 × −2 =  0 = 3  0
1 1 −3 −1
and ° °2 ° °2 ° °2
° 1 ° ° 1 ° ° 1 °
° ° ° ° ° °
°1° = 3, °−2° = 6, and ° 0° = 2,
° ° ° ° ° °
° 1 ° ° 1 ° ° −1 °
we have
     
1 1 £ 1 £
α  £ ¤ β ¤ γ ¤
A= 1 1 1 1 + −2 1 −2 1 +  0 1 0 −1
3 6 2
1 1 −1
     
1 1 1 1 −2 1 1 0 −1
α β γ
= 1 1 1 + −2 4 −2 +  0 0 0
3 6 2
1 1 1 1 −2 1 −1 0 1
α β γ α β α β γ
3 + 6 +2 3 − 3 3 + 6 −2
 
α β α 2β α β
= − 3 .

3 − 3 3 + 3 3
 
α β γ α β α β γ
3 + 6 −2 3 − 3 3 + 6 +2
The QR factorization of a 3 × 3 matrix

In Chapter 3 we introduced the QR factorization for 2 × 2 matrices and in Chapter 4
for 3 × 2 matrices. Here we present an extension of the idea to 3 × 3 matrices.
£ ¤
Theorem 7.2.20. If a 3 × 3 matrix A = c1 c2 c3 has linearly independent
columns, then A can be represented in the form
A = QR,
where Q is a 3×3 orthogonal matrix and R is an upper triangular 3×3 matrix,

that is,  
r 11 r 12 r 13
R =  0 r 22 r 23 
0 0 r 33
with r 1,1 > 0, r 2,2 > 0, and r 3,3 > 0.
Proof. First we define
v1 = c1
c2 · v1
v2 = c2 − projv1 c2 = c2 − v1
v1 · v1
c3 · v1 c3 · v2
v3 = c3 − projSpan{v1 ,v2 } c3 = c3 − projv1 c3 − projv2 c3 = c3 − v1 − v2
v1 · v1 v2 · v2
Note that the vectors v1 , v2 , and v3 are nonzero vectors (v2 is nonzero because the
vectors c1 , c2 are linearly independent and v3 is nonzero because the vectors c1 , c2 , c3
are linearly independent),
v1 • v2 = v1 • v3 = v2 • v3 = 0,
and
c2 · v1 c3 · v1 c3 · v2
c2 = v1 + v2 and c3 = v1 + v2 + v3 .
v1 · v1 v1 · v1 v2 · v2
Now we normalize the vectors v1 , v2 , v3 :
1 1 1
u1 = v1 , u2 = v2 , and u3 = v3
kv1 k kv2 k kv3 k
and denote
c2 · v1 c3 · v1
r 1,1 = kv1 k, r 1,2 = kv1 k , r 1,3 = kv1 k ,
v1 · v1 v1 · v1
c3 · v2
r 2,3 = kv2 k , r 2,2 = kv2 k,
v2 · v2
r 3,3 = kv3 k.
Then we have
c1 = r 1,1 u1 ,
c2 = r 1,2 u1 + r 2,2 u2 ,
c3 = r 1,3 u1 + r 2,3 u2 + +r 3,3 u3 ,
and consequently
 
£ ¤ £ ¤ r 11 r 12 r 13
c1 c2 c3 = u1 u2 u3  0 r 22 r 23  .
0 0 r 33
Note that r 1,1 > 0, r 2,2 > 0, and r 3,3 > 0 .
The method used in the proof of the above result to construct orthogonal vectors
is called Gram-Schimdt process.
 
1 0 1
Example 7.2.21. Determine the QR factorization of the matrix A = 1 1 0.
1 1 1
Solution. From the equality

   
0 1
   1  · 1        
0 1 1 1 0 1 −2
2 1
1 −     1 = 1 − 1 =  1
1 1 3 3
1 1 1 1 1
1 · 1
1 1
we get
     
0 1 −2
2 1
1 = 1 +  1 (7.3)
3 3
1 1 1
and from the equality
       
1 1 −2 0
2 1 1
0 − 1 +  1 = −  1
3 6 2
1 1 1 −1
we get
             
1 1 −2 0 1 −2 0
2 1 1 2 1 1
0 = 1 −  1 −  1 = 1 −  1 + −1 . (7.4)
3 6 2 3 6 2
1 1 1 −1 1 1 1
With a slight modification of the method in the proof of Theorem 7.2.20 we let
    

1 −2 0
v1 = 1 , v2 =  1 , and v3 = −1 .
1 1 1
   
0 0
We have taken v3 = −1 and not v3 =  1 because the third coefficient of
1 −1
the vector v3 in (7.4) must be positive. Similarly, if the second coefficient of the
vector v2 in (7.3) was negative, we would have to replace v2 by −v2 .
Since ° ° ° ° ° °
° 1 ° ° −2 ° ° 0 °
° ° p ° ° p ° ° p
°1° = 3, ° 1° = 6, and °−1° = 2,
° ° ° ° ° °
° 1 ° ° 1 ° ° 1 °
we let      
1 −2 0
1   1   1  
u1 = p 1 , u2 = p 1 , and u3 = p −1 .
3 1 6 1 2 1
Consequently
 
1 p
1 = 3u1 ,
1
 
0 p p
1 = 2 3 u1 + 6 u2 ,
3 3
1
 
1 p p p
0 = 2 3 u1 − 6 u2 + 2 u3 .
3 6 2
1
Now we define £ ¤
Q = u1 u2 u3 .
£ ¤
Since the matrix Q = u1 u2 u3 is an orthogonal matrix, from the equality
A = QR we get
Q T A = Q T QR = R.
Hence p p p 
2 3 2 3
3 3
p 3
p
¤T
R = Q T A = u1 u2 6
− p66  .
£
u3 A =  0
 
3
2
0 0 2
This means that the QR factorization of the matrix A is

p p p 
2 3 2 3
3 3
p 3
p
6
− p66  .
£ ¤
A = u1 u2 u3  0

3
2
0 0 2
7.2.1 Exercises
Orthogonally diagonalize the given matrix.

   
1 2 2 2 4 2
1. 2 1 2 6. 4 17 8.
2 2 5 2 8 5
   
1 4 −2 27 9 9
2.  4 1 −2. 7.  9 3 3.
−2 −2 7 9 3 3
   
13 −4 1 −2 6 3
3. −4 10 −4. 8.  6 3 −2
1 −4 13 3 −2 6
   
1 3 4 6 −2 10
4. 3 1 4 9. −2 9 5
4 4 8 10 5 −15
   
3 2 4 9 0 −3
5. 2 3 4 10.  0 6 0.
4 4 9 −3 0 9
Determine the spectral decomposition of the given matrix.

   
13 −4 1 1 2 2
11. −4 10 −4. 12. 2 1 2
1 −4 13 2 2 5
   
6 −2 10 27 9 9
13. −2 9 5 14.  9 3 3.
10 5 −15 9 3 3
 
1
15. Find a 3 × 3 matrix A such that 1 is an eigenvector of the matrix A corre-
1
 
4
sponding to the eigenvalue 9, −5 is an eigenvector of the matrix A corre-
1
 
2
sponding to the eigenvalue 30, and  1 is an eigenvector of the matrix A
−3
corresponding to the eigenvalue 28.
 
1
16. Find a 3 × 3 matrix A such that 1 is an eigenvector of the matrix A corre-
0
 
1
sponding to the eigenvalue 8, −1 is an eigenvector of the matrix A cor-
1
 
1
responding to the eigenvalue 3, and −1 is an eigenvector of the matrix A
−2
corresponding to the eigenvalue 24.
 
1
17. Find a symmetric 3 × 3 matrix A with eigenvalues 4 and 33 such that 1 and
1
 
1
1 are eigenvectors of the matrix A corresponding to the eigenvalue 33.
3
 
1
18. Find a symmetric 3 × 3 matrix A with eigenvalues 4 and 9 such that 1 and
0
 
1
0 are eigenvectors of the matrix A corresponding to the eigenvalue 4.
1
Determine the QR-factorization of the given matrix.
   
1 2 0 1 1 1
19. 1 0 1 20. −1 0 1
0 1 1 0 1 −1
21. Let A be a symmetric 3 × 3 matrix. Show that if det(A − λI ) = −(λ − α)3 , for
α 0 0
 
some real number α, then A =  0 α 0.

0 0 α
Chapter 8
Applications to geometry
In Chapters 3 and 4 we discussed vector lines in R2 and vector lines and vector planes
in R3 . Such lines and planes always contain the origin. In this chapter we generalize
those considerations to general lines and planes. The proofs in this chapter have a
more geometrical flavor and remind us of the classical presentations of analytical
geometry, but are compatible with the proofs given so far in this book. This section
gives us more opportunities to use the concepts of linear algebra and to understand
the connections between geometry and linear algebra.
When discussing geometry we often call elements of R2 or R3 points rather than
vectors. There is no mathematical difference between points and vectors, but in the
context of geometry it is often more intuitive to talk about points.
8.1 Lines in R2
Definition 8.1.1. Let u, a in R2 . If u is different from the origin, then the set
of all points of the form a + t u, where t is an arbitrary real number, will be
called a line and denoted by a + Span{u}, that is,
a + Span{u} = {a + t u, t in R} .
We say that a + Span{u} is a line that contains the point a and is parallel to the
vector line Span{u}. Note that a line a + Span{u} is a vector line if and only if a = 0. In
the definition of lines we have to assume that u is different from the origin, because
otherwise Span{u} would not be a line, but a point.
Now let a and u be two points in R2 such that u is different from the origin. Con-
sider points of the form a + t u for different values of t , see Figure 8.1. Observe that
the points a + t u lie on the line through a that is parallel to the line which contains
u and the origin, that is, the vector line Span{u}. If t takes all real values, then we
obtain the entire line through a that is parallel to the vector line Span{u}.
355
356 Chapter 8: Applications to geometry
a + 75 u
a+u
a + 21 u
a
2 7
a− 3u 5u
u
1
2u
0
− 32 u
Figure 8.1: The line a + Span{u}.
¸ · ½· ¸¾
−1 8
Example 8.1.2. The line through and parallel to the vector line Span
1 −2
is · ¸ ½· ¸¾ ½· ¸ ¾
−1 8 −1 + 8t
+ Span = : t in R .
1 −2 1 − 2t
As in the case of vector lines, we adopt the convention that when we say “a line
a + Span{u},” we always implicitly assume that u 6= 0.
ux
vector line Span{u}

x−a u
0
line a + Span{u}
a
x
Figure 8.2: An illustration of Theorem 8.1.3.
Theorem 8.1.3. Let u and a be vectors of R2 such that u is different from the
origin. Then, for every x in R2 ,
x is on the line a + Span{u} if and only if (x − a) • ux = 0.
Proof. The proof is similar to the proof of Theorem 4.2.8.

8.1. LINES IN R2 357
If x = a + t u for some t in R, then
(x − a) • ux = ((a + t u) − a) • ux = t u • ux = 0.
Conversely, if (x−a) • ux = 0, then x−a = s(ux )x = −su for some s in R, by Theorem

3.2.22. Hence x = a + (−s)u, which means that x is on the line a + Span{u}.
Corollary 8.1.4. Let u and a be vectors of R2 such that u is different from the
£ ¤
x is on the line a + Span{u} if and only if det x − a u = 0.
Proof. Since (x−a)•ux = det u x − a = − det

£ ¤ £ ¤
£ x − a ¤u , Theorem 8.1.3 implies that
x is on the line a + Span{u} if and only if det x − a u = 0.
Corollary 8.1.5. Let n and a be vectors of R2 such that n is different from the
(x − a) • n = 0 if and only if x is on the line a + Span{nx }.
Proof. By Theorem 8.1.3, the vector x is on the line a + Span{nx } if and only if it
satisfies the equation (x−a)•(nx )x = 0, wich is equivalent to the equation (x−a)•n = 0,
since (nx )x = −n.
0
line through a and orthogonal to Span{n}
a
vector line Span{n}
Figure 8.3: Line through the point a and orthogonal to the vector line Span{n}.
Definition 8.1.6. Let n be a nonzero element in R2 and a a point in R2 . The

set of all x such that
(x − a) · n = 0
is called the line through the point a and orthogonal to the vector line
Span{n}.
Example 8.1.7. Write the equation 2x + 5y = 3 in the form (x − a) • n = 0.
Solution. Since the equation 2x + 5y = 3 can be written as

· ¸ · ¸
x 2
• = 3,
y 5
· ¸
2
we can take n = . Next we notice that
5
· ¸ · ¸
−1 2
3= • ,
1 5
so the equation 2x + 5y = 3 can be written as

· ¸ · ¸ · ¸ · ¸
x 2 −1 2
• = •
y 5 1 5
or µ· ¸ · ¸¶ · ¸
x −1 2
− • = 0.
y 1 5
Consequently, the line ·defined
¸ by the equation 2x + 5y = 3 can be described
½· ¸¾ as the
−1 2
line through the point and orthogonal to the vector line Span .
1 5
· ¸
−1
Note that the presented solution is not unique. For example, instead of
1
· ¸ · ¸ · ¸
1 1 2
we could use . Since 2 = • , the equation 2x + 5y = 3 can be written as
0 0 5
µ· ¸ · ¸¶ · ¸
x 1 2
− 32 • = 0.
y 0 5
A somewhat different solution is based on the observation that we have
· ¸ · ¸ · ¸ · ¸
3 2 2 3 2 2
3= · ¸ · ¸ • = •
2 2 5 5 29 5 5
•
5 5
and consequently the equation 2x + 5y = 3 can be written as

· ¸ · ¸ · ¸ · ¸
x 2 3 2 2
• = •
y 5 29 5 5
or µ· ¸ · ¸¶ · ¸
x 3 2 2
− • = 0.
y 29 5 5
This approach may seem artificial,· ¸but this solution is special½·because,

¸¾ unlike in
3 2 2
other solutions, the point a = 29 is on the vector line Span , see Fig. 8.4.
5 5
From this point of view, the last approach is quite natural. It is often used in argu-
ments where n is not known, for example, in the proof of the next theorem.
y
· ¸
−1
1
· ¸
3 2
29 5
the line 2x + 5y = 3
½· ¸¾ 0 x
2 · ¸
Span 3 1
5 2 0
Figure 8.4: Different solutions in Example 8.1.7.
Theorem 8.1.8. Let n be a nonzero vector in R2 . The equation
x•n = d
defines a line in R2 for any real number d .
Proof. If n is a nonzero vector, then the equation
x•n = d
can be written as
x − ndn n • n = 0,
¡ ¢
•
d
which is the equation of a line in R2 which contains the point n•n n and is orthogonal
to the vector line Span{n}.
Theorem 8.1.9. Let u and v be linearly independent vectors in R2 . For ar-

bitrary a and b in R2 the lines a + Span{u} and b + Span{v} have a unique
common point.
a line b + Span{v}
v
vector line Span{u} the unique common point
line a + Span{u}
0 u
vector line Span{v} b
Figure 8.5: The unique common point the lines a + Span{u} and b + Span{v}.
Proof. The common point of the lines a + Span{u} and b + Span{v} is given by the
equation
a + su = b + t v,
su − t v = b − a.
The uniqueness of the numbers s and t satisfying this equation is a consequence of
the fact that the set {u, v} is a basis of R2 .
Example 8.1.10. We want to find the intersection point of the lines

· ¸ ½· ¸¾ · ¸ ½· ¸¾
1 −1 2 1
+ Span and + Span .
3 2 0 1
· ¸
−1 1
Since det = −3 6= 0, the lines have a unique common point. To find this
2 1
point we have to solve the equation
· ¸ · ¸ · ¸ · ¸
1 −1 2 1
+s = +t . (8.1)
3 2 0 1
We could use Theorem 1.3.10 or proceed as follows. We apply the dot product with
· ¸x · ¸
1 −1
= to both sides of the equation (8.1) and get
1 1
2 + 3s = −2.
Consequently, s = − 43 and the point of intersection of the lines is

· ¸ · ¸ "7#
1 4 −1
− = 31 .
3 3 2 3
· ¸x · ¸
−1 2
Instead, we could apply the dot product with − = to both sides of the
2 1
equation (8.1) and get
5 = 4 + 3t .
1
Hence t = 3 and the point of intersection of the lines is
· ¸ · ¸ "7#
2 1 1
+ = 31 .
0 3 1 3
In both cases we got the same point of intersection, as expected.
The uniqueness property in the following theorem plays an important role in

geometry.
Theorem 8.1.11. If a, b are in R2 and a 6= b, then there is a unique line which

contains both points a and b. That unique line can be described in any of the
following ways:
a + Span{b − a} = a + Span{a − b} = b + Span{b − a} = b + Span{a − b}.
Proof. If a line c + Span{u} contains the points a and b, then
a = c + su and b = c + t u,
for some real numbers s and t . Hence
b − a = (t − s)u
with s 6= t . Consequently,
Span{b − a} = Span{u}.
and, since a = c + su and su + Span{u} = Span{u},
a + Span{b − a} = c + su + Span{u} = c + Span{u}.
The equalities
a + Span{b − a} = a + Span{a − b} = b + Span{b − a} = b + Span{a − b}
follow from the fact that all the above lines contain a and b.
Corollary 8.1.12. If a, b are in R2 and a 6= b, then the vector x is on the line

which contains both points a and b if and only if
£ ¤
det x − a b−a = 0
Projections on lines in R2
Projections on vector lines in R2 were discussed in Chapter 3. It turns out that the
situation does not change much when we consider projections on arbitrary lines in
R2 . For example, the next theorem is almost identical to Theorem 3.2.13 in R2 .
a q
the point minimizing the distance

vector line Span{u} q−p
p
line a + Span{u}
0
u
Figure 8.6: The point minimizing the distance from q to the line a + Span{u}.
Theorem 8.1.13. Let a, q, and u be elements of R2 , where u is different from

the origin. There is a unique point p on the line a + Span{u} such that the
distance from q to p is the shortest distance from q to any point on the line
a + Span{u}. This point p is characterized as the point on the line a + Span{u}
satisfying the equation
(q − p) • u = 0.
Proof. Since
kq − a − t uk2 = kq − ak2 − 2t (q − a) • u + t 2 kuk2

(q − a) • u 2 (q − a) • u 2
µ ¶ µ ¶
(q − a) • u
= kq − ak2 − + − 2t kuk + t 2 kuk2
kuk kuk kuk
(q − a) • u 2
µ ¶ µ ¶2
(q − a) • u
= kq − ak2 − + − t kuk ,
kuk kuk
³ ´2
(q−a)•u
the distance kq−a−t uk is minimized for the value of t for which kuk − t kuk = 0.
(q−a)•u
Solving for t we get t = kuk2
and consequently
(q − a) • u
p = a+ u.
kuk2
It is easy to verify that for this p we have (q − p) • u = 0.

Now, if
(q − a − t u) • u = 0,
(q−a)•u
then we must have t = kuk2
which means that the point p is characterized by the
equation (q − p) • u = 0.
Definition 8.1.14. Let a, q and u be elements of R2 , where u is different from

the origin.
(a) The unique point p on the line a+Span{u} that minimizes the distance
from q to the line a + Span{u} is called the projection of q on the line
a + Span{u}.
(b) By the distance from the point q to the line a + Span{u} we mean the
distance kq − pk between the point q and its projection p on the line
a + Span{u}.
· ¸
1
Example 8.1.15. Calculate the projection of the point q = on the line
2
· ¸ ½· ¸¾
3 2
+ Span .
−1 3
Solution. The point · ¸ · ¸ · ¸

3 2 3 + 2t
+t =
−1 3 −1 + 3t
must satisfy the equation
µ· ¸ · ¸¶ · ¸
3 + 2t 1 2
− • =0
−1 + 3t 2 3
which is equivalent to the equation
2(2 + 2t ) + 3(−3 + 3t ) = 0.
5
Solving for t we get t = 13 . Thus the projection is
· ¸ · ¸ · ¸
3 5 2 1 49
+ =
−1 13 3 13 2
Recall that the equation (x − q) • u = 0 describes a line through the point q. Using
this interpretation we can rephrase Theorem 8.1.13 in terms of intersecting lines.
Corollary 8.1.16. Let u be a nonzero vector in R2 and let q be an arbitrary

point in R3 . The projection of q on the line a + Span{u} is the intersection of
the line a + Span{u} and the line (x − q) • u = 0.
line (x − q) • u = 0
a q
projection of q on the line a + Span{u}
line a + Span{u}
0
u
vector line Span{u}
Figure 8.7: The projection of q on the line a + Span{u}.
The next theorem gives us a practical method for calculating the projection of a
point on a line given by the equation x • n = d .
Theorem 8.1.17. Let n be a nonzero vector in R2 and let q be an arbitrary

point in R2 . The lines x • n = d and q + Span{n} have a unique common point
p. This point p is the projection of the point q on the line x • n = d .
Proof. Every point on the line q+Span{n} is of the form q+ t n for some real number
t . A point q + t n is on the line x • n = d if
(q + t n) • n = d .
Solving for t gives us

d −q•n
t= .
n•n
This means that the line q + Span{n} and the line x • n = d have a unique common
point and that point is
d −q•n
p = q+ n.
n•n
If x is an arbitrary point on the line x • n = d , then we have
−d + q • n −d + q • n −d + q • n
(q − p) • (x − p) = n • (x − p) = (n • x − n • p) = (d − d ) = 0.
n n
• n n
• n•n
Hence
kq − xk2 = kq − pk2 − 2(q − p) • (x − p) + kx − pk2 = kq − pk2 + kx − pk2 .
This means that
kq − xk2 ≥ kq − pk2
for every x on the line x • n = d . Moreover,
kq − xk2 > kq − pk2
for every x on the line x • n = d that is different from p. In other words, the distance
from q to p is the shortest distance from q to any point in the line x • n = d .
· ¸
2
Example 8.1.18. Find the projection of the point q = on the line x + 2y = 5.
3
· ¸ · ¸
1 x
Solution. The line x + 2y = 5 can be written as x = 5 where x =
• .
2 y
The projection is
· ¸ · ¸
2 1
· ¸ 5− • · ¸ "7#
d −q n 3 2 1
· ¸ · ¸
• 2 2 3 1
p = q+ n= + · ¸ · ¸ = − = 59
n•n 3 1 1 2 3 5 2
• 5
2 2
The following theorem gives us a useful formula for calculating the distance from
a point to a line.

point in R2 . The distance from the point q to the line x • n = d is
¯ ¯
¯d − q • n¯
.
knk
Proof. Since, as shown the proof of Theorem 8.1.17, the projection of the point q on
the line x • n = d is
d −q•n
p = q+ n,
n•n
we have ¯ ¯
° ° ¯ ¯ ¯d − q • n¯
° d −q n ° ¯ d −q n ¯
• •
kq − pk = ° n n n° = ¯ knk2 ¯ knk = .
•
knk
· ¸ · ¸
2 1
Example 8.1.20. Find the distance from the point to the line x • = 3.
5 −1
· ¸ · ¸
2 1
Solution. In our case q = ,n= , and d = 3. Consequently, the distance is
5 −1
¯ · ¸ · ¸¯
¯3 − 2 • 1 ¯
¯ ¯
¯ ¯
¯d − q • n¯ ¯ 5 −1 ¯ 6
= °· ¸° =p .
knk ° 1 °
° ° 2
° −1 °
Theorem 8.1.21. The distance from a point q to the line a + Span{u} in R2 is

¯ £ ¤¯
¯det q − a u ¯
.
kuk
Proof. First we observe that for any vectors v and w in R2 we have

¤¯2
w ¯ = kvk2 kwk2 − (v • w)2 ,
¯ £
¯det v
which can be verified by direct calculations. Using the above identity and Theorem
8.1.17 we obtain
r° °2
kq − pk = °q − a − (q−a) u ° •
u
°
uu •
°
s
kq − ak2 kuk2 − ((q − a) • u)2
=
kuk2
p
kq − ak2 kuk2 − ((q − a) • u)2
=
kuk
¯ £ ¤¯
¯det q − a u ¯
= .
kuk
· ¸ · ¸ ½· ¸¾
1 2 1
Example 8.1.22. Find the distance from then point to the line +Span .
2 3 −1
· ¸ · ¸ · ¸
1 2 1
Solution. Since in this case q = ,a= , and u = , the distance is
2 3 −1
¯ · ¸¯
¯det −1 1 ¯
¯ ¯
¯ £ ¤¯
¯det q − a u ¯ ¯ −1 −1 ¯ 2 p
= °· ¸° = p = 2.
kuk ° 1 °
° ° 2
° −1 °
Theorem 8.1.23. The distance from a point q to the line through two distinct
points a and b in R2 is ¯ £ ¤¯
¯det q − a q − b ¯
.
kb − ak
Proof. The line through two distinct points a and b is a+Span{b−a}. From Theorem
8.1.21, the distance from q to the line a + Span{b − a} is
¯ £ ¤¯
¯det q − a b − a ¯
.
kb − ak
Since
£ ¤ £ ¤
det q − a b − a = det q − a b − q + q − a
£ ¤ £ ¤
= det q − a b − q + det q − a q−a
£ ¤
= det q − a b − q
£ ¤
= − det q − a q − b ,
the distance can also be written as

¯ £ ¤¯
¯det q − a q−b ¯
.
kb − ak
· ¸
1
Example 8.1.24. Find the distance from the point to the line through the points
0
· ¸ · ¸
−1 2
and .
2 3
· ¸ · ¸ · ¸
1 −1 2
Solution. Since in this case we have q = ,a= , and b = , the distance is
0 2 3
¯ · ¸¯
¯det 2 −1 ¯
¯ ¯
¯ £ ¤¯
¯det q − a q−b ¯ ¯ −2 −5 ¯ 12
= °· ¸° =p .
kb − ak ° 3 °
° ° 10
° 1 °
Corollary 8.1.25. Let a, b and q be vectors in R2 such that the vectors q−a and
q − b are linearly independent. The area of the triangle qab is
1 ¯¯ £ ¤¯
det q − a q−b ¯.
2
q
distance from q to the line through a and b
Figure 8.8: The triangle qab.
Proof. Since the area of a triangle is
1
A= · (the length of the base) · (the height),
2
we have
¯ £ ¤¯
1 ¯det q − a q − b ¯ 1 ¯ £ ¤¯
A = · kb − ak · = ¯det q − a q−b ¯.
2 kb − ak 2
· ¸ · ¸ · ¸
1 2 −2
Example 8.1.26. Find the area of the triangle with vertices , , and .
−1 2 3
· ¸ · ¸ · ¸
1 2 −2
Solution. We can set q = ,a= , and b = . Then the area of the triangle
−1 2 3
qab can be calculated as
¯ · ¸¯
1 ¯¯ £ ¤¯ 1 ¯ −1 3 ¯¯ 13
det q − a q − b = ¯det
¯ ¯ = .
2 2 −3 −4 ¯ 2
· ¸ · ¸ · ¸
2 1 −2
If we choose q = ,a= , and b = . Then for the area of the triangle
2 −1 3
qab we get ¯ · ¸¯
1 ¯¯ £ ¤¯ 1 ¯ 1 4 ¯¯ 13
det q − a q − b ¯ = ¯¯det = .
2 2 3 −1 ¯ 2
As expected, the answers are the same.
8.1.1 Exercises
· ¸
0
1. Find an equation of the line which contains the point and is orthogonal to
1
½· ¸¾
2
the vector line Span .
1
· ¸
1
2. Find an equation of the line which contains the point and is orthogonal to
2
½· ¸¾
2
the vector line Span .
3
· ¸ ½· ¸¾
3 −2
3. Write the equation of the line + Span in the form ax + b y = c.
1 3
· ¸ ½· ¸¾
2 1
4. Write the equation of the line + Span in the form ax + b y = c.
−1 3
5. Write the equation 3x + y = 5 in the form (x − a) • n = 0.
6. Write the equation x − 2y = 1 in the form (x − a) • n = 0.

· ¸ · ¸ · ¸
1 0 −3
7. Find the projection of on the line a + Span{u} where a = and u = .
1 2 1
· ¸ · ¸ · ¸
2 1 3
8. Find the projection of on the line a + Span{u} where a = and u = .
−1 0 2
· ¸
1
9. Find the projection of on the line x + y = −1.
3
· ¸
1
10. Find the projection of on the line x − y = 2.
4
· ¸ · ¸ · ¸
1 2 4
11. Find the area of the triangle with the vertices , , and .
1 −1 0
· ¸ · ¸ · ¸
1 0 3
12. Find the area of the triangle with the vertices , , and .
0 2 5
· ¸ · ¸
a1 b1
13. Let a = and b = be points in R2 such that a 6= b. Show that the point
a2 b2
· ¸
x1
x= is on the line containing the points a and b if and only if
x2
 
1 1 1
det 1
 x a1 b 1  = 0.
x2 a2 b2
8.2 Lines and planes in R3
Lines in R3
Here we gather some definitions and theorems about lines in R3 that are the same
for R2 and for R2 . Even the proofs presented for lines in R2 are valid in R3 .
Definition 8.2.1. Let u, a be vectors in R3 . If u is different from the origin,

then the set of all points of the form a + t u, where t is an arbitrary real num-
ber, will be called a line and denoted by a + Span{u}, that is,
a + Span{u} = {a + t u, t in R} .
Theorem 8.2.2. If a, b are in R3 and a 6= b, then there is a unique line which

contains both points a and b. That unique line can be described in any of the
following ways:
a + Span{b − a} = a + Span{a − b} = b + Span{b − a} = b + Span{a − b}.

8.2. LINES AND PLANES IN R3 371
Theorem 8.2.3. Let a, q and u be elements of R3 , where u is different from the

origin. There is a unique point p on the line a+Span{u} such that the distance
from q to p is the shortest distance from q to any point on the line a+Span{u}.
This point p is characterized as the point on the line a + Span{u} satisfying the
equation
(q − p) • u = 0.
Definition 8.2.4. Let a, q and u be elements of R3 , where u is different from

the origin.
(a) The unique point p on the line a+Span{u} that minimizes the distance
from q to the line a + Span{u} is called the projection of q on the line
a + Span{u}.
(b) By the distance from the point q to the line a + Span{u} we mean the
distance kq − pk between the point q and its projection p on the line
a + Span{u}.
Theorem 8.2.5. Let a, q and u be elements of R3 , where u is different from the

origin. The projection of the point q on the line a + Span{u} is
(q − a) • u
p = a+ u.
u•u
Some theorems from Section 8.1 are not mentioned above, because of differ-
ences between R2 and R3 . For example, the equation (x − a) • n = 0 does not define a
line in R3 and the lines a + Span{u} and b + Span{v} need not intersect for arbitrary
linearly independent a and b in R3 .
Planes in R3
In this section we generalize some results on projections on vector planes in R3 pre-

sented in Chapter 4. These more general results can be easily obtained from similar
results in Chapter 4 or can be proved by obvious modification of proofs presented
there.
Definition 8.2.6. Let u, v, a be elements of R3 . If u and v are linearly inde-

pendent, then the set of all points of the form a + su + t v, where s and t are
arbitrary real numbers, will be called a plane and denoted by a + Span{u, v},
that is,
a + Span{u, v} = {a + su + t v, s, t in R} .
Theorem 8.2.7. Let u and v be linearly independent vectors in R3 . Then a

vector x is in the vector plane a + Span{u, v} if and only if
(x − a) • (u × v) = 0. (8.2)
Proof. The vector x = a + su + t v satisfies (8.2) for any real numbers s and t , by The-
orem 5.1.6.
If x satisfies (8.2), then there are real number s and t such that
x − a = su + t v,
by Theorem 5.1.7.
Theorem 8.2.8. If n is a vector in R3 different from the origin, then the equa-
tion
(x − a) • n = 0
defines a plane, that is, there are two linearly independent vectors u and v in
R3 such that (x − a) • n = 0 if and only if x is in a + Span{u, v}.
Proof. The proof is similar to the proof of Theorem 4.2.8.
The equation (x − a) • n = 0 defines a line in R2 . The same equation (x − a) • n = 0

defines a plane in R3 . From the point of view of geometry these are very different
objects. On the other hand, lines in R2 and planes in R3 share many algebraic prop-
erties.
Definition 8.2.9. Let n be a nonzero element in R3 and a a point in R3 . The

set of all x such that
(x − a) · n = 0
is called the plane through the point a and orthogonal to the vector line
Span{n}.
The equation of the plane that contains a point a and is orthogonal to the vector
line Span{n} is the same as the plane which contains the projection of the point a on
the vector line Span{n} and is orthogonal to the vector line Span{n}, see Fig. 8.9. In
other words, if p is the projection of the point a on the vector line Span{n}, then the
equations (x−p) • n = 0 and (x−a) • n = 0 are equivalent. This is a direct consequence
of the fact that (a − p) • n = 0 and x − a = p − a + x − p. This observation makes the
above definition more intuitive.
vector line Span{n}
projection of a on Span{n}
.
p
a
Figure 8.9: The plane that contains a and is orthogonal to the vector line Span{n}.
 
2
Example 8.2.10. The equation of the plane which contains the point 1 and is
4
 
5
orthogonal to the vector line Span 8 is
3
     
x 2 5
 y  − 1 • 8 = 0.
z 4 3
This vector equation can be written in the form
5(x − 2) + 8(y − 1) + 3(z − 4) = 0
or
5x + 8y + 3z = 30.
Theorem 8.2.11. Let n be a nonzero vector in R3 . The equation
x•n = d
defines a plane in R3 .
Proof. If n is a nonzero vector, then the equation x • n = d can be written as
x − ndn n • n = 0,
¡ ¢
•
d
which is the equation of the plane in R3 which contains the point n•n n and is orthog-
onal to the vector line Span{n}.
Example 8.2.12. Write the equation 2x + 5y + 3z = 4 in the form (x − a) • n = 0.
Solution. This equation can be written as

   
x 2
 y  • 5 = 4
z 3
or, equivalently, as
       
x 2 2 2
 y  • 5  = 4 5 • 5
2 2 + 5 2 + 32
z 3 3 3
and finally as
     
x 2 2
 y  − 2 5 • 5 = 0.
19
z 3 3
The presented solution produces the only point a that is on the vector line
Span{n}. If we are not interested in this particular solution, we can find a simpler
solution. For example, we note that
   
2 2
0 • 5 = 4,
0 3
so the equation 2x + 5y + 3z = 4 can be written as

     
x 2 2
 y  − 0 • 5 = 0.
z 0 3
 
a
Example 8.2.13. If b  6= 0, write the equation ax +b y +c z = d in the form (x−a) •
c
n = 0.
Solution. The equation ax + b y + c z = d can be written as

   
x a
 y  • b  = d
z c
or, equivalently, as
     
x a a
 y  − d b  • b  = 0.
a 2 +b 2 +c 2
z c c
The above example shows that the equation ax +b y +c z = d describes all points
   
x  a 
 y  which are in the plane that is orthogonal to the vector line Span b  and
z c
 
 
a
intersects this line at the point a 2 +bd2 +c 2 b .
c
Projections on planes in R3
Now we turn our attention to the problem of finding the point on a plane that mini-
mizes the distance from a given point.

point in R3 .
The plane x • n = d and the line q + Span{n} have a unique common point
d −q•n
p = q+ n. The distance from q to p is the shortest distance from q to
n•n
any point in the plane x • n = d and p is the unique point with this property.
The point p is characterized as the point in the plane x • n = d satisfying the
equation
(q − p) • (x − p) = 0
for any point in the plane x • n = d .
Proof. Every point on the line q+Span{n} is of the form q+ t n for some real number
t . A point q + t n is in the plane x • n = d , if
(q + t n) • n = d .
Solving for t gives us

d −q•n
t= .
n•n
This means that the line q + Span{n} and the plane x • n = d have a unique common
point and that point is
d −q•n
p = q+ n.
n•n
If x is an arbitrary point in the plane x • n = d , then we have
−d + q • n −d + q • n −d + q • n
(q − p) • (x − p) = n • (x − p) = (n • x − n • p) = (d − d ) = 0.
n•n n•n n•n
Hence
kq − xk2 = kq − pk2 − 2(q − p) • (x − p) + kx − pk2 = kq − pk2 + kx − pk2 .
This means that

kq − xk2 ≥ kq − pk2
for every x in the plane x • n = d . Moreover,
kq − xk2 = kq − pk2
if and only if x = p. In other words, the distance from q to p is the shortest distance
from q to any point in the plane x • n = d .
It results from the above proof that the point p is characterized as the point in
the plane x • n = d satisfying the equation
(q − p) • (x − p) = 0.
Definition 8.2.15. Let n be a nonzero vector in R3 and let q be an arbitrary

point in R3 . The unique common point of the plane x • n = d and the line
q + Span{n} is called the projection of q on the plane x • n = d .
 
1
Example 8.2.16. Find the projection of the point q = 2 on the plane
1
x + y − z = 1.
Solution. The plane x + y − z = 1 can be described by the equation

   
x 1
 y  •  1 = 1,
z −1
 
1
so in this case we have n =  1. We need to find the point common to the plane
−1
and the line    
1  1 
2 + Span  1 .
1

−1

     
1 1 1+t
The point 2 + t  1 = 2 + t  is on the plane x + y − z = 1 if
1 −1 1−t
1 + t + 2 + t − (1 − t ) = 1.
 
1
Solving for t we obtain t = − 31 . Consequently, the projection of the point q = 2
1
on the plane x + y − z = 1 is
    2
1 1 3
1
 53  .
2 −  1 =  
3
1 −1 4
3
The problem can also be solved by an application of the formula given in

Theorem 8.2.14:
   
1 1
  1 − 2 •  1       2
1 1 −1 1 1 1 3
d −q•n −1    5 
p = q+ n = 2  +      1  = 2  + 1 = 3.
n•n 1 1 3
1 −1 1 −1 4
 1 •  1 3
−1 −1
Projections on lines in R3 revisited

In R2 the projection of a point q on the line a+Span{u} is characterized by the equa-
tion (q − p) • u = 0 (see Theorem 8.1.13). It turns out that the same characterization
works in R3 , in spite of the fact that the geometric interpretation of the equation is
different in R2 and in R3 .
Theorem 8.2.17. Let a, q, and u be elements of R3 , where u is different from

the origin. The projection of q on the line a + Span{u} is the intersection of the
line and the plane (x − q) • u = 0.
Proof. The intersection of the line a + Span{u} and the plane (x − q) • u = 0 is a point
p = a + t u such that (a + t u − q) • u = 0. Solving for t we obtain
(q − a) • u
t=
u•u
and thus
(q − a) • u
p = a+ u,
u•u
which is the projection of q on the line a + Span{u}.
     
1 1  1 
Example 8.2.18. Find the projection of 2 on the line 0 + Span 1 .
1 1 1
 
   
1 1
Solution. We solve the equation (q − p) • u = 0 for p with q = 2, u = 1, and
1 1
     
1 1 1+t
p = 0 + t 1 =  t . From the equation
1 1 1+t
         
1 1+t 1 −t 1
2 −  t  • 1 = 2 − t  • 1 = 0
1 1+t 1 −t 1
we find that t = 23 . Hence the projection is
1 + 23
  5
3
2 2
3 3.
=


2 5
1+ 3 3
The distance from a point to a plane
Definition 8.2.19. Let n be a nonzero vector in R3 and let q be an arbitrary

point in R3 . By the distance from the point q to the plane x • n = d we mean
the distance kq − pk between the point q and its projection p on the plane
x • n = d.
The following theorem gives us a useful formula for calculating the distance from
a point to a plane.

point in R3 . The distance from the point q to the plane (x − a) • n = 0 is
¯ ¯
¯(q − a) • n¯
.
knk
Proof. Since the projection of the point q to the plane (x − a) • n = 0 is

d −q•n
p = q+ n,
n•n
where d = a • n, we have
¯ ¯ ¯ ¯
° ° ° ¯¯ a n−q n ¯¯ ¯q • n − a • n¯ ¯(q − a) • n¯
° d −q n ° °
• a n−q•n • • •
kq − pk = ° n n n° = ° n n n° = ¯ knk2 ¯ knk =
•
= .
•
knk knk
 
1
Example 8.2.21. Find the distance from the point 2 to the plane x + y − z = 1.
1
Solution. The plane x + y − z = 1 can be described by the equation

   
x 1
 y  •  1 = 1,
z −1

1
so in this case we have n =  1. We need to find the point common to the plane
−1
and the line    
1  1 
2 + Span  1 .
1

−1

     
1 1 1+t
The point 2 + t  1 = 2 + t  is on the plane x + y − z = 1 if
1 −1 1−t
1 + t + 2 + t − (1 − t ) = 1.
 
1
Solving for t we obtain t = − 31 . Consequently, the projection of the point q = 2
1
on the plane x + y − z = 1 is
    2
1 1 3
2 − 1  1 = 
 53  .

3
1 −1 4
3
Hence the distance from q to the plane x + y − z = 1 is

°   2 ° ° 1 °
° 1
3 ° ° 3 °
° ° °
°
°   5 ° ° 1 ° 1
° 2 −  3 ° = ° 3 ° = p .
° 1
° ° °
4 ° °
° 3
3 −1 ° 3
Since the hyperplane x + y − z = 1 is given by the equation

     
x 1 1
1
 y  −  1 •  1 = 0,
3
z −1 −1
using the formula in Theorem 8.2.20 we get

¯ 2   ¯
¯ 
¯ 1
   ¯
1 1 ¯¯ ¯ 3
¯ 1 ¯¯
¯ 5  •  ¯
¯2 − 1  1 •  1¯ 1 ¯
¯
¯ 3 
3 ¯
−1 ¯
¯ ¯ ¯ ¯
¯
¯(q − a) • n¯ ¯ 1 −1 −1 ¯ ¯ 4
3 1
= ° ° = ° ° = p .
knk ° 1 ° ° 1 ° 3
° ° ° °
° 1° ° 1°
° ° ° °
° −1 ° ° −1 °
The equation of a plane through three points

We close this section with some formulas specific to R3 . First we derive a formula for
the plane through three points in R3 .
Theorem 8.2.22. Let a, b, and c be points in R3 such that b − a and c − a are

linearly independent. There is a unique plane containing the points a, b, and
c. The plane is defined by the equation
£ ¤
(x − a) • ((b − a) × (c − a)) = det x − a b−a c − a = 0.
Proof. First we note that the plane
(x − a) • ((b − a) × (c − a)) = 0
contains a, b, and c because we have
(a − a) • ((b − a) × (c − a)) = 0,
(b − a) • ((b − a) × (c − a)) = 0,
(c − a) • ((b − a) × (c − a)) = 0.
Now assume that the plane x • n = d contains a, b, and c. Then
a • n = b • n = c • n = d.
This implies
(b − a) • n = 0 and (c − a) • n = 0,
which means that n is of the form,
t ((b − a) × (c − a)), t 6= 0
by Theorem 5.1.1 and the remark after the proof of Theorem 5.1.6.
We can take n = (b − a) × (c − a).

The equation of the plane can be written as
x • ((b − a) × (c − a)) = d .
Since
d = a • n = a • ((b − a) × (c − a)),
we obtain the equation
(x − a) • ((b − a) × (c − a)) = 0.
Note that we have
n = (b − a) × (c − a) = −((a − b) × (c − b)) = (a − c) × (b − c).
Example 8.2.23. Find the equation of the plane which contains the points
     
2 3 5
1 , 1 , and 2 .
3 4 7
Solution. Since
                 
x 2 3 2 5 2 x −2 1 3
 y  − 1 · 1 − 1 × 2 − 1 =  y − 1 · 0 × 1
z 3 4 3 7 3 z −3 1 4
   
x −2 −1
=  y − 1 · −1
z −3 1
= (−1) · (x − 2) + (−1) · (y − 1) + 1 · (z − 3)
= −x − y + z,
the equation is
−x − y + z = 0.
Example 8.2.24. We consider the points

     
1 3 7
a = 1 , b = 2 , and c = 4 .
2 3 5
Since
                 
x 1 3 1 7 1 x −2 2 6
 y  − 1 · 2 − 1 × 4 − 1 =  y − 1 · 1 × 3
z 2 3 2 5 2 z −3 1 3
   
x −2 0
=  y − 1 · 0 = 0,
z −3 0
     
1 3 7
the points 1, 2, and 4 do not determine an unique plane because the equa-
2 3 5
 
x
tion is satisfied by all points  y  of R3 . This happens because the vectors b−a and
z
 
 2 
c − a are not linearly independent. Both vectors are on the line Span 1 .
1
 
Area of a triangle
The following two theorems give us formulas for calculating the distance from a
point to a line in R3 . These theorems are limited to R3 because they use the cross
product that is not available outside of R3
Theorem 8.2.25. The distance from a point q to a line a + Span{u} in R3 is
k(q − a) × uk
.
kuk
Proof. We have
r° °2
kq − pk = °q − a − (q−a) u °
•
u
°
uu•
°
s
kq − ak2 kuk2 − ((q − a) • u)2
=
kuk2
p
kq − ak2 kuk2 − ((q − a) • u)2
=
kuk
k(q − a) × uk
= .
kuk
The last equality is justified by the identity
kv × wk2 = kvk2 kwk2 − (v • w)2 .
Theorem 8.2.26. The distance from a point q to the line through two distinct
points a and b in R3 is
k(q − a) × q − ak
.
kb − ak
Proof. The line through two distinct points a and b is a+Span{b−a}. From Theorem
8.2.25, the distance from q to the line a + Span{b − a} is
k(q − a) × (b − a)k
.
kb − ak
Since
(q − a) × (b − a) = (q − a) × (b − q + q − a)
= (q − a) × (b − q) + (q − a) × (q − a)
= (q − a) × (b − q)
= −(q − a) × (q − b),
we have
k(q − a) × (b − a)k k(q − a) × (q − b)k
= .
kb − ak kb − ak
Figure 8.10: The triangle qab.
In Chapter 1 we obtained a formula for a triangle with one vertex at the origin.
From Theorem 8.2.26 we obtain a simple formula for the area of an arbitrary triangle
in R3 :
Corollary 8.2.27. Let a, b and q be vectors in R3 such that the vectors q−a and
q − b are linearly independent. The area of the triangle qab is
1
k(q − a) × (q − b)k.
2
Proof.
1
The area of the triangle qab = (the length of the base) · (the height)
2
1 k(q − a) × (q − b)k
= kb − ak
2 kb − ak
1
= k(q − a) × (q − b)k.
2
The volume of the tetrahedron

The following theorem gives us a direct formula for the distance of an arbitrary point
q in R3 from a plane through points a, b, and c.
Theorem 8.2.28. Let a, b, and c be points in R3 such that b − a and c − a are

linearly independent. The distance from a point q to the plane through points
a, b, and c is ¯ £ ¤¯
¯det q − a q − b q − c ¯
.
k(b − a) × (c − a)k
Proof. Since, by Theorem 8.2.20, the plane through the points a, b, and c is the plane
given by the equation
(x − a) • ((b − a) × (c − a)) = 0,
the distance is
|(q − a) • n| |(q − a) • ((b − a) × (c − a))|
= .
knk k(b − a) × (c − a)k
Now because
(q − a) • ((b − a) × (c − a)) = (q − a) • (b − q + q − a) × (c − q + q − a)
= (q−a)•((b−q)×(c−q))+(q−a)•((b−q)×(q−a))+(q−a)•((q−a)×(c−q))+(q−a)•((q−a)×(q−a))
= (q − a) • ((b − q) × (c − q))
= (q − a) • ((q − b) × (q − c))
£ ¤
= det q − a q − b q − c
the distance is ¯ £ ¤¯
¯det q − a q − b q − c ¯
.
k(b − a) × (c − a)k
Figure 8.11: The tetrahedron qabc.
Theorem 8.2.29. Let a, b,c and q be points in R3 such that q − a, q − b and

q − c are linearly independent. Then the volume of the tetrahedron qabc is
1 ¯¯ £ ¤¯
det q − a q − b q − c ¯ .
6
Proof. We note that if the vectors q − a, q − b and q − c are linearly independent then
the vectors b − a and c − a are linearly independent.
Now we have
1
the volume of qabc = (the area of the base) • (the height)
3 ¯ £ ¤¯
1 1 ¯det q − a q − b q − c ¯
= • k(b − a) × (c − a)k •
3 2 k(b − a) × (c − a)k
1 ¯¯ £ ¤¯
= det q − a q − b q − c ¯ .
6
The results obtained in this section give us a simple formula for the volume of a
tetrahedron defined by three linearly independent vectors in R3 .
Figure 8.12: The tetrahedron defined by vectors a, b, and c.
Theorem 8.2.30. Let a, b, and c be linearly independent vectors in R3 . The

volume of the tetrahedron 0abc is
1 ¯¯ £ ¤¯
det a b c ¯ .
6
8.2.1 Exercises
     
1 0  1 
1. Find the projection of the point 1 on the line 0 + Span 1 .
0 1 1
 
     
0 1  1 
2. Find the projection of the point 1 on the line 0 + Span 1 .
0 1 0
 
Write the given equation of the plane in the form ax = b y + c z = d .

           
2  1 2  1  2 2 
3.  0 + Span 3 , 2 5.  1 + Span 1 , −1
−1 1 1 1 3
 
−1
 
           
0  1 2  1  1 0 
4. 0 + Span 1 , 1 6. 2 + Span 0 , 1
1 1 1 3 1 1
   
 
1
7. Find an equation of the plane which contains the point 1 and is orthogonal
2
 
 2 
to the vector line Span 3 .
2
 
 
2
8. Write the equation of the plane which contains the point  0 and is orthog-
−1
 
 1 
onal to the vector line Span 1 .
1
 
Write the given equation of a plane in the form (x − a) • n = 0.
9. 3x − y + z = 2 11. 3x − y + z = 2
10. x − 2y + 5z = 1 12. x − 2y + 5z = 1
 
1
13. Find the projection of the point 1 on the plane 3x + y − 4z = 1.
0
 
2
14. Find the projection of the point 1 on the plane x − y + z = 1.
1
 
1
15. Calculate the distance from the point 1 to the plane 3x + y − 4z = 1.
0
 
2
16. Calculate the distance from the point 1 to the plane x − y + z = 1.
1
     
1 0  1 
17. Find the projection of the point 0 on the line 2 + Span −1 , using
0 1 1
 
Theorem 8.2.17.
     
0 1  1 
18. Find the projection of the point 0 on the line 1 + Span 1 , using
1 1 1
 
Theorem 8.2.17.
Find an equation of the plane which contains the given points.

           
1 2 1 2 1 1
19. 1, 3, 2 20. 2, 1, 2
1 0 2 1 1 1
           
1 0 0 1 1 1
21. 0, 1, 0 22. 0, 1, 0
0 0 1 1 1 0
Find the area of the triangle with the given vertices.

           
1 2 3 1 0 0
23. 1, 1, 2 25. 0, 1, 0
0 1 2 0 0 1
           
2 1 1 1 0 1
24. 2, 1, 2 26. 0, 1, 1
1 1 1 1 1 1
Find the volume of the tetrahedron with the given vertices.

               
1 2 3 4 1 1 3 3
27. 0, 1, −1, 1 29. 1, 2, 2, 2
0 1 1 1 1 1 2 8
               
1 2 1 1 1 1 2 1
28. 2, 1, 0, 1 30. 0, 1, 0, 0
0 1 1 1 1 1 1 2
31. Show that p is the projection of the point q on the plane a + Span{u, v} if and
only if
(p − a) · (u × v) = 0, (q − p) · u = 0 and (q − p) · v = 0.
Chapter 9
Rotations
9.1 Rotations in R2
Consider two vectors a and b in R2 such that kak = kbk = 1. We can think of b as a ro-
tated about the origin to a new position. This point of view turns out to be important
in mathematics and many applications. In this chapter we describe the operation of
rotating vectors about the origin in the language of linear algebra. As we will see, lin-
ear algebra provides an elegant description of rotations. Moreover, it will lead us in
a natural way to trigonometric functions and allow us to give simple proofs of some
basic formulas from trigonometry.
Figure 9.1: We can think of b as a rotated about the origin to a new position.
We start with a theorem that is the theoretical basis for this chapter.
391
392 Chapter 9: Rotations
Theorem 9.1.1. Let a and b be vectors in R2 . If kak = kbk = 1, then there are
unique real numbers p and q such that
b = pa + qax . (9.1)
Moreover,
(a) p 2 + q 2 = 1,
(b) p = a·b,
£ ¤
(c) q = det a b .
Proof. Existence and uniqueness of p and q follow from the fact that {a, ax } is a basis
in R2 , by Theorem 3.2.25.
From the Pythagorean theorem we get
kbk2 = p 2 kak2 + q 2 kax k2 = (p 2 + q 2 )kak2 .
Since kak = kbk = 1, we have p 2 + q 2 = 1.

Finally, from (9.1) we obtain
a·b = pa·a + qa·ax = pkak2 = p
and
det a b = b·ax = pa·ax + qax ·ax = qkax k2 = qkak2 = q.
£ ¤
Example 9.1.2. Let

p1 p1
" # " #
2 5
a= and b = .
p1 p2
2 5
Note that kak = kbk = 1. Since
p1 p1
" # " #
2 5 1 1 1 2 3
a·b = · = p p +p p =p
p1 p2 2 5 2 5 10
2 5
and
p1 p1
" #
£ ¤ 2 5 1 2 1 1 1
det a b = det = p p −p p =p ,
p1 p2 2 5 2 5 10
2 5
9.1. ROTATIONS IN R2 393
we have
p1
" 1 #
−p
" #
3 1 3 2 1 2
x
b = pa + qa = p a + p ax = p + p .
10 10 10 p1 10 p1
2 2
We can easily verify that this is correct.
· ¸
p
Every point w on the unit circle at the origin can be identified with a vector
q
such that p 2 + q 2 = 1.
· ¸
p
Definition 9.1.3. For any point w = such that p 2 + q 2 = 1 we define a
q
transformation R w of R2 , that is, a function from R2 to R2 :
R w (a) = pa + qax . (9.2)
ax pb
pb + qbx
pa + qax
qax
a
pa 0
qbx
0 bx
Figure 9.2: Rotations
The transformation defined by (9.2) is a rotation. This means that the result of
its application to any vector a is counterclockwise rotation of a about the origin by
an angle that is determined by p and q and does not depend on a, as illustrated on
Figure 9.2.
Since · ¸ · ¸ · ¸x · ¸
1 1 1 p
Rw =p +q = ,
0 0 0 q
· ¸
1
we can see that R w defined by (9.2) is determined by the image of . Actually, R w is
0
completely determined by the image of any nonzero a. Indeed, from (9.2) we obtain
a·R w (a) = pa·a + qa·ax and ax ·R w (a) = pax ·a + qax ·ax .

Hence
a·R w (a) = pkak2 and ax ·R w (a) = qkax k2 = qkak2 .
Solving for p and q we obtain:
a·R w (a)
p=
kak2
and
ax ·R w (a) 1 £ ¤
q= 2
= 2
det a R w (a)
kak kak
Since p and q do not depend on a, for any other nonzero point b we must have
a·R w (a) b·R w (b) ax ·R w (a) bx ·R w (b)
= and = .
kak2 kbk2 kak2 kbk2
For an arbitrary rotation R w we define
a·R w (a) ax ·R w (a)
C (R w ) = and S(R w ) = ,
kak2 kak2
where a is an arbitrary nonzero vector in R2 . These definitions are consistent, since
the defined values do not depend on a, but only on R w . We can say that C and S are
real-valued functions defined on the set of all rotations.
Theorem 9.1.4. If
b = pa + qax and c = sb + t bx ,
then
c = (pr − q s)a + (ps + qr )ax .
Proof.
c = sb + t bx
¢x
= s pa + qax + t pa + qax
¡ ¢ ¡
= psa + q sax + pt ax − q t a
= (ps − q t )a + (pt + q s)ax .
The above theorem says that if R 1 is the rotation defined by R 1 (x) = px+ qxx and
R 2 is the rotation defined by R 2 (x) = sx + t xx then we have
R 2 ◦ R 1 (a) = R 2 (R 1 (a)) = (ps − q t )a + (pt + q s)ax
Composition of functions is not a commutative operation. On the other hand, our

intuition tells us that, when following one rotation about the origin by another one,
the order should not matter. It is easy to verify that
R 1 ◦ R 2 (a) = (pr − q s)a + (ps + qr )ax = R 2 ◦ R 1 (a) (9.3)

which shows that our intuition is correct.

Since p = C (R 1 ), q = S(R 1 ), s = C (R 2 ), and t = S(R 2 ), from (9.3) we obtain the
following identities:
C (R 1 ◦ R 2 ) = C (R 1 )C (R 2 ) − S(R 1 )S(R 2 ) (9.4)
and
S(R 1 ◦ R 2 ) = C (R 1 )S(R 2 ) + S(R 1 )C (R 2 ). (9.5)
You may recognize that these formulas are similar to the formulas for cosine and
sine of the sum of two angles. As we will see in the next section this is not a coinci-
dence. In fact, the functions C and S can be interpreted as the familiar cosine and
sine functions.
Theorem 9.1.5. If kak = kbk = 1 and b = pa + qax , then a = pb − qbx .
Proof. If b = pa + qax , then bx = pax − qa and thus
pb − qbx = p(pa + qax ) − q(pax − qa) = (p 2 + q 2 )a = a.
The above theorem says that the rotation that reverses the effect of the rotation
R(a) = pa+qax is the rotation defined by pa−qax . In other words, if R(a) = pa+qax ,
then R −1 (a) = pa − qax .
Trigonometry
Interpretation of the results in this chapter in the language of the familiar trigono-
metric functions requires assigning numerical values to angles so that every real
number α corresponds to a rotation R α in such a way that the following properties
hold:
R α ◦ R β = R α+β ,
R0 = I ,
R α−1 = R −α .
The number α is interpreted as a measure of the angle of rotation R α . While this

association can be done in different ways, for example degrees or radians, it is im-
portant to note that trigonometric identities do not depend on how the measure is
assigned to angles, as long as the above properties hold.
When a measure of angles is chosen, we can define
cos α = C (R α ) and sin α = S(R α ).

cos α
· ¸
sin α
α
· ¸
1
0 0
Figure 9.3: Vector rotated about the origin counterclockwise by an angle α.
· ¸
1
If we rotate the vector about the origin counterclockwise by an angle α, then
0
the obtained vector will be
· ¸x
cos α
· ¸ · ¸ · ¸ · ¸
1 1 1 0
cos α + sin α = cos α + sin α =
0 0 0 1 sin α
More generally, if R α is the counterclockwise rotation about the origin by an angle

α, then
R α (a) = cos α a + sin α ax .
This interpretation of the numbers p and q in the expression pa + qax allows us to

obtain some properties of the sine and cosine functions from properties of rotations
in R2 .
Theorem 9.1.6. For any α and β we have
sin(α + β) = sin α cos β + cos α sin β
and
cos(α + β) = cos α cos β − sin α sin β.
Proof. Since
R α ◦ R β = R α+β ,
the identities follow immediately from (9.4) and (9.5).
Note that our algebraic proof of the above trigonometric identities is much sim-
pler than the standard geometric proof. Moreover, a single proof gives us a pair of
identities: one for sine and one for cosine. The same is true for the next theorem.
Theorem 9.1.7. For any α we have
cos(−α) = cos α and sin(−α) = − sin α.
Proof. This is a direct consequence of Theorem 9.1.5.
In this book we choose to use radians as the measure of an angle, that is, the arc
length in the unit circle. Consequently, the measure of the angle associated with the
rotation
R(a) = 0 · a + 1 · ax = ax
is π2 . In other words,
R π (a) = ax .
2
Note that this gives us

π π
cos = 0 and sin = 1.
2 2
If 0 < α < π2 we have cos α > 0 and sin α > 0 and for every couple of numbers p and q
such that p 2 + q 2 = 1 there is a unique number α such that 0 ≤ α < 2π and
cos α = p and sin α = q.
We do not prove these results in this book.
9.1.1 Exercises
1. Let kak = kbk = 1. Show that if b = pa + qax and ax = sb + t bx , then s = q and

t = p.
2. Let a and b be vectors in R2 such that kak = kbk. Show that there are unique
real numbers p and q such that b = pa + qax . Moreover,
(a) p 2 + q 2 = 1,
a·b
(b) p = ,
kakkbk
det[a b]
(c) q = .
kakkbk
3. Let a and b be vectors in R2 such that kak = kbk = 1. Show that if b = pa + qax
and −a = sb + t bx , then p = −s and q = t .
· ¸ · ¸
2 3
4. Let a = and b = . Find numbers p and q such that b = pa + qax .
3 −2
5. Let kak = kbk = kck = 1. If b = pa + qax and c = pb + qbx , show that c = (2p 2 −
1)a + 2pqax .
6. Let R(a) = pa+qax for some real numbers p and q such that p 2 +q 2 = 1. Show
that
(a) R(λa) = λR(a),

(b) R(a + b) = R(a) + R(b),
(c) R(a)·R(b) = a·b,
(d) kR(a)k = kak.
· ¸
p −q
7. Show that b = pa + qax if and only if b = a.
q p
8. Consider the functions L : R2 → R2 and M : R2 → R2 defined by

p p
1 3 x 3 1
L(x) = x + x and M (x) = x + xx .
2 2 2 2
Find L(M (x)).
9. Consider the functions L : R2 → R2 and M : R2 → R2 defined by L(x) = px+qxx ,

for some real numbers p and q such that p 2 + q 2 = 1, and M (x) = xx . Find
L(M (x)).
10. Let kak = kbk = kck = kdk = 1. Show that, if
b = pa + qax , c = pb + qbx and d = pc + qcx ,
then d = (4p 3 − 3p)a + (3q − 4q 3 )ax .
11. Let kak = kbk = kck = kdk = 1. If, for some real numbers p and q such that
p > 0, q > 0 and p 2 + q 2 = 1,
b = pa + qax , c = pb + qbx and d = pc + qcx = −a,

p
3 x
show that b = 21 a + 2 a .
12. Let kak = kbk = kck = kdk = 1. If, for some real numbers p and q such that
p > 0, q > 0 and p 2 + q 2 = 1,
b = pa + qax , c = pb + qbx and d = pc + qcx = ax ,

p
3 1 x
show that b = 2 a+ 2a .
13. Consider the function L : R2 → R2 defined by L(x) = px + qxx for for some real
numbers p and q such
p
that p > 0, q > 0 and p 2 + q 2 = 1. Find p and q such
3 x
that L(L(a)) = 21 a + 2 a .
p p
2 2 x
14. Consider the function L : R2 → R2 defined by L(x) = 2 x+ 2 x . Show that L
is a rotation about the origin and that L(L(x)) = xx .
p
3 x
15. Consider the function L : R2 → R2 defined by L(x) = 12 x + 2 x . Show that L is
a rotation about the origin and that L(L(L(x))) = −x.
p
3 x
16. Consider the function L : R2 → R2 defined by L(x) = 12 x + 2 x . Prove that
L(L(L(L(L(L(x)))))) = x.
17. Consider the function L : R2 → R2 defined by L(x) = px + qxx for some real
numbers p and q such that p > 0, q > 0 and p 2 + q 2 = 1. Find p and q such
that L(L(x)) = −x.
18. Let kak = kbk = kck = 1. If b = pa + qax for some real numbers p and q such
that p > 0, q > 0 and p 2 + q 2 = 1 and c = pb + qbx = ax , show that b = p1 a +
2
p1 ax .
2
p
3 x
19. Let a be an arbitrary vector in R2 and let b = 12 a + 2 a . Find kb − ak.
20. Consider the functions L : R2 → R2 and M : R2 → R2 defined by

p p
1 3 x 1 3 x
L(x) = x + x and M (x) = x − x .
2 2 2 2
Find L(M (x)).
Calculate the following values using the results from this section.
21. cos π3 26. sin 7π

12
π
22. sin π3 27. cos 12
π
23. cos π6 28. sin 12
24. sin π6 29. cos 5π

12
25. cos 7π
12 30. sin 5π
12
Prove the following trigonometric identities.
31. cos(θ + π) = − cos θ 36. sin(π − α) = sin α
32. sin(θ + π) = − sin θ 37. cos π2 − α = sin α

¡ ¢
33. cos(3α) = 4 cos3 (α) − 3 cos(α) 38. sin π2 − α = cos α

¡ ¢
34. sin(3α) = 3 sin(α) − 4 sin3 (α) 39. sin(2α) = 2 sin α cos α
35. cos(π − α) = − cos α 40. cos(2α) = 2 cos2 α − 1

9.2 Quadratic forms
Definition 9.2.1. By a quadratic form we mean a function which associates

to every vector x from R2 the number xT Ax, where A is an 2 × 2 symmetric
matrix.
Example 9.2.2. Find the quadratic form associated with the matrix
· ¸
7 5
A= .
5 2
Solution.
· ¸· ¸ · ¸
¤ 7 5 x ¤ x
= 7x 2 + 2y 2 + 10x y
£ £
x y = 7x + 5y 5x + 2y
5 2 y y
Definition 9.2.3. Let A be a symmetric 2 × 2 matrix.
(a) The quadratic form xT Ax is called positive definite if xT Ax > 0 for all
vectors x from R2 different from 0.
(b) The quadratic form xT Ax is called positive semidefinite if xT Ax ≥ 0 for

all vectors x from R2 .
(c) The quadratic form xT Ax is called negative definite if xT Ax < 0 for all
vectors x from R2 different from 0.
(d) The quadratic form xT Ax is called negative semidefinite if xT Ax ≤ 0 for

all vectors x from R2 .
(e) The quadratic form xT Ax is called indefinite if xT Ax > 0 for a vector x

from R2 and yT Ay < 0 for a vector y from R2 .
¸ · · ¸
1 −1 x1
Example 9.2.4. If A = , then for any x = we have
−1 2 x2
· ¸· ¸
1 −1 x 1
xT Ax = x 1 x 2
£ ¤
−1 2 x 2
9.2. QUADRATIC FORMS 401
· ¸
£ ¤ x1 − x2
= x1 x2
−x 1 + 2x 2
= x 12 − 2x 1 x 2 + 2x 22
= (x 1 − x 2 )2 + x 22 .
Since
(x 1 − x 2 )2 + x 22 > 0
whenever at least one of the numbers x 1 and x 2 is different from 0, the quadratic
form xT Ax is positive definite.
To classify quadratic forms we will use the following result.
Theorem 9.2.5. If A is an 2×2 symmetric matrix with eigenvalues λ1 , λ2 and

P is an orthogonal 2 × 2 matrix such that
λ1 0 T
· ¸
A=P P ,
0 λ2
then
xT Ax = y 12 λ1 + y 22 λ2 ,
where · ¸
y1
= P T x.
y2
Proof. We have to show that, if A is a 2 × 2 symmetric

· matrix¸ with eigenvalues λ1 , λ2
λ1 0 T
and P is an orthogonal 2 × 2 matrix such that A = P P , then
0 λ2
· ¸
x1
= y 12 λ1 + y 22 λ2 ,
£ ¤
x1 x2 A
x2
· ¸ · ¸
y1 x1
where = PT . First we note that
y2 x2
¤ λ
· ¸ · ¸ · ¸
x 0 T x1
x1 x2 A 1 = x1 x2 P 1
£ ¤ £
P .
x2 0 λ2 x2
If · ¸ · ¸
y1 x1
= PT ,
y2 x2
then
£ ¤ £ ¤
y 1 y 2 = x1 x2 P
and consequently
¤ λ ¤ λ1 0 y 1
· ¸ · ¸ · ¸· ¸
0 T x1
x1 x2 P 1 = y 12 λ1 + y 22 λ2 .
£ £
P = y1 y2
0 λ2 x2 0 λ2 y 2
Theorem 9.2.6. Let A be a symmetric 2 × 2 matrix with eigenvalues λ1 , λ2 .
(a) The quadratic form xT Ax is positive definite if and only if λ1 > 0 and
λ2 > 0.
(b) The quadratic form xT Ax is positive semidefinite if and only if λ1 ≥ 0

and λ2 ≥ 0.
(c) The quadratic form xT Ax is negative definite if and only if λ1 < 0 and
λ2 < 0.
(d) The quadratic form xT Ax is negative semidefinite if and only if λ1 ≤ 0

and λ2 ≤ 0.
(e) The quadratic form xT Ax is indefinite if and only if one eigenvalue of A

is strictly positive and one eigenvalue of A is strictly negative.
Proof. We only prove that a quadratic form xT Ax is positive definite if and only if the
eigenvalues of the matrix A are strictly positive. The other proofs are similar.
We have to prove that, if A is a 2 × 2 symmetric matrix with eigenvalues λ1 and
λ2 , then λ1 > 0 and λ2 > 0 if and only if
· ¸
x
x1 x2 A 1 > 0
£ ¤
x2
·
¸ · ¸
x1 0
for all 6= . According to Theorem 9.2.5 there is an orthogonal 2 × 2 matrix P
x2 0
such that if · ¸ · ¸
y1 x1
= PT ,
y2 x2
then · ¸
x
x 1 x 2 A 1 = y 12 λ1 + y 22 λ2 .
£¤
x2
· ¸ · ¸
x1 0
If λ1 > 0, λ2 > 0, and 6= , then
x2 0
· ¸ · ¸ · ¸
y1 x1 0
= PT 6=
y2 x2 0
and consequently · ¸
x
x 1 x 2 A 1 = y 12 λ1 + y 22 λ2 > 0.
£ ¤
x2
Now assume that · ¸
x
x1 x2 A 1 > 0
£
¤
x2
· ¸ · ¸ · ¸ · ¸ · ¸ · ¸
x1 0 x1 1 x1 0
for all 6= . If we take =P , then we have 6= and consequently
x2 0 x2 0 x2 0
λ1 0 T
· ¸ · ¸ · ¸
£ ¤ x1 £ ¤ T 1
x1 x2 A = 1 0 P P P P
x2 0 λ2 0
¤ λ1 0 1
· ¸· ¸
= λ1 > 0.
£
= 1 0
0 λ2 0
Using a similar argument we can show that λ2 > 0 .
The general form of a quadratic form on R2 is

· ¸
x
= ax 2 + b y 2 + c x y,
£ ¤
x y A
y
where a, b, c are arbitrary real numbers. The terms ax 2 and b y 2 are called quadratic
terms and cx y is called the cross-product term.
Note that, if A is a diagonal matrix, then the quadratic form xT Ax has no cross-
product terms. Since an orthogonal 2 × 2 matrix P corresponds to a change of base
in R2 , the representation
λ1 0 T
· ¸
A=P P
0 λ2
in Theorem 9.2.5 can be used to find new variables in R2 for which the quadratic
form has no cross-product terms.
Example 9.2.7. Classify the quadratic form 2x 2 + 17y 2 + 8x y and find a change of
variables · ¸ · 0¸
x x
=P 0
y y
such the quadratic form expressed in these new variables has no cross-product
term.
Solution. We have · ¸· ¸
2 2
¤ 2 4 x£
2x + 4y + 8x y = x y
4 17 y
and · ¸ · ¸
2 4 1 0 T
=P P ,
4 17 0 18
where
p4 p1
" #
17 17
P= 1
.
− p p4
17 17
Since both eigenvalues are positive, the quadratic form is positive definite. In
terms of the new variables x 0 and y 0 the quadratic form becomes
¤ 1 0 x0
· ¸· ¸
x0 y 0 = (x 0 )2 + 18(y 0 )2 .
£
0 18 y 0
The Principal Axes Theorem
Theorem 9.2.8. For every quadratic form xT Ax on R2 there is an orthonormal

basis {p, px } of R2 such that the equation
xT Ax = c
λ1 y 12 + λ2 y 22 = c,
where c is a real number, y 1 , y 2 are the coordinates of x in the basis {p, px },

that is
x = y 1 p + y 2 px ,
and λ1 and λ2 are the eigenvalues of the symmetric matrix A.
Proof. By Theorem 3.3.12, the symmetric matrix A can be diagonalized, that is, there
is an orthogonal matrix P with columns p1 and p2 and a diagonal matrix
λ1 0
· ¸
D=
0 λ2
such that A = P DP −1 . Then
xT Ax = xT P DP −1 x = (P −1 x)T DP −1 x = yT Dy = λ1 y 12 + λ2 y 22 ,
where x = P y. Note that we can always take
px .
£ ¤ £ ¤
p1 p2 = p
· ¸
q
We note that, if p = , then the function
r
µ· ¸¶ · ¸ · ¸x · ¸ · ¸ · ¸
a a a a −b ¤ a
px
£
R =q +r =q +r = p
b b b b a b
µ· ¸¶ µ· ¸¶
1 0
defines a rotation such that R = p and R = px .
0 1
y y
x x
Figure 9.4: The graphs of x 2 − 2x y + 2y 2 = 1 and x 2 − 4x y + 2y 2 = 1.
The equation
ax 2 + b y 2 + c x y = d ,
where a, b, c, d are real numbers, with a, b, d different from 0, describes a curve in R2 .
For example, the graph of the equation x 2 − 2x y + 2y 2 = 1 is an ellipse and the graph
of the equation x 2 − 4x y + 2y 2 = 1 is a hyperbola. It is not obvious why such similar
equations produce curves that are very different. How can we tell without graphing
these equations? The answer is easy if the equation does not have a cross-product
term. The graph of any equation that can be written in the form
x2 y 2
+ =1
a2 b2
for some a > 0 and b > 0 is always an ellipse and the graph of any equation that can
be written in the form
x2 y 2
− =1
a2 b2
for some a > 0 and b > 0 is always a hyperbola. Since the shape of a curve in the
plane does not depend on the choice of the coordinates we use, if we eliminate the
cross-product term in the original equation, the form of the equation in the new
variables will immediately tell us whether the graph is an ellipse or a hyperbola.
Example 9.2.9. Apply The Principal Axes Theorem to the equation

p
−13x 2 + 14 3x y + y 2 = 40
and classify the curve.
Solution. Since
p ¸· ¸
p
·
2 2
£ ¤ −13 7 3 x
−13x + 14 3x y + y = x y p
7 3 1 y
p # p #
¤ 2 − 23 8 3 · ¸
" 1 · ¸" 1
£ 0 2 2 x
= x y p p ,
3 1 0 −20 − 3 1 y
2 2 2 2
the desired change of variables is

p #−1 p # " 1# " p #
· ¸ " 1 3 · 0¸ " 1 3 · 0¸ 3
x 2 2 x − x 0 p2 0 − 2
= p = p2 2
0 =x + y .
y − 3 1 y0 3 1 y 3 1
2 2 2 2 2 2
We note that " 1# · π¸ π 1 π 0

· ¸ · ¸
2 cos 3
p = π = cos + sin ,
3 sin 3 3 0 3 1
2
and
" p #
− 23 − sin π3 π 1 π 0 π 0 π −1
· ¸ · ¸ · ¸ · ¸ · ¸
= π = − sin + cos = cos + sin ,
1 cos 3 3 0 3 1 3 1 3 0
2
so the new coordinate system is obtained by rotating the axes by π3 .

The equation in terms of the new variables x 0 and y 0 is
0 x0
· ¸· ¸
¤ 8
x0 y 0 = 8(x 0 )2 − 20(y 0 )2 = 40
£
0 −20 y 0

(x 0 )2 (y 0 )2
− = 1,
5 2
so the graph of the equation is a hyperbola.
Example 9.2.10. Apply The Principal Axes Theorem to the equation

p
15x 2 + 2 3x y + 13y 2 = 48
and classify the curve.

Solution. Since
p ¸· ¸
p
·
¤ 15 3 x
15x 2 + 2 3x y + 13y 2 = x y p
£
3 13 y
" p3 ¸ " p3
¤ 2 − 12 16 0 1
#· #· ¸
£ 2 2 x
= x y p p ,
1 3 0 12 − 1 3 y
2 2 2 2
the desired change of variables is
· ¸ " p3 1 · 0¸
# "p #
3
" 1#
−
x 2 −2 x 0 2 0 p2
= p 0 =x +y .
y 1 3 y 1 3
2 2 2 2
We note that "p #

3
cos π6 π 1 π 0
· ¸ · ¸ · ¸
2 = = cos + sin ,
1 sin π6 6 0 6 1
2
and
" 1#
−2 − sin π6 π 1 π 0 π 0 π −1
· ¸ · ¸ · ¸ · ¸ · ¸
p = π = − sin + cos = cos + sin ,
3 cos 6 6 0 6 1 6 1 6 0
2
so the new coordinate system is obtained by rotating the axes by π6 .

The equation in terms of the new variables x 0 and y 0 is
¤ 16 0 x 0
· ¸· ¸
0 0
= 16(x 0 )2 + 12(y 0 )2 = 48
£
x y
0 12 y 0
or, equivalently,
(x 0 )2 (y 0 )2
+ = 1,
3 4
so the graph of the equation is an ellipse.
The Cholesky decomposition
Definition 9.2.11. An 2 × 2 symmetric matrix A is called positive definite if
xT Ax > 0
for all vectors x from R2 different from 0.
Note that the quadratic form xT Ax is positive definite if and only if the matrix
A is positive definite. The following theorem gives us a characterization of positive
definite matrices. It can be used to easily produce examples of such matrices.
Theorem 9.2.12 (Cholesky decomposition). Let A be a symmetric 2 × 2 ma-

trix. The following conditions are equivalent:
(a) A is positive definite;
(b) There is a lower triangular 2×2 matrix M with strictly positive elements
on the main diagonal such that
A = MT M.
Proof. First we show that (a) implies (b). Since

· ¸· ¸
£ ¤ a 11 a 21 1
a 11 = 1 0 >0
a 21 a 22 0
and
¸ "p # h ¸ "a # " #
a 11 p 11 a 21 0 0
· ·
a 11 a 21 a
p 21
i a 11 a 21
− pa21 a 11 a 11 = − a2 = 2
a 21 ,
a 21 a 22 a 11 a 21 a 22 a 21 a21 0 a 22 − a11
11
we have
"p # " #
a 11 hp ¤ 0 0 ·x 1 ¸
· ¸· ¸ i ·x ¸
¤ a 11 a 21 x 1 £ a 1
a 11 p 21
£ ¤ £
x1 x2 − x1 x2 p
a 21 a 11 = x1 x2 2
a 21 .
a 21 a 22 x 2 a 11 x2 0 a 22 − x2
a 11
a2
¶2 Ã !
a 21
· ¸· ¸ µ
¤ a 11 a 21 x 1 p
x 2 = a 22 − 21 x 22 .
£
x1 x2 − a 11 x 1 + p
a 21 a 22 x 2 a 11 a 11
· ¸
a 11 a 21
This shows that the matrix is positive definite if and only if a 11 > 0 and
a 21 a 22
21 a2
a 22 − a11 > 0.
Now it is easy to verify that
p  pa a
p 21

· ¸ a 11 0 11 a 11
a 11 a 21 r
= a 2 .
 
2 
r
a 21
a 21 a 22 p 21 a − a 21
a 11 22 a 11 0 a 22 − a11
To prove that (b) implies (a) it suffices to observe that
xT Ax = xT M T M x = kM xk2
and that kM xk > 0 if x 6= 0, which is a consequence of the fact that the columns of
the matrix M are linearly independent.
Definition 9.2.13. The decomposition of an 2 × 2 matrix A in the form

· ¸· ¸
m 11 0 m 11 m 21
A= ,
m 21 m 22 0 m 22
where m 11 > 0 and m 22 > 0, is called the Cholesky decomposition of A.
Theorem 9.2.12 says that an 2 × 2 matrix is positive definite if and only if it has a
Cholesky decomposition.
¸ ·
3 1
Example 9.2.14. Find the Cholesky decomposition of the matrix A = .
1 5
"p #
3
Proof. We follow the method of the proof of Theorem 9.2.12. If we take v = p1 ,
3
then · ¸ · ¸
3 1 0 0
vvT = and A − vvT = .
1 13 0 14
3
Consequently, the Cholesky decomposition of A is

# "p p #
3
"p
3 0 3 3
A= p
3
p
42
p .
42
3 3 0 3
· ¸
2 3
Example 9.2.15. Show that the matrix A = is not positive definite.
3 4
"p #
2
Proof. We follow the method of the proof of Theorem 9.2.12. If we take v = p3
,
2
then · ¸ · ¸
T 2 3 0T 0
vv = and A − vv = .
3 92 0 − 12
Consequently, the matrix A is not positive definite because − 21 < 0.
From the proof of Theorem 9.2.12 we get the following easy to use characteriza-
tions of 2 × 2 positive definite matrices.
· ¸
a 11 a 21
Theorem 9.2.16. A matrix is positive definite if and only if
a 21 a 22
· ¸
a 11 a 21
a 11 > 0 and det > 0.
a 21 a 22
· ¸
a 11 a 21
Proof. In the proof of Theorem 9.2.12 we show that the matrix is positive
a 21 a 22
a2
definite if and only if a 11 > 0 and a 22 − a21
11
> 0. Our result is a consequence of the
equality
a2
· ¸
1 a 11 a 21
a 22 − 21 = det .
a 11 a 11 a 21 a 22
¸ ·
3 −1
Example 9.2.17. Show that the matrix A = is positive definite.
−1 5
· ¸
3 −1
Solution. Since a 11 = 3 and det = 14, the result follows from Theorem
−1 5
9.2.16.
· ¸
4 7
Example 9.2.18. Show that the matrix A = is not positive definite.
7 5
· ¸
4 7
Solution. It suffices to note that det = −8.
7 5
The LU-decomposition of a positive definite matrix
Theorem 9.2.19. An 2 × 2 symmetric matrix A is positive definite if and only

if A has the LU -decomposition
· ¸· ¸
1 0 d 1 l 21 d 1
A=
l 21 1 0 d2
such that d 1 > 0 and d 2 > 0.

Proof. Suppose that the matrix A is positive definite and has the Cholesky decom-
position · ¸· ¸
m 11 0 m 11 m 21
A= .
m 21 m 22 0 m 22
Then the LU-decomposition of A is
" #"
2
#
1 0 m 11 m 21 m 11
A= −1 2
.
m 21 m 11 1 0 m 22
Now suppose that we have

· ¸· ¸
1 0 d 1 l 21 d 1
A= ,
l 21 1 0 d2
where d 1 > 0 and d 2 > 0. Then the Cholesky decomposition of A is

" p # "p p #
d1 0 d 1 l 21 d 1
A= p p p ,
l 21 d 1 d2 0 d2
which means that the matrix A is positive definite.
The above result gives us a new method for calculating the Cholesky decompo-
sition of a matrix.
Example 9.2.20. Using Theorem 9.2.19, find the Cholesky decomposition of the
matrix · ¸
3 2
A= .
2 7
Solution. Since the LU-decomposition of the matrix A is

" #" #
1 0 3 2
A= 2 ,
3 1 0 17
3
the Cholesky decomposition of A is A = M M T where

"p #
3 0
M= q .
p2 17
3 3
9.2.1 Exercises
Find the quadratic form associated with the given matrix.

· ¸ · ¸
2 1 −2 −4
1. 3.
1 7 −4 1
· ¸ · ¸
7 9 −2 −2
2. 4.
9 3 −2 3
Find the matrix associated with the given quadratic form.
5. 3x 2 + 14x y + 2y 2 7. x 2 + x y + 4y 2
6. 9x 2 + 8x y + y 2 8. −x 2 − 3x y + y 2
Classify the given quadratic form.

· ¸· ¸ · ¸· ¸
£ ¤ −2 −2 x £ ¤ −2 1 x
9. x y 11. x y
−2 3 y 1 −2 y
· ¸· ¸ · ¸· ¸
£ ¤ 2 −1 x £ ¤ 0 4 x
10. x y 12. x y
−1 2 y 4 15 y
Find the Cholesky decomposition of the given matrix using the method from the
proof of Theorem 9.2.12.
· ¸ · ¸
2 −3 1 1
13. A = 14. A =
−3 7 1 5
Determine if the given matrix is positive definite using Theorem 9.2.16.

· ¸ · ¸
3 −3 3 5
15. 17.
−3 5 5 7
· ¸ · ¸
−2 4 5 −7
16. 18.
4 9 −7 10
Determine the Cholesky decomposition using the method from Example 9.2.20.
· ¸ · ¸
2 3 4 −5
19. A = 20. A =
3 5 −5 7
For the given matrix A complete the following:

· ¸
p 11 −p 21
(a) Find an orthogonal matrix P = satisfying the given conditions
p 21 p 11
· 0¸
x
and such that the quadratic form x 0 y 0 P T AP 0 has no cross-product term.
£ ¤
y
· 0¸
x
(b) Calculate x 0 y 0 P T AP 0 .
£ ¤
y
· ¸ · ¸ · ¸ · ¸
1 p 11 0 −p 21
(c) Determine the rotation which rotates to and to and the
0 p 21 1 p 11
angle of that rotation.
· ¸ · ¸
−3 3 −3 3
21. A = , p 11 > 0, p 21 > 0. 25. A = , p 11 > 0, p 21 < 0.
3 5 3 5
· ¸ · ¸
8 2 8 2
22. A = , p 11 > 0, p 21 > 0. 26. A = , p 11 < 0, p 21 > 0.
2 11 2 11
· ¸ · ¸
−3 3 −3 3
23. A = , p 11 < 0, p 21 > 0. 27. A = , p 11 < 0, p 21 < 0.
3 5 3 5
· ¸ · ¸
8 2 8 2
24. A = , p 11 > 0, p 21 < 0. 28. A = , p 11 < 0, p 21 < 0.
2 11 2 11
· ¸· ¸
3 −5 x
29. Consider the equation x 2 − 10x y + 27y 2 = x y
£ ¤
= 14.
−5 27 y
· ¸
p 11 −p 21
(a) Find an orthogonal matrix P = such that p 11 > 0, p 21 > 0,
p 21 p 11
· ¸ · 0¸
3 −5 x
and such that the quadratic form x 0 y 0 P T
£ ¤
P 0 has no cross-
−5 27 y
· 0¸ · ¸
x x
product term, where P 0 = .
y y
· ¸ · 0¸
£ 0 0¤ T 3 −5 x
(b) Express the equation x y P P 0 = 14 in the standard form,
−5 27 y
that is, a(x 0 )2 + b(y 0 )2 = 1.
· ¸ · ¸ · ¸ · ¸
1 p 11 0 −p 21
(c) Determine the rotation which rotates to and to and
0 p 21 1 p 11
the angle of that rotation.
· ¸· ¸
2
¤ 3 8 x
2
£
30. Consider the equation 3x + 16x y + 33y = x y = 22.
8 33 y
· ¸
p 11 −p 21
(a) Find an orthogonal matrix P = such that p 11 > 0, p 21 > 0,
p 21 p 11
· ¸ · 0¸
3 8 x
and such that the quadratic form x 0 y 0 P T
£ ¤
P 0 has no cross-
8 33 y
· 0¸ · ¸
x x
product term, where P 0 = .
y y
· ¸ · 0¸
3 8 x
(b) Express the equation x 0 y 0 P T
£ ¤
P 0 = 22 in the standard form,
8 33 y
that is, a(x 0 )2 + b(y 0 )2 = 1.
· ¸ · ¸ · ¸ · ¸
1 p 11 0 −p 21
(c) Determine the rotation which rotates to and to and
0 p 21 1 p 11
the angle of that rotation.
9.3 Rotations in ~3
In Section 9.1 we were interested in rotations of vectors in ~2 about the origin. The
original vector a and the rotated vector b were elements of ~2 , which can be thought
of as a vectm plane in W, namely the plane Span { m,m}. Now we would Hke
to consider rotations about the origin in an arbitrary plane in ~3 •

When we described rotations in ~2 , it was clear what was meant by "counter-
clockwise rotation" and "clockwise rotation". When an arbitrary plane in ~3 is con-
sidered, we use the "right-hand rule" to specify the direction of a rotation.
Figure 9.5: The right-hand rule.
It turns out that the vectors a, b, and ax bare like the fingers of your right hand
with the index finger pointing in the direction of a, the middle finger pointing in the
direction of b, and your thumb pointing in the direction of a x b (see Fig. 9.5). We
will accept this fact for now and come back to it at the end of this chapter.
Rotations in a vector plane in ~3

If a vector a in ~2 is rotated counterclockwise about the origin by an angle a, the
result is the vector b = cos a a+ sin a aL. The fact that the vector aL is orthogonal
to a is important here, but note that there is another vector orthogonal to a, namely
-aL. From these two vectors we identify aL as the vector obtained when a is rotated
counterclockwise about the origin by the angle ~. In this section we generalize this
setup to an arbitrary vector plane in ~3 •
Recall that an arbitrary vector plane in IRI3 can be described by the equation
n·x= 0, where n is a vector different from the origin. Now consider a vector a "f. 0
on that plane, that is n•a = 0. We want to rotate a about the vector line Span{n}
oriented by the vector n in the plane n·x = 0 and describe the resulting vector by
a formula analogous to the one in ~2 , that is, b = cos a a+ sin a aL. Since the perp
operation is not available here, we need to find a replacement.
First suppose that a = se + t f where {e, f} is an orthonormal basis in the plane

n·x = 0, where n = e × f. It will be natural to define
axn = −t e + sf.
n
f
axn = −t e + sf tf
a = se + t f
sf
−t e
O se e
Figure 9.6: The vector axn = −t e + sf.
The defined vector axn should be independent of the choice of an orthonormal

basis in the plane n·x = 0. Indeed, we have
−t e + sf = sf − t e
= s(e × f) × e + t (e × f) × f
= (e × f) × (se + t f)
= n × (se + t f)
= n × a,
where we use the identity
(x × y) × z = (x·z)y − (y·z)x.
The vector n × a is clearly independent of the choice of an orthonormal basis in the

plane n·x = 0. Moreover, since n × a = −t e + sf, it agrees with our intuition of what
the vector axn should be.
Definition 9.3.1. For any vector a in R3 and any unit vector n in R3 we define
axn = n × a.
     
0 a1 −a 2
Note that, if n = 0 and a = a 2 , then ax =  a 1 , which gives us the familiar
n
1 0 0
2
perp operation in R .
We are now ready to generalize the formula b = cos α a + sin α ax to R3 .
axn = n × a
0
a
Figure 9.7: Vectors n, a, and axn .
Definition 9.3.2. Let a be a vector in the plane n·x = 0 where knk = 1. By

the vector obtained by counterclockwise rotation by an angle α of a about the
vector line Span{n} oriented by the vector n we mean the vector
cos α a + sin α axn = cos α a + sin α (n × a).
n
axn
cos α a + sin α axn

α
0
a
Figure 9.8: The vector cos α a + sin α axn .
π
Note that for α = 2 we have
π π
cos a + sin axn = axn ,
2 2
so axn is the vector obtained by rotating a in the plane n·x = 0 about the vector line
Span{n} oriented by the vector n by the angle π2 .
If n is an arbitrary nonzero vector in R3 , then the vector obtained by rotating a
about the vector line Span{n} oriented by the vector n by an angle α is
1
cos α a + sin α (n × a).
knk
If p and q is any pair of numbers such that p 2 + q 2 = 1 and n a unit vector in R3 ,

then for any vector a in the plane x·n = 0 the vector
pa + q(n × a)
is the counterclockwise rotation of a about the vector line Span{n} oriented by the
vector n by an angle α where α is defined by p = cos α and q = sin α.
   
1 1
Example 9.3.3. Let n = −1 and a = 2. Find the vector b obtained by rotating
1 2
a counterclockwise by the angle α = π4 in the plane n·x = 0 about the vector line
Span{n} oriented by the vector n.
Solution. We know that

π π 1 1
b = cos a + sin axn = p a + p axn ,
4 4 2 2
so we only need to find

1
a nx = (n × a).
knk
Since p p
knk = 12 + (−1)2 + 12 = 3
and     
1 1 −4
n×a = −1  ×  2  = −1 ,

1 2 3
we have  
−4
x 1  
a n = p −1 .
3 3
Consequently,
   
1 −4
1   1 1  
b = p 2 + p p −1 .
2 2 2 3 3
Rotations in an arbitrary plane in R3

In the previous section we considered a vector plane n·x = 0 and a vector a in that
plane that is different from the origin. Now we would like to investigate the case
when a is not in the plane n·x = 0. In this case we will consider rotating a about the
vector line Span{n} oriented by the vector n.
Let a and n be vectors in R3 different from the origin. To rotate a about the axis
Span{n} oriented by the vector n we first decompose a into the component of a that
is on the vector line Span{n} and the component of a that is perpendicular to that
· ·
³ ´
a n
n · n + cos α a − a n2 · n + knk
1
sin α (n × a)
knk2 knk
·
a n
a
n
knk2
cos α a − a·n2 n + knk

³ ´
1
sin α (n × a)
knk
θ
O a − a·n2 n
knk
Figure 9.9: Rotating a about the vector line Span{n}.
line. Note that the component of a on Span{n} does not change when rotated about
that line.
a·n
Recall that the projection of a onto the line Span{n} is n (Theorem 4.2.10)
knk2
a·n
and the projection of a onto the plane x·n = 0 is a − n (from the proof of Theo-
knk2
a·n
rem 8.1.17). The rotation of a − n in the plane x·n = 0 by the angle α about the
knk2
vector line Span{n} oriented by the vector n is
a·n a·n
µ ¶ µ µ ¶¶
1
cos α a − n + sin α n × a − n
knk2 knk knk2
which reduces to
a·n
µ ¶
1
cos α a − n + sin α (n × a) ,
knk2 knk
because n × n = 0. Consequently, the rotation of a about the axis Span{n} oriented
by the vector n by angle α is
a·n a·n
µ ¶
1
· n + cos α a − ·n + sin α (n × a) .
knk2 knk2 knk
Example 9.3.4. Let

   
1 2
n =  1 and a = −3
−1 5
We would like to find the vector b obtained by rotating a about the vector line
Span{n} oriented by the vector n by the angle α = π6 .
Since    
p 2 1
knk = 3 and a·n = −3 ·  1 = −6,
5 −1
the projection of a onto the vector line Span{n} is
   
1 −2
a·n −6    
n = 1 = −2
knk2 3
−1 2
and the projection of a onto the plane x·n = 0 is

     
2 −2 4
a·n −3 − −2 = −1 .
a− n =
knk2
5 2 3
Since      
1 2 2
n × a =  1 × −3 = −7 ,
−1 5 −5
we have
a·n a·n
µ ¶
1
b= · n + cos θ a − ·n + sin θ (n × a)
knk2 knk2 knk
     
−2 4 2
π 1 π
= −2 + cos −1 + p sin −7
6 3 6
2 3 −5
  p    
−2 4 2
3  1 1 
= −2 + −1 + p −7
2 3 2 −5
2 3
 p 
−2 + 7 3 3
 p 


= 5 3.
−2 − 3 
 p 
2 3
2+ 3
9.3.1 Exercises
Find the rotation of the vector a about the vector line through n oriented by n.
       
1 0 1 0
1. a = 0, n = 0 2. a = 0, n = 1
0 1 0 0
       
0 1 0 1
3. a = 0, n = 0 5. a = 1, n = 0
1 0 0 0
       
0 0 0 0
4. a = 0, n = 1 6. a = 0, n = 1
1 0 1 0
 
1
7. Show that the rotation of the vector a about the vector line through 0 ori-
0
   
1 1 0 0
ented by 0 is 0 cos α − sin α a.
0 0 sin α cos α
 
0
1
cos α − sin α 0
   
0
ented by 0 is  sin α cos α 0 a.
1 0 0 1
 
0
0
cos α 0 sin α
   
0
ented by 1 is  0 1 0 a.
0 − sin α 0 cos α
9.4 Cross product and the right-hand rule

Now we would like to offer a different intuitive interpretation of the cross product.
The idea is to start with any two orthogonal vectors n and a of norm one and, by
   
0 1
using only rotations, transform the space so that 0 becomes n and 0 becomes
1 0
       
0 1 0 0
a. Since 0 × 0 = 1, the cross product n × a will be the new position of 1.
1 0 0 0
this shows that the relative position of the vectors a, n × a, and n is the same as the
     
1 0 0
relative position of the vectors  0 , 1  , and 0, which justifies the right-hand

0 0 1
rule. It is important to note that rotations do not change the relative position of n
and a.
In order to obtain the intuitive interpretation of the cross product we proceed in
the following way:
Since
 n2 
q
n 12 +n 22 
   
1 0

n1 n2 0 − q n 1 1 
− 2 2  = q
 q 
n 1 +n 2 
n 12 + n 22 0 n 12 + n 22 0

0
and
 n1 
q
n 12 +n 22
   
1 0

n
 n 1 n
0 + q 2
q 2 = q 1  ,
 

n 12 +n 22
n 12 + n 22 0 n 12 + n 22 0
 
0
 n2 
q
 n 12 +n 22 
n1 π
when we rotate the vector − q 2 2  counterclockwise by about the z-axis ori-
 
2
 n 1 +n 2 
0
 n1 
q
  2 2
0  n1 +n2 
 q n2 
ented by the vector 0 , we obtain the vector  2 2 .
 
 n1 +n2 
1
0
Consequently if we define the angle α by
n2 n1
q = cos α and −q = sin α
n 12 + n 22 n 12 + n 22
 
1
when we rotate the vector 0 counterclockwise by α about the z-axis oriented by
0
n
q 2
 
  2 2  
0  n 1 +n 2  0
n
the vector 0, we obtain the vector − q 21 2  and when we rotate the vector 1
 
n 1 +n 2 
1 0

0
 
0
counterclockwise by α about the z-axis oriented by the vector 0, we obtain the
1
 n1 
q
2 2
 n1 +n2 
 n
vector  q 22 2 .

 n1 +n2 
0
We summarize the effect of the rotation by the angle α in the following table.
9.4. CROSS PRODUCT AND THE RIGHT-HAND RULE 423
Original position After the first rotation
 n2 
q
n 12 +n 22
 
1  
n
0  − q 1
 

n 12 +n 22
0
 
0
 n1 
q
n 12 +n 22 
 
0 
1  q n2 
 2 2
 n1 +n2 
0
0
   
0 0
0  0 
1 1
Step 2
n
q 2
 
2 2
n 1 +n 2 

n
The second step is the counterclockwise rotation about the vector line through − q 21 2 
 
 n 1 +n 2 
0
       
0 n1 0 n1
until 0 becomes n = n 2 . Note that this is possible since both 0 and n = n 2 
1 n3 1 n3
n
q 2
 
n 2 +n 2
 1 2
n
are orthogonal to − q 21 2 . Since
 
 n 1 +n 2 
0
 n1 
q
 
  2 2
n1 0 q  n1 +n2 
2 2  q n2 
n = n2 = n3 0 + n1 + n2  2 2 
   
 n1 +n2 
n3 1
0
and
n n
q 1 3  n1 
 
2 2
q
n 1 +n 2    2 2
 0  n1 +n2 
n2 n3 
q
 2 2 0   q n2 
 q 
n 12 +n 22 
= − n1 + n2 + n3  2 2  ,
  n +n 
 q  1 1 2
− n 12 + n 22 0
 
n1
π
when we rotate the vector n = n 2  counterclockwise by 2 about the vector
n3
 n2   n2 
q q
 n 12 +n 22   n 12 +n 22 
 q n1  n1
line through − 2 2  oriented by the vector − q 2 2  we obtain
 
 n 1 +n 2   n 1 +n 2 
0 0
n n
q 1 3
 
2 2
n 1 +n 2 

n n
q 2 3 
 
the vector 
 n 12 +n 22 
.
 q 
− n 12 + n 22
Consequently if we define the angle β by

q
n 3 = cos β and n 12 + n 22 = sin β
 
0
when we rotate the vector 0 counterclockwise by β about the vector line through
1
n n
q 2 q 2
   
2 +n 2 2 +n 2  
 n 1 2  n 1 2 n1
 q n1   q n1 
− 2 2  oriented by the vector − 2 2  we obtain the vector n 2  and when
n 1 +n 2  n 1 +n 2 
n3
 
0 0
n
q 1


2 2
 n1 +n2 
 n
we rotate the vector  q 22 2  counterclockwise by β about the vector line through

 n1 +n2 
0
n n
n2 n q 1 3
 
q 2
   
n 12 +n 22 
q
2 2
n 1 +n 2  2 2
n 1 +n 2  
  n n
 q n1  n q 2 3 
 
− 2 2  oriented by the vector − q 21 2  we obtain the vector  .
 
 n 1 +n 2   n 1 +n 2 
 n 12 +n 22 
 q 
0 0 − n 12 + n 22
We summarize the cumulative effect of both rotations by α and β in the following

table.
Original position After the first rotation After the first two rotations
n n2
q 2
   
q
n 12 +n 22  n 12 +n 22
 
1   
0   q n1  n
q 1
− 2 2  −
 

n 1 +n 2  n 12 +n 22
0
  
0 0
n n
n q 1 3
 
q 1
 
  2 2 2 2
n 1 +n 2 
0  n1 +n2  
n n
 q n2  q 2 3 
 
1  
n 12 +n 22 
 2 2
 n1 +n2  
0  q 
0 − n 12 + n 22
     
0 0 n1
0  0  n 2 
1 1 n3
Step 3
In this step the vector line through n is the axis of rotation. Since a is in the vector
plane x • n = 0 and kak = 1, there are real numbers p and q such that p 2 + q 2 = 1 and
n n
n q 1 3
 
q 2
 
2 2
n 1 +n 2   n 12 +n 22 
 n n
n q 2 3 
 
a = p − q 21 2  + q  .
 
 n 1 +n 2 
 n 12 +n 22 
 q 
0 − n 12 + n 22
Now, since
n n
n2 n q 1 3
 
q 2
   
n 12 +n 22 
q
n 12 +n 22 2 2
 
n1 n 1 +n 2  
   n n
n     q n1   q 2 3 

n × − q 1  = n 2 × − 2 2  =  (9.6)

n 12 +n 22 n 1 +n 2  
 n 12 +n 22 
n3
   q 
0 0 − n 12 + n 22
and
n n n n
q 1 3 q 1 3
     −n 
2 2 2 2
q 2
n 1 +n 2    n 1 +n 2  n 12 +n 22 
 n 1 
n n n n 
q 2 3     q 2 3   q n1
   
n× 2  = n2 ×  = , (9.7)
 
2 n 12 +n 22   n 2 +n 2 
 q n1 +n2  n3  q  1 2
− n 12 + n 22 − n 12 + n 22 0
we get
n n
n q 1 3
   
q 2

2 2 n 2 +n 2
  n 1 +n 2   1 2 
n n
n q 2 3 
   
n×a = n× p − q 1 + q
 
 2 2
 n 2 +n 2 
n 1 +n 2 
  
  q 1 2 
0 − n 12 + n 22
n n
n q 1 3
  
q 2
  
n 2 +n 2 n 12 +n 22 
1 2 
 
  n n
n q 2 3 
  
= p  n × − q 2 1 2  + q n ×
   
  n 1 +n 2 
  n 12 +n 22 
  q 
0 2 2
− n1 + n2
n n
n2 q 1 3
   
2 2
q
n 12 +n 22   n 1 +n 2 
 n n
 q n1  q 2 3 
 
= −q  − +p  n 12 +n 22 
.
n 12 +n 22 
 
 q 
0 − n 12 + n 22
Note that a and n × a are both in the plane orthogonal to n and n × a can be
obtained from a by counterclockwise rotation in that plane by π2 about the vector
line Span{n} oriented by the vector n.
Consequently if we define the angle γ by
p = cos γ and q = sin γ
n
q 2
 
n 2 +n 2
 1 2
 q n1 
when we rotate the vector − 2 2  counterclockwise by γ about the vector line
 n 1 +n 2 
0
Span{n} oriented by the vector n we obtain the vector a and when we rotate the
 n1 n3 
q
2
n 1 +n 2 2

n2 n3 
2  counterclockwise by γ about the vector line Span{n} oriented

vector  q
2

 q n1 +n2 

− n 12 + n 22
by the vector n we obtain the vector n × a.
We summarize the cumulative effect of these three rotations (by α,β and γ) in
the following table.
Original position After the first rotation After the first two rotations After three rotations
 n2   n2 
q q
n 12 +n 22 n 12 +n 22
 
1    
n n
0  − q 1 − q 1 a
   
 
n 12 +n 22 n 12 +n 22
0
   
0 0
n n
n1 q 1 3
   
2 2
q
  2 2 n 1 +n 2 
0  n1 +n2  
n n
 q n2  q 2 3 
 
1   n×a
n 12 +n 22 
 2 2
 n1 +n2  
0  q 
0 − n 12 + n 22
   
0 0
0  0  n n
1 1
Chapter 10
Problems in plane geometry
The classical plane geometry gives us an attractive opportunity to apply and under-
stand linear algebra. In this chapter we use tools provided by linear algebra to solve
nontrivial problems in plane geometry. The main purpose of the chapter is to give
students a chance to practice newly acquired skills in the familiar context of geom-
etry. We also hope that students will appreciate the elegance of algebraic solutions.
The applications are presented in the form of problems with complete solutions.
With few exceptions, the tools used here were introduced in Chapters 3. The solu-
tions are presented with fewer details than in the rest of the book. The reader is
expected to work through the arguments and fill in the finer details.
In this chapter, following the standard notation in plane geometry, points in R2
will be denoted with capital letters A, B,C , . . . , X , Y , Z instead of a, b, c, . . . , x, y, z. As
before, we will identify points in R2 with vectors in R2 . This will allow us to translate
a purely geometric problem to a problem in linear algebra and then use the power
of linear algebra to solve the problem in an elegant way.
10.1 Lines and circles

Let A and B be two distinct points in R2 . The line segment connecting A and B will
be denoted by AB .
Problem 10.1.1. Let A, B , and C be distinct points in R2 such that the angle
∠ AC B is a right angle. Show that C is on the circle with diameter AB .
A +B
Solution 1. The center of the circle is at . It suffices to show that
2
° ° ° °
° A − A + B ° = °C − A + B ° .
° ° ° °
° 2 ° ° 2 °
429
430 Chapter 10: Problems in plane geometry
C B
Figure 10.1: Problem 10.1.1.
Since the angle ∠ AC B is a right angle, we have 0 = (C − A) • (C − B ). Hence,
0 = (C − A) • (C − B )
A +B A +B A +B A +B
µ ¶ µ ¶
= C− + −A • C− + −B
2 2 2 2
A +B B − A A +B B − A
µ ¶ µ ¶
= C− + • C − −
2 2 2 2
° °2 ° °2
° A +B ° ° A −B °
° ° °
=°°C − 2 ° − ° 2 °
° °2 ° °2
° A +B ° ° − °A − A + B ° ,
° °
=° C −
° 2 ° ° 2 °
which gives us the desired equality.
It turns out that the algebraic part of this argument can be significantly simpli-
fied. Since the described property does not depend on the position of the triangle
relative to the origin, we can choose its position to simplify calculations. In this case
it is most convenient to assume that the middle of the line segment AB is at the
origin. Then the solution becomes significantly simpler.
Solution 2. If we take the middle of the segment AB as the origin, then we have
B = −A and hence
0 = (C − A) • (C − B ) = (C − A) • (C + A) = C • C − A • A = kC k2 − kAk2 ,
As we can see, the algebraic part of the proof has been reduced to a single line.
In the remaining examples in this chapter we will always try to find a position that
gives us the simplest calculations. Sometimes it is not entirely obvious what that
position is. In such a case we might have to try a couple of different positions before
we discover the best one. On the other hand, in some examples it seems that there
is no advantage in choosing a special position.
10.1. LINES AND CIRCLES 431
A}
{
an
Sp
+
P
P
Problem 10.1.2. Let P be a point on a circle with the center at C . Show that
the tangent to the circle at P is orthogonal to the line through the points P and
C.
Solution. Let A be a point such that kAk = 1. The common points of the circle and
the line P + Span{A}, if such points exist, are given by the equation
kP + t A −C k = kP −C k,
or, equivalently,
(P + t A −C ) • (P + t A −C ) = (P −C ) • (P −C ).
This reduces to
t 2 + 2((P −C ) • A)t = 0,
since kAk = 1. This quadratic equation has exactly one solution if and only if
(P −C ) • A = 0.
Problem 10.1.3 (Chord Theorem). If two chords in a circle intersect, then the
product of the lengths of the two segments on one chord is equal to the product
of the lengths of the two segments on the other chord.
Solution. Consider a circle with the center at C and the radius r . Let P be a point in-
side the circle, that is, kP −C k < r . Any line through P can be described as P + Span{A}
with kAk = 1. Note that the distance from P to a point P + t A is |t |.
The intersection points of the line P + Span{A} and the circle are given by the
equation
kP + t A −C k = r,
t 2 + 2((P −C ) • A)t + kP −C k2 − r 2 = 0.
Since
((P −C ) • A)2 − kP −C k2 + r 2 > 0,
the equation has two distinct roots, as expected. Moreover, the product of these
roots is
−kP −C k2 + r 2 ,
which is independent of A.
P
C

10.2. TRIANGLES 433
Problem 10.1.4 (Tangent-Secant Theorem). If a secant segment and tangent

segment are drawn to a circle from the same external point, the product of
the length of the secant segment and its external part equals the square of the
length of the tangent segment.
Solution. This theorem can be interpreted as a case of the Chord Theorem when
the point P is outside the circle. It can proved by a modification of the argument
presented above. We leave the proof as an exercise.
10.2 Triangles
In many solutions in this section we consider the triangle with vertices C −C , A −C ,
and B − C instead of the triangle with vertices C , A, and B . This is done to simplify
the calculations. Subtracting C from all points of the triangle results in translating
the whole triangle without changing its size or shape. This allows us to solve the
general problem by solving an “easier” problem.
Problem 10.2.1. Show that the three altitudes in a triangle intersect at a sin-
gle point.
A B
Solution. Consider a triangle with vertices A, B , and C . To simplify the calculations,

we assume that C is at the origin. The altitude from A is on the line A + Span{B x }
and the altitude from B is on the line B +Span{A x }. The intersection of these to lines
can be found as the solution of the equation
A + t B x = B + s Ax.
Note that this equation has a unique solution since A x and B x are linearly indepen-
dent. Let’s denote the intersection point by H . It suffices to prove that H • (A −B ) = 0.
Indeed,
H • (A − B ) = H • A − H • B = (B + s A x ) • A − (P + t B x ) • B = B • A − A • B = 0.
Problem 10.2.2. Show that the three medians in a triangle intersect at a sin-
gle point.
1
3 (A + B +C )
A B
Figure 10.6: Medians in a triangle intersect at a single point.
Solution. The median from A is

B +C
µ ¶
A+t −A .
2
A+B +C
If we take t = 32 , we get the point 3 . This point is also on the medians from B
and from C , since
A + B +C 2 A +C 2 A +B
µ ¶ µ ¶
=B+ −B =C + −C .
3 3 2 3 2
Problem 10.2.3. Consider a triangle with vertices A, B , and C . Show that the
bisectors intersect at a single point. The point of intersection is
a b c
I= A+ B+ C,
a +b +c a +b +c a +b +c
where a = kB −C k, b = kA −C k, and c = kA − B k.
10.2. TRIANGLES 435
A B
Figure 10.7: Bisectors in a triangle intersect at a single point.
Solution. We will show that the point is on the bisector from C . Observe that
a b
I =C + (A −C ) + (B −C )
a +b +c a +b +c
ab A −C ab B −C
=C + +
a +b +c µ b a + b + c a¶
ab A −C B −C
=C + + ,
a + b + c kA −C k kB −C k
from which it is clear that the point I is on the bisector from C . The same method
can be used to check that I is on the other two bisectors.
Problem 10.2.4. Consider a triangle with vertices A, B , and C . Show that
a2 + b2 − c 2
(A −C ) • (B −C ) = ,
2
where a = kB −C k, b = kA −C k and c = kA − B k.
Solution. We have
kA − B k2 = kA −C +C − B k2
= kA −C k2 + kC − B k2 + 2(A −C ) • (C − B )
= kA −C k2 + kB −C k2 − 2(A −C ) • (B −C ),

A D B
Problem 10.2.5. Consider a triangle with vertices A, B , and C . Denote the

intersection of the bisector from the vertex C with the opposite side by D. Then
kA − Dk kB − Dk
= .
kA −C k kB −C k
Solution. We place the vertex C at the origin. Now the bisector from the vertex C is
on the line
A B
½ ¾
Span +
kAk kB k
and a point on the line through the points A and B is of the form
A + s(B − A).
Thus the intersection is given by the equation
A B
µ ¶
t + = (1 − s)A + sB,
kAk kB k
where t and s are real numbers to be determined. Since A and B are linearly inde-
pendent, we must have
t t
= 1 − s and = s.
kAk kB k
Now, with a = kB k and b = kAk, we obtain
b
s= ,
a +b
and thus the intersection point is
a b
A+ B = D.
a +b a +b
10.2. TRIANGLES 437
Now
a b
D−A= A+ B−A
a +b a +b
b b b
= B− A= (B − A)
a +b a +b a +b
and
a b
D −B = A+ B −B
a +b a +b
a a a
= A− B= (A − B ).
a +b a +b a +b
Hence
kD − Ak kA − B k kD − B k
= =
b a +b a
or
kD − Ak kD − B k
= .
kB k kAk
Finally, going back to the triangle with vertices A, B , and C , we get
kA − Dk kB − Dk
= .
kA −C k kB −C k
Problem 10.2.6. Consider a triangle with vertices A, B , and C . The radius r

of the circle inscribed in the triangle is
£ ¤
| det A −C B −C |
,
a +b +c
where a = kB −C k, b = kA −C k, and c = kA − B k.
A B
Figure 10.9: Inscribed circle.

Solution. We assume that C is at origin. From Problem 10.2.3 we know that the
center of the circle is
a b c a b
I= A+ B+ C= A+ B.
a +b +c a +b +c a +b +c a +b +c a +b +c
The intersection point of the line through I perpendicular to the line through B and
C with that line is given by the equation
a b
A+ B + t B x = sB.
a +b +c a +b +c
Hence
a b
A • Bx + B • B x + t B x • B x = sB • B x
a +b +c a +b +c
which simplifies to
a
A • B x + t kB k2 = 0.
a +b +c
Since A • B x = det B A and kB k2 = a 2 , we get
£ ¤
a
A + t a 2 = 0.
£ ¤
det B
a +b +c
Hence £ ¤ £ ¤
det B A det A B
t =− = .
a(a + b + c) a(a + b + c)
Now observe that the radius must be equal to kt B x k and thus
£ ¤ £ ¤
x | det B A | x | det A B |
r = kt B k = kB k = ,
a(a + b + c) a +b +c
because kB x k = kB k = a. Switching back to A, B , and C we obtain

£ ¤
| det A −C B −C |
r= .
a +b +c
Problem 10.2.7. Consider a triangle with vertices A, B , and C . Show that the
radius R of the circumcircle is given by
abc
R= £ ¤ ,
2| det A −C B −C |
where a = kB −C k, b = kA −C k and c = kA − B k.
10.2. TRIANGLES 439
A B
Solution. We place the origin at C . The center of the circumcircle can be found from
the equation
A B
+ t A x = + sB x ,
2 2
where t and s are real numbers to be determined. We multiply both sides by B and
obtain
A •B kB k2 a 2
+ t Ax • B = = .
2 2 2
By solving for t we obtain
a2 − A • B
t= . (10.1)
2A x • B
Now observe that
°2 °
° x A °2
° °
2
°A x
°
R = ° + t A − A ° = °t A − °
° ° °
2 2°
kAk2 kAk2
µ ¶
2 ° x °2
° ° 2 2 2 2 1
=t A + = t kAk + =b t + .
4 4 4
From (10.1) we obtain

µ 2 ¶2
1 a − A •B 1
t2 + = +
4 2A x • B 4
a 4 − 2a 2 A • B + (A • B )2 + (A x • B )2
=
4(A x • B )2
a − a (a + b 2 − c 2 ) + a 2 b 2
4 2 2
a2c 2
= =
4(A x • B )2 4(A x • B )2
2 2
a c
= £ ¤ ,
4(det A B )2
since
a2 + b2 − c 2
A •B = ,
2
as shown in Problem 10.2.4, and
(A • B )2 + (A x • B )2 = kAk2 kB k2 .
Consequently
a2b2c 2
µ ¶
1
R 2 = b2 t 2 + = ¤2
4
£
4 det A B
and thus
abc
R= £ ¤ .
2| det A B |
Recalling that A = A −C and B = B −C we obtain the desired equality
abc
R= £ ¤ .
2| det A −C B −C |
Problem 10.2.8 (Nine-point circle). Let H be the orthocenter of the triangle

with vertices A, B , and C . Prove that the three points where the altitudes meet
the opposite sides of the triangle and the midpoints of the line segments H A,
H B , HC , AB , BC and C A are all on the same circle.
Solution. We place the origin at H .

We have
° C A + B +C °2 ° A + B A + B +C °2 ° A + B −C °2
° ° ° ° ° °
° − ° =° ° ,
° 2 − ° =°
° °
°2 4 ° 4 4 °
° B A + B +C °2 ° A +C A + B +C °2 ° A +C − B °2
° ° ° ° ° °
° − ° =° − ° =° °
°2 4 ° ° 2 4 ° ° 4 °
and
° A A + B +C °2 ° B +C A + B +C °2 ° B +C − A °2
° ° ° ° ° °
° − ° =° ° .
° 2 − ° =°
° °
°2 4 ° 4 4 °
Since
A • (B −C ) = B • (A −C ) = C • (A − B ) = 0,
by the Pythagorean Theorem, we have
° A + B −C °2
° °
° = 1 kAk2 + kB −C k2
¡ ¢
°
° 4 ° 16
1 ¡
kAk2 + kC − B k2
¢
=
16
10.2. TRIANGLES 441
C
1
2 (C + H)
1
2 (A +C )
H
1
2 (B +C )
1
2 (A + H ) 1
+ H)
2 (B
A 1 B
2 (A + B )
Figure 10.11: Problem 10.2.8: The nine-point circle.
° A +C − B °2
° °
=°
° °
4 °
and
° A + B −C °2 ° B + A −C °2
° ° ° °
° ° =° °
° 4 ° ° 4 °
1 ¡
kB k2 + kA −C k2
¢
=
16
1 ¡
kB k2 + kC − Ak2
¢
=
16
° B +C − A °2
° °
=° ° .
° 4 °
Consequently, ° ° ° ° ° °
° A + B −C ° ° A +C − B ° ° B +C − A °
° °=° °=° °.
° 4 ° ° 4 ° ° 4 °
Therefore, the points
A B C A + B B +C B +C
, , , , , and
2 2 2 2 2 2
A + B +C kA + B −C k
are on the circle centered at of radius r = . Since,
4 4
° °
° B +C A °
° − ° = 2r,
° 2 2°
B +C A
points and are on a diameter of that circle. If D is the point where the
2 2
altitude from A meets the opposite side of the triangle, then we have
B +C A
µ ¶ µ ¶
−D • − D = 0.
2 2
But this means that D is on the circle, by Problem 10.1.1. By a similar argument we
can show that the points where the altitudes from B and C meet the opposite sides
of the triangle are on the circle.
Problem 10.2.9. Prove that the orthocenter H , the center N of the nine-point
circle (see Problem 10.2.8), and the circumcenter Z are on the same line. More-
over, show that N is the midpoint of the segment with endpoints H and Z and
the radius of the circumcircle is twice the radius of the nine-point circle.
A B
Solution. We place H at the origin. Since
1
N = (A + B +C ),
4
the line which passes through H and N is
Span{A + B +C }.
10.3. GEOMETRY AND TRIGONOMETRY 443
By the hypothesis, we have A • (B −C ) = 0 and thus

µ ¶
1
2N − (B +C ) • (B −C ) = 0.
2
Similarly, µ ¶
1
2N − (A +C ) • (A −C ) = 0
2
and µ ¶
1
2N − (A + B ) • (A − B ) = 0,
2
which means that 2N = Z is the circumcenter. Now to finish the proof it suffices to
observe that the square of the radius of the circumcircle is
° °2 ° °2
2
°1 ° °1 °
kZ − Ak = ° (A + B +C ) − A ° = 4 ° (B +C − A)°
° ° °
° .
2 4
10.3 Geometry and trigonometry

In this section we assume basic knowledge of trigonometry. Properties of trigono-
metric functions used here were derived in Section 9.1.
β
2
B
β
Figure 10.13: Problem 10.3.1
Problem 10.3.1. Show that the inscribed angle equals half of the central an-
gle.
Solution. If we take the origin in the center of the circle, then
B = cos αA + sin αA x and C = cos(α + β)A + sin(α + β)A x ,
where α > 0, β > 0, and α + β < 2π. Then
B − A = (cos α − 1)A + sin αA x

α α α
= −2 sin2 A + 2 sin cos A x
2 2 2
α³ ³π α´ ³π α´ ´
= 2 sin cos + A + sin + Ax
2 2 2 2 2
and
C − A = (cos(α + β) − 1)A + sin(α + β)A x

α+β α+β α+β x
µ ¶
= 2 sin − sin A + cos A
2 2 2
α+β π α β π α β x
µ µ ¶ µ ¶ ¶
= 2 sin cos + + A + sin + + A .
2 2 2 2 2 2 2
α α+β α α+β
Note that 0 < < π and 0 < < π. Hence sin > 0 and sin > 0 and
2 2 2 2
β
∠(B − A,C − A) = .
2
Problem 10.3.2. Show that in a triangle with vertices A, B , and C we have
a = 2R sin α,
where a = kB − C k, R is the radius of the circumcircle, and

0 < α = ∠(B − A,C − A) < π.
Solution. If we take the origin in the center of the circle, then
C = cos 2α B + sin 2α B x ,
by Problem 10.3.1. This yields
C • B = cos 2α B • B = R 2 cos 2α.
Now, since
a 2 = kB −C k2
= kB k2 + kC k2 − 2B • C
10.3. GEOMETRY AND TRIGONOMETRY 445
a = kB −C k
α
A
Figure 10.14: Problem 10.3.2
= 2R 2 − 2B • C ,
we have
a 2 = 2R 2 − 2R 2 cos 2α = 2R 2 (1 − cos 2α) = 4R 2 (sin α)2 .
Problem 10.3.3. Show that, if
∠(B − A,C − A) = ∠(B − D,C − D),
then the points A, B , C , and D are on the same circle.
A D

Solution. Suppose that
∠(B − A,C − A) = ∠(B − D,C − D) = γ.

Then
C −D B −D (B − D) x
= cos γ + sin γ . (10.2)
kC − Dk kB − Dk kB − Dk
Placing the origin at the center of the circumcircle of the triangle with vertices A, B ,
and C , we obtain
C = cos 2γ B + sin 2γ B x . (10.3)
Let kC − Dk = c and kB − Dk = b. Using 10.2 and 10.3, we get
B Bx B Bx D D Dx
cos 2γ + sin 2γ − cos γ − sin γ = − cos γ − sin γ
c c b b c b b
and, by calculating the square of the norm of both sides,
cos 2γ cos γ 2 sin 2γ sin γ 2

µµ ¶ µ ¶ ¶
2
kB k − + −
c b c b
¶2 µ ¶2 ¶
1 cos γ sin γ
µµ
= kDk2 − + .
c b b
Since
cos 2γ cos γ 2 sin 2γ sin γ 2 1 cos γ
µ ¶ µ ¶
1
− + − = 2 −2 + 2
c b c b c bc b
1 cos γ 2 sin γ 2
µ ¶ µ ¶
= − + ,
c b b
we conclude
kDk = kB k.
10.4 Geometry problems from the International

Mathematical Olympiads
In this section we present solutions to geometry problems from the International
Mathematical Olympiads that were held in 2007, 2008, 2009, and 2010. We would
like to point out that, while the solutions are not trivial, they are strightforward. They
do not require any special tricks.
Problem 10.4.1 (IMO 2007 Vietnam). In triangle ABC the bisector of angle
BC A intersects the circumcircle again at R, the perpendicular bisector of BC
at P , and the perpendicular bisector of AC at Q. The midpoint of BC is K and
the midpoint of AC is L. Prove that the triangles RP K and RQL have the same
area.
10.4. PROBLEMS FROM THE INTERNATIONAL MATHEMATICAL OLYMPIADS 447
C C
−α α
K
L P
Q
S
B B
A A
R R
Figure 10.16: IMO 2007 Vietnam.
Solution. If we place the origin at C , then the circumcenter is

1
S = R + sR x ,
2
for some real number s.
Since C R is the bisector of the angle BC A, we have
B = t cos α R + sin α R x
¡ ¢
and
A = u cos(−α)R + sin(−α)R x ,
¡ ¢
for some t , u, and α. From

µ ¶
1
B − S = t cos α − R + (t sin α − s) R x ,
2
1
C − S = O − S = − R − sR x ,
2
and
kB − Sk2 = kC − Sk2 ,
we obtain µ ¶2 µ ¶2
1 1
t cos α − + (t sin α − s)2 = + s2
2 2
and then
t 2 − t cos α − 2t s sin α = 0.
Since t 6= 0, we must have t = cos α + 2s sin α.
Now we note that point P is given by the equation
¢x
K + x cos αR + sin αR x = yR.
¡
Hence
¢x ¢ ¡
K + x cos αR + sin αR x • cos αR + sin αR
x¢
= yR • (cos αR + sin αR x ),
¡ ¡
which, in view of the equalities

¢x ¡
cos αR + sin αR x • cos αR + sin αR x = 0
¡ ¢
and
cos αR + sin αR x • cos αR + sin αR x = kRk2 ,
¡ ¢ ¡ ¢
simplifies to
1
(cos α + 2s sin α) = y cos α
2
or
t
= y cos α.
2
The area of the triangle RP K is
¯µ ¶x ¯
t
¶ µ
1 ¯¯ ¯ 1¯ 1
(K − R) • (P − R)x ¯ = ¯¯ B − R •
¯
R − R ¯¯
2 2 2 2 cos α
¯µ ¶ ¯
1 ¯¯ t ¡ t
¶ µ
cos α R + sin α R x − R • − 1 R x ¯¯
¢ ¯
= ¯
2 2 2 cos α
¯
t
µ ¶¯
1 ¯¯ 2
= ¯t sin α
¯
− 1 ¯¯ kRk
4 2 cos α
1 ¯ sin α ¡ 2
¯ ¯
t − 2t cos α ¯¯ kRk2
¢¯
= ¯¯
4 cos α
1 ¯ sin α ¡ 2
¯ ¯
4s (sin α)2 − (cos α)2 ¯¯ kRk2 .
¢¯
= ¯¯
4 cos α
Now we note that the value of this expression depends only on α, s, and R and does
not change if we replace α with −α. Thus we can conclude that the triangles RP K
and RQL have the same area, because the area of the triangle RQL is obtained by
replacing α with −α in the expression for the area of the triangle RP K .
Problem 10.4.2 (IMO 2008 Spain). An acute-angled triangle ABC has ortho-
center H . The circle passing through H with the center at the midpoint of BC
intersects the line BC at A 1 and A 2 . Similarly, the circle passing through H
with the center at the midpoint of C A intersects the line C A at B 1 and B 2 , and
the circle passing through H with the center at the midpoint of AB intersects
the line AB at C 1 and C 2 . Show that A 1 , A 2 , B 1 , B 2 , C 1 , C 2 lie on a circle.
Solution. We place the origin at H . Since
A • (B −C ) = B • (A −C ) = C • (A − B ) = 0,
A
B2
C1
C2
S
B1
B A1 A2 C
Figure 10.17: IMO 2008 Spain.
and, consequently,
A • B = B •C = A •C, (10.4)
1
it is easy to verify that the center of the circumcircle is S = 2 (A + B
+C ).
Points A 1 and A 2 are on the line B + Span{C − B } and satisfy the equation
° °2 ° °2
°B + t (C − B ) − B +C ° = ° B +C ° ,
° ° ° °
° 2 ° ° 2 °
t 2 kB −C k2 − t kB −C k2 − B • C = 0.
Hence µ ¶
1 kB +C k
t= 1± .
2 kB −C k
Thus points A 1 and A 2 are
B +C 1 kB +C k
± (C − B ).
2 2 kB −C k
The square of the distance between the circumcenter S and the points A 1 and
A 2 is
° ¶°2 ° °2
° A + B +C B +C 1 kB +C k ° = ° A ± 1 kB +C k (B −C )°
µ
° ° °
° − ± (B −C )
° 2 2 2 kB −C k ° ° 2 2 kB −C k °
1¡
= kAk2 + kB +C k2
¢
4
Similarly, the square of the distance between the circumcenter and the points B 1 and
B 2 is 41 (kB k2 +kA+C k2 ) and the square of the distance between the circumcenter and
the points C 1 and C 2 is 14 (kC k2 + kA + B k2 ).
Now, from (10.4), we get
kAk2 + kB +C k2 = kB k2 + kA +C k2 = kC k2 + kA + B k2 ,
which proves that the points A 1 , A 2 , B 1 , B 2 , C 1 , and C 2 lie on a circle centered at

S.
Problem 10.4.3 (IMO 2009 Germany). Let ABC be a triangle with circum-
center S. The points P and Q are interior points on the sides C A and AB ,
respectively. Let K , L and M be the midpoints of the segments B P , CQ and
PQ, respectively, and let Γ be the circle passing through K , L, and M . Suppose
that the line PQ is tangent to the circle Γ. Prove that the line segments SP and
SQ have the same length.
L
S
P B
K
M
Q
A
Figure 10.18: IMO 2009 Germany.
Solution. If we place the origin at A, we can express C as

c¡
uB + vB x ,
¢
C= (10.5)
b
where b = kB k, c = kC k, u = cos α, v = sin α, and α is the angle between AB and AC .
Moreover,
Q = sB, for some 0 < s < 1,

c¡
uB + vB x , for some 0 < t < 1,
¢
P =t
b
B +P
K= ,
2
C +Q
L= ,
2
P +Q
M= .
2
Let D be a point on the perpendicular bisector of the segment PQ, that is,
D = M + x(P −Q)x ,
for some real number x. Note that D is the center of Γ and PQ is tangent to Γ if and
only if
kD − K k = kD − Lk = kD − M k.
The equality
kD − K k = kD − M k
is equivalent to
° °2
°Q − B x°
° ° x °2
°
°
° 2 + x(P −Q) ° = x(P −Q)
° .
When expressed in terms of B and B x , the equality becomes
¢´°2 °
° °
°s −1 ³ c¡
x
³ c¡
x °2
¢´°
° B + x t (u − s)B − vB ° =°
°x t (u − s)B − vB ° .
° 2 b ° b
Hence
(s − 1)2 2 s −1 ³ c¡ ¢´
b +2 B • x t (u − s)B x − vB = 0,
4 2 b
which simplifies to
(s − 1)2
b − xt c(s − 1)v = 0.
4
Solving for x we obtain
(s − 1)b
x= . (10.6)
4t c v
The equality
kD − Lk = kD − M k
is equivalent to
° °2
° P −C x°
° ° x °2
°
° 2 + x(P −Q) ° = x(P −Q) ,
° °
which, when expressed in terms of B and B x , becomes
¢°2 °
° °
°t −1 c ¡ x¢ c¡ x c¡ °2
¢°
°
° 2 b uB + vB + xt (u − s)B − vB °xt (u − s)B x − vB ° .
° =°
b ° b
Consequently,
(t − 1)2 2 t −1
c − 2xscb v = 0,
4 2
which gives us
(t − 1)c
x= . (10.7)
4sbv
From (10.6) and (10.7) we obtain
t (t − 1)c 2 = s(s − 1)b 2 . (10.8)
L
S
D
P B
K
M
Q
A
Figure 10.19: IMO 2009 Germany.
Now we show that (10.8) is equivalent to kS − P k = kS − Qk. Since the circum-

center S is at the intersection of the perpendicular bisectors of the segments AB and
C A, we have
B C
S = + yB x = + zC x ,
2 2
for some real numbers y and z. Hence
B C
µ ¶ µ ¶
+ yB x • C = + zC x • C ,
2 2
which, using (10.5), yields

1c c c2
ub 2 + y vb 2 = .
2b b 2
Solving for y we obtain
c − bu
y= .
2bv
Consequently,
1 c − bu x
S= B+ B ,
2 2bv
which gives us
¶2 ¶2 ¶
c c − bu c
µµ µ
2 1
kS − P k = −t u + −t v b2
2 b 2bv b
and ¶2 ¶2 ¶
c − bu
µµ µ
1
kS −Qk2 = −s + b2.
2 2bv
The equality
kS − P k2 = kS −Qk2
is thus equivalent to
c − bu 2 1 c ³ c ´2 µ c − bu ¶2 c − bu c ³ c ´2
µ ¶
1 2
−s+s + = −t u+ t u + − 2t v+ t v
4 2bv 4 b b 2bv 2bv b b
or, after simplifying and using the fact that u 2 + v 2 = 1,
c2
−s + s 2 = (−t + t 2 ) .
b2
But this is equivalent to (10.8), which completes the proof.
Problem 10.4.4 (IMO 2010 Kazakhstan). Let P be a point inside the triangle
ABC . The lines AP , B P , and C P intersect the circumcircle Γ of triangle ABC
again at the points K , L and M , respectively. The tangent to Γ at C intersects
the line AB at S. Suppose that line segments SC and SP have the same length.
Prove that the line segments M K and M L have the same length.
Solution. If we place the origin at S, then
P = (cos β)C + (sin β)C x ,
A = a (cos α)C + (sin α)C x ,

¡ ¢
and
B = b (cos α)C + (sin α)C x ,
¡ ¢
for some a, b > 0, where α is the angle between SC and S A and β is the angle between
SC and SP .
From the Tangent-Secant Theorem (Problem 10.1.4) we have
kC k2 = kAk kB k = akC kbkC k.
Hence ab = 1 and
1¡
(cos α)C + (sin α)C x .
¢
B= (10.9)
a
K
C
B
Γ
Figure 10.20: IMO 2010 Kazakhstan.
Note that
K = P + k(A − P ), L = P + l (B − P ), M = P + m(C − P ),
for some k, l , m < 0. From the Chord Theorem (Problem 10.1.3) we have
kC − P k kM − P k = kA − P k kK − P k = kB − P k kL − P k.
Hence
mkC − P k2 = kkA − P k2 = l kB − P k2
and
kC − P k2 kC − P k2
k =m and l = m .
kA − P k2 kB − P k2
Now
kM − K k = km(C − P ) − k(A − P )k
kC − P k2
° °
° °
= °m(C − P ) − m
° (A − P )°
kA − P k2 °
kC − P k2
° °
° °
°(C − P ) − kA − P k2 (A − P )°
= |m| ° °
kC − P k4 kC − P k2
µ ¶
= |m| kC − P k2 + − 2 (A − P ) • (C − P )
kA − P k2 kA − P k2
kC − P k2
µ ¶
1
= |m|kC − P k2 1 + − 2 (A − P ) • (C − P ) .
kA − P k2 kA − P k2
Similarly
kC − P k2
µ ¶
1
kM − Lk = |m|kC − P k2 1 + − 2 (B − P ) • (C − P ) .
kB − P k2 kB − P k2
Therefore, to prove the equality
kM − K k = kM − Lk,
it suffices to show that the value of

kC − P k2 1 k(C − P )k2 − 2(A − P ) • (C − P )
2
−2 2
(A − P ) • (C − P ) =
kA − P k kA − P k k(A − P )k2
does not change the value when we replace A by B .
We have
(A − P ) • (C − P ) = ((a cos α − cos β)(1 − cos β) − (a sin α − sin β) sin β)kC k2

= a cos(α − β) + a cos α + (cos β)2 − cos β + (sin β)2 kC k2
¡ ¢
= ((a cos(α − β) + a cos α + 1 − cos β))kC k2 ,
kA − P k2 = (a cos α − cos β)2 + (a sin α − sin β)2 kC k2

¡ ¢
= (a 2 + 1 − 2a cos(α − β))kC k2 ,
kC − P k2 = (1 − cos β)2 + (sin β)2 kC k2

¡ ¢
= (2 − 2 cos β)kC k2 .
The above equalities yield
kC − P k2 − 2(A − P ) • (C − P ) = (a cos(α − β) + a cos α)kC k2 .
Similarly,
µ ¶
1 1
kC − P k2 − 2(B − P ) • (C − P ) = cos(α − β) + cos α kC k2 .
a a
Consequently,
kC − P k2 − 2(A − P ) • (C − P ) a cos(α − β) + a cos α

= 2
kA − P k2 a + 1 − 2a cos(α − β)
1
a cos(α − β) + a1 cos α
=
1 + a12 − 2 a1 cos(α − β)
kC − P k2 − 2(B − P ) • (C − P )
= ,
kB − P k2
which completes the proof.
Chapter 11
Problems for a computer

algebra system
The following problems are intended to be solved with a computer algebra system
like Maple, Mathematica, or Matlab. The purpose of these exercises is to give you an
opportunity to gain some basic knowledge and some practice in using a computer
algebra system to solve problems in linear algebra.
2 3 5 1
 
 
 4
1. Determine the reduced row echelon form that the matrix A =  7 9 2 
.
8 15 17 4
3 1 2
 
 
 8 3 4
2. Verify, using the reduced row echelon form, that the matrix B =  

5 7 2
4 7
 
 
 3
is invertible and then solve the equation B X = C , where C =  0 
.
7 9
 
2 1 5 2
 
 5 8 7 4 
 
3. Calculate the inverse of the matrix E =  .
 1 5 7 3
 

 
4 3 3 2
1 4 0 1 0 5 2 1 3
   
   
4. Calculate the product  0 1 0  0 1 2  1 1 4 
.
 
0 −3 1 0 0 1 7 2 5
457
458 Chapter 11: Problems for a computer algebra system
7
 
 
 21  on the vector plane Span{u, v},
5. Determine the projection of the vector b =  
14
2 4
   
   
where u =  3  and v =  1 
  
.
1 3
6. Find numbers x and y that minimize

° " #°2
° x °
2 2 2
(2 − 2x − y) + (1 − x − 3y) + (1 − 5x − 2y) = °c − F ° ,
° °
° y °
2 2 1
   
   
 1  and F =  1
where c =  3 
.
 
1 5 2
7 2 x
 
 
 y
7. Calculate the determinant of the matrix G =  2 5 
.
4 −3 8
2 1 2
 
 
 2
8. Find the eigenvalues and eigenvectors of the matrix H =  3 4 
.
4 2 10
9. Find a symmetric matrix which has eigenvalues x, y, and z such that the vector
3
 
 
p= 1  is an eigenvector corresponding to the eigenvalue x and the vector

4
1
 
 
 −11  is an eigenvector corresponding to the eigenvalue y.
q= 
1 1 1
 
 
 2
10. Find the QR-decomposition of the matrix K =  −1 5 
.
1 2 −3
Chapter 12
Answers to selected exercises
Section 1.1
· ¸
1 2 13 1
9
£ ¤ 65 −4
3 1 16
· ¸
·
12
¸ 7p − 2r 7q − 2s
5 11
−18 5p + 3r 5q + 3s
· ¸ · ¸
13 13 7p + 5q −2p + 3q
7 13
5 −4 7r + 5s −2r + 3s
Section 1.2
· ¸ · 1 1¸
4 5 3 6
1
3 8 −7 −3
· ¸ · ¸
3 8 1 −4
3 17
13 29 0 1
1 − 25
· ¸ " #
19 −35
5 19
5 −9 0 1
· ¸ 5
4 11 · ¸·
1 0 1 −2
¸·
1 0
¸ ·
11 −2
¸
7 21 =
2 4 0 12 0 1 −5 1 − 52 1
· ¸ 2
8w + y 8x + z ·
1 4 31
¸· ¸·
1 0
¸·
0 1
¸ "1 1#
9 0
w x 23 = 21 3
0 1 0 1 0 18 1 0
· ¸ 8 0
−25w + 15y −25x + 15z · ¸· ¸
11 1 5 1 0
−2w + y −2x + z 25
0 1 7 1
1 −8 91 0
· ¸· ¸ ·1 ¸
−8 · ¸· ¸· ¸· ¸
13 = 9 1 0 1 0 1 −2 3 0
0 1 0 1 0 1 27
5 1 0 2 0 1 0 1
· 1 ¸· ¸· ¸· ¸
3 0 1 0 1 1 1 0 · ¸·
1 0 1
¸· ¸·
1 2 0 1 0
¸
15 =
0 1 −7 1 0 1 0 12 29
1 1 0 1 0 1 0 4
459
460 Chapter 12: Answers to selected exercises
· ¸· ¸ · ¸ · ¸
a b w x aw + b y ax + bz 1 0
31 = 6=
0 0 y z 0 0 0 1
· ¸
1 0
33 If the matrix AB is invertible, then the matrix B is invertible since (AB )−1 A B =
¡ ¢
.
0 1
If the ¸matrix A is not invertible and a 6= 0 or c 6= 0, then
35 · · there
¸ is ·an invertible matrix
α β α β
¸
1 k
(a product of elementary matrices) such that A= , where k is a
γ δ γ δ 0 0
¸−1 ·
α β
· ¸
1 k
real number. Hence A = .
γ δ 0 0
¸ · 1 ¸
1 0 21 0
· ¸·
37 a = 20, = 2 0
−5 1 0 1 − 25 1
· ¸· ¸· ¸ · 1¸
0 1 1 −2 1 0 0
39 a = 14 , = 3
3 1 0 0 1 0 31 1 − 23
41 x = 17 15
· ¸· ¸
1 0 3 2
7 ,y= 7 47 5 1
3 1 0 −3
43 x = − u2 + 3v2 , y = u − 2v " #·
1 0 p
¸
p
· ¸· ¸
1 0 5 7
45 3 5 49 q
5 1 0 19 p 1 0 p −q
· ¸
1 0
51 D −1C −1 B −1 A −1 ABC D = D −1C −1 B −1 BC D = D −1C −1C D = D −1 D =
0 1
Section 1.3
· ¸
1 7 1 5 −2
9 13
−1 3
3 −11
5 −bc · ¸
−4 2
7 0 11 21
3 −1
· ¸
1 a −b
13 If a 6= b and a 6= −b, then the inverse is . If a = b or a = −b, then the
a 2 −b 2 −b a
matrix is not invertible.
· ¸
5 −a
15 2 1
a +10 a 2
b a
" #
3b−2a
− 3b−2a
17 If 3b − 2a 6= 0, then the inverse is 2 3
. If 3b − 2a = 0, then the matrix is
− 3b−2a 3b−2a
not invertible.
19 x = −3, y = 52 8
21 x = 31 , y = 28
31
· ¸ · ¸
a1 + t a2 b1 + t b2 a b
23 det = (a 1 + t a 2 )b 2 − (b 1 + t b 2 )a 2 = a 1 b 2 − b 1 a 2 = det 1 1
a2 b2 a2 b2
Chapter 12: Answers to selected exercises 461
a+5
25 x = y = 15 29 x = , y = 2 a−6
a 2 −4a+10 a −4a+10
2 7
27 x = 47 , y = − 47
2s−3t 15
31 If 2a − 15 6= 0, then x = −5s+at
2a−15 and y = 2a−15 . If 2a − 15 = 0 or a = 2 , then we have
2x + 5y = 2s
½
3 .
2x + 5y = t
t −2x
If 2s 2s
3 6= t , then there is no solution. If 3 = t , then x is arbitrary and y = 5 .
33 If ab − 4 6= 0, then x = −2s+at
ab−4
and y = bs−2t
ab−4
. If ab − 4 = 0 or b = a4 , then we have
½
2x + a y = s
.
2x + a y = t2a
If s 6= t2a , there is no solution. If s = t2a , then x is arbitrary and y = s−2x

a .
Section 1.4
· ¸
2 11 10 and −6
1
−1
· ¸ 13 1 and a − b
1
3 15 3 and 15a + 3
4 · ¸
4 4
5 6 and 2 17
−1 −1
7 13 and 2 · ¸
8s − t −2s + 2t
9 7 and 0 19 17
4s − 4t −s + 8t
· ¸ · ¸
a −1 b a −2 b
21 From det = 0 and det = 0we get a = 0 and b = − 52 .
5 2 5 1
" #
12 0
· ¸
−1 1 3
23 A = P DP , where P = and D = ,
1 −5 0 4
· ¸ · ¸
1 4 14 0
25 A = P DP −1 where, P = and D = .
3 −1 0 1
· ¸ · ¸
1 1 1 0
27 A = P 1 DP 1−1 , where P 1 = and D = .
−1 8 0 10
· ¸ · ¸
−1 2 1 0
A = P 2 DP 2−1 , where P 2 = and D = .
1 16 0 10
" #
2 1
· ¸
−1 2 0
29 Since A = P DP , where P = and D = , we have
−1 4 0 11
· n
8 · 2n + 11n −2 · 2n + 2 · 11n
¸ · ¸
2 0 −1 1
A n = P D n P −1 = P P = .
0 11n 9 −4 · 2n + 4 · 11n 2n + 8 · 11n
· ¸ · ¸ · ¸ · ¸
xn 1 3 7 1 7
31 Since = An , where A = , and A = P DP −1 , where P = and D =
yn 1 1 9 1 −1
· ¸
10 0
, we have
0 2
· n
0 −1 1 10n + 7 · 2n 7 · 10n − 7 · 2n
¸ · ¸
10
A n = P D n P −1 = P n P = n n n n .
0 2 8 10 − 2 7 · 10 + 2
· ¸ · ¸ · 33 ¸
x 1 10
Consequently 33 = A 33 = .
y 33 1 1033
3 1
· ¸ · ¸ " # · ¸
xn 2 5 1
33 Since = An , where A = 52 43 , and A = P DP −1 , where P = and D =
yn 3 8 −1
5 "4 #
·
1 0
¸ 1 0
n n −1 = P ³ ´n P −1 . Consequently,
7 , we have A = P D P 7
0 20 0 20
·
¸ · ¸ · ¸ · ¸ · ¸ · ¸ " 25 #
xn n 2 1 0 −1 2 1 5 5 2
lim = lim A =P P = = 13
40
.
n→∞ y n n→∞ 3 0 0 3 13 8 8 3
13
· ¸ · ¸ · ¸ · ¸ " 25 #
2 5 5 1 5 5
Since = 13 1
+ 13 , the solution is 13 = 13
40
.
3 8 −1 8
13
Section 2.1
8 28 4
£ ¤  
1 10 0 1 13
· ¸  4 14 2
8 3 −1 13 
 2 7 1

3
12 2 −4
· ¸ 10 35 5
15 10 5
5 ·
2 1 2
¸
5 20 −5 15
 1 1 2
22 −1

· ¸
7  8 38 19 7
17
4 −9 17 9
· ¸
2 5 
15 20 −5 −5

9
4 20 19  3 4 −1 −1
11 21 6 8 −2 −2
21 The first matrix has 4 columns and the second matrix has 3 rows.
· ¸· ¸ · ¸· ¸
a a 12 1 0 b b 12 1 0
23 From the assumptions we get 11 = 11 and thus
a 21 a 22 0 1 b 21 b 22 0 1
· ¸ · ¸· ¸ · ¸· ¸ · ¸
a 11 a 12 a a 12 1 0 b b 12 1 0 b b 12
= 11 = 11 = 11 .
a 21 a 22 a 21 a 22 0 1 b 21 b 22 0 1 b 21 b 22
25 From the assumptions we get
1 0 0 0 1 0 0 0
   
¤ 0 1 0 0 = b 11 b 12 b 13 b 14 0 1 0 0
£ £ ¤
a 11 a 12 a 13 a 14  .
0 0 1 0 0 0 1 0
0 0 0 1 0 0 0 1
Consequently,
1 0 0 0
 
£ ¤ £ ¤ 0 1 0 0
a 11 a 12 a 13 a 14 = a 11 a 12 a 13 a 14  
0 0 1 0
0 0 0 1
1 0 0 0
 
£ ¤ 0 1 0 0 £ ¤
= b 11 b 12 b 13 b 14   = b 11 b 12 b 13 b 14
0 0 1 0
0 0 0 1
 
1

1 3

27 A T =  2 29 A T = 2 4
−3 5 2
31
 · ¸ · ¸T
£ ¤ b1 £ ¤ b2
µ· ¸· ¸¶T  a 1 a 2 b 3 a1 a2
b4 
a1 a2 b1 b2

=  · ¸ .
 
a3 a4 b3 b4
· ¸
£ ¤ b1 £ ¤ b 2 
a3 a4 a3 a4
b3 b4
 · ¸ · ¸
£ ¤ b1 £ ¤ b1
 a1 a2 b3 a3 a4
b3 
=
 
· ¸ · ¸
£ ¤ b2 £ ¤ b2 
a1 a2 a3 a4
b4 b4
 · ¸ · ¸
£ ¤ a1 £ ¤ a3
 b1 b3 a2 b1 b3
a4 
=
 
· ¸ · ¸
£ ¤ a1 £ ¤ a3 
b2 b4 b2 b4
a2 a4
· ¸· ¸
b1 b3 a1 a3
=
b2 b4 a2 a4
· ¸T · ¸T
b b a1 a2
= 1 2
b3 b4 a3 a4
· ¸ · ¸
a 11 a 12 a a 21
33 det A = det = a 11 a 22 − a 12 a 21 = det 11 = det A T
a 21 a 22 a 12 a 22
35 (A A T )T = (A T )T A T = A A T , because (A T )T = A.
Section 2.2
½ · ¸
2x + y = 3 1 0 0
1 5
3x + 2y = 4 0 1 0
 1
1 0 2
· 7¸ 7 0 1 − 21 
1 2

3
0 0 0 0 0
1 0 2

1 0 1 0 −p − 3q
  
0 1 −1 17 0 1 −2 0 3p 
9 
0

0 0 0 0 0 1 2q
0 0 0
 3p q 5r


1 0 0 − 13
 1 0 0 2 +2− 2
11 0 2  
1 0 3

p q

19 0 1 0 − 2 + 2 + r2 
 
1
0 0 1 3  
 
1 0 0 −2 5
 
p q
0 0 1 − 2 − 2 + 3r
2
13 0 1 0 −3 3
0 0 1 1 32
1 a 0 1
   

1 0 −2 −p + 2q
 0 0, 0 0 
 
21 

15 0 1 3 p − q 0 0  0 0 
0 0 0 0 0 0 0 0
0 a 1 a

1
 
0
 
0 1 0

23 0 1 b , 0 0 1 , 0 0 1 
0 0 0 0 0 0 0 0 0
0 0 a 0 a 0 1 a 0 0

1
 
1
   
0 1 0 0

25 0 1 0 b , 0 1 b 0 , 0 0 1 0 , 0 0 1 0
   
0 0 1 c 0 0 0 1 0 0 0 1 0 0 0 1
27 x = −1 and y = 2 31 x = 49 , y = 45 , z = − 74
29 x = −2 + 3z and y = 5 − 5z 33 No solutions.
p q 3p q p 3q p q
35 If r = 4 + 4 , then x = 8 − 8 − 21 z and y = − 8 + 8 − 12 z. If r 6= 4 + 4 , then there are no
solutions.
p q p q
37 If r = p + 2q, then x = 2 + 2 − 2z, y = 2 − 2 − z, and z is arbitrary. If r 6= p + 2q, there
are no solutions.
39 If r = p + q and s = 2p + q, then x = p − q + y, z = −p + 2q − 3y, and y is arbitrary. If
r 6= p + q or s 6= 2p + q, then there are no solutions.
41 If r = p − 2q, then x = 2p − 5q + 4z − 3w, y = −p + 3q − 3z + w, z and w are arbitrary. If
r 6= p − 2q, then there are no solutions.
Section 2.3
 
a1 b1 c1 d1
 
0
1 a 2 + 4a 3 b 2 + 4b 3 c 2 + 4c 3 d 2 + 4d 3  1
 
a3 b3 c3 d3 5 0
 
0
 
a 1 + 3a 2 b 1 + 3b 2
 
a1 b1 c1 d1 a2 b2 
  
7  
3  a2 − a1 b2 − b1 c2 − c1 d2 − d1   a3 b3 
a 3 + 7a 1 b 3 + 7b 1 c 3 + 7c 1 d 3 + 7d 1 a 4 + 8a 2 b 4 + 8b 2

10 2 −3
 
0 1 0

9 A =  5 1 −2 15 P = 0 0 1
24 5 −9 1 0 0
0 j 1 0
 

5 0 0
 0 1 0 0
17 P = 
 
11 P = 0 4 0 0 k 0 0
0 0 2 0 m 0 1
 
1 0 0 3 0
0 0 1 0 0 1 0 1 0
 
 
0 1 0 0 19 P = 0

0 1 5 0

13 P = 
 
1 0 0 0 0 0 0 1 0
 
0 0 0 1 0 0 0 7 1
1 1 0 1 0 0
  
1 0 0 1

0 0 −4 4
  
1 0

21 Since 0 1 0 0 1  −11 1 0 0 0 1  11 −3 = 0 1,

8 0
0 0 1 0 0 1 4 0 1 0 1 0 1 −1 0 0
0 1 0 0

1 1
 
1 0 0 1 0 0
  
0 1 −3

1 1
we have P = 0 1 0 0 8 0 −11 1
  0 0 0 1 = 8 0 1 −11
   
0 0 1 0 0 1 4 0 1 0 1 0 8 0 32
1 −1 0 1 0 0
  
1 0 0

1 1 3
 
1 0 1

23 Since 0 1 0 0 1   1 1 0 −1 1 1 = 0 1 2,

2 0
0 2 1 0 0 1 −1 0 1 1 −1 −1 0 0 0
0 1 0 0

1 −1
 
1 0 0
 
1 −1 0

1 1
we have P = 0 1 0  02 0  1 1 0 = 1 1 0
2
0 2 1 0 0 1 −1 0 1 0 2 2
0 0 1 1 0 0 0 0 1

0 0 1
     
25 A = 0 2 0, A −1 = 0 1 0 0 1 1
2 0 = 0 2 0 ,
1 0 0 1 0 0 0 0 1 1 0 0
0 0 1

0 0 1
  
1 0 0

0 2 0 0 1
2 0 = 0 1 0
  
1 0 0 1 0 0 0 0 1
1 0 0 1 0 0 1 0 −4 1 0 −4 1 4 0 1 0 −4

1 4 0
         
−1 1
27 A = 0 7 2 , A = 0 0 1 0
  
2 0 0 1 −7 = 0
  0 1 , 0 7 2 0
  0 1 =
0 1 0 0 1 0 0 1 7 1 7
0 1 0 0 1 0 2 −2 0 1 0 0 2 −2

1 0 0

0 1 0
0 0 1
0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0
      
0 3 0 0 −1 0 1 0 0 0 1 1
3 0 0 0
3 0 0
29 A = 
1 0 0 0, A = 1 0 0 0 0
   = ,
0 1 0  1 0 0 0
0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1
0 0 1 0 0 0 1 0 1 0 0 0
    
0 3 0 0 0 1
  3 0 0 0 1 0 0
  
1 0 0 0 1 0 0 0 = 0 0 1 0

0 0 0 1 0 0 0 1 0 0 0 1
1 0 0 3 1 0 0 −3 1 0 0 0 1 0 −3 0
      
0 1 0 5 −1 0 1 0
 0 1 0 0
−5  0 1 −5 0
31 A = 0 0 0 1, A = 0 0 1
  = ,
−2 0 0 0 1 0 0 −2 1
0 0 1 2 0 0 0 1 0 0 1 0 0 0 1 0
1 0 0 3 1 0 −3 0 1 0 0 0
    
0 1 0 5 0 1 −5 0 0 1 0 0
0 0 0 1 0 0 −2 1 = 0
    
0 1 0
0 0 1 2 0 0 1 0 0 0 0 1
· ¸
5 3 2
 
−1 −2
33 51
0 −1 1 35 13  1 2
2 1
1
− 13
 
2

1 1 1 0 0
 1 0 0 3 0
1 1
− 16 
 
37 Since  1 3 1 0 1 0 ∼  0 1 0 −3 2 ,
−1 1 1 0 0 1 0 0 1 2
− 12 5
 1 1
 3 6
 −1 0 −3
2 1 1  31 1
− 16 

we have  1 3 1  = −3

2 .
−1 1 1 2
−1 5
3 2 6

3 −1 −1

−1 −2 3 2
 
39 A −1 = 12 −2 0 2 −1 1 2
 4 2 −12 
41 A = 8  
−2 2 0 3 −2 −1 2 
−2 4 −2 4
1 0 0 1
 
0 1 0 1
43 The matrix A is not invertible because A ∼ 
0 0
.
1 −1
0 0 0 0

−1 0 1
 
1 0 1 1 0 0 1
 
0 0

1 0 0 −1 0 0
 
45 A −1 =  3 −1 −1, A = 0 1 1 0 1 0 0 0 1 −3 1 0  0 1 0, because

−1 1 0 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1

1 0 1 1 0 0 1 0 0
  
1 0 0 −1 0

0
 
1 0 0

0 1 1 0 1 0 0 0 1 −3 1 0  0 1 −1
0 A = 0
 1 0. Note that we can
0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 0 1
solve the exercise as in Example 2.3.18
47
1 0 6 1 0 0 1 −2 0 1 0 0 1 0 0
     
1 0 0 0 1 0
 
−1
A = 0 1 −4 0 1 0 0 1 0 0 −1 0 0 0 1 −2 1 0 1 0 0
0 0 1 0 0 − 13 0 0 1 0 0 1 0 1 0 −3 0 1 0 0 1
−2 −1 2
 
=  43 1
3 −1

− 13 2
3 0
   
1 0 0 1 0 1
 
1 0 0 2 1 1
49 A = 1 1 0 0 1 0  1 1 0 0 3 1
51 A =  2  2 2
1 1 1 0 0 −1 1 1
1 0 0 4
2 3 3
Section 3.1
· ¸
1
1 d = b−a+c = .
2
b
d = b−a+c
0 c
3 Here is an example of two vectors a and b and the vector 2a + 13 b.
2a + 13 b
b
2a
1
a 3b
0
Here is another example of two vectors a and b and the vector 2a + 13 b.
1
3b
0 2a + 13 b
a
2a
5 Here is an example of two vectors a and b and the vector − 34 a + 2b.
− 43 a + 2b
2b
b
− 34 a
0
a
Here is another example of two vectors a and b and the vector − 43 a + 2b.
− 43 a + 2b
2b
− 43 a
0
a
7 Here is an example of two vectors a and b and the vector −0.5a − 0.75b.
−0.75b
0
−0.5a − 0.75b
b −0.5a
Here is another example of two vectors a and b and the vector −0.5a − 0.75b.
−0.5a − 0.75b
−0.5a
−0.75b
0
b
9 Here is an example of three vectors a, b, c, and the vector 12 a + b − 53 c.

− 53 c 1
2a
1 5
2a+b+−3c 0
c
1
2a+b
Here is another example of three vectors a, b, c, and the vector 12 a + b − 53 c.
1 5
2a+b+−3c
1
2a+b
a b
1
− 35 c 2a
0
c
11 Here is an example of three vectors a, b, c, and the vector − 34 a + 13 b + 3c.
3c
c − 34 a + 31 b + 3c
1
3b 0
b
− 43 a
− 34 a + 13 b
Here is another example of three vectors a, b, c, and the vector − 34 a + 13 b + 3c.

− 34 a + 13 b + 3c
3c
− 34 a c
− 43 a + 13 b
0
1
3b
b
a
13 We have u = − 57 v.
· ¸ · ¸
5 7
15 The equality =c is not possible.
−5 7
· ¸
7
7
· ¸
5
−5
v = 0 which gives us a = − 21
£ ¤
17 We must have det u 2 .
u
19 a = 2 or a = 5.
u u=v
a =2 v a =5
0 0
21 a = 1 or a = −9.
v u
a =1 a = −9 0 v
u
23 a = 2 or a = 7.
u=v u
0 a =2 a =7 0
· ¸ · ¸
5 −1
25 The vectors and are linearly independent.
2 4
· ¸ · ¸ ·· ¸ · ¸¸
a c a c
27 The vectors and are linearly independent because det = a 2 − bc >
b a b a
0.
· ¸ · ¸ · ¸
1 1 1
29 Since =2 − , the coordinates are 2 and −1.
2 1 0
· ¸ · ¸ · ¸
1 3 0
31 Since = 31 − 73 , the coordinates are 13 and − 37 .
−2 1 1
· ¸ ½· ¸ · ¸¾
1 1 0
33 The coordinates of with respect to the basis , are 1 and −1. The coordi-
0 1 1
· ¸ ½· ¸ · ¸¾
1 1 1
nates of with respect to the basis , are 23 and − 12 .
0 1 3
Section 3.2
1 5 13 0
p
3 34 15 a = −2, see Figure 12.1
p
5 p2 17 a = −1 or a=1, see Figure 12.2
a
p 19 a = 1 or a = 4, see Figure 12.3
7 13
p 21 See Figure 12.4
9 2|a|
11 11
· ¸
1
2
· ¸
4
−2
Figure 12.1: Solution to Exercise 15.
· ¸ · ¸
1 −1
1 1
0 0
· ¸ · ¸
1 −1
a =1 a = −1
−1 −1
Figure 12.2: Solutions to Exercise 17.
c b c
b−a d
b
a a
d
0 0
b−a d−c
a =1 a =4
d−c
Figure 12.3: Solutions to Exercise 19.
1
2
· ¸
b •u 5 1
23 p = u = 10 = 
kuk2 3 3
2
a
the line (x − a) • n = 0
Figure 12.4: Solution to Exercise 21.
 x+y 
 2 
· ¸
x+y 1
25 p = b u2 u = 2
•
=
kuk 1

x+y
2
· ¸ · ¸
1 1 1
27 The projection matrix on Span{u} is 12 and the projection of on Span{u} is
1 1 0
"1#
2
1
.
2
· ¸ · ¸
4 −2 x
29 The projection matrix on Span{u} is 15 and the projection of on Span{u}
−2 1 y
" 4x−2y #
5
is −2x+y .
5
· ¸ · ¸ ¸ ·
−1 y 4 −3
31 33 35 A = 51
2 x −3 −4
· ¸ · ¸
x x
37 The vector satisfies the equation 3x − 4y = 0 if and only if the vector is in
y y
½· ¸x ¾ ½· ¸¾
3 4
Span = Span .
−4 3
39 x − y = 0 43 p1
2
41 x + 3y = 0 45 13
2
¸ ·
· ¸ · ¸ · ¸
a1 b b −a 2
and b = 1 . Then b • ax = 1 •
£ ¤
47 Let a = = −b 1 a 2 + b 2 a 1 = det a b .
a2 b2 b2 a1
49 det ax bx = bx • (ax )x = −a • bx = − det b a = det a b
£ ¤ £ ¤ £ ¤
51 There are three possibilities:

· ¸ · ¸
a a
(a) The vectors 1 and 2 are linearly independent,
b1 b2
· ¸ · ¸
a1 a
(b) The vectors and 2 are linearly dependent and at least one of them is
b1 b2
· ¸
0
different from the zero vector ,
0
· ¸ · ¸ · ¸
a1 a 0
(c) = 2 = .
b1 b2 0
· ¸ · ¸· ¸ · ¸
a b a b x 0
In the first case we have det 1 1 6= 0 and thus the equation 1 1 = has
a2 b2 a2 b2 y 0
· ¸ · ¸
x 0
a unique solution = , by Cramer’s Rule (Theorem 1.3.10). So in this case we
y 0
½· ¸¾
0
have S = .
0
· ¸ · ¸
a a
Now we assume that the vectors 1 and 2 are linearly dependent and at least one
b1 b2
· ¸ · ¸
a 0
of them is different from the zero vector. Suppose that 1 6= . Then there is a
b1 0
number c such that · ¸ · ¸
a2 a
=c 1 . (12.1)
b2 b1
· ¸· ¸ · ¸
a b x 0
The equation 1 1 = is equivalent to the system
a2 b2 y 0
½
a1 x + b1 y = 0
,
a2 x + b2 y = 0
which can be written as ½

a1 x + b1 y = 0
,
c a1 x + c a2 y = 0
· ¸· ¸ · ¸
a b x 0
by (12.1). But this means that the equation 1 1 = is equivalent to
a2 b2 y 0
a 1 x + b 1 y = 0.
From Theorem 3.2.22 the general solution of this equation is

· ¸ · ¸
x −b 1
=t ,
y a1
· ¸
x
where t is an arbitrary real number. In other words, is a solution of the equation
y
· ¸ ½· ¸¾
x −b 1
(51) if an only if is in Span , so S is a vector line in this case. The case when
y a1
· ¸ · ¸
a 0
the vector 2 6= is handled in a similar way.
b2 0
· ¸ · ¸ · ¸ · ¸
a a 0 x
Finally we note that, if 1 = 2 = , then every vector in R2 satisfies the
b1 b2 0 y
equation
· ¸· ¸ · ¸· ¸ · ¸
a1 b1 x 0 0 x 0
= = ,
a2 b2 y 0 0 y 0
so S = R2 in this case.
a•b
k bk
kpk kbk2 |a·b|
53 kak = kak = kakkbk
55 Note that, if the vectors a and b are linearly independent, then b • ax 6= 0 and a • bx 6= 0,
c • bx c • ax
by Exercise 48, so the expressions in x = and y = make sense.
a • bx b • ax
The equation has a solution because the set {a, b} is a basis of R2 .
Now, if numbers x and y satisfy the equation xa + yb = c, then xa • bx + yb • bx = c • bx
and xa • ax + yb • ax = c • ax . Since a • ax = 0, b • bx = 0, b • ax 6= 0, and a • bx 6= 0, we must
c • bx c • ax
have x = and y = .
a • bx b • ax
· ¸
1 0
57 A = 2 1 2 (uuT ) − and we have A T = A.
kuk 0 1
Section 3.3
· ¸ · ¸ · ¸ · ¸
3 −2 1 £ 1 £
11 A = 72 1 1 + 23
¤ ¤
1 We have · = 0. 1 −1
2 3 1 −1
· ¸ · ¸
1 2 4 0
p1
· ¸ · ¸
3 P= ,D= 1 £ 2 £
13 A = 45 1 −2 + 59
¤ ¤
5 −2 1 0 9 2 1
· ¸ · ¸ −2 1
−1 2 −7 0
5 P= p1 ,D= "1 3#
5 2 1 0 13 2 2
· ¸ · ¸ 15 A =
5 1 28 0 3 1
7 P= p1 ,D= 2 2
26 1 −5 0 2
4α + β 2α − 2β
· ¸ · ¸ " #
−1 1 2 0
9 P= p1 ,D= 17 A = 15
2 1 1 0 2a + 2 2α − 2β α + 4β
· ¸ · ¸
1 1 1 0
19 A = P DP T where P = p1 and D = .
2 1 −1 0 0
· ¸ · ¸
1 3 1 0
21 A = P DP T where P = p1 and D = .
10 3 −1 0 0
" 15 8
# · ¸ · ¸
4 −1 1 0
23 A = 17 8
17
15
= P DP T where P = p1 and D = .
17 1 4 0 −1
17 − 17
· ¸ · ¸
a −b
25 The vector is an eigenvector corresponding to the eigenvalue 1 and the vector
b a
is an eigenvector corresponding to the eigenvalue 0.
· ¸ · ¸ · ¸
a b u v
27 Let A = , u = 1 and v = 1 . The desired equality is a consequence of the
b c u2 v2
calculations:
µ· ¸ · ¸¶ · ¸
a b u1 v
(Au) • v = · 1 = (au 1 + bu 2 )v 1 + (bu 1 + cu 2 )v 2
b c u2 v2
· ¸ µ· ¸ · ¸¶
u1 a b v1
u • (Av) = · = u 1 (av 1 + bv 2 ) + u 2 (bv 1 + c v 2 )
u2 b c v2
" 1 # "p
− p1 2 p3
#
p
29 A = 2 2 2
p1 p1 0 p1
2 2 2
· ¸ · ¸ · ¸ · ¸
1 a £ −b £ a 0
a b + 21 2
¤ ¤
31 A = −b a where 6= .
a 2 +b 2 b a +b a b 0
Section 4.1

1
   
1
  
−3 −1
1
1  2 = − 3 −6 11 There is no c such that −3 = c  3.
−1 3 1 2
 
1
 
1  
1
   
1 3
3 There is no c such that 2 = c  0.
13 1 = −2 −1 + −1
3 −1
  1 1 3
3
 
12
5 2 = 14  4      
1 2 1
1 8 15 2 = 2 − 0
1 3 2
 
1
 
−1
7 −3 = −  3
17 x = u + 2v
1 −1
 
3
 
3 19 x is not in Span {u, v}.
9 There is no c such that 1 = c 1.
 
2 3 21 x = 34 u + 53 v

3
  
1
 
1
 
3
   
1 1
 
3
 
3
23 We have  5 = 1 +2  2 and 4 = 2 1 +  2 and the vectors  5 and 4
−1 1 −1 1 1 −1 −1 1
 
5
 
1
 
1
 
2
   
1 1
 
5
 
2
25 We have 7 = 3 1 + 2 2 and 1 = 3 1 −
           2 and the vectors 7 and 1
  
1 1 −1 4 1 −1 1 4
 
· ¸ · ¸· ¸ · ¸ 1
1 2 1 2 5 9
27 The transition matrix is and, because = , we have w = 9 1 +
2 1 2 1 2 12
1
 
1
12  2.
−1
· ¸ · ¸· ¸ · ¸
3 3 3 3 a 3a + 3b
29 The transition matrix is and, because = , we have w =
2 −1 2 −1 b 2a − b
 
1
 
1
(3a + 3b) 1 + (2a − b)  2.
1 −1
¤ 2 3 −1 £
· ¸ · ¸
£ ¤ £ ¤ 2 3 £ ¤
31 We have u v = a b and u v = a b . This means that the tran-
1 5 1 5
· ¸
2 3
sition matrix from the basis {u, v} to the basis {a, b} is and the transition matrix
1 5
· ¸−1
2 3
from the basis {a, b} to the basis {u, v} is .
1 5
¤ 3 0 −1 £
· ¸ · ¸
£ ¤ £ ¤ 3 0 £ ¤
33 We have u v = a b and u v = a b . This means that the tran-
2 1 2 1
· ¸
3 0
sition matrix from the basis {u, v} to the basis {a, b} is and the transition matrix
2 1
· ¸−1
3 0
from the basis {a, b} to the basis {u, v} is .
2 1
¤ 2 1 −1 £
· ¸ µ · ¸¶
2 −1
= a b or u v 13
£ ¤ £ ¤ £ ¤
35 We have u v = a b . This means that the
1 2 −1 2
· ¸
2 −1
transition matrix from the basis {a, b} to the basis {u, v} is 31 and we have w =
−1 2
· ¸ µ · ¸¶ · ¸
¤ 1 2 −1 1
= u v 31 = 31 u + 13 v.
£ £ ¤
a b
1 −1 2 1
¤ 3 1 −1 £
· ¸ µ · ¸¶
£ ¤ £ ¤ 1 5 −1 £ ¤
37 We have u v = a b or u v 13 = a b . This means that the
2 5 −2 3
· ¸
1 5 −1
transition matrix from the basis {a, b} to the basis {u, v} is 13 and we have
−2 3
· ¸ µ · ¸¶ · ¸
¤ 4 ¤ 1 5 −1 4
= 17 1
£ £
w= a b = u v 13 13 u + 13 v.
3 −2 3 3
39 Let u = pa+qb. One of the numbers p or q must be different from 0. If p 6= 0, then {b, u}
is a basis of the vector plane Span{a, b}. Indeed, if xb + yu = 0, then xb + y(pa + qb) = 0
or y pa + (x + y q)b = 0. Thus x = y = 0, because p 6= 0. Consequently, the vectors b and
u are linearly independent. This means that the set {b, u} is a basis of the vector plane
Span{a, b}.
If q 6= 0 , then {a, u} is a basis of the vector plane Span{a, b}.
Section 4.2

9 −6 15

1 25
p 1 −6 4 −10
17 38
3 30
15 −10 25
5 3
 
0
   
 1 0 
7 Span −2 , 2 19 0

0 1
 0
 
5
x + 2y + 2z
 
1 1
9 27
21 19 2x + 4y + 4z 
1
2x + 4y + 4z
 
1
11 97 2
       
 1 3   1 8 
2 23  1 , 1
   and 0 ,  5
2 2
   
3 −2 −4
13 x = 11
        
1 0 2

 2 19   3 94 
1 0 0 0
15 5 25 1 ,  12 and  2 , 53
2 0 4 5 97
   
−10 −4
    p
 1 4  39 2
2
27 Span −2 , 2 p
0 5 41 2
 
   1 
p1

1 0 −1

−p 
3 

 12   43 21  0 2

0
 
29 Span  p  ,  p1 
2  3  −1 0 1
p1 

 

 0 3 
2 1 1

 
−1 1 1
45 3 2 −1
31 12  1
1 −1 2
2

10 3 −1
  
−1
1  3 2
33 1 −7
27
47 11 3
25 −1 3 10
 
2

1 2 0

35 27 1 1 2 4 0 
49 5
4 0 0 5
 
8 51 x = 5 and y = 8
37 13 7
1 53 x = − 21 and y = 12
  
1

−1
55 x = − 21 + t , y = t . The vectors −2 and  2 are linearly dependent.
1 −1
57 x = 0 and y = 23
 
1
63 23 1
2
59 x = − 34 and y = − 41
65 x = 13 , y = 34

2
 
5 4 2

61 12  1 1 4
67 9 5 −2
−1 2 −2 8
69
· ¸−1 · ¸ · ¸−1 · ¸
¤ u·u u·v b·u ¤ u·u 0 b·u
p = A(A T A)−1 A T b = u
£ £
v = u v
u·v v·v b·v 0 v·v b·v
" b·u #
¤ 1
· ¸· ¸
0 b·u b·u b·v
v u·u v u·u =
£ £ ¤
= u 1 = u u+ v
0 b · v b·v u·u v·v
v·v v·v
 1
71 17 27 p1

p
2 − 14 x 38  p
 2 2 p1
 
73 13 − 11
7
13 x
 p1 − p1  
77  2 p
2
38 

 2 2  0 238

p p p 6
0 p
5 3 5 5 − p1

 38
 p1 − p 4  5
75  5 3 5 p3

 p 0
5 5
0 3
Section 5.1
     
−32  −3   1 
1  17 7 Span −3 = Span  1
18 3
   
−1
 
11 9 −x − 2y + 3z = 0
3 −1
−7 11 −14x + 23y + z = 0
 
 2  13 det A = 18
5 Span −9

−4
 15 det A = −1
£ ¤ £ ¤ £ ¤
17 det a a b = det a b a = 0, because a • (b × a) = a • (a × b) = 0, and det b a a = 0,
because a × a = 0.
£ ¤ £ ¤ £ ¤
19 det a + d b c = (a + d) • (b × c) = a • (b × c) + d • (b × c) = det a b c + det d b c
£ ¤
21 det a + sb b £c + t b = ¤ (a+sb) (b×(c+t b)) = a (b×c)+sb (b×c)+t a (b×b)+st b (b×b) =
• • • • •
a • (b × c) = det a b c
25
a 1 + sa 2 b 1 + sb 2 c 1 + sc 2 a 1 + sa 2 a2 a3 + t a2
   
det  a2 b2 c 2  = det  b 1 + sb 2 b2 b3 + t b2 
a3 + t a2 b3 + t b2 c3 + t c2 c 1 + sc 2 c2 c3 + t c2
a1 a2 a3 a1 b1 c1
   
= det  b 1 b 2 b 3  = det  a 2 b2 c2 
c1 c2 c3 a3 b3 c3

2 1 2

27 The first and the third column of the matrix 2 4 2 are equal.
7 5 7
£ ¤ £ ¤
29 33 (The result is a consequence of the equality det a + 9b b c = det a b c .)
£ ¤ £ ¤
31 99 (The result is a consequence of the equality det 3a + 5b b c = 3 det a b c .)
1 s 0
 
33 Use Theorem 5.1.16 and det 0 1 0 = 1.

0 t 1
Section 5.2

−1 1 −1
 
4 −4 0

1  1 1 −3 5 − 41 −9 4 1
−1 −3 5 1 0 −1

5 −6 −2
 
9 −4 −1

3  0 −2 1 7 −7 17 −2
−10 13 1 −4 −1 6

−1 1 1

9  1 −1 0
1 0 −1
· ¸ · ¸ · ¸
1 5 2 5 2 1
11 4 det − 3 det + 2 det = −9
1 3 1 3 1 1
· ¸ · ¸ · ¸
2 5 4 2 4 2
13 −3 det + 1 det − 1 det = −9
1 3 1 3 2 5
· ¸ · ¸ · ¸
1 1 4 1 4 1
15 − det − 1 det = −8 19 7 det = 49
3 1 2 3 1 2
· ¸ 21 x = 2, y = − 51 , z = − 75
2 1
17 −3 det =3 23 x = − 25 , y = 45 , z = − 15
3 1
Section 5.3

3 1 3
 
1 0 0

1 Linearly independent because 3 3 1 ∼ 0 1 0.

1 3 3 0 0 1
1 0 74
 

1 3 1
  
 
3 Linearly dependent because 3 2 2 ∼ 0 1 7 .
   1 
2 −1 1
 
 
0 0 0

1 1 1
 
1 0 0

5 Linearly independent because 1 1 0 ∼ 0 1 0.

1 0 0 0 0 1
1 0 − 51
 

2 1 1
  
 
7 Linearly dependent because 1 3 4 ∼ 0 1
 7.
5
9 7 8
 
 
0 0 0
1 0 p
  
0 1 0

£ ¤ £ ¤
9 a b c ∼ 0 1 q , p 6= 0, or a b c ∼ 0 0 1
  
0 0 0 0 0 0
11 x − y + z = 0. The vectors are linearly dependent.
13 4x + y − z = 0. The vectors are linearly dependent.
x+y
   
3
 
1
 
2
15 c = xa + yb = 2x + y 
 19 a = 3, 3 = − −1 + 2 1
   
3x + y 1 1 1
21 Any a 6= 2.
     
2 1 1 23 Any a 6= 1.
17 a = 2, 1 = 1 + 0
1 0 1 25 x = 3a − b + 12 c
27 x = 34 a − 14 b − 41 c x
   
5
31 a = −11 and  y  = t −1.
29 a = 2 or a = 5. z −7
33 Because the implication (a) implies (b) is immediate, we only have to show the other
implication. If αu + βv is an arbitrary vector in Span{u, v}, then we have
αu + βv = α(pa + qb + r c) + β(xa + yb + zc) = (pα + xβ)a + (qα + yβ)b + (r α + zβ)c,
which means that αu+βv is in Span{a, b, c}. We show in the same way that an arbitrary
element in Span{a, b, c} is in Span{u, v} which completes the proof.
35 det a b a × b = (a × b) • (a × b) = ka × bk2
£ ¤
37 det a b a × b = ka × bk2 > 0

£ ¤
39 We have (c − (r a + sb)) • a = 0 and (c − (r a + sb)) • b = 0. This shows that r a + sb is the

projection of c on the vector plane Span{a, b}. We also have c − t (a × b) • (a × b) = 0. This
shows that t (a × b) is the projection of c on the vector line Span{a × b}.
¯ h i¯
° |c•(a×b)| ¯det a b c ¯
° ³ ´° ¯ ¯
c•(a×b)
41 kc − pk = °c − c − 2 (a × b) ° = ka×bk =
°
ka×bk ka×bk
43 A = 12 kbk ka×bk
kbk
= 12 ka × bk
45 Let u = pa + qb + r c. One of the numbers p, q, r must be different from 0. Suppose
p 6= 0. We show now that {b, c, u} is a basis in R3 . If xb + yc + zu = 0, then xb + yc +
z(pa + qb + r c) = 0 or zpa + (x + zq)b + (y + zr )c = 0. Hence x = y = z = 0, because
p 6= 0. Consequently the vectors b, c, u are linearly independent, which means that the
set {b, c, u} is a basis in R3 .
If q 6= 0 we obtain that {a, c, u} is a basis in R3 and if r 6= 0 we obtain that {a, b, u} is a
basis in R3 .
Section 5.4
1 2 9 3
3 1
11 rankA = 2
5 2
7 1 13 rankA = 3
Chapter 6
p p    
½ · ¸ · ¸¾  4
1 7 and 5 3 
p p 1 2
7 A ,A =  5 , 0
3 50 and 10 2 −1
4
 
    −3
½ · ¸ · ¸¾  1 1 
1 1
5 A ,A =  1 , 2
1 −1
1
 
−3
 
1
 
1 · ¸ · ¸
1 2
9 A = σ1 u1 vT T p1   p1  0, v1 = p1 , v2 = p1
1 + σ2 u2 v2 , where u1 = 3 1 , u2 = 2 5 2 5 −1
,
1 −1
p p
and the singular values are 15 and σ2 = 10.

1
  
−1 · ¸
3
11 A = σ1 u1 vT T p1   p1   p1
1 +σ2 u2 v2 , where u1 = 6 −1 , u2 = 2 −1 , v1 = 10 −1 and v2 =
2 0
p p
· ¸
1 1
p , and the singular values are σ1 = 15 and σ2 = 5.
10 3
p
15 0 "vT #
  
1
£ ¤ p 1 1
13 A = u1 u2 u3  0 10 T , u3 = p −2
6
0 0 v2 1
p
15 0 "vT #
  
1
£ ¤ p 1 1
15 A = u1 u2 u3  0 10 
T
, u 3 = p −1
3
0 0 v2 −1
17 If A = σ1 u1 vT T T T T T T T T T T
1 + σ2 u2 v2 , then A = σ1 (v1 ) u1 + σ2 (v2 ) u2 = σ1 v1 u1 + σ2 v2 u2 .
19 If A = σ1 u1 vT T T 2 T 2 T T
1 +σ2 u2 v2 , then A A = σ1 v1 v1 +σ2 v2 v2 , by exercise 18. Hence A A(v1 ) =
(σ21 v1 vT 2 T 2 T 2 T 2 T 2
1 + σ2 v2 v2 )(v1 ) = σ1 v1 and A A(v2 ) = (σ1 v1 v1 + σ2 v2 v2 )(v2 ) = σ2 v2 . There-
fore σ1 and σ2 are the singular values of the matrix A.
Section 7.1

7−3 4 9
 
4 4 9

1 3 is an eigenvalue because det  1 5−3 1 = det 1 2 1 = 0.

1 2 4−3 1 2 1

2−1 1 1
 
1 1 1

3 1 is an eigenvalue because det  1 2−1 1 = det 1 1 1 = 0.

 
8 3 3−1 8 3 2

3−1 4 1
 
2 4 1

5 1 is an eigenvalue because det  5 7−1 9 = det 5 6 9 = 0.

2 4 2−1 2 4 1

7−3 8 4
 
4 8 4
 
2 4 2

7 3 is an eigenvalue because det  2 9−3 3 = det 2 6 3 = 2 det 2 6 3 =

   
2 4 5−3 2 4 2 2 4 2
0.

5−1 4 4
 
4 4 4

9 1 is an eigenvalue because det  4 5−1 4 = det 4 4 4 = 0 and 4 is an eigen-

 
4 1 8−1 4 1 7

5−4 4 4
 
1 4 4

value because det  4 5−4 4 = det 4 1 4 = 0

4 1 8−4 4 1 4
     
 1   −5 −2 
11 Span 1 15 Span  0 ,  1
1 1 0
   
   
 1   1 
13 Span 1 17 Span  1
1
   
−4

−1 −2 2
 
1 0 0

19 0, 2, 3
p p 25 P =  0 1 1, D = 0 1 0
21 1, 5−2 5 , 5+2 5
1 0 1 0 0 6

0 1 0
 
2 0 0
 
−2 −2 1
 
3 0 0

23 P = 2 −1 0, D = 0 1 0 27 P =  0 1 2, D = 0 3 0
1 1 1 0 0 0 1 0 2 0 0 12
29 The eigenvalues are 2 (double) and 3. The eigenspace corresponding to the eigenvalue
   
 0   1 
2 is Span 0 and the eigenspace corresponding to the eigenvalue 3 is Span 1 .
1 2
   
Consequently it is not possible to diagonalize the matrix.
Section 7.2
p1 p1 p1
 
2 3 6

−1 0 0


 1 1 1
1 P = − p2 p p , D =  0 1 0 

3 6
0 0 7

0 − p p2
1
3 6
 1
p1 p1

p
2 3 6

12 0 0

 
1 2
3 P =
 0 − p p , D =  0 18 0

3 6
0 0 6

1 p1 p1
−p
2 3 6
 2 1 p1 
−p 30
5 6

1 0 0


5 p1 
5 P =
 0 − 30 , D = 0 1 0

6
0 0 13

p1 2 p2
5 30 6
 3
p1 p3

p
11 10 110

33 0 0

 
 p1 − p3 p 1
7 P =  11 , D =  0 0 0

10 110 
0 0 0

1 10
p 0 −p
11 110
 2
p1 p2

p
6 5 30

10 0 0

 
 p1 − p2 1
p , D =  0 10
9 P = 6 0

5 30 
0 0 −20

p1 0 − p5
6 30

1
 
1
  
1
11 A = 12  0 1 0 −1 + 18 −1 1 −1 1 + 6 2 1 2 1
£ ¤ £ ¤ £ ¤
2 3 6
−1 1 1
 
2
 
1
 
2
13 A = 10
£ ¤ 10 £ ¤ 20 £ ¤
6 1 2 1 1 + 5 −2 1 −2 0 − 30
     1 2 1 −5
1 0 −5
  41 25
27 −13 −5

0

2 2
15 A = −13 30 −8 17 A =  25
 41
0

2 2
−5 −8 22
0 0 33
 1  p p
 p p1 − p1 2 2 p1

1 2 0

 12 3 6 p 2
p1 p1  

 2 − 3
19 1 0 1 =  p 0 3 0

6 
0 1 1 1 2 3
0 p p 0 0 p
3 6 6
a b c
 
21 Let A = b d e . Then
c e f
a −λ b c
 
det  b d −λ e  = (a −λ)(d −λ)( f −λ)+2bce −(a −λ)e 2 −(d −λ)c 2 −( f −λ)b 2 .

c e f −λ
Let
Φ(λ) = (a − λ)(d − λ)( f − λ) + 2bce − (a − λ)e 2 − (d − λ)c 2 − ( f − λ)b 2 .
If Φ(λ) = −(λ − α)3 , then Φ0 (λ) = −3(λ − α)2 and Φ00 (λ) = −6(λ − α). Since Φ00 (λ) =
2(a + d + f ) − 6λ, we have
2(a + d + f ) − 6λ = −6(λ − α)
and consequently α = 31 (a + d + f ). Now we have
Φ0 (λ) = −(a − λ)(d − λ) − (a − λ)( f − λ) − (a − λ)(d − λ) + e 2 + c 2 + b 2

= −3λ2 + 2(a + d + f )λ − d f − a f − ad + e 2 + c 2 + b 2
and, since α = 13 (a + d + f ),
1 2
Φ0 (α) = − (a + d + f )2 + (a + d + f )2 − d f − a f − ad + e 2 + c 2 + b 2
3 3
1
= (a + d + f ) − d f − a f − ad + e 2 + c 2 + b 2
2
3
1
= (a − d )2 + (d − f )2 + ( f − a)2 ) + e 2 + c 2 + b 2 .
6
But Φ0 (α) = 0, so we must have a = d = f = α and e = c = b = 0.
Section 8.1
· ¸
1 2x + y = 1 3
7 25
4
3 3x + 2y = 11 · ¸
−3
9 12
µ· ¸ · ¸¶ · ¸ 1
x 3 3
5 − 21 • =0 11 52
y 1 1

1 1 1

· ¸
x − a1 b1 − a1
13 det x 1 a 1 b 1  = − det 1 =0
x2 − a2 b2 − a2
x2 a2 b2
Section 8.2
  q
1 15 117
1 1 13
1 3
4  
2
3 x + y − 4z = 6 17 13 4
5 x −y −z =1 5
7 2x + 3y + 2z = 9 19 3x − y + z = 3
x
     
3 3
2
y − 11 −1 • −1 = 0 21 x + y + z = 1
9   
z 1 1 p
2
      23 2
0  1 0 
11 0 + Span  0 , 1 p
25 23
2 1
 
−3
27 23
 
17
1 23
13 26
12 29 2
31 The conditions (p−a) • (u×v) = 0, (q−p) • u = 0, and (q−p) • v = 0 are equivalent to the fact
that p is in the plane a+Span{u, v} and there is a real number t such that p = q+t (u×v).
Section 9.1
1 By Theorem 9.1.4 we have ps − q t = 0 and pt + q s = 1. Hence
p = p(pt + q s) = psq + t p 2 = t q 2 + t p 2 = t (q 2 + p 2 ) = t
and
q = q(pt + q s) = sq 2 + t pq = sq 2 + sp 2 = s(q 2 + p 2 ) = s.
3 By Theorem 9.1.4 we have ps − q t = −1 and pt + q s = 0. Hence
−p = p(ps − q t ) = sp 2 − pt q = t p 2 + sq 2 = s(q 2 + p 2 ) = s
and
−q = q sp − t q 2 = −t p 2 − t q 2 = −t (p 2 + q 2 ) = −t .
5 Using Theorem 9.1.4 we obtain
c = (p 2 − q 2 )a + 2pqax = (p 2 + p 2 − p 2 − q 2 )a + 2pqax
= (2p 2 − (p 2 + q 2 ))a + 2pqax = (2p 2 − 1)a + 2pqax .
· ¸ · ¸ · ¸ · ¸ · ¸· ¸
a1 a −a 2 pa 1 − q a 2 p −q a 1
7 If a = , then b = pa + qax = p 1 + q = = .
a2 a2 a1 pa 2 + q a 1 q p a2
9 L(M (x)) = pxx + q(xx )x = −qx + pxx
11 Since (4p 3 − 3p)a + (3q − 4q 3 )ax = −a, we have 4p 3 − 3p = −1 and 3q − 4q 3 = 0.
p p
13 Since (2p 2 − 1)a + 2pqax = 21 a + 23 ax , we have 2p 2 − 1 = 12 and 2pq = 23 . Conse-
p
quently, p = 23 and q = 12 .
15 This is an immediate consequence of Exercise 11.
17 Since (2p 2 − 1)x + 2pqxx = −x, we have 2p 2 − 1 = −1 and 2pq = 0. Consequently, p = 0
and q = 1.
19 kb − ak = kak
21 This is a direct consequence of Theorem 9.1.6.
p p p
π π 1 2 3 2
25 cos 7π
12 = cos( 3 + 4 ) = 2 2 − 2 2
p p p
π
27 cos 12 = cos( π3 − π4 ) = 12 22 + 23 22
p p p
π π 3 2 1 2
29 cos 5π
12 = cos( 4 + 6 ) = 2 2 − 2 2
31 cos(θ + π) = cos θ cos π − sin θ sin π = − cos θ
Section 9.2
p # p p 
1 2x 2 + 2x y + 5y 2 2 −322
"
p
2 p
0
13 p 
3 −2x 2 − 8x y + y 2

10
−322 2 0 10
· ¸ 2
3 7
5 15 The matrix is positive definite.
7 2
1 1
· ¸
7 1 2 17 The matrix is not positive definite.
2 4 " p # p 3p2 
9 Positive semidefinite. 2 0 2 2
19 3p2 p2  p 
11 Negative definite. 2
2 2 0 2
¸ p1 − p3 T
 1
− p3 ·
  
·¸ · ¸· ¸· ¸−1 p
−3 3 1 −3 6 0 1 −3 10 10 6 0 10 10
21 (a) Since = = 3    ,
3 5 3 1 0 −4 3 1 p p1 0 −4 p3 p1
 1 10 10 10 10
− p3

p
10 10
we can take P =  .
p3 p1
10 10
· ¸ · 0¸
−3 3 x
(b) x 0 y 0 P T P 0 = 6(x 0 )2 − 4(y 0 )2
£ ¤
3 5 y
 1   3 
µ· ¸¶ · ¸ · ¸ p · ¸ · ¸ −p
a 1 a 3 −b 10 1 1 3 0 10
(c) Since R =p +p , we have  3  = p +p and  1  =
b 10 b 10 a p 10 0 10 1 p
· ¸ · ¸ 10 10
1 0 3 −1 −1 1 ◦
p +p . Now, because cos p ≈ 71 , the angle of the rotation is approx-
10 1 10 0 10
imately 71 .◦
¸ − p3 − p1 T
 3
− p1 ·
  
· ¸ · ¸· ¸· ¸−1 −p
−3 3 1 −3 6 0 1 −3 10 10 −4 0 10 10
23 (a) Since = = 1    ,
3 5 3 1 0 −4 3 1 p − p3 0 6 p1 − p3
 3 10 10 10 10
− p1

−p
10 10
we can take P =  1 .
p − p3
10 10
· ¸ · 0¸
−3 3 x
(b) x 0 y 0 P T P 0 = −4(x 0 )2 + 6(y 0 )2
£ ¤
3 5 y
µ· ¸¶ · ¸ · ¸
a a −b
(c) The rotation is R = − p3 + p1 and, because cos−1 p3 ≈ 19◦ , the
b 10 b 10 a 10
angle of the rotation is approximately 180◦ − 19◦ = 161◦ .
 3 T
p1
 3
p1

· ¸ · ¸· ¸· ¸−1 p · ¸ p
−3 3 1 −3 6 0 1 −3 10 10 −4 0 10 10
25 (a) Since = = 1    ,
3 5 3 1 0 −4 3 1 −p p3 0 6 − p1 p3
 3 10 10 10 10
1

p p
10 10
−p p3
10 10
· ¸ · 0¸
−3 3 x
(b) x 0 y 0 P T P 0 = −4(x 0 )2 + 6(y 0 )2
£ ¤
3 5 y
µ· ¸¶ · ¸ · ¸
a a −b
(c) The rotation is R = p3 − p1 and, because cos−1 p3 ≈ 19◦ , the
b 10 b 10 a 10
angle of the rotation is approximately 360◦ − 19◦ = 341◦ .
 1 T
p3 ¸ − p1 p3
 
· ¸ · ¸· ¸· ¸−1 −p ·
−3 3 1 −3 6 0 1 −3 10 10 6 0 10 10
27 (a) Since = = 3    ,
3 5 3 1 0 −4 3 1 −p − p1 0 −4 − p3 − p1
 1 10 10 10 10
p3

−p
10 10
−p − p1
10 10
· ¸ · 0¸
−3 3 x
(b) x 0 y 0 P T P 0 = 6(x 0 )2 − 4(y 0 )2
£ ¤
3 5 y
µ· ¸¶ · ¸ · ¸
a a −b
(c) The rotation is R = − p1 − p3 and, because cos−1 p1 ≈ 71◦ , the
b 10 b 10 a 10
angle of the rotation is approximately 180◦ + 71◦ = 251◦ .
29 (a) Since
¤ 5 −1 2 0 5 −1 −1 x
· ¸· ¸ · ¸· ¸· ¸ · ¸
£ ¤ 3 −5 x £
x y = x y
−5 27 y 1 5 0 28 1 5 y
 5 1
T
¸ 5 1
 
¤ p26 − p26 2 0 p26 − p26
· · ¸
 x ,
£
= x y  1  
p p5 0 28 p1 p5 y
26 26 26 26
 5
− p1

p
26 26 
we can take P = 
1 5
.
p p
26 26
 5
− p1 · 0 ¸

· ¸ p · ¸ · 0¸
x 26 26  x 0 , the equation x 0 y 0 P T
£ ¤ 3 −5 x
(b) Since = 1 P 0 = 14 be-
y p p5 y −5 27 y
26 26
(x 0 )2 (y 0 )2
comes 7 + 1 = 1.
2
µ· ¸¶ · ¸ · ¸
a a −b
(c) The rotation is R = p5 + p1 and the angle of rotation is cos−1 p5 ≈
b 26 b 26 a 26
11◦ .
Section 9.3
cos α
 
1
    
0 1

1 cos α 0 + sin α
   0 × 0 = sin α
    
0 1 0 0
 
0
    
1 0 0

3 cos α 0 + sin α 0 × 0 = − sin α

1 0 1 cos α
 
0
    
1 0 0

5 cos α 1 + sin α
   0 × 1 = cos α
    
0 0 0 sin α
a1
   
1
7 If a = a 2  and n = 0, then
a3 0
a1
 
a·n a·n
µ ¶
1
• n + cos α a − •n + sin α (n × a) = a 2 cos α − a 3 sin α .

knk2 knk2 knk
a 2 sin α + a 3 cos α
a1
   
0
9 If a = a 2  and n = 1, then
a3 0
a 1 cos α + a 3 sin α
 
a·n a·n
µ ¶
1
• n + cos α a − •n + sin α (n × a) =  a2  .
knk2 knk2 knk
−a 1 sin α + a 3 cos α
Chapter 11
The solutions for the problems for a computer algebra system presented here are written
in the Maple code. It is necessary to include
> with(LinearAlgebra) :
at the beginning of your Maple document.
1 > A := Matrix([[2, 3, 5, 1], [4, 7, 9, 2], [8, 15, 17, 4]]);

> ReducedRowEchelonForm(A);
2 > B := Matrix([[3, 2, 1], [8, 3, 4], [5, 7, 2]]);
> ReducedRowEchelonForm(B );
> C := Matrix([[4, 7], [3, 0], [7, 9]]);
> LinearSolve(B,C );
3 > E := Matrix([[2, 1, 5, 2], [3, 8, 7, 4], [1, 5, 7, 3], [4, 3, 3, 2]]);
> MatrixInverse(E );
4 > Matrix([[1, 4, 0], [0, 1, 0], [0, −3, 1]]).Matrix([[1, 0, 5], [0, 1, 2], [0, 0, 1]]).
Matrix([[2, 1, 3], [1, 1, 4], [7, 2, 5]]);
5 > u := Vector([2, 3, 1]);
> v := Vector([4, 1, 3]);
> P := ProjectionMatrix({u, v});
> b := Vector([7, 21, 14]);
> P.b;
6 > F := Matrix([[2, 1], [1, 3], [5, 2]]);
> c := Vector([2, 1, 1]);
> LeastSquares(F, c);
7 > G := Matrix([[7, 2, x], [y, 2, 5], [4, −3, 8]]);
> Determinant(G);
8 > H := Matrix([[2, 1, 2], [2, 3, 4], [4, 2, 10]]);
> Eigenvectors(H );
9 > p := Vector([3, 1, 4]);
> q := Vector([1, −11, 2]);
> r := CrossProduct(p, q);
x
> A1 := OuterProductMatrix(p, p);
(Norm(p, 2))2
y
> A2 := OuterProductMatrix(q, q);
(Norm(q, 2))2
z
> A3 := OuterProductMatrix(r, r );
(Norm(r, 2))2
> A1 + A2 + A3;
10 > K := Matrix([[1, 1, 1], [2, −1, 5], [1, 2, −3]]);
> QRDecomposition(K );
February 28, 2019 9:16 book-961x669 A Bridge to Linear Algebra-11276 LA_Master page 491
Bibliography
[1] D. Atanasiu, Linjär Algebra och Geometri, Göteborgs Universitet, 1994.

[2] J.S.R. Chisholm, Vectors in three-dimensional space, Cambridge University Press, 1978.
[3] J. Dieudonné, Algèbre linéaire et géométrie élémentaire, Hermann, 1964.
[4] T.W. Körner, Vectors Pure and Applied, Cambridge University Press, 2013.
[5] S. Lang, Introduction to Linear Algebra, Springer, 2nd edition, 1997.
[6] R. Larson, Elementary Linear Algebra, Cengage, 8th edition, 2017.
[7] D. Lay, S. Lay, and J. McDonald, Linear Algebra and Its Applications, 5th Edition, Pearson,
2016.
[8] L. Spence, A. Insel, and S. Friedberg, Elementary Linear Algebra, Pearson, 2nd edition,
2007.
491
Index
m × n matrix, 55 eigenvector, 45, 317

elementary matrix, 14, 101
addition of matrices, 57 elementary operations, 71
adjoint matrix, 254 elementary row operations, 76
area of a triangle, 158 entry of a matrix, 55
associativity of matrix multiplication, 61 equality of matrices, 56
augmented matrix, 76 equation of a plane, 374
equation of a vector plane, 240
back substitution, 25, 94 equivalent matrices, 98
basic variable, 93
basis, 135, 140, 181, 190, 270 forward substitution, 25
best approximation, 149, 205, 211 free variable, 93
Cartesian coordinates, 132, 179

Gauss-Jordan form, 79
change of basis, 193
Gaussian elimination, 78
characteristic polynomial, 315
Gram-Schimdt process, 349
Cholesky decomposition, 409
Chord Theorem, 431
identity matrix, 62
column of a matrix, 55
indefinite quadratic form, 400
column space of a matrix, 288
inverse matrix, 10, 106
consumption matrix, 37
invertible matrix, 10, 106
coordinates of a vector, 141, 190
coordinates of a vectors, 272
leading one, 79
Cramer’s Rule, 32, 261
leading term, 79
cross product, 236
cross-product term of a quadratic form, least squares, 216
403 least-squares line, 223
Leontief model, 35
demand vector, 37 line, 355, 370
determinant, 28, 244 linear combination, 141, 190, 270
diagonal matrix, 39, 319 linearly dependence, 275
diagonalizable matrix, 46, 319 linearly dependent vectors, 135, 182, 264
dimension of a vector subspace, 284 linearly independent vectors, 139, 185,
distance between two points, 144, 201 269
distance from a point to a line, 156, 363, lower triangular matrix, 23
371 LU-decomposition, 23, 120
distance from a point to a plane, 379 LU-factorization, 23, 120
dot product, 145, 200
matrix, 55
eigenspace, 318 multiplication of matrices by numbers,
eigenvalue, 44, 314 58
493
494 INDEX
negative definite quadratic form, 400 residuals, 223

negative semidefinite quadratic form, right-hand rule, 414
400 rotation, 391, 414
nine-point circle, 440 row interchange, 76
norm of a vector, 143, 200 row of a matrix, 55
nullity of a matrix, 312 row replacement, 76
nullspace of a matrix, 312 row scaling, 76
row space of a matrix, 288
observed values, 223
origin, 132, 179 scalar multiplication, 58
orthogonal basis, 156, 208, 275 simple matrix, 104
orthogonal diagonalization, 167 singular value decomposition, 301
orthogonal matrix, 165, 331 singular values, 292
orthogonal vectors, 202 span, 134, 186
orthogonally diagonalizable matrix, 333 spectral decomposition, 169, 346
orthonormal basis, 208, 333 square matrix, 55
outer product expansion, 297 sum of matrices, 57
output vector, 37 symmetric matrix, 68
Parallelogram law, 201 Tangent-Secant Theorem, 433

parameter values, 223 transition matrix, 194
perp operation, 153 transpose of a matrix, 66
pivot column, 80
pivot position, 80 unit matrix, 62
pivot variable, 93 unit vector, 144, 200
plane through a point orthogonal to a upper triangular matrix, 23
line, 372
vector line, 134, 181
plane through three points, 381
vector plane, 186
positive definite matrix, 407
vector subspace, 134, 181, 186, 271
positive definite quadratic form, 400
volume of a tetrahedron, 282
positive semidefinite quadratic form, 400
predicted values, 223
zero matrix, 57
Principal Axes Theorem, 404
product of matrices, 2, 59, 60
production vector, 37
projection, 150, 206, 214
projection matrix, 153, 207, 215
projection of a point on a line, 363, 371
projection of a point on a plane, 377
QR factorization, 173, 225, 348

quadratic form, 400
quadratic terms of a quadratic form, 403
rank of a matrix, 288

rank theorem for 3 × 3 matrices, 285
rank-nullity theorem, 312
reduced row echelon form, 79
reflection, 152
regression line, 223

Dragu Atanasiu, Piotr Mikusiński - A Bridge To Linear Algebra-WSPC (2019)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dragu Atanasiu, Piotr Mikusiński - A Bridge To Linear Algebra-WSPC (2019)

Uploaded by

Copyright:

Available Formats

A BRIDGE TO

11276_9789811200229_TP.indd 1 15/2/19 4:38 PM

11276_9789811200229_TP.indd 2 15/2/19 4:38 PM

Library of Congress Cataloging-in-Publication Data

British Library Cataloguing-in-Publication Data

Copyright © 2019 by World Scientific Publishing Co. Pte. Ltd.

For any available supplementary material, please visit

LaiFun - 11276 - A Bridge to Linear Algebra.indd 1 03-01-19 9:42:42 AM

We dedicate this book to our wives,

Delia and Grażyna

1 Basic ideas of linear algebra 1

3 The vector space R2 131

4 The vector space R3 179

5 Determinants and bases in R3 233

6 Singular value decomposition of 3 × 2 matrices 291

7 Diagonalization of 3 × 3 matrices 307

8 Applications to geometry 355

10 Problems in plane geometry 429

11 Problems for a computer algebra system 457

12 Answers to selected exercises 459

projections, determinant, eigenvalues and eigenvectors, diagonalization, spectral

Now a few words about the content of the book.

Chapter 8 gives a presentation of classical analytic geometry compatible with the

Basic ideas of linear algebra

2 Chapter 1: Basic ideas of linear algebra

and call x a 2 × 1 matrix or a 2 × 1 vector. The geometric

Example 1.1.2. The system ½

Example 1.1.4. We want to calculate

4 Chapter 1: Basic ideas of linear algebra

Example 1.1.5. To calculate · ¸

Example 1.1.6. We wish to calculate the product

It is important to remember that the product of matrices is not commutative,

Example 1.1.7. For the product

The results are completely different.

Example 1.1.8. Show that

6 Chapter 1: Basic ideas of linear algebra

Solution. First we calculate the product

Theorem 1.1.9. For any numbers a 1 , a 2 , b 1 , b 2 , b 3 , b 4 , c 1 , c 2 we have

We obtain the same result if we calculate

The calculations are left as an exercise.

The result in the above lemma is an example of associativity of matrix multipli-

Theorem 1.1.10. For any numbers a 1 , a 2 , a 3 , a 4 , b 1 , b 2 , b 3 , b 4 , c 1 , c 2 , c 3 , c 4 we

8 Chapter 1: Basic ideas of linear algebra

Theorem 1.1.11. For any numbers a, b, c, d we have

Proof. The equalities can be verified by direct calculations.

We will also multiply matrices by real numbers. To multiply a matrix by a real

Find the products of the given matrices.

15. Show by direct calculations that

16. Show that the product

can be written in the form

17. Show that · ¸· ¸ · ¸· ¸ · ¸

18. Show that

10 Chapter 1: Basic ideas of linear algebra

19. Show that

20. Show that if A is a 2 × 2 matrix and B and C are 2 × 1 vectors, then

21. Show that if A, B , and C are 2 × 2 matrices, then

22. Show that if A and B are 2 × 2 matrices and C is a 2 × 1 vector, then

23. Show that if A, B , and C are 2 × 2 matrices, then

1.2 Inverse matrices

1.2. INVERSE MATRICES 11

Example 1.2.2. Since

Example 1.2.3. Since

Example 1.2.4. Since