Professional Documents
Culture Documents
T. Shaska
Contents
Preface 2
1 Analytic geometry 3
1.1 Cartesian system of the plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Relations and their graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Geometric interpretation of complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Roots of unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Algebraic equations, planar algebraic curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2 Circle and Ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3 Conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Conics with mixed terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Vectors in Physics and Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5.1 The plane R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5.2 The space R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Vector Spaces 47
3.1 Definition of vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.1 Linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2 Bases and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2.1 A basis for Matn×n (R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2.2 Finding a basis of a subspace in kn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Nullspace and rank of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3.1 Finding a basis for the row-space, column-space, and nullspace of a matrix. . . . . . . . . . . . 58
3.4 Sums, direct sums, and direct products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4.1 Direct sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4.2 Direct products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3
4 Linear Transformations 65
4.1 Linear maps between vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2 Composition of linear maps, inverse maps, isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 Matrices associated to linear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4 Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.5 Linear transformation in geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5.1 Scalings: scalar matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5.2 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5.3 Shears . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5.4 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5.5 Reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6 Review exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
In most universities linear algebra is taught at the sophomore level and is the first course which introduces students
to concepts such as vector spaces, homomorphisms, matrices, etc. These are important concepts in mathematics
and must be developed with care. Such concepts play a crucial role for mathematics and computer science majors.
Moreover, linear algebra is also a very important tool to students who major in engineering, sciences, finance, etc.
It is important that these students learn linear algebra from a more practical viewpoint. Responding to these needs
there are two kinds of textbooks on linear algebra.
In the first kind the emphasis is put mainly on applications and the mathematical content is hidden or lost. The
students feel overwhelmed with many applications and techniques and finish the course without the mathematical
culture that such a course should provide.
The second kind of textbooks are at a more advanced level focusing on the mathematical content and skipping
most of the computational aspects of the subject. Most texts of this kind simply lack the examples and exercises
to illustrate the material. Furthermore, they avoid most of the computational exercises in topics such as canonical
forms, diagonalization, etc.
Almost all books of linear algebra lack the historical prospective of the subject, the motivation why the subject
was developed. The students who take a course in linear algebra usually do not lear about the rich geometry of the
plane R2 , or the space R3 , the transformations, etc. This is mainly because most of it is ignored by recent books and
teachers of the subject.
In writing this book I simply wanted to present a more balanced approach of the mathematical and the computa-
tional content of the subject while reminding the student throughput the book of the motivations and the historical
prospectives of the subject. My goal was that students learn the computational tools of linear algebra and at the
same time have some level of maturity about the subject. Preparing the student for more advanced topics of linear
algebra was the main concern. Therefore, we treat some topics that probably would not be covered in a typical
sophomore level course, such as irreducibility of polynomials over the rationals numbers, complex numbers, etc.
What you will find special about this textbook is that the geometry is incorporated throughout the book. The
first chapter is a quick review of a few facts from analytic geometry that are covered in high school. However, it is
no coincidence that the chapter is closed with an example of a conic with mixed terms. The process of transforming
this conic in a standard form touches some of the very important parts of linear algebra. First, is the question of
what type of transformations will preserve the shape of the conic. Second, how is it done? Throughout the book
the student will realize that diagonalizing a matrix correspond to changing the basis of a vector space, which in
itself corresponds to certain algebraic substitutions. Moreover, the idea that many things change while we change
the basis, but some things stay the same leads to the very important concept of invariant. Inertia of a binary form,
determinant of a matrix, etc are some of the concepts where this will be explored.
We treat canonical forms in detail in contrast to most other textbooks at this level. It is believed that this part of
linear algebra is too difficult for an introductory course on the subject. I first experimented with teaching canonical
forms at the undergraduate level at the University of California at Irvine, during the academic year 2002-2003.
To my surprise students responded very well to it. Since then I have continuously included canonical forms in
my linear algebra courses. It is true that I spend one or two lectures on polynomials and complex numbers but
that is not a lost time since most of the undergraduate students lack a good understanding of these topics. While
computing canonical forms of matrices can be at times painful it can be also be an excellent opportunity to introduce
some computer algebra packages in the course. I have used Maple to give examples and exercises that illustrate
the benefits of canonical forms. One can also assign programming exercises on computing canonical forms. The
success of these programming exercises will depend on the exposure that students have had to programming.
Some of them will respond very well to it, while others will have a difficult time with such exercises. Having
1
Linear Algebra Shaska T.
some knowledge of canonical forms is very helpful to engineering students who often have to normalize operators
and also to mathematics students who will use these canonical forms in later classes such as abstract algebra,
representation theory, etc.
We expect that the student who is taking the class has already knowledge of basic calculus or as it is known in
most US universities Calculus I and II. For students of mathematics a chart of classes and the logical hierarchy is
suggested in the diagram.
Calculus I, II
v *
Linear Algebra Introduction to Proofs
r ,
Calculus III Elementary Number Theory
,
Real Analysis I Abstract Algebra I
Real Analysis II Abstract Algebra II
We have tried not to overflow the book with meaningless examples and exercises, however we give enough
exercises to entertain even the most ambitious students. There are a few exercises in the text that need some
knowledge of other areas of mathematics. We expect students who are taking the course to have taken the calculus
sequence of classes and to have taken some basic course on discrete mathematics or logic. The level of exercises
varies. Most of the exercises are at the very basic level where we simply check the level of the understanding of the
subject. However, some of the exercises are challenging even for the most ambitious student.
Acknowledgments
This book grew out of lectures on the subject at University of California at Irvine, University of Idaho, University
of Vlora, and Oakland University. I would like to thank especially the University of California at Irvine: for giving
me the opportunity to teach the subject several times in a row during the time that the first draft of this book was
written, and enthusiastic students at all the above schools who had to put up with rough drafts of the manuscript.
Tanush Shaska
Rochester, 2018
2
Chapter 1
Analytic geometry
Analytic geometry is the study of geometric objects such as lines, circles, ellipses, parabolas, hyperbolas through
the use of algebra. In this chapter we will review briefly some of the major problems of analytic geometry and
show that it was the study of such problems as the main motivator for the development of what we now call linear
algebra.
Any two distinct points determine uniquely a line. Let L be a line on the
plane. Denote by θ the angle between the x-axis and the line L, measured (−4, −3)
counterclockwise starting from the x-axis. The slope of L is called tan θ.
Lemma 1.1. For any two points P1 (x1 , y1 ) and P2 (x2 , y2 ) in the plane, the slope
of the line P1 P2 is Figure 1.1: Coordinate plane
y2 − y1
m= .
x2 − x1
The reader to prove the following lemma as a trigonometry problem.
Lemma 1.2. For any two lines L1 and L2 with slopes m1 and m2 respectively, the angle φ between them, measured in the
counterclockwise direction, is
m2 − m1
tan φ = ,
1 + m1 m2
where m2 is the slope of ending line and m1 the slope of the initial line.
Corollary 1.1. Two lines are perpendicular if and only if
1
m2 = − .
m1
3
Linear Algebra Shaska T.
F(x, y) = 0.
A particular class of relations are those relations when for every x from the domain it corresponds a unique value
y from the set of values. Such relations we call them functions.
When the domain and the set of values of a relation ∼ are the same, say X, we say that we have a relation ∼ on
X. A relation ∼ on X is called an equivalence relation if the following properties hold:
• reflexive: ∀x ∈ X, x ∼ x
• symmetric: ∀x, y ∈ X, x ∼ y =⇒ y ∼ x
• transitive: ∀x, y, z ∈ X, x ∼ y ∧ y ∼ z =⇒ x ∼ z.
The study of geometric objects using algebraic methods is the focus of algebraic geometry. It is exactly the
correspondence between geometric objects and algebraic equations that was a breakthrough in mathematics.
4
Shaska T. Linear Algebra
Example 1.1. Convert the point 2, π3 from polar coordinates to Cartesian coordinates.
π 1
x = r cos θ = 2 cos = 2· = 1
3 2
√
π 3
y = r sin θ = 2 sin = 2 · =3
3 2
5
Linear Algebra Shaska T.
√
Thus the point in Cartesian coordinates is (1, 3).
Example 1.2. Find the polar coordinates for the point P(1, −1) given in Cartesian coordinates
x = 4 cos π = 4 · (−1) = −4
y = 4 sin π = 4 · (0) = 0
Exercise 1.1. Prove that the multiplication in C is simply the addition and multiplication of R when restricted to R.
6
Shaska T. Linear Algebra
For the complex number z = a + bi we call the real part of z the real number a and the imaginary part of z the real
number b. We denote them as follows:
<(z) = a, =(z) = b.
C→C
(1.3)
z → z̄
is surjective and is called complex conjugation map. If z ∈ C, then z is real if and only if z = z̄. The absolute value
or modulus of z = a + bi, denoted by kzk, is defined to be
√ √
|z| = zz̄ = a2 + b2 .
z = ρ (cos θ + i sin θ) .
This is called the polar representation of z. We can use this polar representation to get a geometric interpretation
of the multiplication of complex numbers. Let, z, w be any two complex numbers such that
Then,
z · w = r1 r2 cos(α + β) + i sin(α + β)
z r1 (1.4)
= cos(α − β) + i sin(α − β)
w r2
7
Linear Algebra Shaska T.
Exercises:
1.4. Let z = r(cos α + i sin α). Show that 1.9. Solve the following equation
zn = cos nα + i sin nα zn − 1 = 0
for any integer n ≥ 1. √
3
1.10. Let z = 21 + i 2 . Compute z2 , z3 , z4 , z5 , z7 , z8 , z9 .
1.5. Prove that ∀u, v ∈ C,
1.11. Let z = r1 (cos α + i sin α) and w = r2 (cos β + i sin β).
|u · v| = |u| · |v|.
Prove the following
1.6. Compute the modulus of the following complex numbers:
z · w = r1 r2 cos(α + β) + i sin(α + β)
√ √ !10 √ !12
2 2 1 3
i− , +i 4πi 2πi
2 2 2 2 1.12. Express the numbers 3e− 6 , −3e n in standard form.
8
Shaska T. Linear Algebra
3 f :C→C
z+ (3z2 − 2z + 5) = 0
z2
az+b
z→ ,
1.14. Factor completely the following polynomial p(z) = z7 −1. cz+d
where ad − bc = 1. Given that
1.15. Factor over Q the polynomial p(z) = z5 − 1
1
1.16. Does the equation f (1) = −1, f (i) = i, f (2) = −
2
z4 + z3 + z2 + z + 1 = 0 a) Find f (z) and then f (2), f (2i), f ( 12 ).
b) Let C = {z ∈ C s.t. |z| = 1, Re(z) ≥ 0} (half of the unit
have any rational solutions? circle). Find f (C).
xn −1
1.17. Can f (x) = x−1 be expressed as a polynomial? What is 1.26. A function f (z) is given by
that expression?
f :C→C
1.18. Compute the modulus of the following complex numbers:
az+b
√ !12 z→
1 3 cz+d
(i + 1)10 , + i
2 2 where ad − bc = 1. Given that
1
1.19. Prove that for any rational number r ∈ Q, f (i) = −i, f (3) = , f (−1) = −1
3
(cos θ + i sin θ)r = cos(rθ) + i sin(rθ)
a) Find f (z) and then f (2), f (2i), f ( 12 ).
√
3
b) Let C = {z ∈ C s.t. |z| = 2, } be the circle with center in
1.20. Let z = 12 + i 2 . Compute z2 , z3 , z4 , z5 , z7 , z8 , z9 . the origin and radius 2. Find f (C).
4πi 2πi
1.21. Express the numbers 3e− 6 , −3e n in standard form. 1.27. Let ε5 denote the primitive 5-th root of unity. Find the
Mobius transformation f (x) such that
1.22. Solve the following
f (0) = 0, f (1) = ε5 , f (ε5 ) = ε25
3
(z + 2 )(3z2 − 2z + 5) = 0.
z Prove your answer. Find f (ε25 ), f (ε35 ), f (ε45 ).
1.23. Factor completely the following polynomial p(z) = z7 −1. 1.28. Prove the De Moivre’s formula. For any integer n ≥ 1
1.24. Factor over Q the polynomial p(z) = z5 − 1. (cos θ + i sin θ)n = cos(nθ) + i sin(nθ).
f (x, y) = 1 + x + y + xy,
1.3.1 Lines
Let’s start with degree one equations. Then we have the following:
9
Linear Algebra Shaska T.
Lemma 1.4. i) The graph of every algebraic equation of degree one is a line.
ii) Every non-vertical line has equation
y = mx + b,
where m is the slope of the line.
The proof is done in high school algebra. In more general we say that the equation of a line is given by
ax + by = c
Consider now the problem of finding the intersection of two lines L1 and L2 with equations
L1 : a1 x + b1 y = c1
L2 : a2 x + b2 y = c2
Hence we are looking for points which are both in L1 and L2 , or in other words which satisfy the equation of L1 and
L2 . The set of such points P(x, y) we denote by
a1 x + b1 y = c1
(
a2 x + b2 y = c2
and call it a system of two equations with two unknowns. We will see later that systems of equations are always
intersection sets of several geometric objects.
(x − a)2 + (y − b)2 = r2 .
Thus, we have:
Lemma 1.5. Every circle on the plane is given by an algebraic equation of degree 2.
Naturally, we would like to know whether every degree 2 algebraic equation represents a circle. The answer is
negative as we will see below.
Corollary 1.2. A circle with center at the origin and radius r has equation
x2 + y2 = r2 .
Let P1 (a, b) and P2 (c, d) be two points on the plane. The set of points P(x, y) of the plane such that the distance
PP1 + PP2 is constant, say
PP1 + PP2 = 2d,
is called ellipse. Points P1 and P2 are called foci of the ellipse. The equation of the ellipse is
q q
(x − a)2 + (y − b)2 + (x − c)2 + (y − d)2 = 2d.
Lemma 1.6. Every ellipse on the plane can be written as a degree 2 algebraic equation.
Thus, a degree 2 polynomial equation can give us a circle or an ellipse and who knows what else?
Corollary 1.3. An ellipse with foci (c, 0) and (−c, 0) has equation
x2 y2
+ = 1,
a2 b2
where b2 + c2 = a2 .
10
Shaska T. Linear Algebra
Hence, a degree two equation can give different geometric shapes. Or is it really the circle and the ellipse that
much different? For example are the circle
x2 + y2 = 1
and the ellipse
y2
=1 x2 +
4
really different? After all, the above ellipse can be transformed into the unit circle above by shrinking the y-axis by
y
a factor of two. In other words if we make the substitution Y = 2 in the equation of the ellipse then we get
x2 + Y2 = 1,
Ellipse
First let’s study the ellipse in more detail.
Definition 1.2. Ellipse is called the set of all points of the plane the sum of distances of which from fixed points F1 and F2 ,
is constant. These two points are called the foci of the ellipse. The midpoint of the segment F1 F2 is called the center of the
ellipse.
Kepler’s second law says that the planets orbit around the sun in trajectories that are ellipses and the Sun is in
one of the foci of the ellipse.
To simplify the algebra we can assume that the foci of the ellipse are on the x-axis and the origin in the center of
the ellipse. Say the foci have coordinates (−c, 0) and (c, 0). Denote the sum of distances of a point P(x, y) from the
foci with 2a > 0.
Then P(x, y) is in the ellipse when
|PF1 | + |PF2 | = 2a
Hence, q q
(x + c)2 + y2 + (x − c)2 + y2 = 2a
or q q
(x − c) + y = 2a − (x + c)2 + y2
2 2
11
Linear Algebra Shaska T.
x2 y2
+ =1 a≥b>0
a2 b2
has foci (±c, 0), where c2 = a2 − b2 and vertices (±a, 0). Let’s see some examples.
x2 y2
Example 1.5. For the ellipse + = 1, the big axes is horizontal or vertical?
6 4
Solution: Since a > b and 6 > 4, we have a2 is the denominator of x2 . Hence, the big axes is horizontal
x2 y2
Example 1.6. Find a, b, c for the ellipse + = 1.
25 16
√ √ √
Solution: From the equation we have a = 5, b = 4. Thus, c = a2 − b2 = 25 − 16 = 9 = 3.
Example 1.7. Find the equation of the ellipse with foci (0, ±2) and vertices (0, ±3).
Solution: Using Eq. (1.5), we have c = 2 and a = 3. Then, b2 = a2 − c2 = 5. Thus, the equation of the ellipse is
x2 y2
+ =1
5 9
1.3.3 Conics
Consider the following problem. It is given the equation z2 = x2 + y2 in space. As we will see later the graph of this
equation is the set of points in R3 which satisfy this equation. Since for every value of z, this is a curve with radius
√
|z|, then this graph is what we call a double cone; see Fig. 1.7.
Exercise 1.2. Let the double cone given as above and
ax + by + cz = d,
the equation of a plane in R3 . Find the intersection between the double cone and the plane.
Notice that in R3 every polynomial equation of degree one represents a plane. To solve the problem we need to
solve the system
z = x2 + y2
( 2
ax + by + cz = d
Geometrically the set of solutions is given in Fig. 1.7.
Next, we will try to determine this solution set algebraically.
12
Shaska T. Linear Algebra
Parabola
Next we see another shape of graph from a degree two algebraic equation which is not an ellipse or a circle.
Definition 1.3. A parabola is called the set of all points of the plane which are equidistant from a fixed line L and from a
fixed point F which is not on the line. The line L is called directrix of the parabola and the point F the focus.
Notice that the middle point of the perpendicular from the focus to the directrix belong to the parabola. This
point is called the vertex of the parabola. The perpendicular from the focus F to the directrix L is called the axes of
the parabola.
To obtain a simple equation of the parabola we put its vertex on the
origin O and its directrix parallel to the x-axis. If the focus of the
parabola is the point (0, p), then the directrix has equation y = −p. If
P(x, y) is a point in the parabola, then the distance from P to the focus
is q
|PF| = x2 + (y − p)2
and the distance of P from the directrix is |y + p|. From the definition
of the parabola these distances are equal, hence
q
x2 + (y − p)2 = |y + p|
x2 = 4py (1.6)
13
Linear Algebra Shaska T.
1
If we would substitute a = , then the standard equation of the parabola Eq. (1.6) will be in the form y = ax2 . It
4p
has its wings up if p > 0 and wings down if p < 0. The graph is symmetric with respect to the y-axis since Eq. (1.6)
does not change when we replace x with −x.
If we change places between x and y in Eq. (1.6) we have:
y2 = 4px (1.7)
which gives the equation of the parabola with focus (p, 0) and directrix x = −p. Parabola is with wings to the right if
p > 0 and to the left when p < 0. In both cases the graph is symmetric to the x-axis which is the axis of the parabola.
If the parabola has its vertex at (h, k) and vertical axis then the standard form is
Solution: We write the equation as x2 = 4y and comparing it with Eq. (1.6), we have 4p = 4. Thus, p = 1. Hence, the
focus is (0, p) = (0, 1), directrix y = −p = −1, vertex a (0, 0), and the axis x = 0.
Example 1.9. Find the focus, directrix, vertex, and axis of the parabola y2 + 10x = 0.
5
Solution: We have y2 = −10x and comparing it with Eq. (1.7) we have 4p = −10. Thus, p = − . Hence, the focus is
2
5 5
(p, 0) = − , 0 , directrix y = −p = , vertex (0, 0), and the axis y = 0.
2 2
Example 1.10. Find the focus, directrix, vertex, and axis of the parabola y2 + 4x − 2y − 3 = 0.
Solution: Let us see if we can transform the equation in the standard form
Solution: Let us see if we can transform the equation in the standard form
x2 − 4x − 8y + 28 = (x − 2)2 − 8(y − 3) = 0
(x − 2)2 = 8(y − 3)
The equation resembles Eq. (1.8). Thus, the vertex is (h, k) = (2, 3), focus (h, k + p) = (2, 5), directrix y = k − p = 3 − 2 = 1,
and the axis x = h = 2.
14
Shaska T. Linear Algebra
Example 1.12. Find the focus, directrix, vertex, and axis of the parabola x2 − 10x − 2y + 29 = 0.
Solution: Let us see if we can transform the equation in the standard form
The equation resembles Eq. (1.8). Thus, the vertex is (h, k) = (5, 2), focus (h, k + p) = (5, 2.5), directrix y = k − p = 2 − 0.5 =
1.5, and the axis the line with equation x = h = 5.
We leave the reader with the following problem.
Exercise 1.3. Given a degree 2 equation, how can we determine if this equation represents a parabola? For example, can you
prove that the graph of the equation
4x2 + 12xy + 9y2 − 4x + 4y = 0
is a parabola?
Hyperbola
Next we see another shape obtained also from a degree two equation.
Definition 1.4. Hyperbola is the set of all points P(x, y) of the plane whose
difference from two fixed points F1 and F2 is constant. In other words, all
points P(x, y) such that
for some constant a > 0. Points F1 and F2 are called foci of the hyperbola.
The midpoint of the segment that connects to foci is called the center of the
hyperbola. The line that connects the two foci is called the axis of the
hyperbola.
Notice that the definition of the hyperbola is similar to the definition
of the ellipse. The only difference is that the sum of the distances in the
definition of the ellipse is replaced by the difference of the distances.
Determining the equation of the hyperbola is similar to that of the Figure 1.9: Hyperbola
ellipse.
If the foci are on the x-axis, precisely on (±c, 0) and the difference of the distances is
15
Linear Algebra Shaska T.
!
b
has foci (±c, 0), where =
c2 a2 + b2 ,
horizontal axis, vertices (±a, 0), and asymptotes y = ± x.
a
If the foci are on the y-axis, then interchanging places for x and y in Eq. (1.10) we have that the hyperbola
y2 x2
− =1 (1.11)
a2 b2
a
has foci (0, ±c), ku c2 = a2 + b2 , vertical axis, vertices (0, ±a), and asymptotes y = ±( )x.
b
The standard equation of the hyperbola with center in (h, k) and horizontal axis is
(x − h)2 (y − k)2
− =1 (1.12)
a2 b2
The distance between two vertices is 2a. Vertices are (h ± a, k), the distance between to foci is 2c. The foci are (h ± c, k),
where c2 = a2 + b2 . The equations of the asymptotes are
b
y = k ± (x − h)
a
The standard equation of the hyperbola with center in (h, k) and vertical axis is
(y − k)2 (x − h)2
− =1 (1.13)
a2 b2
The distance among the vertices is 2a. Vertices are (h, k ± a) and the distance among the foci 2c. The foci are (h, k ± c),
where c2 = a2 + b2 . The equations of the asymptotes are
a
y = k ± (x − h).
b
Let us illustrate with a few examples.
Example 1.13. If the equation of the hyperbola is
y2 x2
− = 1,
4 36
√ √
then the axis of the hyperbola is vertical, a = 6, b = 2 and each foci is the distance c = a2 + b2 = 2 10 from the center of the
√ √ 1 1
hyperbola. The foci are (0, −2 10) and (0, 2 10). Asymptotes have equations y = x dhe y = − x, since ba = 13 .
3 3
Example 1.14. Suppose that the center of the hyperbola is at (2, −1). Then, the equation of the hyperbola is
(x − 2)2 (y + 1)2
− ,
9 16
b 4
from which we have a = 3 and b = 4. The slope of the asymptotes are ± = ± . The axis of the hyperbola is horizontal. The
a 3
foci are (7, −1) and (−3, −1). Vertices of the hyperbola are (−3, −1) and (5, −1).
Example 1.15. Find the equation of the hyperbola with center at (2, −1) when a = 3, b = 4, and the axis is horizontal
(x − 2)2 (y + 1)2
− =1
32 42
16
Shaska T. Linear Algebra
√
and c = a2 + b2 = 5. The distance between two vertices is 2a = 6. Vertices are (h − a, k) = (−1, −1) and (h + a, k) = (5, −1).
The distance between to foci is 2c = 10. The foci are (h − c, k) = (−2, −1) and (h + c, k) = (7, −1). The equations of the
asymptotes are
b 4 b 4
y = k − (x − h) = −1 − (x − 2) dhe y = k + (x − h) = −1 + (x − 2)
a 3 a 3
Example 1.16. Find the foci and the equation of the hyperbola with vertices (0, ±1) and asymptote y = 2x.
√
Solution: From Eq. (1.11) we have a = 1 and a
b = 2. Hence, b = a
2 = 1
2 and c2 = a2 + b2 = 45 . Then the foci are (0, ± 5/2)
and the equation of the parabola is y2 − 4x2 = 1.
Example 1.17. Determine the conic and find its foci
Solution: We first complete the squares to get the equation in standard form. Hence,
implies
4(y − 1)2 − 9(x − 4)2 = 36
or in other words
(y − 1)2 (x − 4)2
− = 1.
9 4
√
Thus the equation is of the form of Eq. (1.13). Therefore, a2 = 9, b2 = 4, c2 = 13. The foci are (4, 1 + 13) and
√ 3
(4, 1 − 13), while the vertices are (4, 4) and (4, −2). The equations of the asymptotes are y = 1 ± (x − 4).
2
Exercises:
17
Linear Algebra Shaska T.
Find vertices, foci, asymptotes of the hyperbola and 1.52. 4x2 + y2 + 24x − 6y + 9 = 0
sketch its graph.
1.53. y2 − 2y + 40x + 281 = 0
x2 y2
1.46. − =1 1.54. 4x2 + y2 + 32x + 6y + 57 = 0
36 81
1.55. x2 + y2 − 6x + 4y + 12 = 0
y2 x2
1.47. − =1
4 36 1.56. y2 − x2 + 2y − 14x − 57 = 0
1.48. 75x2 − 27y2 = 675 1.57. 4x2 − y2 − 56x − 4y + 176 = 0
1.49. x2 − 4y2 + 10x − 48y − 135 = 0 1.58. x2 = y2 + 1
(y + 3)2 (x + 7)2 1.59. y2 + 2y = 4x2 + 3
1.50. − =1
25 64
1.60. 4x2 + 4x + y2 = 0
5x2
1.51. − 5y2 = 80 1.61. y2 − 8y = 6x − 16
4
1 2
Identify the conic and find vertices and foci. 1.62. x + 62 = y + 62
12
X 2 Y2
+ =1
A2 B2
for some real numbers A and B?
It can be easily verified that Equation 1.14 can be written as
9(x − y)2 + 4(x − 5)2 = 36 Figure 1.10: The graph of Equation 1.14
Thus, by letting
X = x−5 and Y = x− y
we get
X 2 Y2
+ =1 (1.15)
32 22
which definitely seems nicer than Equation 1.14. This "new" ellipse has axes of length 6 and 4 and they seem, at
least visually, to be close to the original ellipse.
18
Shaska T. Linear Algebra
Question 1.1. Can we find a methodological approach to solve this problem or similar problems like this one?
Stated a bit differently, given a degree 2 equation, can we determine a methodological approach so we can make
the right substitutions and that the equation is transformed into a nicer one?
Question 1.2. Can this be done for quadratic surfaces (degree 2 equations in space)? What about higher degree equations?
In the process, we will have to understand and answer the following three questions:
A student who will learn to answer all of three parts of the question above, even for just degree 2 equations, will
have learned linear algebra. In the process we will learn a beautiful theory and about the people who developed it.
You should prove that this is indeed an equivalence relation. Then, a vector is called an equivalence class from
the above relation. Denote the set of all such equivalence classes by S/ ∼. Hence, this is the set of all vectors.
Moreover, the above three conditions are geometrically equivalent with moving the vector u in a parallel way over
v. So we can assume that all vectors of S/ ∼ start at the origin O of the space R.
Thus, there is a one-to-one correspondence between the set of elements of S/ ∼ and points of R, namely
~ ←→ P = (x, y) or P = (x, y, z)
u = OP
u1
Hence, a vector u = OP is an ordered tuple (u1 , . . . , un ) for n = 2, 3 and will be denoted by u = ... , in order to
~
un
distinguish it from the point P(u1 , . . . , un ). In the next section we will generalize this concept to Rn .
u1 + v1
" # " #
ru1
u + v := , and r · u := , (1.16)
u2 + v2 ru2
19
Linear Algebra Shaska T.
~
ru
~
u ~
u
~
u ~
u
~
v
~+~
u v ~+~
u v
~
v ~
v
~
u
where r ∈ R.
Geometrically scalar multiplication r u is described as in Fig. 1.11, where r u is a new vector with the same
direction as u and length r-times the length of u.
Addition of two vectors u and v geometrically is described in Fig. 1.12.
Example 1.18. Prove that such definitions agree with addition and scalar multiplication defined in Eq. (1.16)
v1
y
v := v2 , x0
v3
y0 Q
x
where v1 , v2 , v3 ∈ R are called the coordinates of v.
Notice that we will denote a point by an ordered triple P(x, y, z) Figure 1.13: Coordinates of P(x, y, z).
and will always distinguish this from the vector OP ~ with coordinates
x, y, z.
20
Shaska T. Linear Algebra
u1 v1
For any two vectors u = u2 and v = v2 we define the addition and scalar multiplication as in R2 , namely
u3 v3
u1 + v1 ru1
u + v := u2 + v2 , r · u := ru2 .
u3 + v3 ru3
where r ∈ R.
Since any two generic lines determine a plane, the geometric interpretation of addition and scalar multiplication
of R2 is still valid in R3 .
Let be given two points P1 (x1 , y1 , z1 ) and P2 (x2 , y2 , z2 ) in R3 . We will show that the distance |P1 P2 | between the
two points is
q
|P1 P2 | = (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2
To verify this formula we construct a parallelepiped where the points
P1 and P2 are vertices across from each other as in Fig. 1.14. If z
A(x2 , y1 , z1 ) and B(x2 , y2 , z1 ) are the other vertices as in Fig. 1.14, then
Since the triangles 4P1 BP2 and 4P1 AB are right triangles, from the
Pythagorean theorem we have
|z2 − z1 |
P2
|P1 B|2 = |P1 A|2 + |AB|2 P1
|P1 P2 |2 = |P1 B|2 + |BP2 |2 |x2 − x1 |
Combining the two equations we have y2 − y1
The equation of the sphere with center at the point with coordinates (x0 , y0 , z0 ) and radius r is
(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = r2
To prove this we just need the definition of the sphere, which is the set of all points P(x, y, z) equidistant from the
fixed point Q(x0 , y0 , z0 ) with a distance r from it. Thus, |PQ| = r. Squaring both sides we have |PQ|2 = r2 or
(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = r2
x2 + y2 + z2 = r2
However, not every sphere has an equation as above. Consider the following example:
21
Linear Algebra Shaska T.
z
z
(x, y, z)
(x, y, z)
r
y (x0 , y0 , z0 )
0
y
0
x
x
(a) Radius r, center (0, 0, 0) (b) Radius r, center (x0 , y0 , z0 )
Example 1.19. Prove that the following equation represent a sphere and find its radius and its center
21
(x − 1)2 + (y + 2)2 + z2 =
4
q
Thus the equation represents a sphere with center (1, −2, 0) and radius 21
4 .
In Chapter ?? we will show how to use methods of linear algebra and have a methodological way of doing this
for every quadratic surface.
1. u + v = v + u
2. u + (v + w) = (u + v) + w
3. u + 0 = u
4. u + (−u) = 0
5. c(u + v) = cu + cv
22
Shaska T. Linear Algebra
6. (c + d)u = cu + du
7. (cd)u = c(du)
8. 1u = u
The proof is left as an exercise to the reader.
Let’s denote by V3 the set of all vectors in the 3-dimensional space R3 . Three vectors which play a special role
in V3 are
1
0
0
i = 0 j = 1 , k = 0
0 0 1
These vectors are called vectors of the standard basis. We will explain this terminology in more detail in the coming
sections.
a
If u = b, then we have
c
1 u
u= .
kuk kuk
In the next section we will formalize such definitions to the case of Rn . The reader should make sure to fully
understand the concepts from R2 and R3 before proceeding to Rn .
Exercises:
1.63. Find the lengths of the sides of the triangle with vertices 1.70. Find the equation of the sphere which passes through the
A(3, −2, 1), B(1, 2, −3), C(3, 4, −2). Determine if this triangle point (4, 3, −1) and has the center at (3, 8, 1)
is regular.
Prove that the following equations represent a sphere,
1.64. Finds the distance of the point (−5, 3, 4) from each coor- find its center and its radius.
dinate plane. 1.71. x2 + y2 + z2 − 6x + 4y + 2z = −17
1.65. Find the magnitude of the force which has its projections 1.72. x2 + y2 + z2 = 4x − 2y
on the coordinate axis as x = −6, y = −2, and z = 9.
1.73. x2 + y2 + z2 = x + y + z
1.66. Prove that the triangle with vertices A(1, −2, 1) 1.74. x2 + y2 + z2 + 2x + 8y − 4z = 28
B(3, −3, 1) and C(4, 0, 3) is a right triangle.
1.75. 16x2 + 16y2 + 16z2 − 96x + 32y = 5
1.67. Find the equation of √ the sphere with center at the point
1.76. (a) Prove that the middle of the segment which is deter-
(4, −2, 3) and radius r = 3.
mined by the points A(a1 , b1 , c1 ) and B(a2 , b2 , c2 ) is the point
1.68. Find the equation of with coordinates
√ the sphere with center at the point
(−1, 3, 2) and radius r = 3. a1 + a2 b1 + b2 c1 + c2
!
, ,
2 2 2
1.69. Find the equation of the sphere with center at the point
(2, 3, 4) and radius 5. Where does the sphere intersect the (b) Find the lengths of the three medians of the triangle with
coordinate planes? vertices A(4, 1, 5), B(1, 2, 3), C(−2, 0, 5)
23
Linear Algebra Shaska T.
Determine the inequalities which determine the follow- 2
ing regions. 1.86. Find a + b, 3a − 2b, kak and |a − b|, if a = −4 and
4
1.77. The region between the plane xy and z = 5
0
b = 2 .
1.78. The region which consists of all points between spheres −1
of radii r and R with center at the origin, where r < R.
1.87." Find # v + w," v −# w, kvk and |v − w|, |v + w|, and −2v, if
1.79. Find the equation of the sphere with has the same center 1 −1
v= and w = .
with x2 + y2 + z2 − 6x + 4z − 36 = 0 and passes through the 3 −5
point (2, 5, −7).
1.88. Find v + w, v − w, kvk and |v − w|, |v + w|, and −2v, if
−1
1
1.80. Prove that the set of all points whose distance from
A(−1, 5, 3) is twice the distance from B(6, 2, −2), is a sphere. v = 2 and w = 2
3 −3
1.81. Determine an equation for the set of points equidistant 1.89. Find v + w, v − w, kvk and |v − w|, |v + w|, and −2v, if
from A(−1, 5, 3) and B(6, 2, −2).
1 −1
= =
0 −2
−→ v and w
1.82. Draw the vector AB, when A and B are given as below
1 2
and find its equivalent with the initial point at the origin.
1.90. Find the unit vector which has the same direction with
i) A = (0, 3, 1), B = (2, 3, −1) the vector −3 i + 7 j.
1.91. Find the unit vector which has the same direction with
ii) A = (4, 0, −2), B = (4, 2, 1) the vector 2 i − j + 3 k.
iii) A = (2, 0, 3), B = (3, 4, 5) 1.92. Find the unit vector which has the same direction with
2
iv) A = (0, 3, −2), B = (2, 4, −1) the vector 3 .
−2
1.83. Find the sum of the vectors and illustrate geometrically
1.93. Find
a vector which has the same direction with the
" # " # 3
1 4
, vector 2, but has length 3
i)
−4 3 1
1
2 1.94. Find
a vector which has the same direction with the
ii) −3 ,
0 −2
vector 4 , but has length 6.
5 1
2
" # " #
2 3 −1
iii) , 3
1 −1 1.95. Let be given the vectors v = 5 and w = 1.
−2 1
1
3 1. Find the vector u such that u + v + w = i.
iv) 1 , 2
2. Find the vector u such that u + v + w = 2 j + k.
2 1
−→ −→ −−→
1.96. If A, B, C are vertices of a triangle, find AB + BC + CA.
" #
5
1.84. Find a + b, 2a − 3b, kak and |a − b|, if a = and
−12
" # " # " #
3 2 7
" # 1.97. Draw the vectors a = , b= and c = . De-
3 2 −1 1
b= .
6 termine graphically if there exist the scalars s and t such that
c = sa + tb. Find the values for s and t.
1 1.98. Let be given ~x and ~y two nonzero vectors not parallel in
1.85. Find a − b, a + 2b, kak and |a − b|, if a = 2 and
R2 . Prove that if ~z is any vector in R2 , then there exist two
−3
scalars s and t such that ~z = s~
x + t~y.
−2
b = .
−1 1.99. Is the property from the previous problem true for R3 ?
5
Explain.
24
Shaska T. Linear Algebra
x x0
of all points (x, y, z) which satisfy
1.100. Let u = y and v = y0 be given vectors in R2 . De-
z z0
scribe the set of all points (x, y, z) which satisfy |u − v| = 1. |a − a1 | + |a − a2 | = k,
" # " # " #
x x1 x
1.101. Let a = ,a = , and a2 = 2 . Describe the set
y 1 y1 y2 where k > |a1 − a2 |
25
Linear Algebra Shaska T.
26
Chapter 2
Rn := {(x1 , . . . , xn ) | xi ∈ R}
u1
A vector u in R will be defined as an ordered tuple (u1 , . . . , un ) for ui ∈ R, i = 1, . . . , n and denoted by u = ... . For
n
un
u1 v1
any u, v ∈ Rn such as u = ... and v = ... we define the vector addition and scalar multiplication as follows:
un vn
u1 + v1 rv1
A Euclidean n-space is the set of vectors together with vector addition and scalar multiplication defined as
0
above. Elements of R are called vectors and all r ∈ R are called scalars. The vector 0 = ... is called the zero vector.
n
0
27
Linear Algebra Shaska T.
By a vector u we usually mean a column vector unless otherwise stated. The row vector [u1 , . . . , un ] is called the
transpose of u and denoted by
ut = [u1 , . . . , un ]
For the addition and scalar multiplication we have the following properties.
Theorem 2.1. Let u, v, w be vectors in Rn and r, s scalars in R. The following are satisfied:
1) (u + v) + w = u + (v + w),
2) u + v = v + u,
3) 0 + u = u + 0 = u,
4) u + (−u) = 0,
5) r (u + v) = ru + rv,
6) (r + s) u = r u + s u,
7) (rs) u = r (s u),
8) 1 u = u.
Proof. Exercise.
Two vectors v and u are called parallel if there exists an r ∈ R such that v = r u. Given vectors v1 , . . . , vs ∈ Rn and
r1 , . . . , rs ∈ R, the vector
r1 v1 + · · · + rs vs
is called a linear combination of vectors v1 , . . . , vs .
Definition 2.1. Let v1 , . . . , vs be vectors in Rn . The span of these vectors, denoted by Span (v1 , . . . , vs ), is the set in Rn of all
linear combinations of v1 , . . . , vs .
Span (v1 , . . . , vs ) = r1 v1 + · · · + rs vs | ri ∈ R, i = 1, . . . , s
r1 u1 + · · · + rn un = 0
implies that
r1 = · · · = rn = 0,
otherwise, we say that u1 , . . . , un are linearly dependent.
In the coming sections we will see that the concept of linear independence is one of the most important concepts
of linear algebra. Our strategy will be to try to generalize all concepts of R2 or R3 to Rn . Of course the geometric
interpretation in Rn doesn’t make sense, but this will not deter us to assign the same names to abstract concepts in
Rn as we had for R2 and R3 .
Exercises:
" # " #
2.1. Show that the formal definitions of the addition and scalar 3 5
2.4. Let v = and u = . Find scalars r, s such that
multiplication in R2 agree with the geometric interpretations. 5 6
2.2. Let u, v, w given as below #"
5
rv+su = .
3
1
0 11
v = , u = , w = 3 .
5 1
−1 7 4
2.5. What does it mean for two vectors u, v ∈ R2 to be linearly
Compute 2u + 3v − w. dependent?
1
3 " # " #
2.3. Let v = , u = 6 . Compute 2u + 3v. 0 1
2
2.6. What is the span of and in R2 ?
−1 −6 1 0
28
Shaska T. Linear Algebra
2.7. Let u, v, and w be given vectors as below 2.10. Let c be a positive real number and O1 , O2 points in the
xy-plane with coordinates (c, 0) and (−c, 0) respectively. Find
1
3
1 an equation that describes all points P of the xy-plane such
u = 2 , v = 4 , w = 1 . that
→ →
0 0 1 ||PO1 || + ||PO2 || = 2a,
2.9. Use vectors to decide whether the triangle with vertices P(x, y)
A = (1, −3, −2), B = (2, 0, −4), and C = (6, −2, −5) is right
angled. D
u1
Definition 2.3. Let u := ... ∈ Rn . The norm of u, denoted by kuk, is defined as
un
q
kuk = u21 + · · · + u2n
u · v := u1 v1 + · · · + un vn ,
kvk2 = v · v
is very useful.
29
Linear Algebra Shaska T.
i) u · v = v · u
ii) u · (v + w) = u · v + u · w
iii) r (u · v) = (r u) · v = u · (rv)
Proof. Use the definition of the dot product to check all i) through iv).
Two vectors u, v ∈ Rn are called perpendicular if u · v = 0.
|u · v| ≤ ||u|| · ||v||
Proof. If one of the vectors is the zero vector, then the inequality is obvious. So we assume that u, v are nonzero.
For any r, s ∈ Rn we have krv + suk ≥ 0. Then,
and
|u · v| ≤ ||u|| · ||v||.
Lemma 2.3 (Triangle inequality). For any two vectors u, v in Rn the following hold
ku + vk ≤ kuk + kvk
Proof. We have
ku + vk2 = (u + v) · (u + v)
= (u · u) + 2(u · v) + (v · v) = kuk2 + 2(u · v) + kvk2 ≤ kuk2 + 2 |u · v| + kvk2
≤ kuk2 + 2 · kuk · kvk + kvk2 = (kuk + kvk)2
Hence,
kv + uk ≤ kvk + kuk.
Example 2.1. Let u and v be two given vectors and θ the angle between them. Prove that
30
Shaska T. Linear Algebra
Hence, we have the following definition. The angle between two vectors u and v is defined to be
u·v
θ := cos−1
kuk · kvk
From Lem. (2.2) we have that
u·v
−1 ≤ ≤1
kuk · kvk
Hence, the angle between two vectors is well defined.
Example 2.2. Find the angle between
−1
2
u = −1 , and v = −1
2 1
! √ !
(2, −1, 2) · (−1, −1, 1) 3
θ = cos −1
√ √ = cos−1
.
9· 3 9
u·v
w = v − proju (v) = v − u. (2.3)
u2
We will see later in the course how this idea is generalized in Rn to the process of orthogonalization and is used in
the method of least squares.
Exercise 2.1. The above discussion provides a method that for any two given vectors u and v we can determine a vector w
which is perpendicular to u. Can you devise a similar argument for three vectors u1 , u2 , u3 ? In other words, determine v and
w from u1 , u2 , u3 such that the set of vectors {u, v, w} are pairwise perpendicular.
Exercises:
2.11. Prove that the triangle with vertices A(−2, 4, 0), 2.13. Describe the region that determine the following equa-
B(1, 2, −1) and C(−1, 1, 2) is regular. tions or inequalities in R3 .
31
Linear Algebra Shaska T.
3. x < 5 t
1
2.20. For what values of t are the vectors u = 0 and v = −t
4. z = −4
2
t t
perpendicular?
5. y > −3
2.21. Show that the distance d from a point P = (x0 , y0 ) to a
6. x2 =4 line
ax + by + c = 0
7. x = y
is given by
|ax0 + by0 + c|
8. x2 + y2 + z2 ≤3 d= √ .
a2 + b2
2.14. Let 4 ABC be any given triangle and θ the angle between
2.22. Let the vectors u, v, w have the same origin in R3 and
AB and AC. Prove the law of cosines in a triangle
coordinates
BC = AB + AC − 2 AB · AC · cos θ
2 2 2
1
2 −1
u = 2 , v = 2 , and w = −1 .
2.15. Show that for any two vectors u and v the following is 2
−3
−1
true
(v − w) · (v + w) = 0 ⇐⇒ ||v|| = ||w|| Compute the volume of the parallelepiped determined by
u, v, w.
2.16. Let a and b be the sides of a parallelogram and its diag-
onals d1 , d2 . Show that, 2.23. Let the vectors u, v ∈∈ R3 be given as below
1
d21 + d22 = 2(a2 + b2 ). 1
u = and v = 2 .
2
2 −3
2.17. Prove that two diagonals of a parallelogram are perpen-
dicular if and only if all sides are equal.
Find the projection of u on v.
1
2 2.24. Let u, v, w be vectors in R3 as follows:
2.18. Find the angle between the vectors u = and v = 2
2
−3 −1
1
2
2
and the area of the triangle determined by them. u = 2 , v = 2 , w = −1 .
2 −3 −1
2.19. Let u be the unit vector tangent to the graph of y = x + 1
2
at the point (2, 5). Find a vector v perpendicular to u. Find the projection of u onto the vw-plane.
A = [u1 , . . . , un ]
ai,1
a
i,2
ui = .
..
ai,m
32
Shaska T. Linear Algebra
as follows:
a1,1 a1,2 a1,3 ... a1,n
a
2,1 a2,2 a2,3 ... a2,n
a3,1 a3,2 a3,3 ... a3,n
A = [ai, j ] = ·
(2.4)
·
·
am,1 am,2 am,3 ... am,n
The i-th row of A is the vector
Ri := (ai,1 , . . . , ai,n )
and the j-th column is the vector
a1, j
·
C j :=
· .
·
an,j
Let A = [ai, j ] be an m × n matrix and B = [bi, j ] be a n × s matrix. The matrix product AB is the n × s matrix C = [ci, j ]
such that ci,j is the dot product of the i-th row vector of A and the j-th column vector of B.
The matrix addition is defined as
A + B = [ai,j + bi, j ],
and the multiplication by a scalar r ∈ R is defined to be the matrix the matrix
rA := [rai,j ].
The m × n zero matrix, denoted by 0, is the m × n matrix which has zeroes in all its entries. An m by n matrix A
is called a square matrix if m = n. If A = [ai,j ] is a square matrix then all entries ai,i form the main diagonal of A.
The n by n identity matrix, denoted by In , is the matrix which has 1’s in the main diagonal and zeroes elsewhere.
A matrix that can be written as rI is called a scalar matrix. Two matrices are called equal if their corresponding
entries are equal. Notice that the arithmetic of matrices is not the same as the arithmetic of numbers. For example,
in general AB , BA, or AB = 0 does not imply that A = 0 or B = 0. We will study some of these properties in detail
in the next few sections. Next we state the main properties of the algebra of matrices.
Theorem 2.3. Let A, B, C be matrices of sizes such that the operations below are defined. Let r, s be scalars. Then the following
hold:
i) A + B = B + A
ii) (A + B) + C = A + (B + C)
iii) A + 0 = 0 + A = A
iv) r(A + B) = rA + rB
v) (r + s)A = rA + sA
vi) (rs)A = r(sA)
vii) (rA)B = A(rB) = r(AB)
viii) A(BC) = (AB)C
ix) IA = A = AI
x) A(B + C) = AB + AC
xi) (A + B)C = AC + BC
Proof. Most of the proofs are elementary and we will leave them as exercises for the reader.
The trace of a square matrix A = [ai,j ] is the sum of its diagonal entries:
33
Linear Algebra Shaska T.
Proof. The first part is obvious. We prove only part ii). Let A = [ai,j ] and B = [bi,j ] be n × n matrices. Denote
AB = C = [ci,j ] and BA = D = [di,j ]. Then
where Ri (A) is the i-th row of A and Ci (B) is the i-th column of B. This completes the proof.
Example 2.3. For matrices A and B given below
4 2 2 1 2 61
A = 0 3 1 , B = 3 -3 1
21 10 -2 31 2 1
74 6 248
AB = 41 -7 4 .
-13 8 1289
At := [a j,i ].
A is called symmetric if A = At . Note that for a square matrix A its transpose is obtained by simply rotating the
matrix along its main diagonal.
Lemma 2.5. For any matrix A the following hold
i) (At )t = A,
ii) (A + B)t = At + Bt ,
iii) (AB)t = Bt At .
Proof. Parts i) and ii) are easy. We prove only part iii). Let A = [ai,j ] and B = [bi,j ]. Denote AB = [ci, j ]. Then,
(AB)t = [c j,i ] where
c j,i = R j (A) · Ci (B) = C j (At ) · Ri (Bt ) = Ri (Bt ) · C j (At ).
This completes the proof.
Example 2.4. For matrices A and B given below
4 2 2 1 2 61
A = 0 3 1 , B = 3 -3 1
21 10 -2 31 2 1
Solution: We have
4 0 21 1 3 31
At = 2 3 10 , Bt = 2 -3 2 .
2 1 -2 61 1 1
Computing (A + B)t , (AB)t , and (BA)t is left as an exercise for the reader.
Let A be a square matrix. If there is an integer n such that = I then we say that A has finite order, otherwise
An
A has infinite order. The smallest integer n such that An = I is called the order of A.
Exercises:
34
Shaska T. Linear Algebra
2.25. Find the trace of the matrices A, B, A + B, and A − B, 2.31. Let A be a square matrix. Show that (An )t = (At )n .
where A and B are
4
2 2
1 2 6
2.32. Prove or disprove the identity
A = 0 3 1 , B = 3 -3 1
(A + B)2 = A2 + 2AB + B2 ,
21 10 -1 31 0 13
a1,1 x1 + · · · + a1,n xn = b1
a2,1 x1 + · · · + a2,n xn = b2
........................
a x +···+a x = b
m,1 1 m,n n m
35
Linear Algebra Shaska T.
where
a1,1 a1,2 a1,3 ... a1,n x1 b1
...
a
2,1 a2,2 a2,3 a2,n x
2
b
2
a3,1 a3,2 a3,3 ... a3,n x3 b3
A = [ai,j ] = · , x = , b = .
·
·
...
am,1 am,2 am,3 am,n xm bm
We would like to use matrices and design an algorithm which can determine if such a system has a solution and in
the case it does, find that solution. The matrix [A | b] denotes the following matrix
It is obvious that such operations on the augmented matrix do not change the solution set of the system. If the
matrix B is obtained by performing row operations on A then matrices A and B are called row equivalent .
1) All rows containing all zeroes are below rows with nonzero entries.
2) The first nonzero entry in a row appears in a column to the right of the first nonzero entry in any preceding row.
For a matrix in row-echelon form, the first nonzero entry in a row is the pivot for that row.
Example 2.5. Using row operations find the row echelon form of the matrix
1 2 3
A = 2 0 1
3 2 2
36
Shaska T. Linear Algebra
1 2 3 1 2 3 1 2 3
R3 → 13 R3 R3 →R2 − 32 R3
−→ 0
2 5 R3 →R
−→
1 −R3 0 2 5 −→ 0 2 5
2 2 2
2 2 4 7
1 3 3
0 3 3
0 0 −1
Row operations are fast and inexpensive operations. Below we give the algorithm of how to transform a matrix in
row-echelon form.
The row-echelon form of matrices is used to solve linear systems of equations. Let A x = b, be a linear equation. We
create the augmented matrix [A | b] and find its row-echelon form, say [H | c]. Using back substitution we solve the
system
Hx = c.
We illustrate with an example.
Example 2.6. Solve the linear system
x2 − 3x3 = −5
2x1 + 3x2 − x3 = 7
4x1 + 5x2 − 2x3 = 10
Solution: Then
0 1 -3 -5 2 3 -1 7
[A | b] = 2 3 -1 7 [H | c] = 0 1 -3 -5
4 5 -2 10 0 0 -3 -9
by performing the operations R1 −→ R2 , R3 → R3 − 2R1 , R3 → R3 + R2 . Thus the linear system is equivalent with the following
system
2x1 + 3x2 − x3 = 7
x2 − 3x3 = − 5
−3x3 = − 9
This method is known as the Gauss method.
Theorem 2.4. Let
Ax = b
be a linear system and [A | b] [H | c], where [H | c] is in row-echelon form. Then one of the following hold:
37
Linear Algebra Shaska T.
1. Ax = b has no solution if and only if H has a row of all zeroes and in the same row c has a nonzero entry.
2. If Ax = b has solutions then one of the following holds: i) it has a unique solution if every column of H contains a pivot
ii) it has infinitely many solutions if some column of H contains no pivot
ax = b
has no solution if and only if a = 0 and b , 0. It has a unique solution if and only if a , 0 and b , 0 and infinitely
many solutions if and only if a = b = 0.
If H has a row of all zeroes and in the same row c has a nonzero entry cn , 0 then the equation
0 · xn = cn
has no solution and therefore the linear system Ax = b has no solution. The converse also hold from the definition
of the row-echelon form. Parts 2, i) and 2, ii) follow similarly.
Example 2.7. Find how many solutions the following system has:
2x + 5y = 3
(
6x + 15y = 9
" # " #
2 5 3 2 5 2
[A | b] = [H | c] =
6 15 9 0 0 0
From the above theorem the system has infinitely solutions. Of course, this is easy to see since the second equation of the system
is obtained by multiplying the first equation by 3.
The above theorem can be interpreted geometrically in the case of a 2 by 2 or a 3 by 3 coefficient matrix. For
example in the case of a linear system of 2 equations and 2 variables we have the well known situation of two lines
on the plane. It is known from geometry that two lines intersect in one point, no points, or infinitely many points.
Exercises:
Solve the linear systems using the Gauss method with Find the row-echelon form of the following matrices
back substitution.
2.39.
2.36. 0
1 -3 -5
x + 5y = 2
(
0 3 0 1
3x + 2y = 9 4 5 -2 10
2.37. 2.40.
0 0 0 0
2x + y − 3z = 0
1
1 -3 -3
6x + y − 8z = 0
1 3 0 0
2x − y + 5z = −4
2 5 -2 1
38
Shaska T. Linear Algebra
2.42. Determine all values of b1 , b2 such that the following 2.44. Find a, b, c and d such that the quartic
system has no solutions
(
x1 + 2x2 = b1 y = ax4 + bx3 + cx2 + d
− 2x1 − 4x2 = b2 passes through the points (3, 2), (-1, 6), and (-2, 38), and (2, 6).
2.43. Find a, b, and c such that the parabola
y = ax2 + bx + c
2.45. Find a polynomial function going through the points (3,
passes through the points (1,-4), (-1,0), and (2,3). 1, -2), (1, 4, 5), and (2, 1, -4).
2 3 -1 7
[H | c] = 0 1 -3 -5 .
0 0 -3 -9
Solution: To find the reduced row-echelon form we perform the following row-operations
2 3 -1 7 R1 → 1 R1 , R3 →− 1 R3
[H | c] = 0 1 -3
2 3
-5
−→
0 0 -3 -9
3
- 12 7
1 2 2 R1 →R1 − 3 R2 1 0 4 11 R →3R +R
2 0 1 -3 -5 2 3 2
0 1 -3 -5 −→ −→
0 0 1 3 0 0 1 3
1 0 4 11 R →R −4R 1 0 0 -1
0 1 0 4 1 1 3 0 1 0 4
−→
0 0 1 3 0 0 1 3
Hence, we can directly conclude that the solution to the corresponding system is
-1
x = 4 ,
3
as concluded previously.
39
Linear Algebra Shaska T.
Remark 2.1. Notice that the reduced row-echelon form of a matrix A , on contrary to the row-echelon form, is unique.
The method that transforms the augmented matrix to the reduced row-echelon form is called the Gauss-Jordan
method.
Remark 2.2. Even though the Gauss-Jordan method gives the solution in a "nicer" form, it is not necessarily better than the
Gauss method. For large linear systems the number of operations performed becomes significant. Using the Gauss-Jordan
method, it takes roughly 50% more arithmetic operations than using the Gauss method.
Example 2.9. Find the reduced row-echelon form of the matrix.
2 1 -2 1
[A | b] = -2 1 1 2
-2 -1 2 2
Show all the row operations. What are the solutions of the corresponding system Ax = b?
− 34
1 0 0
[H | c] = 0 − 12
1 0
0 0 0 1
Ax = 0.
Clearly x = 0 is a solution of such systems and is called the trivial solution. The augmented matrix for such systems
is [A | 0] and its row-echelon form will be [H | 0]. The system has nontrivial solutions if there is a row of H with no
pivots. We will see in Chapter 3 that this is equivalent with the determinant of the matrix A being nonzero.
Exercises:
40
Shaska T. Linear Algebra
2.46. Find the reduced row-echelon form of A 2.50. Solve the following system using the Gauss method
1 2 3 5x1 + 3x2 − x3 = −2
A =
2 0 1
2x1 + 2x2 + 2x3 = 3
3 2 2
− x1 − x2 + x3 = 6
and solve the linear system Ax = 0.
2.51. Solve the following system using the Gauss-Jordan
method
11x1 + 12x2 − 3x3 = 2
2.47. Find the reduced row-echelon form of A
− x1 + 3x2 + 2x3 = 3
0 1 -3 -5
2x1 + 3x2 + x3 = −2
A =
0 3 0
1
4 5 -2 10
2.52. Prove that the reduced row-echelon form of a matrix is
and solve the linear system Ax = 0. unique.
2.48. Find the reduced row-echelon form of A 2.53. Let Ax = 0 be a homogenous system which has no non-
trivial solutions. What is the reduced row-echelon form of A ?
0 0 0 0
1 1 -3 -3
A = 2.54. Find a, b, and c such that the parabola
1 3 0 0
2 5 -2 1
y = ax2 + bx + c
and solve the linear system Ax = 0. passes through the points (1,2), (-1,1), and (2,3).
Definition 2.7. Let A = [ai, j ] be a n × n square matrix. A is called invertible if there exists an n × n matrix A−1 such that
AA−1 = A−1 A = In .
A−1 is called the inverse of A and A is called invertible. If A is not invertible then it is called singular.
Theorem 2.5 (Uniqueness of the inverse). Let A be an invertible matrix. Then, its inverse is unique.
AC = I = AD and CA = I = DA
Then we have
D(AC) = DI = D
(2.5)
D(AC) = (DA)C = IC = C
Hence, C = D.
We also have the following useful result.
41
Linear Algebra Shaska T.
Proof. Exercise
Definition 2.8. Any matrix that can be obtained from the identity matrix In by one row operation is called an elementary
matrix .
Theorem 2.6. Let A be an m × n matrix and E an m × m elementary matrix. Then E A affects the same row operation on A as
the one performed in In to obtain E.
Ri ←→R j
Im −→ E.
Then the new Ri (E) = (0, . . . , 0, 1, 0, . . . 0) where 1 is in the j-th position. Hence, the entries of Ri (E A) are
Proof. Let E1 be an elementary matrix. Then E1 is obtained by some row operation on the identity I. Since every
row-operation can be undone then we can perform a new row-operation on E1 to obtain I. The second row
operation corresponds to another elementary matrix E2 such that E2 E1 = I; see the previous theorem. Thus, E1 has
an inverse.
Solution: E is obtained by interchanging rows R2 ←→ R4 of the identity matrix. So E is an elementary matrix and therefore
invertible. Its inverse is E since E2 = I.
42
Shaska T. Linear Algebra
Proof. It is enough to show that if AB = In then BA = In , the other direction goes by symmetry of A and B. Hence,
we assume that AB = In . Let b be any vector in Rn . Then ABb = b. Thus the system Ax = b has always a solution
(namely x = Bb). By Theorem 2.4 the reduced row-echelon form of A is In . Hence, there are E1 , . . . , Ek such that
Ek · · · E1 A = In (2.6)
Ek · · · E1 (AB) = B.
-1 1 0 2
0 2 1 0
A =
0 1 -2 1
0 -1 -1 0
Solution: Create the matrix [A | I]. Then its reduced row-echelon form is:
1 0 0 0 -1 -5 2 -9
0 1 0 0 0 1 0 1
[I | C] =
0 0 1 0 0 -1 0 -2
0 0 0 1 0 -3 1 -5
Hence,
-1 -5 2 -9
0 1 0 1
A = C =
−1
0 -1 0 -2
0 -3 1 -5
Example 2.13. Let A be given
1 0 0 -1
1 1 1 0
A =
-1 1 1 0
0 0 -1 -1
43
Linear Algebra Shaska T.
1 0 0 -1 1 0 0 0
1 1 1 0 0 1 0 0
[A | I] =
-1 1 1 0 0 0 1 0
0 0 -1 -1 0 0 0 1
Remark 2.3. We have illustrated above how to find the inverse of a matrix. However, such an inverse does not always exist.
In the next chapter we will study some necessary and sufficient conditions such that the inverse of a matrix exists.
Exercises:
2.56. a) Let A be a square matrix such that A2 = 0. Find the 2.60. Show that if B is invertible, then tr (A) = tr (BAB−1 ).
inverse of I − A.
b) Let A be a square matrix such that A2 + 2A + I = 0. Find
2.61. Let
the inverse of A.
1 2 -1
c) Let A be a square matrix such that A − A + I = 0. Find
3
A = 0 3
1
the inverse of A.
2 0 1
d) Let A be a square matrix such that A = 0. Find the
n
1 2 3 3 0 1
2.58. For what values of a, b, c, d does the inverse of
A = -2 1 2 , B = 2 0 2 ,
" #
3 2 1 0 2 1
a b
A=
c d
be given. Find the following: tr(A), tr(B), At , AB, Bt At ,
tr(BAB−1 ).
exist? Find the inverse for such values of a, b, c, d.
2.59. Solve the linear system 2.64. Show that if A is invertible then so is At .
Ax = b
2.65. Let r be a positive integer and A an invertible matrix.
when A is invertible. Is Ar necessarily invertible? Justify your answer.
Review exercises
44
Shaska T. Linear Algebra
2.66. Find the reduced row-echelon form of the matrix. Show 2.78. Find all matrices which commute with
all the row operations. " #
0 1
4 2 3 3
-2 0 2
1 1 2
3 -1 2 1 2.79. Show that if A commutes with B then At commutes
with Bt .
1
5
2.67. Find the angle between the vectors u = 2 and v = 1.
3 8
2.80. Let V be the set of all m by n matrices with entries in R.
2.68. Determine all values of b1 , b2 such that the following Show that scalar matrices commute with all matrices from V.
system has no solutions Are there any other matrices which commute with all matrices
of V?
x1 + 2x2 − x3 = b1
− 2x1 − 4x2 + 2x3 = b2
2.81. Let a, b, c, d be real number not all zero. Show that the
x1 − x2 + x3 = 2
following system has exactly one solution
2.69. Find the area of the triangle between the three points
ax1 + bx2 + cx3 + dx4 = 0
(1, 2), (3, 4), (5, 6).
bx1 − ax2 + dx3 − cx4 = 0
2.70. Let the matrices
cx1 − dx2 − ax3 + bx4 = 0
3 2 3 2 -2 1
dx1 + cx2 − bx3 − ax4 = 0
A = -2 1 2 , B = 2 0 2 ,
0 1 1 0 2 2
2.82. For what value of λ does the following system has a
be given. Find the following: tr(A), tr(B), At , AB, Bt At , solution:
2x1 − x2 + x3 + x4 = 1
tr(BAB−1 ).
x1 + 2x2 − x3 + 4x4 = 2
2.71. Show that if AB is invertible then so are A and B.
x1 + 7x2 − 4x3 + 11x4 = λ
2.72. A square matrix is called upper triangular if all entries
2.83. The following system has a unique solution:
below the mail diagonal are zero. What is the sum and product
of upper triangular matrices? Justify your answer.
ay + bx = c
2.73. A square matrix is called lower triangular if all entries cx + az = b
above the mail diagonal are zero. What is the sum and product
bz + cy = a.
of lower triangular matrices? Let V := Matn×n (R) be the set
of all n × n matrices with entries in R, W1 the set of all upper Show that abc , 0. Find the solution of the system.
triangular matrices of V, and W2 the set of all lower triangular
matrices of V. What is the intersection W1 ∩ W2 ? 2.84. Find the following:
" #n " #n " #n
2.74. Let A be a 3 by 2 matrix. Show that there is a vector b 1 1 1 0 1 1
such that the linear system , ,
0 1 1 1 1 1
Ax = b
2.85. Let " #
is unsolvable. a b
A= ,
c d
2.75. Let A be an m × n matrix with m > n. Show that there
exists a b such that the linear system Ax = b is unsolvable. such that A2 = I. Show that the following relation is satisfied
2.76. Let A be a m × n matrix and B an n × m matrix, where when x is substituted by A:
m > n. Use the above result to show that the row-echelon form x2 − (a + d)x + (ad − bc) = 0.
of the matrix AB has at least one row of all zeroes.
2.77. Find all matrices B such that 2.86. Let A be a 3 by 3 matrix. Can you generalize the above
problem in that case? What about if A is an n by n matrix?
" # " #
0 1 0 0 2.87. Find the order of the following matrices
i) B=
0 2 0 0
" # " # " # " # " # " #
0 1 0 0 1 1 -1 1 -1 -1 1 1 -1
ii) B= , , ,
0 2 0 0 2 1 0 0 1 0 1 -1 0
45
Linear Algebra Shaska T.
46
Chapter 3
Vector Spaces
In this chapter we formally define vector spaces. After discussing Euclidean spaces in the previous chapter, the
concept of the vector space here will be more intuitive. Throughout this chapter k denotes a field. For our purposes
k is always one of the following: Q, R, C.
”+” : Z×Z → Z
(3.2)
(a, b) → a + b
”+” : V×V → V
(3.3)
(u, v) → u + v
”?” : k×V → V
(3.4)
(r, u) → r ? u
Definition 3.1. The set V together with the binary operations above, denoted by (V, +, ?), is a vector space over k if the
following are satisfied:
1. (u + v) + w = u + (v + w), ∀ u, v, w ∈ V
2. u + v = v + u, ∀ u, v ∈ V
3. ∃ 0 ∈ V, s.t. 0 + u = u + 0 = u, ∀u ∈ V
47
Linear Algebra Shaska T.
where a0 , . . . , an ∈ k. We define the sum and the scalar product of two polynomials to be
for any r ∈ k. Then, k[x] is a vector space over k. k[x] is also called the polynomial ring of univariate polynomials; see Chapter
4 for more details.
Example 3.4 (The space of n × n matrices). The set of n × n matrices with entries in a field k, together with matrix addition
and scalar multiplication forms a vector space. We denote this space by Matn×n (k).
Example 3.5 (The space of functions from R to R). Let L(R) denote the set of all functions
f : R −→ R
f : S −→ k
Let V denote the set of all k-valued functions. We define the sum and the scalar product of two functions in V to be
48
Shaska T. Linear Algebra
There are some subsets of a vector space V which are of special importance. A subset W ⊂ V is called a subspace
(or a linear subspace) of V if it is a vector space by itself. Next we see some examples of subspaces of a vector
space.
Example 3.7. Let V = R3 . Then every v ∈ V is a triple (x, y, z), which we have denoted by
x
v = y
z
u + v ∈ S.
ru ∈ S.
Lemma 3.1. Any subset W ⊂ V is a vector space if and only if it is closed under addition, scalar multiplication, and contains
0.
Proof. Exercise
Example 3.8. Let V = R3 and P be the plane determined by the vectors u and v going through the origin. This plane is a
vector space because: it contains the zero vector, every sum of two vectors in P is again in P, and every vector in P multiplied
by a scalar is again in P.
Example 3.9 (The nullspace of a matrix:). Let A be a given matrix. Consider the set of all vectors in Rn which satisfy the
equation
Ax = 0.
We call this set the nullspace of A. It is a subspace of Rn . The proof is easy and is left as an exercise.
Definition 3.2. Let V be a vector space over k and v1 , . . . vn ∈ V. Then, v is a linear combination of v1 , . . . vn if it can be
written as
v = r1 v1 + · · · + rn vn
where r1 , . . . , rn ∈ k.
Lemma 3.2. Let V be a vector space and v1 , . . . , vn ∈ V. The set W of all linear combinations of v1 , . . . , vn , is a subspace of V.
Proof. Exercise.
49
Linear Algebra Shaska T.
r1 u1 + · · · + rn un = 0
implies that
r1 = · · · = rn = 0,
otherwise, we say that u1 , . . . , un are linearly dependent.
Hence, a set of vectors u1 , . . . , un are linearly dependent if one of them is expressed as a linear combination of
the other ones.
2
1
1
Example 3.10. Show that u1 = , u2 = , and u3 = 1 are linearly independent in R3 .
3 2
1 1 1
Solution: We want to find if there exist r1 , r2 , r3 , not all zero such that
r1 u1 + r2 u2 + r3 u3 = 0.
We have
2r1 + r2 + r3 0
3r1 + 2r2 + r3 = 0
r1 + r2 + r3
0
The augmented matrix and its reduced row-echelon form are
2 1 1 0 1 0 0 0
[A | 0] = 3 2 1 0 [H | c] = 0 1 0 0
1 1 1 0 0 0 1 0
Since every row has a pivot then the system has a unique solution (r1 , r2 , r3 ) = (0, 0, 0). This concludes that u1 , u2 , u3 are
linearly independent.
The following example should be familiar to students who have had a course in differential equations:
Example 3.11. Let L(R) be the vector space of all real-valued functions in t. Show that the following pair of functions
sin t, cos t are linearly independent.
Exercises:
3.1. Let U, W be subspaces of V. Define the sum of subspaces Show that Wu is a subspace of V.
of U and W by
3.3. Let S be a set and V a vector space over the field k. Show
U + W := {u + w | u ∈ U, w ∈ W}. that the set of functions
Show that U ∩ W and U + W are subspaces of V. f : S → k,
3.2. Let u ∈ V = Rn and
under function addition and multiplication by a constant is a
Wu := {v ∈ V | u · v = 0}. vector space.
50
Shaska T. Linear Algebra
3.4. Let L(R) be the vector space of all real-valued functions 3.9. Let Q be the set of rational numbers and
in t. Show that the following pairs are linearly independent. √ √
i) t, et Q( 2) := {a + b 2 | a, b ∈ Q}.
ii) sin t, cos 2t √
iii) tet , e2t Prove that Q( 2) is a vector space over Q with the usual
iv) t, sin t. addition and scalar multiplication.
3.10. We know that the set of complex numbers C is given by
3.5. An upper triangular matrix is a matrix A = [ai,j ] such
that ai, j = 0 for all i < j. Show that the space of upper triangular C := {a + bi | a, b ∈ R}
matrices is a subspace of Matn×n (R). √
where i = −1. Is C a vector space over R with the usual
3.6. Prove that k[x] is a vector space over the field k. addition and scalar multiplication?
3.7. Let k be a field and A := k[x] the polynomial ring. Denote 3.11. Let V be the set of 2 by 2 matrices of the form
by An the set of polynomials in A of degree n. Is An a subspace " #
0 x
of A? Justify your answer.
y 0
3.8. Let k be a field and A := k[x] the polynomial ring. De-
where x, y are any scalars in R. Is V a vector space over R?
note by Pn the set of polynomials in A of degree ≤ n. Is Pn a
subspace of A? Justify your answer. 3.12. Is R a vector space over Q?
B = {i, j}
Solution: Indeed, we know from calculus that every vector v ∈ R2 can be written as a linear combination of i and j as follows:
v = r1 i + r2 j
x1 v1 + · · · + xn vn = y1 v1 + · · · + yn vn ,
then
xi = yi , for i = 1, . . . , n.
Proof. From
x1 v1 + · · · + xn vn = y1 v1 + · · · + yn vn
we get that
(x1 − y1 )v1 + · · · + (xn − yn )vn = 0.
51
Linear Algebra Shaska T.
xi = yi , for i = 1, . . . , n.
The theorem motivates the following definition:
Definition 3.5. Let V be a vector space, B := {v1 , . . . , vn } a basis of V, and u ∈ V given by
u := x1 v1 + · · · + xn vn .
w1 = x1 v1 + · · · + xm vm .
We know that w1 , 0 since B2 is a basis, thus at least one of x1 , . . . , xm is , 0. Without loss of generality we may
assume that x1 , 0. Then we have
x1 v1 = w1 − x2 v2 − · · · − xm vm
Hence,
1 x2 xm
v1 = w1 − v 2 − · · · − vm .
x1 x1 x1
The subspace W generated by {w1 , v2 , . . . , vm } contains v1 . Hence, W = V. We continue this procedure until we
replace all v2 , . . . , vm by w2 , . . . wm . Thus, we have that the set
{w1 , . . . , wm }
generates V. Then for each i > m we have wi as a linear combination of w1 , . . . , wm . This is a contradiction because
w1 , . . . , wn are linearly independent since B2 is a basis. Hence, m ≥ n. Interchanging the roles of B1 and B2 we get
m = n.
Hence, we have the following definition.
Definition 3.6. Let V be a vector space and B a basis of V. Then,
dim(V) := |B|
Proof. Exercise.
Corollary 3.1. Let V be a vector space and W a subspace of V. If dim(W) = dim(V) then W = V.
52
Shaska T. Linear Algebra
Proof. Take a basis B = {w1 , . . . wn } of W. Hence, w1 , . . . , wn are linearly independent. Then from the above theorem
they generate V.
dim(W) ≤ dim(V).
Proof. Exercise.
" # " #
1 2
Example 3.13. Let u = and v = be vectors in V = R2 . What is the space W = Span (u, v)?
3 7
Solution: From the previous examples we know that dim(V) = 2. Then from the previous corollary
dim(W) ≤ 2.
Since u and v are not multiples of each other then they are independent. Hence, dim(W) = 2. From Corollary 3.1 we have that
W = R2 .
Example 3.14. Let V = Mat2×2 (R). Find a basis for V and its dimension.
r1 M1 + r2 M2 + r3 M3 + r4 M4 = 0,
which gives
r1 = r2 = r3 = r4 = 0.
Remark 3.1. In general, one can find a basis of Matn×n (k) as above and show that the dimension is n2 .
53
Linear Algebra Shaska T.
54
Shaska T. Linear Algebra
1 -1 2 3
2 3 4 3
A =
3 1 2 1
1 5 6 5
− 52
1 0 0
0 1 0 − 35
7
0 0 1
5
0 0 0 0
Thus, the basis of W is B = {w1 , w2 , w3 }.
of elementary vectors. Obviously this set generates Rn since every vector in Rn can be written as a linear combination
of elements in B.
Create the matrix A = [w1 , . . . , wn ]. Then A = I so it is already in reduced row-echelon form. Since every column
has a pivot, then elements of B are linearly independent.
The basis B is called the standard basis of Rn .
Example 3.16. Let P4 be the vector space of polynomials with real coefficients and degree ≤ 4. Determine whether
{ f1 , f2 , f3 , f4 , f5 } given as below
f1 = 2x4 − x3 + 2x2 − 1
f2 = x4 − x
f3 = x4 + x3 + x2 + x + 1
f4 = x2 − 1
f5 = x − 1
Solution: We take the basis B = {x4 , x3 , x2 , x, 1} for P4 . The reader should verify that this is a basis for P4 . Then the coordinates
of f1 , f2 , f3 , f4 , f5 with respect to the basis B are
2 1 1 0 0
−1 0 1
0
0
f1 = 2 , f2 = 0 , f3 = 1 , f4 = 1 , f5 = 0 .
0 −1 1 0 1
−1 0 1 −1 −1
55
Linear Algebra Shaska T.
We can determine whether the polynomials are independent by determining whether the corresponding coordinate vectors in
R5 are independent. The corresponding matrix is
2 1 1 0 0
-1 0 1 0
0
2 0 1 1 0
0 -1 1 0 1
-1 0 1 -1 -1
and its reduced row-echelon form is the identity matrix I5 . Since every column has a pivot then the vectors are independent in
R5 and therefore f1 , . . . f5 are independent in P4 . The dimension of P4 is dim P4 = 5. Hence { f1 , f2 , f3 , f4 , f5 } form a basis for
P4 .
Exercises:
3.13. Let V be a vector space over k. If a set of vectors is 3.18. Let V = k[x]. Show that
linearly independent in V, prove that the set does not contain
the zero vector. f1 = x6 + x4 and f2 = x6 + 3x4 − x,
3.14. Let W = Span (w1 , w2 , w3 ) ⊂ R4 such that are linearly independent.
1 −1 1 3.19. Let k be a field and V := k[x] the vector space of poly-
2 3 4 nomials in x. Denote by Pn the space of polynomials in V of
w1 = , w2 = , w3 = (3.10)
3 1 0 degree ≤ n. Find a basis for Pn .
1 5 6
3.20. Let V be the vector space of functions f : R → R. Let
Find a basis for W. W be the subspace of V such that
3.15. Let W = Span (w1 , w2 ) ⊂ R6 such that W := Span (sin2 x, cos2 x).
1 2
Show that W contains all constant functions.
2 4
3 6
3.21. Let V be the vector space of functions f : R → R. Show
w1 = , and w2 =
(3.11) that the set
1 2
9
18
{1, sin x, sin 2x, . . . , sin nx}
5 10
is an independent set of vectors in V.
Find a basis for W. 3.22. Let V be the vector space of functions f : R → R. Find
3.16. Prove that any set B ⊂ Rn of n non-zero vectors which a basis of the subspace
are mutually perpendicular form a basis for Rn .
W = Span (3−sin x, 2 sin 2x−sin 3x, 3 sin 2x−sin 4x, sin 5x−sin 2x}.
3.17. Let V = Mat3×3 (R). Find a basis for V and its dimen-
sion. Hint: Use the previous problem.
56
Shaska T. Linear Algebra
Theorem 3.6 (Rank-Nullity Theorem). Let A be an m × n matrix and H its row-echelon form
i) rank (A) = number of pivots of H
ii) null (A)= number of columns without a pivot
Moreover,
rank (A) + null (A) = n
Proof. All is left to show is that null (A) = # columns without pivots in the row-echelon form. This is obvious since
we have as many free variables for the corresponding linear system as we have columns without pivots in the
row-echelon form of A.
Example 3.17. Find the rank, nullity, a basis for the row space, a basis for the column space, and a basis for the nullspace of
the matrix
2 1 1
A = 3 2 2
1 1 1
2 1 1 1 0 1
A = 3 2 2 H = 0 1 1
1 1 1 0 0 0
Then
rank (A) = 2, and null (A) = 1.
A basis for the column space is
2 1
B1 = , .
3 2
1 1
To find a basis of the row-space we use the rows from H which contains pivots. So we have
1 0
B2 = 0 , 1 .
1 1
Hx = 0
-x3 - 1
x = -x3 = x3 -1
x3 1
57
Linear Algebra Shaska T.
3.3.1 Finding a basis for the row-space, column-space, and nullspace of a matrix.
Given a m×n matrix A, we would like to find the bases of spaces associated with it. We have the following algorithm:
1 2 -1 3
A = 1 1 2 1
2 -1 1 2
1 0 0 3/2
H = 0
1 0 1/2
0 0 1 -1/2
The rank of A is rank (A) = 3 and null (A) = 1. Thus, there is one free variable which we denote by x4 . Solving Hx = 0 we have
3 3
- 2 x4 - 2
1 1
- 2 x4 - 2
x = = x4
1
1
2 x4
2
x4 1
58
Shaska T. Linear Algebra
Example 3.19. Find the rank, nullity of a basis for the column space, row space, and the nullspace of the matrix
4 2 3 3
A =
-2 1 1 2
3 -1 2 1
6
1 0 0 -
23
H = 0 1 0 9
23
25
0 0 1
23
Then,
rank (A) = 3, null (A) = 1
For the basis of the column space we have
4 2 3
-2 , 1 ,
1
3 -1
2
For the basis of the row-space we take all three rows of A since each one of them contains a pivot. Next we find a basis for the
nullspace. Hence, we have to solve the system
Hx = 0.
The solution is 6
- 23 - 6
9
9
23
x = · x4 =
· t,
25
23 25
1 23
for some free variable t. Hence, a basis is
-6
9
B=
25
23
The next theorem relates some of the previous topic to this section.
Theorem 3.7. Let A be an n × n matrix. The following are equivalent:
i) Ax = b has a unique solution for every b ∈ Rn .
ii) A is row equivalent to In
iii) A is invertible
iv) The column vectors of A form a basis for Rn
Proof. Exercise
The following result is quite useful when checking for inverses.
Corollary 3.3. Let A be an n × n matrix. Then A is invertible if and only if
rank (A) = n.
Exercises:
59
Linear Algebra Shaska T.
3.23. Find the rank, a basis for the row space, and a basis for 3.29. Generalize the above problem to Rn . Let u1 , . . . , un be
the column space, a basis for the nullspace for the following linearly independent column vectors in Rn and A an invertible
matrices. n × n matrix. Prove that the vectors Au1 , . . . , Aun are linearly
independent.
2 3 2 1 1 1 1 1 2 3
3.30. Let u and v be column vectors in R3 and A an invertible
, ,
1 1 0 1 1 2 3 4 5 6
2 3 1 -1
3 4 5
7 8 9
3 × 3 matrix. Prove that if vectors Au and Av are linearly
independent then u and v are linearly independent.
3.24. Let A be a square matrix. Show that
3.31. Generalize the above problem to Rn . Let u1 , . . . , un
null (A) = null (At ).
be column vectors in Rn and A an invertible n × n matrix.
3.25. Let A, B be matrices such that the product AB is defined. Prove that if vectors Au1 , . . . , Aun are linearly independent
Show that then u1 , . . . , un are linearly independent.
rank (AB) ≤ rank (A).
3.32. Let
3.26. Give an example of two matrices A, B such that
cos θ − sin θ
" #
A=
rank (AB) < rank (A). sin θ cos θ
3.27. Let A be an m × n matrix. Prove that for some angle θ. Take any vector u ∈ R2 and compare it with
rank (AAt ) = rank (A). the vector Au. What happens geometrically?
3.28. Let u and v be linearly independent column vectors in 3.33. Let A be as in the previous exercise and {u, v} a basis in
R3 and A an invertible 3 × 3 matrix. Prove that the vectors R2 . Show that {Au, Av} is a basis for R2 . You might want to
Au and Av are linearly independent. look at the nullspace of A.
Lemma 3.3. Let V be a finite dimensional vector space and U, W its subspaces. Then
Proof. Exercise.
Theorem 3.8. Let U, W be subspaces of the vector space V. If V = U + W and U ∩ W = {0}, then
V = U ⊕ W.
Proof. Let v in V and v = u + w for some u ∈ U and w ∈ W. To prove that V is a direct sum we must show that u and
w are uniquely determined. Assume that exist u0 and w0 such that v = u0 + w0 . Then,
v − v = (u − u0 ) + (w − w0 ) = 0
60
Shaska T. Linear Algebra
(u − u0 ) = (w0 − w) ∈ U ∩ W = {0}
Therefore,
u = u0 and w = w0
This completes the proof.
Theorem 3.9. Let V be a finite dimensional vector space over k and W a subspace of V. Then, there is a subspace U ⊂ V such
that
V = U⊕W
Proof. Let dim V = n and dim W = r where r < n. Let
B = {b1 , . . . , bn }
be a basis for V. Then we can pick r elements of B which form a basis for W, say b1 , . . . , br . Let
U := {br+1 , . . . , bn }
u = x1 u1 + · · · + xr ur
w = y1 w1 + · · · + ys wr
v = x1 u1 + · · · + xr ur + y1 w1 + · · · + ys wr
61
Linear Algebra Shaska T.
v = v1 + · · · + vn , with vi ∈ Vi .
Exercise: Show that U × W with this addition and scalar multiplication is a vector space over k.
Definition 3.7. The vector space U × W is called the direct product of U and W.
is the set of n-tuples where addition and scalar multiplication are defined coordinate-wise.
Exercises:
" #
1
2 0
3.34. Let V = R2 and W be the subspace generated by w = .
and let U be the subspace generated by u1 = and u2 = 1.
1
3
0 1
" #
1
Let U be the subspace generated by u = . Show that V is Show that V is the direct sum of W and U.
1
the direct sum of W and U. Can you generalize this to any
two vectors u and w?
3.36. Let u and v be two nonzero vectors in R2 . If there is no
c ∈ R such that u = cv, show that {u, v} is a basis of R2 and that
1
3.35. Let V = R . Let W be the space generated by w = 0,
3 R2 is a direct sum of the subspaces generated by U = Span (u)
0
and V = Span (v) respectively.
62
Shaska T. Linear Algebra
3.37. Let U and W be subspaces of V. What are U +U, U +V? 3.40. Let U and W be subspaces of a vector space V.
Is U + W = W + U? i) Show that U ∩ W ⊂ U ∪ W ⊂ U + W.
3.38. Let U, W be subspaces of a vector space V. Show that ii) When is U ∪ W a subspace of V?
ii) What is the smallest subspace of V containing U ∪ W?
dim U + dim W = dim(U + W) + dim(U ∩ W)
3.39. Let k be a field, V = Mat2×2 (k), 3.41. Let V be a vector space over k and S the set of all sub-
(" # ) spaces of V. Consider the operation of subspace addition in S.
a b
U := | a, b ∈ k Show that there is a zero in S for this operation and that the
-b a
operation is associative.
and (" # )
a b
W := | a, b ∈ k . 3.42. Let V be a vector space over k and S the set of all sub-
b -a spaces of V. Consider the operation of intersection in S. Show
Show that: that this operation is associative. Is there an identity for this
i) U and W are subspaces of V. operation (i.e., there is an E ∈ S such that A ∩ E = A for all E
ii) V = U ⊕ W in S)?
Review exercises
3.43. Define the following: vector space over a field k, sub- 3.49. Find a basis for the subspace W = Span (w1 , w2 , w3 , w4 )
space, nullspace, direct sum, direct product. in R4 where w1 , . . . , w4 are given as below:
3.44. A square matrix is called upper triangular if all entries
1 −1 1 3
below the main diagonal are zero. Let V = Matn×n (R) and W 0 3 4 0
the set of all upper triangular matrices of V. Is W a subspace w1 = , w2 = , w3 = , w4 = (3.12)
3 1 2 1
of V? Justify your answer.
1 5 1 5
3.45. A square matrix is called lower triangular if all entries
above the main diagonal are zero. Let V = Matn×n (R), W1 be
" #! " #!
2 −1
the set of all upper triangular matrices of V, and W2 be the set 3.50. Let V = R 2 , W = Span , and U = Span .
3 1
of all lower triangular matrices of V. What is the intersection Show that V is the direct sum of W and U.
W1 ∩ W2 ?
3.51. Let B := {u, v, w} such that
3.46. Let A be an n × n invertible matrix. What is rank (A),
null (A)? What is the reduced row-echelon form of A.
1
1
1
3.47. Let A be an invertible 3 by 3 matrix. Prove that u = 2 , v = −1 , w = 3
B := {u, v, w} is a basis for R3 if and only if 3 1 1
63
Linear Algebra Shaska T.
64
Chapter 4
Linear Transformations
Let us consider again one of the questions that was raised in Chapter 1 and more specifically in Section 1.4. So what
kind of transformations of Rn will preserve most (or all) of geometric properties of the objects and in the same time
keep the algebraic structure of Rn ?
There are two algebraic operations in Rn , namely the vector addition and the scalar multiplication. How should
a map look like, which preserves both of these operations? Do such maps preserve the geometric properties of the
objects?
i) T(0V ) = 0W .
ii) For every v ∈ V, T(−v) = −T(v).
65
Linear Algebra Shaska T.
i) ker(T) is a subspace of V
ii) Img(T) is a subspace of W.
Proof. Exercise
The following lemma is helpful in checking whether a liner map is injective or not.
Lemma 4.3. Let
T:V→W
be a linear map. Then ker(T) = {0V } if and only if T is injective.
Proof. Assume that ker(T) = {0V }. Then, for every v1 , v2 ∈ V such that T(v1 ) = T(v2 ) we have
T(v1 ) − T(v2 ) = 0 =⇒ T(v1 − v2 ) = 0 =⇒ (v1 − v2 ) ∈ ker(T)
which means that
v1 − v2 = 0 =⇒ v1 = v2
Assume that T is injective and let v ∈ ker(T). Then T(v) = T(0V ) = 0W implies that v = 0V .
Example 4.3. Let C(R) denote the vector space all differentiable functions f : R → R. Consider the map
D : C(R) → C(R)
f (x) → D( f (x)) = f 0 (x)
where f 0 (x) is the derivative of f (x). Show that D is a linear map.
Theorem 4.1. Let T : V → W be an injective linear map. If v1 , . . . vn are linearly independent elements in V, then
T(v1 ), . . . , T(vn ) are linearly independent elements in W.
Proof. Let
y1 T(v1 ) + · · · + yn T(vn ) = 0W
for y1 , . . . , yn scalars. Then
T(y1 v1 ) + · · · + T(yn vn ) = 0W
which implies that
T(y1 v1 + · · · + yn vn ) = 0W
Since T is injective then ker(T) = {0} and
y1 v1 + · · · + yn vn = 0V
This implies that
y1 = · · · = yn = 0
since v1 , . . . , vn are linearly independent. Thus T(v1 ), . . . , T(vn ) are linearly independent elements in W.
We will accept the following theorems without proof.
66
Shaska T. Linear Algebra
Theorem 4.3. Let T : V → W be a linear map and dim V = dim W. If ker(T) = {0} or Img(T) = W, then T is bijective.
LA : R3 −→ R3
(4.2)
x −→ A · x
Solution: We determine first ker(LA ). More precisely we want to find all x ∈ R3 such that
T(x) = Ax = 0
Hence ker(LA ) is the same as the nullspace A. To find the nullspace we proceed as before The reduced row-echelon form is
1 0 0
H = 0 1 0
0 0 1
Hence rank (A) = 3, null (A) = 0 and nullspace of A is {0}. Thus, ker(LA ) = {0} and LA is injective. From the previous theorem
we conclude that LA is bijective.
Remark 4.1 (Notation). It is important to emphasize the notation used in the literature about the linear maps. If L : Rn 7→ Rm
is a linear map, then implicitly we are implying that Rn and Rm are vector spaces. Hence, elements of the Rn , Rm are vectors.
Therefore the notation
y1
x1 y
L ... = .
2
..
xn
ym
If L : Rn 7→ Rm is considered as a map among sets Rn , Rm , then the notation
L(x1 , . . . , xn ) = (y1 , y2 , . . . , ym )
must be used. Both notations are used in the literature. We will stick to the column vectors notation whenever possible.
67
Linear Algebra Shaska T.
f g
U −→ V −→ W
Also,
(g ◦ f )(r · u) = g f (r · u = g(r · f (u)) = r · (g ◦ f )(u)
Example 4.5. Let A and B be matrices of dimension m × n and n × s respectively and LA , LB be the linear maps
LA LB
Rm −→ Rn −→ Rs
LB ◦ LA : Rm −→ Rs
x −→ (BA) x
Exercises:
4.1. Let T : R → R such that T(x) = sin x. Is T an isomor- the interval [0, 1]. Check whether the map
phism? Explain.
φ : L([0, 1], R) −→ L(R)
Z 1
4.2. Let A = [ai j ] be an n × n matrix and tr(A) denote its trace. f (x) −→ f (x) dx
Show that the map 0
is a linear map.
68
Shaska T. Linear Algebra
4.7. Can you find two vector spaces of the same finite dimen- 4.10. Let T : C → C, such that
sion which are not isomorphic? Explain.
1
4.8. We know that C is a vector space over R. Define the
for z , 0
map T : C → C, such that T(z) = z, where z is the complex T(z) =
z
0 for z = 0
conjugate of z. Is T a linear map?
which is called the matrix associated with the linear map L with respect to the bases B1 and B2 and denoted by
B
MB2 (L).
1
Indeed, every vector x ∈ U is written as
x = x1 u1 + · · · + xn un
where x1 , . . . , xn are scalars in k. Hence,
L(x) = x1 L(u1 ) + · · · + xn L(un )
= x1 a1,1 v1 + · · · + a1,m vm
+ x2 a2,1 v1 + · · · + a2,m vm
·········
+ xn an,1 v1 + · · · + an,m vm
69
Linear Algebra Shaska T.
Thus, the coordinates of the vector L(x) with respect to the basis B2 of V are
x1
a1,1 a2,1 ··· an,1
x2
a a2,2 ··· an,2
= 1,2 · ·
··· ···
·
a1,m a2,m ··· an,m
xn
B
= At x = MB2 (L) x
1
B
Thus for any linear map L : U → V there is a matrix MB2 (L) with respect to bases B1 and B2 such that
1
B
L(x) = MB2 (L) x
1
B
We normally write MB2 (L) in the following way
1
B
h i
MB2 (L) = L(u1 )B2 | · · · | L(un )B2
1
where each L(ui )B2 is the column vector L(ui ) with respect to the basis B2 of V.
" #! x − y
x
= 2x − 3y
L
y
x − 3y
Find the matrix associated with L with respect to the standard bases.
Then
−1
1
L(i) = 2 , L( j) = −3
1 −3
1 -1
2 -3
1 -3
70
Shaska T. Linear Algebra
f : R2 → R2
x cos θ − y sin θ
" # " #
x
→
y x sin θ + y cos θ
Solution: We have
cos θ − sin θ
" #! " # " #! " #
1 0
f = , f =
0 sin θ 1 cos θ
Then, the associated matrix is:
cos θ − sin θ
" #
A := M( f ) =
sin θ cos θ
We have seen in the exercises of Chapter 1 that
" #
cos nθ − sin nθ
A =n
sin nθ cos nθ
Indeed this is to be expected since rotating n-times by θ is the same as rotating by the angle nθ.
We now see an example when neither of the bases B1 , B2 is a standard basis.
Example 4.8. Let
T : R3 −→ R4
x + y
x
y + z
y
−→
x − y
z
y−z
Now we need to express the vectors w1 , w2 , w3 with respect to the basis B2 . Each one of them must be expressed as
71
Linear Algebra Shaska T.
Theorem 4.6. Let U and V be vector spaces over k and B1 , B2 their respective bases. For any f, g ∈ L(U, V) the following hold:
i) M( f + g) = M( f ) + M(g)
iii) M( f ◦ g) = M( f ) · M(g)
is an isomorphism.
Proof. The previous theorem shows that Φ is a linear map. First we show that φ is injective. Let f, g ∈ L(U, V) such
that Φ( f ) = Φ(g). Thus, M( f ) = M(g). Hence, for every x ∈ U we have
M( f ) x = M(g) x
LA : U −→ V
(4.4)
x −→ A x
Exercises:
72
Shaska T. Linear Algebra
4.11. Check whether the map T : R3 −→ R4 such that 4.16. Let Pn denote the vector space over R of polynomials with
coefficients in R and degree ≤ n. Differentiation of polynomi-
T(x, y, z) = (x + 2, y − x, x + y)
als is a linear map on this space. Find its matrix representation
is linear. If it is linear then find its associated matrix. for
4.12. Find the associated matrix with respect to the standard B1 = B2 = {1, x, . . . , xn }.
bases to the map T : R3 −→ R4 such that " #
1
T(x, y, z) = (x, y, x + y + z) 4.17. Let u = ∈ R2 and T : R2 → R2 such that T(x) = u+x.
2
4.13. Find the associated matrix with respect to the standard Find the matrix representation of T with respect to the standard
bases to the map T : R2 −→ R3 such that basis of R2 .
T(x, y) = (x + y, 3y, 7x + 2y)
4.18. Let T : R2 → R2 be the transformation which rotates
4.14. Find the associated matrix with respect to the standard every point counterclockwise by the angle θ. Find its matrix
bases to the map T : R5 −→ R5 such that representation with respect to the standard basis.
T(x1 , . . . , x5 ) = (x1 , x2 , x3 , x4 , x5 )
4.15. Let L1 (R) be the vector space of differentiable functions 4.19. Let T : R2 → R2 be the transformation of the plane which
from R to R. Let sends every point to its symmetric point with respect to the x-
axis (i.e., T(x, y) = T(x, −y)). Find the matrix representation
V := Span (sin x, cos x) of T with respect to the standard basis.
and D : L1 (R) → L1 (R) the differentiation map. The restric-
tion of this map to V gives a linear map DV : V → V. Find the 4.20. Find the standard matrix representation for the reflec-
matrix representation of DV for B1 = B2 = {sin x, cos x}. tion of the xy-plane with respect to the line y = x + 2.
Algorithm 5. Input: A vector space V and two bases B1 = {u1 , · · · un } and B2 = {v1 , · · · vn } of V
B
Output: The transformation matrix MB2 , such that
1
B
MB2 · vB1 = vB2
1
73
Linear Algebra Shaska T.
" # " #
3 −2
u= , and v =
4 3
Then " #
B 1 2 1
MB2 = ·
1 3 -1 -2
and " # " # " # " #
B 3 1 10 B -2 1 1
uB2 = MB2 · = , and vB2 = MB2 · =− .
1 4 3 -11 1 3 3 4
1
Example 4.10. Let u ∈ R3 with coordinates in the standard basis u = 2. Find the coordinates of u with respect to the basis
3
1 2 3
B0 = , , .
1 0 1
1 1 1
1 2 3 1 0 0
A = 1 0 1 0 1 0
1 1 1 0 0 1
74
Shaska T. Linear Algebra
and 7
2
M · u = 1
3
−2
Let V be a finite dimensional vector space and B and B0
be bases of V. Let L : V → V be a linear transformation
and MB and MB0 be associated matrices for L with respect to bases B and B0 respectively. Then we have the
following theorem.
0
Theorem 4.8. Let M := MB
B
be the transformation matrix from B to B0 . Then,
Proof. Exercise.
Example 4.11. Find the associated matrix for the linear map
T : R3 −→ R4
such that
T(x, y, z) = (x − y + 2z, y + z, 3x − 2y − z, 7y + z)
and find a basis for ker(T).
Solution: We have
1 −1 2
1
0 0
1 0
1
T 0 = , T 1 = , T 0 = (4.6)
3 −2 −1
0 0 1
0 7 1
The system
H(T)x = 0
has as solution only x = 0 therefore ker(T) = {0}.
Exercises:
4.21. Let B1 = {1, x, x2 , x3 } be a basis for P3 . Show that 4.22. Let V := Span (ex , e−x ). Find the coordinates of
f (x) = sinh x, g(x) = cosh x
B2 = {2x − 1, x2 − x + 1, x3 − x, x3 − x, −2}
with respect to B = {ex , e−x }.
is also a basis. Find the transformation matrix from B1 to B2 . 4.23. Let V := Span (ex , xex ). Find the transformation matrix
75
Linear Algebra Shaska T.
4.5.2 Rotations
We already have seen what happens to a rotation with an angle θ counterclockwise around the origin. It is given by
cos θ − sin θ x
" # " #" #
x
→ ,
y sin θ cos θ y
"
#
a −b
which equivalently says that it is a matrix with a2 + b2 = 1.
b a
A rotation combined with a scaling has a matrix
cos θ − sin θ x
" # " #" #
x
→r
y sin θ cos θ y
Hence we have:
"#
a −b
Lemma 4.4. A matrix of the form represents a rotation by θ combined with a scaling r > 0, where r and θ are the
b a
" #
a
polar coordinates of the vector .
b
4.5.3 Shears
" # " #
1 r 1 0
A horizontal shear is given by the matrix and a vertical shear by the matrix .
0 1 k 1
4.5.4 Projections
Let us consider now a problem that we have already seen in Fig. 4.1, finding a projection of a vector v over a vector
u. We already know the formula for proju (v). Is this a linear map? Can we find its matrix if that’s the case?
76
Shaska T. Linear Algebra
In the above discussion it was not necessary to assume that the vector u be a unit. The student should prove the
following lemma.
" #
w
Lemma 4.5. For any given vector w = 1 the projection map
w2
x → projw (x)
77
Linear Algebra Shaska T.
4.5.5 Reflections
We continue our discussion of the previous section but now with the goal of finding the symmetric point of B with
respect to the line AC. First we consider the case when the point A is the point (0, 0) in R2 . So the problem is the
same as before but now we want to find the vector refu v as in Fig. 4.2.
Consider vectors u and v in R2 as in the Fig. 4.1. The reflection
vector of v with respect to u, denoted by refu v is the vector obtained
by reflecting the vector v with respect to the line determined by u. B
Hence,
refu v = proju (v) − w = proju (v) − v − proju (v)
v w
= 2proju (v) − v = 2Pv − v = (2P − I2 ) v
" #
x
Let us try to express this in terms of the coordinates of x = 1 when
x2 O C
" # pro ju v u
u
the unit vector u = 1 is given. So we have the matrix of the reflection
u2
as w0
" 2 #
2u1 − 1 2u1 u2 refu v
S = 2P − I2 =
2u1 u2 2u22 − 1
Consider now, as in the case of projections, the line L with equation
y = ax. Then the unit vector u is given by Eq. (4.7). Thus the matrix S B’
becomes " #
1 1 − a2 2a Figure 4.2: Reflection of v with respect to u
S = 2P − I2 = 2
a + 1 2a a2 − 1
Lemma 4.6. The reflection with respect to a line L going through the origin with equation
L : y = ax
y = ax + b. (4.8)
Consider the map T : R2 → R2 such that it takes every point P(x, y) to its reflection P0 . Determine explicit formulas
for this map and check whether it is linear. The following is a high school problem in analytic geometry.
Lemma 4.7. The reflection map refL x with respect to a general line L : y = ax + b is given by the formula
(1 − a2 )x + 2ay − 2ab
" # " #
x 1
→ 2
y a + 1 2ax + (a2 − 1)y + 2b
Proof. Let P(x1 , y1 ) be a given point. The line L0 going through P and perpendicular to L will have equation
1 x1
y = − x + y1 + . (4.9)
a a
The point of intersection has x-coordinate given by
1 x1
a + x = −b + y1 + .
a a
So we have
x1 + ay1 − ab
x=
a2 + 1
78
Shaska T. Linear Algebra
x1 +x2
If we denote Q(x2 , y2 ) the reflection point then x = 2 . Therefore,
ax + by + cz = 0
Find the formulas for the reflection map refP x with respect to the plane P. Show that this is a linear map. Find its matrix.
Proof. Exercise
Now that we know that the reflection with respect to a plane going through the origin is a linear map, maybe
we take another look at the case of the general line on the plane. Can we somehow consider equation Eq. (4.8) as a
plane in R3 ? Or can we do even better, have some kind of a space that all lines pass through the center?
Exercises:
4.25. Let L be a line in R3 such that it contains the unit vector 4.26. Let L be a line in R3 such that it contains the unit vector
u1
u1
u = 2
u
u = u2
u3
u3
Find the matrix of the linear transformation T(x) = projL (x).
What is the trace of this matrix? Find the matrix of the linear transformation T(x) = refL x.
T : R3 −→ R4 T(x, y, z) = (x − 2y, y − x, x + y)
79
Linear Algebra Shaska T.
4.32. Find the associated matrix for the linear map such that
T(x, y, z, w) = (x − y + z, 2x − 2y + 2z, x + y − z − b, 2x − w)
80
Chapter 5
The theory of determinants was developed in the 17-th and 18-th centuries. It started mainly with Cramer and
continued further with Bezout, Vandermonde, Laplace, Cauchy, et al. With the development of modern algebra
and new concepts that came with it as multilinear forms, permutation groups, etc, the concept of the determinant
was put in a firm foundation.
5.1 Determinants
In this section we define the determinant of matrix. The proper way to do that would be via alternating forms and
permutations but that might be a bit ambitious for this course. Instead we proceed with the more computational
approach. For the interested reader we provide a complete treatment of determinants in the appendix.
Definition 5.1. Let A = [ai j ] be an n × n matrix. For each (i, j) let Ai j be the (n − 1) × (n − 1) matrix obtained by deleting its
i-th row and j-column. Then, Ai j is called a minor of A, and
is called a cofactor of A.
Definition 5.2. Let A = [ai j ] be an n × n matrix. Then for a fixed i = 1, . . . n the determinant of A is defined to be:
n
X n
X
det(A) := (−1)i+j · ai,j · det(Ai j ) = ai, j · āi,j
j=1 j=1
81
Linear Algebra Shaska T.
is denoted by
a1,1 a1,2 a1,3 ... a1,n
a2,1 a2,2 a2,3 ... a2,n
a3,1 a3,2 a3,3 ... a3,n
det(A) = · ·
· ·
· ·
am,1 am,2 am,3 ... am,n
Example 5.1. Let A be a 2 × 2 matrix
" #
a b
A=
c d
Then its determinant is
a b
det(A) = = ad − bc.
c d
= a1,1 a2,2 a3,3 + a1,2 a2,3 a3,1 + a2,1 a3,2 a1,3 − a3,1 a2,2 a1,3 − a3,2 a2,3 a1,1 − a2,1 a1,2 a3,3
In many textbooks of elementary linear algebra the following technique is given for remembering the determi-
nant of a 3 by 3 matrix. The downward arrows represent products with coefficients 1 and the upward ones represent
! ! !
a2,1 a= 2,2 a= 2,3 a= 2,1 a2,2
! ! !
a3,1 a3,2 a3,3 a3,1 a3,2
Definition 5.3. The definition of the determinant as above is called the expansion by minors along the i-th row.
First we have to show what is already claimed in the definition: that the choice of the row does not change the
determinant. We skip the proof of the theorem. For a complete proof a more precise definition of the determinant
is needed, as given in the Appendix B.
Theorem 5.1. Expansion along any row or column does not change the determinant.
The above theorem allows us to pick the row or column with more zeroes when we compute the determinant
of a matrix.
82
Shaska T. Linear Algebra
1 2 0 4 0
0 2 0 0 1
A = 2 1 2 1 2
1 1 2 4 5
0 2 1 2 0
Solution: Since the second row has three zeroes we expand along that row. So we have
1 0 4 0 1 2 0 4
2 2 1 2 2 1 2 1
det(A) = 2 · −1·
1 2 4 5 1 1 2 4
0 1 2 0 0 2 1 2
We let
1 0 4 0 1 2 0 4
2 2 1 2 2 1 2 1
A1 := , A2 =
1 2 4 5 1 1 2 4
0 1 2 0 0 2 1 2
Then
2 1 2 2 2 2
det(A1 ) = 1 · 2 4 5 +4· 1 2 5
1 2 0 0 1 0 (5.1)
= (5 + 8 − 8 − 20) + 4 (2 − 2 · 5) = −15 − 32 = −47
1 2 1 2 2 1 2 1 2
det(A2 ) = 1 2 4 −2· 1 2 4 −4· 1 1 2
2 1 2 0 1 2 0 2 1
= (4 + 16 + 1 − 4 − 4 − 4) − 2 (8 + 1 − 4 − 8) − 4 (2 + 4 − 8 − 1)
= 9 − 2 · (−3) − 4 · (−3) = 27
Hence,
det(A) = 2 · (−47) − 27 = −121
Lemma 5.1. det(A) = det(AT )
Proof. Let A = [aij ] be given. We prove the Lemma by induction. For n = 1 the proof is trivial. Assume that the
lemma holds for n < r. We want to show that it holds for n = r. The determinant of A is
Denote by B := At . Then
det(B) = b11 |B11 | − b21 |B21 | + · · · + (−1)r+1 br1 |B1r |.
However, a1j = b j1 and B j1 = At1j . By the induction hypothesis we have |A1 j | = |B j1 |. Hence det(A) = det(B) = det(At ).
Remark 5.1. The determinant of a triangular matrix is the product of its diagonal entries.
We illustrate with an upper triangular matrix.
83
Linear Algebra Shaska T.
Solution: We find the determinant by expanding along the first column. It is obvious that
n
Y
det(A) = ai,i .
i=1
We now see some properties of determinants.
Lemma 5.2. Let A be an n × n matrix. The row operations have the following effect on the determinant:
det(A0 ) = − det(A)
det(A0 ) = r · det(A)
Proof. i) We proceed by induction. The proof for n = 2 is trivial. Assume that the property holds for all matrices of
size smaller then n. Let B denote the matrix obtained after performing the operation Ri ←→ R j on A. Compute the
determinant by expansion along the s-th row, where s , i and s , j. Then
where C is obtained by interchanging the rows of A. Hence, det(C) = 0 and det(B) = det(A).
84
Shaska T. Linear Algebra
det(A) = r · det(H)
for some constant r , 0. The matrix A is invertible if and only if H has pivots in every row. Since H is triangular
then its determinant is the product of this pivots. Hence, A is invertible if and only if det(H) , 0. Therefore, A is
invertible if and only if det(A) , 0.
Proof. Exercise.
Proof. First we assume that A is diagonal. Then, to obtain the matrix AB, each row of B is multiplied by Ai,i . Hence,
Without loss of generality assume that A is invertible (otherwise the theorem is true from the above Lemma).
Then, A can be converted in a diagonal form D by row operations (no multiplying by constants is allowed).
Thus, D = EA for some elementary matrix E where E corresponds to row interchanges and row-additions. Hence,
det(A) = (−1)r · det(D), for some r. Then,
E(AB) = (EA)B = DB.
Hence, we have
det(AB) = (−1)r · det(DB) = (−1)r · det(D) · det(B) = det(A) · det(B).
This completes the proof.
1 0 0 0 3 0 0 0
2 2 0 0 2 1 0 0
A := , B =
9 2 4 0 21 -7 2 0
12 10 2 5 13 2 31 2
Solution: Since both are triangular matrices and det(AB) = det(A) · det(B) we have
det(AB) = (1 · 2 · 4 · 5) · (3 · 1 · 2 · 2) = 480
85
Linear Algebra Shaska T.
1) Reduce A to row-echelon form using only row addition and row interchanges.
2) If during the procedure one of the rows becomes all zeroes then
det(A) = 0,
otherwise
n
Y
det(A) = (−1)r · pi
i=1
where pi ’s are pivots and r is the number of row interchanges performed.
φ(v1 , . . . , vi , vi+1 , . . . , vn ) = 0
whenever vi = vi+1 and is called symmetric if interchanging any two coordinates does not change the value of the function.
Exercise: Show that a 2-multi-linear map is a bilinear map.
Proposition 5.1. Let φ be an n-multi-linear alternating function on V. Then,
1) the value of φ on an n-tuple is negated if two adjacent components are interchanged.
2) for each σ ∈ Sn ,
φ(vσ(1) , . . . , vσ(n ) = ε φ(v1 , . . . , vn )
3) if vi = v j for any i , j then φ(v1 , . . . , vn ) = 0.
4) if vi is replaced by vi + αv j , in (v1 , . . . , vn ) for any i , j and α ∈ k, then the value of φ on this tuple is not changed.
Proposition 5.2. Assume that φ is an n-multi-linear alternating function on V and that for some v1 , . . . vn ∈ V, w1 , . . . wn ∈ V
we have
w1 = a11 v1 + . . . an1 vn
...
wn = a1n v1 + . . . ann vn
Then, X
φ(w1 , . . . , wn ) = ε(σ)aσ(1) 1 · · · aσ(n) n φ(v1 , . . . , vn )
σ∈Sn
86
Shaska T. Linear Algebra
that satisfies:
1) it is a n-multi-linear alternating form on kn , where n-tuples are (A1 , . . . An ) n-columns of matrices A in kn .
2) det(I) = 1
Theorem 5.4. There is a unique n × n determinant function on k and it can be computed for any n × n matrix A = [aij ] by
X
det(A) = ε(σ)aσ(1) 1 · · · aσ(n) n .
σ∈Sn
Corollary 5.1. The determinant is an n-multi-linear function on the rows of Matn×n (k) and for any matrix A, det(A) = det(AT ).
Theorem 5.5. (Cramer’s rule) If A1 , . . . , An are columns of a matrix A and
b = b1 A1 + . . . bn An
for bi ∈ k then
The theorem assures that the expansion by any row or column for a matrix A gives the same determinant. Thus,
our definition of the determinant in Chapter 3 is justified.
Exercises:
5.1. Let A be a (n × n) invertible matrix. Show that 5.4. Find the determinants of
5 -1 0 2 5 2 0 2
1
det(A ) =
−1 1 2 1 0 3 2 1 0
det(A) A = , B =
3 1 -2 4 3 1 -2 4
0 4 -1 2 2 4 -1 2
5.2. Find the determinants of
and use the result to find det(A−1 ) and det(B−1 ).
1 1 1 2 1 3
5.5. Let A be a matrix such that det(A) , 0. Does the system
A = 1 1 1 , B = 2 -1 0
Ax = b have any solutions?
2 0 1 4 0 3
5.6. Let A be given as
5.3. Find the determinants of "
a b
#
A=
c d
1 0 1 2 1 3
A = 0 1 0 , B = 2 -1 0 What is the condition on a, b, c, d such that A has an inverse?
2 0 1
-1 0 5
Find the inverse.
87
Linear Algebra Shaska T.
5.7. Let C be an invertible matrix. Prove that 5.9. Let A be an n × n matrix. If every row of A adds to 0
prove that det(A) = 0.
det(A) = det(C−1 AC).
5.8. The determinant of an n × n matrix A is det(A) = 3. Find 5.10. Let A be an n × n matrix. If every row of A adds to 1
det(2A), det(−A), and det(A3 ). prove that det(A − I) = 0. Does this imply that det(A) = 0 ?
Definition 5.9. Let A be an n × n matrix. A nonzero scalar λ is called an eigenvalue if there exists a nonzero vector v such
that
Av = λv
The vector v is called the eigenvector corresponding to λ.
Av = λv =⇒ (A − λI)v = 0
(A − λI)x = 0
has a non trivial solution. We know that this system has a nontrivial solution if and only if the determinant of the
coefficient matrix is zero. Thus, we want to find λ such that
det(A − λI) = 0.
Let A = [ai,j ] be a given matrix. Then the above equation can be written as
Remark 5.2. Recall from algebra that a polynomial of degree n can have at most n roots. Hence an n × n matrix can have at
most n eigenvalues. See Appendix B for more details on polynomials.
88
Shaska T. Linear Algebra
The multiplicity of an eigenvalue as a root of the characteristic polynomial is called the algebraic multiplicity
of the eigenvalue. For a fixed eigenvalue λ the corresponding eigenvectors are given by the solutions of the system
(A − λI)x = 0
Equivalently we have called such a space the nullspace of the coefficient matrix (A − λI).
Definition 5.10. If λ is an eigenvalue of A, the set
EL := {v ∈ V | A v = λ v}
is called the eigenspace of A corresponding to λ. The dimension of the eigenspace is called the geometric multiplicity of the
eigenvalue λ.
Remark 5.3. It can be shown that the geometric multiplicity is always ≤ to the algebraic multiplicity.
Finding the eigenvalues requires solving a polynomial equation which can be difficult for high degree polyno-
mials. Once the eigenvalues are found then we use the linear system
(A − λI)x = 0
to find the eigenvectors. We illustrate below.
Example 5.6. Find the characteristic polynomial and the eigenvalues of the matrix
" #
1 2
A= .
5 4
1−λ 2
char (A, λ) = det(A − λI) =
5 4−λ
= (1 − λ)(4 − λ) − 5 · 2 = λ2 − 5λ − 6 = (λ + 1)(λ − 6)
The eigenvalues are λ1 = −1 and λ2 = 6. Both of them have algebraic multiplicity 1.
If λ1 = −1 the system becomes:
" #
2 2
x=0
5 5
and its solution is " #
-1
v1 =
1
Its eigenspace is
Eλ1 = hv1 i.
89
Linear Algebra Shaska T.
Example 5.7. Find the eigenvalues and their multiplicities for the matrix
1 0 2 1
2 1 0 -1
A :=
0 0 2 0
0 0 1 -2
Hence there are three eigenvalues, namely λ1 = 1, λ2 = −2, λ3 = 2. The eigenvalue λ1 = 1 has algebraic multiplicity 2 and the
others have algebraic multiplicity 1.
To find the geometric multiplicities for λ1 , λ2 , λ3 we have to find their corresponding eigenvectors. By solving the
corresponding systems we have
0 1 9
1 - 5 17
v1 = , v2 = 3 , v3 =
0 0 4
0 1
-3
Thus the geometric multiplicities for λ1 , λ2 , λ3 are respectively 1, 1, 1.
Next we will see an example when the algebraic and geometric multiplicities are the same for each eigenvalue.
Example 5.8. Find the eigenvalues and their multiplicities for the matrix
1 0 0 1
0 1 0 2
A :=
1 -1 2 3
0 0 0 -2
Hence there are three eigenvalues, namely λ1 = 1, λ2 = −2, λ3 = 2. The eigenvalue λ1 = 1 has algebraic multiplicity 2 and the
others have algebraic multiplicity 1.
To find the geometric multiplicities for λ1 , λ2 , λ3 we have to find their corresponding eigenvectors. By solving the
corresponding systems we have:
For λ = 1 the eigenvectors are
1 -1
1 0
u1 = , u2 =
0 1
0 0
1 0
2 0
v2 = 5 , v3 =
1
2
0
-3
90
Shaska T. Linear Algebra
Remark 5.4. We will see in the next chapter that the above two examples illustrate two classes of matrices. We will learn how
to deal with each of these classes separately.
Exercises:
5.11. Find the eigenvalues and their algebraic and geometric T and D.
multiplicities for each of the matrices
5.16. Let A and B be given as below:
5 -1 0 2 5 2 0 2
5 -1 0 2 5 2 0 2
1
2 1 0 3 2 1 0
A = , B =
1 2 1 0 3 2 1 0
3 1 -2 4 3 1 -2 4
A = , B =
3 1 -2 4 3 1 -2 4
0 4 -1 2 2 4 -1 2
0 4 -1 2 2 4 -1 2
5.12. Let A be a diagonal n × n matrix given by
Find their eigenvalues. In each case compute the sum and
1 0 0 0 product of eigenvalues and compare it with the trace and de-
0 2 0 0 terminant of the matrix.
A =
0 0 3 0
0 0 0 4
5.17. Prove that a square matrix is invertible if and only if no
What are its eigenvalues and their multiplicities? eigenvalue is zero.
5.13. Compute the eigenvalues and their multiplicities of the 5.18. Let A be a 3 by 3 matrix. Can you find a formula which
matrix A3 , where A is as in the previous example. determines the eigenvalues of A if you know the trace and
determinant of A?
-1 -1 0
5.15. Let A be a 2 by 2 matrix with trace T and determinant A = 1 1 1
D. Find a formula that gives the eigenvalues of A in terms of
3 1 -2
A = C−1 B C.
91
Linear Algebra Shaska T.
then
A = CDC−1
where D is the diagonal matrix given below
λ1
e1
...
λ1
λ2
...
D =
λ2
...
λn
...
en
λn
and h i
C = v1,1 , . . . v1,ei , v2,1 , . . . , v2,e2 , . . . vs,es
We call the matrix C in the above theorem the transitional matrix of A associated with D. We illustrate the above
theorem with the following two examples.
92
Shaska T. Linear Algebra
√ √
5 5
If λ3 = 12 + 2 , λ4 = 12 + 2 , then the corresponding eigenvectors are
13 5 √ 13 5 √
− 2 + 2 5 − 2 − 2 5
1
1
v3 =
√ ,
v4 =
√
6 − 3 5 6 − 3 5
15 7 √ 15 7 √
2 − 2 5 2 − 2 5
Hence, since the algebraic multiplicity of each eigenvalue is the same with the geometric multiplicity then A is similar to
1 +~i
0 0 0
1 −~i
0 0√ 0
D =
5
2+ 2
0 1
0 0 √
5
1
0 0 0 2− 2
93
Linear Algebra Shaska T.
9 0 0 0
-2 1 -3 -4
A :=
-6 0 6 0
4 4 3 11
Find out if this matrix is diagonalizable and in that case find a diagonal matrix D similar to A and the transitional matrix C
associated to D.
Hence, the geometric multiplicities are respectively 1,1, and 2. Therefore the matrix A is diagonalizable and C and D are
3 0 0 0 0 0 2 1
0 6 0 0 -2 1 1 0
D = , C :=
0 0 9 0 0 -3 -4 -2
0 0 0 9 1 1 0 1
Example 5.11. Let A be a 3 by 3 matrix as below
2 1 0
A = 0 2 0 .
0 0 3
Solution: Then char (A, λ) = (λ − 2)2 (λ − 3). For the eigenvalue λ = 2, the algebraic multiplicity is 2 and the eigenspace is
given by
0
E2 =
t 1 | t ∈
Q
0
The geometric multiplicity is 1, hence A is not similar to the diagonal matrix of eigenvalues.
Lemma 5.7. Let A be similar to a diagonal matrix D such that A = C−1 D C. Then
An = (C−1 ) Dn C
Proof. Exercise.
Exercises:
94
Shaska T. Linear Algebra
3 1 4 2
-1 0 -1 0
5.26. Let A be the 4 by 4 matrix
A =
2 1 0 1
-2 -5 -2 -1
1 0 -1 1
5.22. Let 3
2 7 3
0
2 2
2 1 3 2 3 1 4 2
A :=
-1 0 -1 0 -1 0 -1 0
1 −1 −3
-1
A = , and B = . 2 2 2
5 1 0 1 2 1 0 1
1 0 -1 3
1 0 -1 1
−5 −7 −1
1
2 2 2
Determine if A and B are similar.
Show that A = C−1 DC where
5.23. Let 1 2 1 1 -1 0 0 0
3 C := 1 1 -1 0
0 -1 0 0
2 1 3 2 -10 -2 2 ,
D :=
+ 11 -1 1 1 2 0 0 1 0
-1 0 -1 0 7 -5 1
A = , B = .
and
1 1 0 -1 0 0 0 2
5 1 0 1 -15 -2 5 4
1 0 -1 3 -15 -4 5 3
Compute A6 .
Determine if A and B are similar.
95
Linear Algebra Shaska T.
Ax = b
Proof. The solution is x = A−1 b. Expand det Bk in cofactors of the k-th column. We have
2x + 3y = 5
(
5x − y = 7
Solution: Then
" # " # " #
2 3 5 3 2 5
A= , B1 = , B2 =
5 -1 7 -1 5 7
and
det(A) = −17, det(B1 ) = −26, det(B2 ) = −11
Hence,
26 11
x1 = , x2 =
17 17
We now illustrate with a linear system with five equations and five unknowns.
Example 5.13. Solve the linear system Ax = b, where A is as in Example 5.3, and
1
0
b = 0
-1
0
Solution: As shown in Example (5.3) the determinant of A is det(A) = −121. Further, we compute
96
Shaska T. Linear Algebra
Proof. Exercise.
i+1 2 i-1
A = 0 2i 0
i 1 -1
Solution: Then
-2i 0 2
C = -1-i 0 1-i
4 0 -2+2i
Hence,
2i 0 2
C̄ = -1+i 0 1+i
4 0 -2-2i
and
-2i -1-i 4
adj (A) = 0 0 0
2 1-i -2+2i
97
Linear Algebra Shaska T.
Remark 5.5. Notice that if the matrix has entries in R then it is not necessary to take the conjugates of ci, j since the conjugates
of real numbers are the numbers themselves. That is why in most textbooks which treat only the matrices with entries from R
the definition of the adjoint does not contain taking conjugates.
Example 5.15. Let A be the following matrix.
1 2 0 -1
0 2 0 0
A :=
2 1 -1 1
1 1 2 -1
-2 5 -4 -2
0 -6 0 0
adj (A) =
6 -3 0 -6
10 -7 -4 -2
Theorem 5.11. Let A be an invertible matrix and adj (A) its adjoint. Then
Proof. Exercise.
From the above theorem we conclude that for a given matrix A such that det(A) , 0 we have
1
A−1 = adj (A)
det(A)
Exercises:
5.28. Let the curve 5.29. Using Cramer’s rule solve the system Ax = ~b where
5 -1 0 2 5
1 2 1 0 ~b = 3
A + By + Cx + Dy2 + Exy + x2 = 0 A = ,
3 1 -2 4 3
0 4 -1 2 2
98
Shaska T. Linear Algebra
and use the result to find A−1 and B−1 . The polynomials f (x) and g(x) have a common factor in
k[x] if and only if Res( f, g, x) = 0.
5.31. Find the adjoint of
5 -1
0 2
5
2 0 2
Let
1 2 1 0 3 2 1 0
A =
,
B =
F(t) = u(1 + t2 ) − t2
3 1 -2 4 3 1 -2 4
(5.4)
0 4 -1 2 2 4 -1 2 G(t) = v(1 + t2 ) − t3
and use the result to find A−1 and B−1 . Find Res(F, G, t).
5.32. Determine if the matrix 5.34. Let
1 0
0 -1
f (x) = x5 − 3x4 − 2x3 + 3x2 + 7x + 6
0 1
(5.5)
0 0
g(x) = x4 + x2 + 1
A :=
2 1 -1 1
1 0 2 -1
Find Res( f, g, x).
is invertible. 5.35. Let
f (x) = an xn + · · · + a1 x + a0
5.33. Let f, g be as follows:
and f 0 (x) its derivative. Define the discriminant ∆ f of f (x)
f (x) = al x + al−1 x
l l−1
· · · + a − 1x + a0 with respect to x as below:
(5.2)
g(x) = bm x + bm−1 x m m−1
· · · + b1 x + b0 n(n−1)
(−1) 2
The matrix ∆ f := Res( f, f 0 , x).
an
al bm
a The following is a basic fact in the algebra of polynomials:
al
l−1 bm bm−1
al−1 .
al−2 bm−1 . bm−2
polynomial f (x) has double roots if and only if ∆ f = 0.
. The
al−2 . . . . . .
. .. . al . . . bm .
Does
Syl( f, g, x) = a1 . . . al−1 . . . bm−1 b0
f (x) = 6x4 − 23x3 − 19x + 4
a1 . . al−2
a
0 b0 . . .
a0 . . . . . . have any multiple roots in C?
. . . .
5.36. Find b such that
.
. .
a0 b0
f (x) = x4 − bx + 1
(5.3)
is called the Sylvester matrix of f (x) and g(x). The resultant has a double root in C.
of f (x) and g(x), denoted by Res( f, g, x), is
5.37. Find p such that
Res( f, g, x) := det(Syl( f, g, x)).
f (x) = x3 − px + 1
The following is a basic fact in the algebra of polynomials:
has a double root in C.
5.39. Find the eigenvalues and their algebraic and geometric 5.40. Find the eigenvalues and their algebraic and geometric
99
Linear Algebra Shaska T.
multiplicities for the matrix 5.41. Prove that if A is similar to a diagonal matrix, then A is
similar to At .
2 2 0 1
1 1 1 0
B =
1 1 -2 1
1 4 -1 2
100
Chapter 6
Canonical Forms
The main purpose of this chapter is classify the distinct linear transformations of a vector space or the similarity
classes of matrices.
Let V be a n dimensional vector space over the field k and B a basis of V. Further, T : V → V is a linear map and
0
A = MB B
(T) is its associated matrix. Choosing a different basis B0 for V gives a new matrix B = MB B0
(T) associated
with T, namely
B = P−1 A P
where P = MB B0
(id), see Chapter 4. Can we find B0 such that the matrix associated with T is as simple as possible?
The strategy is to pick B0 such that B is as close to a diagonal matrix as possible. We distinguish two cases:
i) k does not contain all the eigenvalues of A
ii) k contains all eigenvalues.
These cases lead respectively to the rational canonical form and the Jordan canonical form and will be studied in sections
2 and 3.
and it is a field. The next theorem shows that the well known Euclidean algorithm applies to polynomials as well.
Theorem 6.1. (Euclidean algorithm) Let f, g ∈ k[x] and assume that g , 0. Then there exists unique r, q ∈ k[x] such that
f = q· g+r
101
Linear Algebra Shaska T.
Corollary 6.1. Let f ∈ k[x] and α ∈ k such that f (α) = 0. Then f (x) = (x − α) · g(x).
Solution: Assume that f (x) factors in Q[x]. Then one of the factors is linear. Hence, f (x) has a rational root a = db . From the
previous theorem
b | 2, and d|1
Hence, b = ±1, ±2 and d = ±1. Then we have a = ±1, ±2. It can easily be checked that none of these values is a root of f (x).
Hence f (x) is irreducible.
Theorem 6.4. (Eisenstein’s criterion) Let f (x) be a polynomial with integer coefficients given by
f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0
and p a prime in A such that:
i) p | ai for all i ≤ n − 1
ii) p2 - a0
iii) p - an .
102
Shaska T. Linear Algebra
is irreducible over Q.
Solution: Notice that p = 3 divides all coefficients other than the leading coefficient. Further p2 = 9 does not divide a0 = −3.
Theorem 6.5. (Extension of the Eisenstein’s criterion) Let f (x) be a polynomial with integer coefficients given by
is irreducible in Q[x].
Solution: We use the previous theorem. Since 3 divides a0 , . . . , a3 but does not divide a4 then r = 4. Hence, if f (x) is reducible
then it is a product of polynomials of degree 4 and 1. Thus f (x) has a rational root. By the integral root test we show that this
can’t happen.
Exercises:
xn −1
6.1. Use the Euclidean algorithm to write the x−1 as a poly-
nomial. 1) x4 + 10x + 5
3) x4 − 4x3 + 6
4) x6 + 30x5 − 15x3 + 6x − 120
6.2. Prove that f (x) = x3 − 3x − 1 is irreducible in Q.
6.3. For any prime p show that x2 −p and x3 −p are irreducible 6.7. Factor over Q the polynomial
in Q.
f (x) = x3 − 7x2 + 16x − 12.
6.4. Let α ∈ Z such that α is divisible by some prime p but 6.8. Factor over Q the polynomial
p2 - α. Prove that xn − α is irreducible.
f (x) = x3 + x2 + x − 14.
6.5. Prove that f (x) = x4 + 1 is irreducible over Q.
6.9. We solve a quadratic equation by the well-known
quadratic formula. Do you know any formulas that one can
6.6. Prove that the following polynomials are irreducible over use to solve a cubic polynomial? What about polynomials of
Q. degree 4, 5?
103
Linear Algebra Shaska T.
I, A, A2 , . . . , As
are linearly dependent for s > n2 . Thus, there exist a0 , . . . , as such that
as As + . . . aA + a0 I = 0.
Take f (x) = as xs + . . . a1 x + a0 .
Definition 6.1. We call the minimal polynomial of A the unique monic polynomial m ∈ k[x] of minimal degree such that
m(A) = 0. The minimal polynomial of A is denoted by mA (x).
Definition 6.2. Let f (x) be a monic polynomial in k[x] given by
and we denote it by C f .
Lemma 6.1. Let f (x) ∈ k[x] and C f its companion matrix. The characteristic polynomial of C f is
char (C f , x) = f (x).
Proof. Exercise.
For a given matrix A the characteristic polynomial char (A, x) = det(xI − A). The matrix (xI − A) can be considered
as a matrix over the field k(x). Moreover, A is also in Matn×n ( k(x) ). In the next theorem we show how every
matrix in Matn×n ( k(x) ) can be transformed into a diagonal matrix by the elementary operations. These elementary
operations consist of
ii) Adding a multiple (in k[x]) of one row or column to another (Ri −→ q(x) · Ri + R j ).
Two matrices A and B, one of which can be obtained by a sequence of elementary operations on the other, are called
Gaussian equivalent. For matrices whose entries are polynomials we have the following:
104
Shaska T. Linear Algebra
Theorem 6.7. Let M ∈ Matn×n ( k[x] ). Then, using elementary operations the matrix M can be put in a diagonal form
1
·
·
1
e1 (x)
·
·
·
es (x)
Proof. We will use the elementary operations to transform M into a diagonal matrix. Among all matrices which are
Gaussian equivalent to M pick the one which has the entry of smallest degree. Let such matrix be A = [ai j (x)] and
the entry with lowest degree is ai j =: m(x).
By an interchange of rows and columns bring this entry in (1, 1)-position. All entries of the first column can be
written as (Euclidean algorithm)
a1 j = m(x) q j (x) + r j (x)
where deg r j (x) < deg m(x).
By performing R j − m(x) q j (x) → R j for j = 2, . . . n the first column of the matrix is
m(x)
r (x)
2
. . .
.
. . .
rn (x)
Choose the entry m0 (x) with the smallest degree from the first column and by a row change move that to the
(1, 1)-position. Perform the same process as above. Then degrees of r0j (x) will decrease by at least one. Since k[x] is
an Euclidean domain this process will end after finitely many steps and the first column will look like
m1 (x)
0
. . .
.
. . .
0
Indeed, the maximum number of steps can be no bigger then deg m(x).
Next we perform the same procedure for the first row to get
m2 (x) 0 ... 0
a0 (x) a0 (x) . . . a0 (x)
2,1 2,2 2,n
a (x) a0 (x) . . . a0 (x)
0
3,1 3,2 3,n
. . .
an,1 (x) an,2 (x) . . . an,n (x)
0 0 0
Continuing again with the first column and so on, we get a sequence of operations
A → A(1) → A(2) → . . .
Let mi (x) denote the entry in the (1, 1)-position after the i-th step. Then
105
Linear Algebra Shaska T.
where e1 (x) has the smallest degree and divides all the entries a00
i, j
(x).
Now we perform the same procedure focusing on the next row and column. Finally we will have
Remark 6.1. If any of ei (x) = 0 then it will occur in the last position since all other e j (x), j , i must divide ei (x).
Definition 6.3. Let A ∈ Matn×n (k). Then by the above theorem the matrix xI − A can be put into the diagonal form
1
·
·
1
e1 (x)
·
·
·
es (x)
such that ei (x) are monic and ei (x) | ei+1 (x), for i = 1, . . . , s − 1. This is called the Smith normal form for A and elements ei (x)
of nonzero degree are called invariant factors of A.
Lemma 6.2. The characteristic polynomial of A is the product of its invariant factors up to multiplication by a constant.
Proof. We have
char (A, x) = det(xI − A).
Since (xI − A) ∼ Smith (A) then
det(xI − A) = c · det(Sm(A)),
for some c ∈ k.
Lemma 6.3. Let e1 (x), . . . es (x) be the invariant factors of A such that
The minimal polynomial ma (x) is the largest invariant factor of A. In other words
es (x) = mA (x).
Proof. Exercise
106
Shaska T. Linear Algebra
Example 6.5. Find the Smith normal form of the matrix A given as follows:
2 -2 14
A := 0 3 -7
0 0 2
Solution: We have
x - 2 2 - 14
xI − A = 0 x-3 7
0 0 x-2
x − 2 2 - 14
C ←→C
xI − A = 0
x−3 7 1−→ 2
0 0 x−2
2 x−2 - 14
R →(x−3)R −2R
x − 3
0 7 2 −→ 1 2
0 0 x−2
2 x−2 - 14
C →(x−2)C −2C
0
(x − 2)(x − 3) −14(x − 2) 2 −→ 1 2
0 0 x−2
2 0 - 14
R1 → 1 R1 , R2 →− 1 R2
0 2 2
−2(x − 2)(x − 3) −14(x − 2)
−→
0 0 x−2
1 0 -7
C →7C +C
0 (x − 2)(x − 3) 7(x − 2) 3 1 3
−→
0 0 x−2
1 0 0
C ←→C
0 (x − 2)(x − 3) 7(x − 2) 2 3
−→
0 0 x−2
1 0 0
R →R −7R
0 7(x − 2) (x − 2)(x − 3) 3 2 3
−→
0 (x − 2) 0
1 0 0
C →(x−3)C −7C
0 7(x − 2) (x − 2)(x − 3) 3 2 3
−→
0 0 (x − 2)(x − 3)
1 0 0
R2 → 1 R2 , R3 →− 1 R3
0 7 7
7(x − 2) 0
−→
0 0 −7(x − 2)(x − 3)
107
Linear Algebra Shaska T.
1 0 0
0 (x − 2) 0
0 0 (x − 2)(x − 3)
which is the Smith normal form Sm(A). The reader can check that the characteristic polynomial of Smith (A) and A are the
same.
Exercises:
6.10. Find the companion matrix of 6.14. Find the Smith normal form of matrices in the previous
two exercises.
f (x) = x3 − x − 1.
6.11. Find the companion matrix of
6.15. Determine all possible minimal polynomials of a matrix
f (x) = (x − 2)2 (x − 3). A with characteristic polynomial
6.12. Let A be a 2 by 2 matrix with entries in Q such that char (A, x) = (x − 2)2 (x − 3)
char (A, x) = x2 + 1. Find the minimal polynomial of A.
6.16. Determine all possible Smith normal forms of a matrix
6.13. Let f (x) be an irreducible polynomial cubic in Q. For A with characteristic polynomial
example
char (A, x) = (x − 2)2 (x − 3)
f (x) = ax3 + bx2 + cx + d.
Let A be a 3 by 3 matrix with entries in Q such that 6.17. Find all possible Smith normal forms of a matrix A with
char (A, x) = f (x). Find the minimal polynomial mA (x) of characteristic polynomial
A. Can you generalize to a degree n polynomial?
char (A, x) = x3 − 1.
C1
C2
·
·
·
Cs
is called the rational canonical form of A and is denoted by Rat (A). The word rational is used to indicate that this
form is calculated entirely within the field k. Notice that,
implies that
deg e1 + · · · + deg es = deg char (A, x).
Hence, A and Rat (A) have the same dimensions.
108
Shaska T. Linear Algebra
2 -2 14
A := 0 3 -7
0 0 2
Solution: We found the invariant factors of this matrix in Example 6.5 in the last section. They are e1 (x) = x − 2 and
e2 (x) = (x − 2)(x − 3). Then the rational form of A is
2
Rat (A) = 0 -6
1 5
Theorem 6.8. Let k be a field and A ∈ Matn×n (k). Then the following hold:
i) Two matrices in Matn×n (k) are similar if and only if have the same rational form.
ii) The rational form of A is unique.
Proof. Let A be similar to B. Then char A (x) = char B (x) as polynomials over k. Hence, the Smith normal form is the
same for A and B. Thus, A and B have the same rational form.
If A and B have the same rational form, then they have the same invariant factors.
ii) There is only a unique choice of invariant factors. Hence a unique rational form.
Example 6.7. Let A be a 10 by 10 matrix such that its invariant factors are
e1 (x) = x − 2
e2 (x) = (x − 2)(x3 + x + 1) (6.1)
e3 (x) = (x − 2)(x − 3)(x + x + 1) 3
e2 (x) = x4 − 2x3 + x2 − x − 2
(6.2)
e3 (x) = x5 − 5x4 + 7x3 − 4x2 + x + 6
2
0 0 0 2
1 0 0 1
0 1 0 -1
0 0 1 2
Rat (A) =
0 0 0 0 -6
1 0 0 0 -1
0 1 0 0 4
0 0 1 0 -7
0 0 0 1 5
109
Linear Algebra Shaska T.
Example 6.8. Let A be a 8 by 8 matrix such that its invariant factors are
e1 (x) = x3 + x + 1
(6.3)
e2 (x) = (x2 + 2)(x3 + x + 1) = x5 + 3x3 + x2 + 2x + 2
0 0 -1
1 0 -1
0 1 0
0 0 0 0 -2
Rat (A) =
1 0 0 0 -2
0 1 0 0 -1
0 0 1 0 -3
0 0 0 1 0
Exercises:
6.18. Find the rational canonical form of this matrix over Q 6.19. Let A be the 8 by 8 matrix given by
0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
A =
0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 0 1 0 0
" # 0 0 0 0 0 0 1 0
1 2
3 4 Find its eigenvalues. What about the eigenvalues of AT ?
Proof. Let e1 (x), . . . , es (x) be the invariant factors of A such that ei (x) | ei+1 (x), for i = 1, . . . s. We know that
char A (x) = e1 (x) · · · es (x)
Since es (A) = mA (A) = 0 and es (x) | char A (x), then char A (A) = 0.
Since m(x) is the minimal polynomial then
deg mA (x) ≤ deg charA (x).
By the Euclidean algorithm,
charA (x) = q(x) mA (x) + r(x)
such that deg r(x) < deg mA (x). Since charA (A) = 0, then r(A) = 0. Thus r(x) is the zero polynomial, otherwise r(x)
would be the minimal polynomial.
110
Shaska T. Linear Algebra
Furthermore, mA (A) = 0. We check that A − I , 0 and (A − I)2 = 0. Hence the minimal polynomial is
mA (x) = (x − 1)2
111
Linear Algebra Shaska T.
3) For each of the operations of step 2, perform the following operations on the identity matrix I by converting to the following
rules:
a) Ri ←→ R j =⇒ Ci ←→ C j
b) Ri −→ q(x) · Ri + R j =⇒ Ci −→ q(x) · Ci + C j
c) Ri −→ u · Ri , for u ∈ k =⇒ Ci −→ u · C j
4) The matrix obtained after performing these operations on I is the sought matrix C.
Exercises:
6.20. Find the rational form of the 3 by 3 matrix with invariant 6.25. Determine all possible rational canonical forms for a
factors matrix with characteristic polynomial
e1 (x) = (x − 1), e2 (x) = (x − 1), e3 (x) = x − 1.
f (x) = x2 (x2 + 1)2
6.21. Find the rational canonical form of matrices over Q
0 -4 85 2 2 1 6.26. Determine all possible rational canonical forms for a
A = , B = matrix with characteristic polynomial
1 4 -30 0 2 -1
0 0 3 0 0 3
2 2 1
3 4 1
6.27. The characteristic polynomial of a given matrix A is
1 5 1
char (A, x) = (x − 1)2 · (x + 1) · (x2 + x + 1).
6.23. Prove that two non-scalar 2 × 2 matrices over k are sim-
ilar if and only if they have the same characteristic polynomial.
What are the possible polynomials that can be minimal poly-
nomials of A?
6.24. Find the rational canonical form of
0 -1 -1
0 0 0 6.28. Find all similarity classes of 2 × 2 matrices with entries
in Q and precise order 4 (i.e, A4 = I).
-1 0 0
112
Shaska T. Linear Algebra
Proof. Let f (x) := (x − α)s . Then, the Cayley-Hamilton theorem implies that
f (A) = (A − αI)s = 0.
Hence, mA (x) = (x − α)r or equivalently mA−αI (x) = xr . Thus, (A − αI) is similar to the companion matrix D of g(x) := xr ,
where
0 1
0 1
0 ·
D = · ·
· ·
· 1
0
Thus, there is an invertible matrix P such that
P−1 (A − αI)P = D
α 1
α 1
α 1
D + αI = · ·
· ·
α
1
α
A matrix is in Jordan canonical form if it is a block diagonal matrix
J1
J 2
J =
.
.
Jn
with Jordan blocks along the diagonal.
Theorem 6.10. Let A be a n × n matrix with entries in k and assume that k contains all eigenvalues of A. Then,
For each αi , i = 1, . . . , er we have a Jordan block. Since the product of all the invariant factors equals the characteristic
polynomial of A, the combination of all the Jordan blocks along the diagonal will create an n by n matrix (same
dimensions as A).
Remark 6.2. The Jordan canonical form of a matrix A is diagonal if and only if A is diagonalizable.
Example 6.10. Both matrices
0 1 1 1 5 2 -8 -8
1 0 1 1 -6 -3 8 8
A = , B = ,
1 1 0 1 -3 -1 3 4
1 1 1 0 3 1 -4 -5
113
Linear Algebra Shaska T.
Solution: The minimal polynomial for A and B is one of the following polynomials:
We check that (A − 3I) (A + I) = 0. In the same way we check that (B − 3I) (B + I) = 0. Hence, the minimal polynomial of A
and B is
m(x) = (x − 3) (x + 1).
Its Smith normal forms are
1
x+1
Smith (A) = Smith (B) =
x+1
(x − 3)(x + 1)
Thus, A and B are similar. Further, A and B are diagonalizable matrices and we can diagonalize them using the techniques of
the previous chapter.
Example 6.11. Let A be a matrix such that its invariant factors are
0 0 0 -4
1 0 0 4
0 1 0 -5
0 0 1 4
0 0 0 0 0 0 8
Rat (A) = 1 0 0 0 0 0 -12 ,
0 1 0 0 0 0 22
0 0 1 0 0 0 -25
0 0 0 1 0 0 20
0 0 0 0 1 0 -14
0 0 0 0 0 1 -6
114
Shaska T. Linear Algebra
2 1
0 2
-i
+i
2 1 0
J(A) = 0 2 1
0 0 2
-i 1
0 -i
i 1
0 i
Example 6.12. Let A be a 3 by 3 matrix as below
2 1 0
A = 0 2 0
0 0 3
Solution: Then char (A, λ) = (λ − 2)2 (λ − 3). For the eigenvalue λ = 2, the algebraic multiplicity is 2 and the eigenspace is
given by
1
E2 = {t
0
| t ∈ Q}
0
The geometric multiplicity is 1, hence A is not similar to the diagonal matrix of eigenvalues.
We have
xI − A =
x − 2 1 0 C ←→C 1 x−2 0
R =(x−2)R −R
0 x−2 0 1−→ 2 x − 2 0 0 2 −→ 1 2
0 0 x−3 0 0 x−3
1 x−2 0 C =(x−2)C −C 1 0 0
R ←→R
C 2←→C3
0
(x − 2)2 0 2 −→ 1 2 0 - (x − 2)2 0 2−→ 3
0 0 x−3 0 0 x−3
1 0 0 1 0 0
0 x−3 0 −→ 0 1 0
0 0 (x − 2)2 0 0 (x − 2)2 (x − 3)
Instead we could have recognized that A was already in the Jordan canonical form. Notice that the geometric multiplicity for
each eigenvalue is 1 and there is one Jordan block for each eigenvalue. Also the algebraic multiplicities of the eigenvalues are 2
and 1 and the corresponding Jordan blocks are of sizes 2 and 1 respectively. We will see that these facts are not a coincidence.
Exercises:
115
Linear Algebra Shaska T.
6.29. Let A be a matrix with characteristic polynomial 6.35. Find the Jordan canonical form of matrices
char (A, x) = x3 + x2 + x + 1
0 -4 85 2 2 1
Find the rational form of A over Q and the Jordan canonical A = 1 4 -30 , B = 0 2 -1
form of A over C. 0 0 3 0 0 3
3 1 0 -1
char (A, x) = x5 + 2x4 − 12x3 + 4x2 − 6x + 10 4
0 0 3
A =
-4 2 2 -3
6.40. Let A be a n by n matrix which has n distinct eigenvalues
2 -4 0 7
λ1 , . . . , λn . Find the Jordan canonical form of A.
6.44. Diagonalize the matrix or explain why it can’t be diag-
onalized.
6.41. The characteristic polynomial of a 3 by 3 matrix A is
7 -1 0 2
char (A, x) = (x − 1) (x − 2).
2
-10 4 0 -4
A =
5 -1 2 2
Find all possibilities for the rational and Jordan canonical form
-15 3 0 -4
of A.
6.45. Let A be an n × n nilpotent matrix. Show that An = 0.
6.42. Determine if the matrices A and B are similar 6.46. Let A be a strictly upper triangular matrix (all entries on
the main diagonal and below are 0). Prove that A is nilpotent.
-1 1 0 0 -1 1 0 0
0 -1 0 0 0 -1 0 0
A = , B =
0 0 -2 0 0 0 -2 1
6.47. Let A be the 2 × 2 matrix which corresponds to the
0 0 0 -2 0 0 0 -2 rotation of the complex plane by 2π
n . Find the Jordan canonical
116
Shaska T. Linear Algebra
6.49. Determine the set of similarity classes of 3 × 3 matrices char (A, x) = (x4 − 1)(x2 − 1).
117
Linear Algebra Shaska T.
118
Chapter 7
In this chapter we will study the important concept of inner product in a vector space. We give the most general
definition of the inner product and briefly look at Hermitian products. The rest of the chapter is focused on
orthogonal and orthonormal bases we will study the Gram-Schmidt orthogonalization process. In the last section
a brief introduction to dual spaces is given.
f: V × V −→ k
(7.1)
(u, v) = f (u, v)
The function f is called an inner product (scalar product) if the following properties hold for every u, v, w ∈ V
and r ∈ k:
We will denote inner products with hu, vi instead of f (u, v). An inner product is called non-degenerate if
A vector space V with an inner product is called an inner space. We give some examples of inner spaces.
Example 7.1. Show that hu, z vi = z̄ hu, vi.
u1 v1
u2 v
u = and v = 2
. . . . . .
un vn )
119
Linear Algebra Shaska T.
f : [0, 1] −→ R
For f, g ∈ V we define
Z 1
h f, gi = f (t) · g(t) dt
0
Using properties of integrals it is easy to verify that this is an inner product.
Example 7.4. Let V be the vector space as above and
Compute h f, gi.
Solution: We have
Z 1 Z 1
1 1 1 − cos 2
h f, gi = sin x cos x dx = sin (2x) dx = (− cos 2 + cos 0) =
0 2 0 4 4
Definition 7.1. Let V be a vector space and h·, ·i an inner product on V. Let u ∈ V. We call v orthogonal to u if hu, vi = 0,
sometimes denoted by u ⊥ v. For a set S ⊂ V its orthogonal set S⊥ is defined as
S⊥ := {v ∈ V | ∃ s ∈ S, s ⊥ v}
u·v u u·v
proju (v) = · = u
||u|| kuk u·u
If we want a vector perpendicular to u we have
u·v
w = v − proju (v) = v − u. (7.2)
u·u
as in Eq. (2.3).
Exercise 7.1. Let u and v be vectors in an inner space with inner product h·, ·i. Take
hu, vi
w = v− u.
hu, ui
Prove that w is orthogonal to u.
Notice that for any inner product
hu, ui = hu, ui.
Hence hu, ui ∈ R and the following definition makes sense.
Definition 7.2. An inner product is positive definite if the following hold:
i) hu, ui ≥ 0 for all u ∈ V.
ii) hu, ui > 0 if and only if u , 0.
120
Shaska T. Linear Algebra
Theorem 7.1. Let V be a finite dimensional vector space over k with a positive definite inner product. If W is a subspace of V
then
V = W ⊕ W⊥ .
Moreover,
dim V = dim W + dim W ⊥ .
Proof. Exercise.
Next we study separately vector spaces over R and those over C.
hu, vi : V × V −→ R
The most common inner product of Euclidean spaces Rn is the dot product
u1 v1
u v
u = and v = 2
2
. . . . . .
un vn )
we define
hu, vi = u1 v̄1 + · · · + un v̄n
Show that this is a Hermitian product. This particular product we will call the Euclidean inner product.
Notice that for the Euclidean inner product h·, ·i
121
Linear Algebra Shaska T.
f : [0, 1] −→ C
For f, g ∈ V we define
Z 1
h f, gi = f (t) · g(t) dt
0
Using properties of complex integrals show that this is an inner product.
Example 7.7. (Fourier series) Let V be the space of continuous complex-valued functions
f : [−π, π] −→ C
For f, g ∈ V we define Z π
h f, gi = f (t) · g(t) dt
−π
For any integer n define
fn (t) = en·it .
Verify that:
i) if m , n then h fn , fm i = 0
ii) h fn , fn i = 2π
h f, f i
Rπ
iii) h fn , fnn i = 2π
1
−π
f (t) e−int dt.
h f, fn i
The quantity h fn , fn i is called the Fourier coefficient with respect to f .
Exercises:
7.1. Let V = R2 and the inner product is the Euclidean prod- 7.4. Let V := Matn (R). Define the inner product of matrices
uct. As a review of Chapter 2 prove the following for any M and N as
u, v ∈ V. hM, Ni = tr(MN)
i) ||u + v||2 = ||u||2 + ||v||2 Show that this is an inner product and it is non-degenerate.
ii) ||u + v|| ≤ ||u|| + ||v||
iii) ||u|| = 0 if and only if u = 0. 7.5. Prove the Schwartz inequality
iv) |hu, vi| ≤ ||u|| · ||v|| |hu, vi| ≤ ||u|| · ||v||
7.2. Let V be the space of real continuous functions for the Hermitian product.
f : [0, 1] −→ R 7.6. Prove the following for the Hermitian product:
i) ||u|| ≥ 0
For f, g ∈ V we define ii) ||u|| = 0 if and only if u = 0.
Z 1 iii) ||αu|| = |α| ||u||
h f, gi = f (t) · g(t) dt iv) ||u + v|| ≤ ||u|| + ||v||
0
7.7. Let V := Matn (R). Let A, B be any matrices in V such
Given f (x) = x3 , find g(x) ∈ V such that g is orthogonal to f .
that
" # " #
7.3. Let V be the vector space as in the previous exercise and a1 a2 b1 b2
A := , and B :=
W the set all polynomials in V. Is W is a subspace of V? Given a3 a4 b3 b4
a polynomial
Is the following
f (x) = an x + an−1 x + · · · + a1 x + a0 ,
n n−1
hA, Bi = a1 b1 + a2 b2 + a3 b3 + a4 b4
Can you find g(x) ∈ V such that h f, gi = 0 ? an inner product on V?
122
Shaska T. Linear Algebra
7.8. Let P2 denote the space of polynomials in k[x] and degree 7.9. Let P2 be equipped with the inner product as in the above
≤ 2. Let f, g ∈ P2 such that example. Describe all the polynomials of norm 1.
hvi , v j i = 0.
If in addition, all vectors have norm one then they are called an orthonormal.
Lemma 7.1. Orthonormal vectors in Rn are linearly independent.
Proof. Exercise
Exercise 7.2. Is the above Lemma true for any inner space?
Exercise 7.3 (Pythagorean theorem). Consider vectors u, v ∈ Rn . Prove that
holds if and only if u and v are orthogonal. Is this true for in any inner space?
Theorem 7.2. If v1 , . . . , vn are linearly independent then there is an orthogonal set w1 , . . . , wn such that
w1 = v1
hv2 , w1 i
w2 = v2 − w1
hw1 , w1 i
hv3 , w2 i hv3 , w1 i
w3 = v3 − w2 − w1 (7.3)
hw2 , w2 i hw1 , w1 i
......
hvi+1 , wi i hvi+1 , w1 i
wi+1 = vi+1 − wi − · · · − w1
hwi , wi i hw1 , w1 i
The reader can check that this is the desired orthogonal set.
Let
B = {v1 , . . . , vn }
123
Linear Algebra Shaska T.
ii) Let
w1 := v1
Example 7.8. Let V = R3 and the inner product on V is the dot product . Let
1
2
v1 = , v2 = 2
2
3 1
124
Shaska T. Linear Algebra
f : [0, 1] −→ R
For f, g ∈ V we define
Z 1
h f, gi = f (x) · g(x) dx
0
As shown above, this is an inner product. Let
f (x) = x, g(x) = x2
Since both are continuous then f, g ∈ V. Find an orthogonal basis of Span ( f, g).
Solution: Take
S = { f, 1}
We want to find an orthogonal set W such that f ∈ W. Let w1 = f . Then
R1
h1, f i x3 dx 7
w2 = 1 − f = 1 − R 01 x3 = 1 − x3
h f, f i x6 dx 4
0
for all i = 1, . . . , m.
125
Linear Algebra Shaska T.
1 2 3
4 5 6
A =
7 8 9
1 1 1
Exercises:
7.11. Find an orthogonal basis for the nullspace of the matrix be given. Find an orthogonal basis of Span (v1 , v2 , v3 , v4 , v5 ).
2 -2 14
A := 0 3 -7 7.18. Let P2 denote the space of polynomials in k[x] and degree
0 0 2
≤ 2. Let f, g ∈ P2 such that
7.12. Find an orthogonal basis for the nullspace of the matrix
f (x) = a2 x2 + a1 x + a0 , and g(x) = b2 x2 + b1 x + b0 .
2 1 1
.
1 2 0
Define
1 1 3
h f, gi = a0 b0 + a1 b1 + a2 b2 .
7.13. Find an orthogonal basis for the nullspace of the matrix
3 1 0 -1 Let f1 , f2 , f3 , f4 be given as below
4 0 0 3
A =
-4 2 2 -3 f1 = x2 + 3
2 -4 0 7
f2 = 1 − x
7.14. Let V be the space of real continuous functions. Given (7.5)
f3 = 2x2 + x + 1
f (x) = x and g(x) = e , find an orthogonal set W = {w1 , w2 }
2 x
7.15. In the space of real continuous functions find a function Find an orthogonal basis of Span ( f1 , f2 , f3 , f4 ).
g(x) which is orthogonal to f (x) = sin x.
7.16. Show that the following identity holds for any inner √
7.19. Find an orthogonal basis for the subspace Span (1, x, x)
product of the vector space C0,1 of continuous functions on [0, 1], where
||u + v|| + ||u − v|| = 2||u|| + 2||v|| R1
h f, gi = 0 f (x)g(x)dx.
7.17. Let V = R4 and the inner product on V is the dot product
. Let
1 2 1 1 0
2 0 1 2 0 7.20. Find an orthonormal basis for the plane
v1 = , v2 = , v3 = , v4 = , v5 =
3 2 1 3 1
4 1 1 4 2 x + 7y − z = 0.
126
Shaska T. Linear Algebra
x22
x21 + = 1,
4
then it is transformed to the unit circle x21 + x22 = 1. So linear transformations change shapes of objects in R2 .
How should a linear transformation be such that it preserves shapes? Obviously, it has to preserve distances.
This motivates the following:
Definition 7.3. A linear transformation T : Rn → Rn is called orthogonal if it preserves the length:
||T(x)|| = ||x||,
for all x ∈ Rn . The corresponding matrix of an orthogonal map is called an orthogonal matrix.
Proposition 7.1. Orthogonal transformations preserve orthogonality. In other words, if u is orthogonal to v then T(u) is
orthogonal to T(v).
We have
||T(u) + T(v)||2 = ||T(u + v)||2 = ||(u + v)||2 = ||u||2 + ||v||2 = ||T(u)||2 + ||T(v)||2
Theorem 7.4. i) A linear transformation T : Rn → Rn is if and only if the image of the standard basis is an orthonormal basis.
ii) An n × n matrix is orthogonal if and only if its columns form an orthonormal basis for Rn .
Proof. Let B = {e1 , . . . , en } be the standard basis. Then, by Proposition 7.1 the set
{T(e1 ), . . . , T(en )}
A−1 = AT .
Proof. Let A be a given orthogonal matrix. From Theorem 7.4 its columns form an orthonormal basis, say
| | |
A = v1
v2 ... vn .
| | |
127
Linear Algebra Shaska T.
Then,
T
− v1 −
v1 · v1 v1 · v2 ··· v1 · vn
− vT2 − | | | v2 · v1 v2 · v2 ··· v2 · vn
A A =
T
..
v v2 ... vn = . .. .. .. = I,
.
1
| .. . . .
| |
− vTn − vn · v1 vn · v2 ··· vn · vn
since v1 , . . . , vn is an orthonormal set.
Another property of orthogonal matrices is the following.
Proposition 7.2. Let A be an orthogonal matrix. For all u and v ∈ Rn we have that
u · v = (Au) · (Av)
Proof. Exercise.
Summarizing all properties of orthogonal matrices we have the following.
Theorem 7.6. Let A be an n × n matrix. Then the following statements are equivalent.
i) A is an orthogonal matrix
ii) The transformation T(x) = nAx preserves length (in other words kAxk = kxk).
iii) The columns of A form an orthonormal basis
iv) AT A = In .
v) A−1 = AT .
vi) A preserves the dot product, in other words u · v = (Au) · (Av).
Proof. Combine all the results proved above to show the equivalence of these statements.
Since we will be using the transpose more often in the coming sections let’s summarize some of its properties.
Proposition 7.3. The following are true:
i) (A + B)T = AT + BT
ii) (rA)T = rAT
iii) AB)T = BT AT
iv) rank AT = rank A
−1 T
v) AT = A−1 .
Proof. Exercise
Proof. Let V be a subspace of Rn with an orthonormal basis u1 , . . . , un . The projection of x onto V is given by
T
−u −
i 1
projV (x) = (u1 · x) u1 + · · · + (um · x) um = u1 u1 x + · · · + um um x = u1 |u2 | · · · |um ... x = QQT x
h
T T
T
−um −
128
Shaska T. Linear Algebra
Let us go back to the case of the line.
Example 7.12. Consider a line L in R2 with equation
L : y = ax + b.
Find the matrix of the orthogonal projection onto L.
Proof. We have noted before that if the line doesn’t pass through the origin, the projection is not even a linear map.
However, let’s just pretend that we don’t even know that.
Notice that a directional vector for L is " # " #
−b/a b −1
u= = .
b a a
√
Its norm is ||u|| = ba a2 + 1. Hence, {v} is a orthonormal basis for L, where
" #
u 1 −1
v= = √ .
||u|| a2 + 1 a
Notice that there is no b anymore in this vector. That’s because this is a vector not on the original line, but on the
line parallel to L which goes through the origin.
Then from Theorem 7.7 the matrix P is
" #! ! " #
1 −1 1 h i 1 1 −a
P= √ √ −1 a = 2 .
a2 + 1 a a2 + 1 a2 + 1 −a a
Let us check how this will work with the directional vector u ∈ L. We have
−(a + 1)
" # " #! " 2 # " #
1 1 −a b −1 b b −1
Pu = 2 · = = = u,
a(a2 + 1 a(a2 + 1)
2
a + 1 −a a a a a a
as expected.
We have already seen the above example; see Fig. 1.2.
Exercise 7.4. Find the matrix of the orthogonal projection onto the subspace V of R4 such that V = Span (u, v), where
1 1
1 1 1 −1
u = and v =
2 1 2 −1
1 1
Notice that u and v already form an orthonormal basis for V.
Exercise 7.5. Given a plane P in R3 going through the origin, say with equation
ax + by + cz = 0.
Find the matrix of the orthogonal map onto P.
Exercises:
129
Linear Algebra Shaska T.
7.21. Check whether the matrix is orthogonal entries in k is a vectors space; see ??. Consider the map
T : Matn×n (k) → Matn×n (k) given by
1 1 −1
3 2 −5
T(A) = AT .
2 2 0
7.22. For a field k the space Matn×n (k) of n × n matrices with Is this a linear map? Prove your answer.
130
Shaska T. Linear Algebra
x x1 x2 x3 x4 ... xn
y y1 y2 y3 y4 ... yn
Table 7.1
Geometrically two of these points (xi , yi ) determine a line. However, we are looking for the line that is "closest"
to all the given points. Let us assume that the equation of f (x) is given by
f (x) = ax + b.
Then we have
yi = axi + b, for i = 1, . . . n.
In matrix notation we have
x1 1 ax1 + b
x2
1 " # ax2 + b
· · a
·
· =
· b ·
· · ·
xn 1 axn + b
or we write this as
A v ~y,
where
x1 1 ax1 + b
x 1 ax + b
2 " # 2
· · a ·
A = , v= , ~y = .
· · b ·
· · ·
axn + b
xn 1
# "
a
The problem now becomes to determine v = such that the error vector Av − ~y is minimal. The concept of
b
minimal depends on the type of application. The method of least squares is based on the idea that we require
that the magnitude ||Av − ~y|| is minimal. Denote by d := Av − ~y. Then, di = ax + b − yi . Minimizing ||Av − ~y|| means
minimizing ||Av − ~y||2 , which means minimizing
for all v ∈ W. Because the dot product is a non-degenerate inner product then
AT Av0 − AT ~y = 0
131
Linear Algebra Shaska T.
and
v0 = (AT A)−1 AT ~y
The matrix
P := (AT A)−1 AT
is sometimes called the projection matrix of A.
Next we provide some examples for different polynomial approximations.
Example 7.13. Let the following data be given
x 1 2 2 5
y 2 3 5 7
Solution: Then
1 1 2
2 1 3
A := , and b =
2 1 5
5 1 7
We have " #
34 10
A A=T
10 4
The least squares solution is
" #
1 7
v0 = (AT A)−1 AT ~y =
6 8
Hence, the best fitting line to the above data is
7 4
y = x+
6 3
As we will see in the next example the least squares method has its limitations. As expected not everything in
applications is linear. If we approximate a given data with a linear model then this model might not fit the data
very well. In the next example we see that sometimes such an approximation is not close at all to the data.
Example 7.14. Let the following data be given
x 1 2 3 4 5
y 2 5 4 7 2
Solution: Then
1 1 2
2 1
5
A :=
3 1 ,
and ~y = 4
4 1 7
5 1 2
The least squares solution is
" #
1 1
v0 = (A A) T −1
A ~y =
T
5 17
132
Shaska T. Linear Algebra
Figure 7.1: Fitting of the above data by the least squares method.
A v ~y
be given. A least squares solution of the matrix equation A v ~y is a vector v0 such that
Let v1 and v2 denote the column vectors of A. The vector Av = av1 + bv2 lies in the space W = Span (v1 , v2 ).
We want to find a vector v0 ∈ W such that the dot product Av · (Av0 − ~y) = 0 for all v ∈ W. Then we have
for all v ∈ W. Because the dot product is a non-degenerate inner product then
AT Av0 − AT ~y = 0
133
Linear Algebra Shaska T.
and
v0 = (AT A)−1 AT ~y
The matrix
P := (AT A)−1 AT
is called the projection matrix of A. The vector Av0 is called the orthogonal projection of ~y on the column space of
A.
x x1 x2 x3 x4 ... xr
y y1 y2 y3 y4 ... yr
Table 7.2
v0 = (AT A)−1 AT ~y
Example 7.15. Let the following data be given as in the previous example.
x 1 2 3 4 5
y 2 5 4 7 2
Find a polynomial of degree 2 that best fits the data.
Solution: Then
1 1 1 2
4 2 1 5
A := 9
3 1 ,
and ~y = 4
16 4 1 7
25 5 1 2
The least squares solution is
134
Shaska T. Linear Algebra
Figure 7.2: Fitting of the above data by the least squares method.
6
− 7
v0 = (AT A)−1 AT ~y = 187
35
13
−5
Hence, the best fitting degree 2 polynomial to the above data is
6 187 13
y = − x2 + x−
7 35 5
The graph of Fig. Fig. 7.2 presents the graph of the data and of the function. Notice how we get a better approximation than in
the linear case.
Example 7.16. Find degree 3 and 4 polynomials that approximate the data of the previous example.
1 15 53
y = − x3 + x2 − x + 3
3 7 21
The graph is presented in Fig. Fig. 7.3. Compare this with degree 1 and 2 polynomials to see that we get a better fit.
Since we have four points on the plane, then by using a degree 4 polynomial we are able to find a polynomial that will pass
through the points. The least squares method will find this unique solution when it exists. In this case, the degree 4 polynomial
that fits the data is
5 29 235 2 196
y = − x4 + x3 − x + x − 33
6 3 6 3
and the graph is presented in Fig. Fig. 7.4.
The least squares method can be used for many other applications. (At A)−1 exists if A has independent column
vectors. Thus, we have a unique least squares solution if null (A) = 0.
Example 7.17. Find the least squares solution to the system
x1 − x2 = 4
+ 2 + x3 = 3
3x 1 2x
3x 1 + 2x 2 − 5x3 = 1
2x1 + x2 − x3 = 3
135
Linear Algebra Shaska T.
Solution: We have
Ax = ~y
where
1 -1 0 4
3 2 1 3
A =
,
~y = .
3 2 -5 1
2 1 -1 3
136
Shaska T. Linear Algebra
Then the orthogonal projection of ~y on the column space of A is the vector Av0 given by
1142
4.152727273
275
179
3.254545455
55
Av0 = = .
331 1.203636364
275
123
2.236363636
55
Exercises:
7.23. Let the following data be given 7.28. Find the least squares solution to the system
x 1 2 3 4 5 x1 − 11x2 = 1
y 8 13 18 23 28 3x1 + x3 = 2
Find a linear function that best fits the data.
x1 + 2x2 = 1
2x1 + x2 − x3 = 31
7.24. Let the following data be given
and the corresponding orthogonal projection.
x 0 1 3 2 -2
y 2 1 -1 -5 4
7.29. Find the least squares solution to the system
Find a linear function that best fits the data.
5x1 − 12x2 = 4
7.25. Find a degree 2 polynomial that best fits the following
x1 + 3x2 = −2
data:
6x1 + 2x2 = −1
x 0 1 3 2 5
y -2 -1 2 4 2 and the corresponding orthogonal projection.
7.26. Find a degree 3 polynomial that best fits the data:
7.30. Find the least squares solution to the system
x 0 1 3 5 -2
3x1 − x2 + 3x3 = 4
y 2 3 5 0 -4
3x1 + 7x2 + x3 = 3
7.27. Find the least squares solution to the system
3x1 + 2x2 − x3 = 21
2x1 + x2 − x3 = 4
x1 − x2 = 4
1 + 2x2 = 3
3x
3x1 + 2x2 = 1
and the corresponding orthogonal projection.
137
Linear Algebra Shaska T.
ci := hvi , vi i
for i = 1, . . . n. We can reorder the basis B such that
where p + s + r = n. Sylvester’s theorem says that the numbers p, s, r don’t depend on the choice of the orthogonal
basis B. We normalize the basis as follows. Let
vi , if ci = 0
vi
√ , if ci > 0
0
vi :=
ci (7.8)
vi
√−c , if ci < 0
i
Exercises:
What is the signature of Rn with the usual Euclidean Find the signature of the Euclidean product for W.
inner product?
7.32. Let P2 denote the space of polynomials in k[x] and degree
7.31. Let W be the space generated by ≤ 2. Let f, g ∈ P2 such that
v1 = (1, 2, 3, 4), f (x) = a2 x2 + a1 x + a0 , and g(x) = b2 x2 + b1 x + b0 .
v2 = (2, 0, 2, 1),
v3 = (1, 1, 1, 1), (7.9) Define
h f, gi = a0 b0 + a1 b1 a2 b2 .
v4 = (1, 2, 3, 4)
v5 = (0, 0, 1, 2) Find the signature of this inner product for P2 .
138
Shaska T. Linear Algebra
φ(vi ) = 1
(
φi := (7.10)
φ(v j ) = 0, for j , i
The functionals {φ1 , . . . , φn } form a basis for V ? .
Proof. Exercise.
Definition 7.5. The basis {φ1 , . . . , φn } of V ? is called the dual basis.
The dual space is a very important concept in linear algebra. Below we give a few more examples of functionals
which are important in different areas of mathematics.
Example 7.19. Let V be a vector space over k with scalar product h·, ·i. Fix an element u ∈ V. The map
V −→ k
v −→ hv, ui
is a functional.
Example 7.20. Let V be a vector space of continuous real-valued functions on the interval [0, 1]. Define
δ : V −→ R
such that δ( f ) = f (0). Then δ is a functional called the Dirac functional.
Theorem 7.10. Let V be a finite dimensional vector space over k with a non-degenerate scalar product. The map
Φ : V −→ V ?
(7.11)
v 7→ Lv
is an isomorphism.
Proof. See for example [Lan87, pg. 128].
Exercises:
139
Linear Algebra Shaska T.
7.33. Let V be a vector space of finite dimension. Prove that, 7.34. Let V = Matn×n (R). Describe V ? .
dim V = dim V ? .
7.35. Let V = R2 n. Describe V ? .
140
Chapter 8
Symmetric matrices
The theory of symmetric matrices is tied closely with the classical theory of quadratic forms. In the first section
we give a brief overview of quadratic forms and how to associate them with matrices. In the next few sections we
study in more detail the symmetric matrices. The main focus of this chapter can be summarized in the problem of
diagonalizing a quadratic form. For example, given and equation
x2 y2 z2
+ + =1
4 9 16
which obviously is a lot easier to recognize its shape.
∆ f = −4 det M f
141
Linear Algebra Shaska T.
Remark 8.1. There are many authors who define binary forms as
Change of coordinates
λ λ2
" #
A change of coordinates is any linear map R2 → R2 for some matrix M = 1 ∈ Mat2 (R),
λ3 λ4
λ λ2 x
" # " #" #
x
→ 1 .
y λ3 λ4 y
f λ1 x + λ2 y, λ3 x + λ4 y = (aλ21 + bλ1 λ3 + cλ23 )x2 + (2aλ1 λ2 + bλ1 λ4 + bλ2 λ3 + 2cyλ3 λ4 )xy + (aλ22 + bλ2 λ4 + cλ24 )y2
So we have
f λ1 x + λ2 y, λ3 x + λ4 y = vt MT M f M v
Exercise 8.1. If A is a symmetric matrix, then for any matrix B, the matrix BT AB is symmetric.
Two binary quadratic forms f (x, y) and g(x, y) are called equivalent if they are related through an invertible
change of coordinates. In other words, if there exists an invertible matrix M ∈ GL2 (R) such that
So we are ready to answer the natural question: if two binary forms f and g are related by a change of coordinates,
how are their matrices M f and M g related?
Lemma 8.2. Two binary quadratic forms f and g are related through a change of coordinates M ∈ Mat2 (R) if and only if their
corresponding matrices satisfy,
M f = MT M g M.
Two binary forms are equivalent over R if and only if there exists M ∈ GL2 (R) such that
M f = Mt M g M.
Proof. We already proved the first claim. To prove the second claim let us assume that M ∈ GL2 (R) such that
M f = MT M g M.
Two matrices A and B are called congruent over R if there is an invertible M ∈ GL2 (R) such that
A = MT BM.
142
Shaska T. Linear Algebra
for some fixed constant d ∈ R. Can we somehow use the matrix M f to determine the shape of the graph? Without
any loss of generality we can assume that d = 1 by applying the transformation
" # "√ #
x dx
→ √ .
y dy
f (x, y) = α1 x2 + α2 y2 (8.1)
then this would be much easier to graph. We have seen such graphs from high school. We call binary quadratics as
in Eq. (8.1) diagonal, since their corresponding matrices are diagonal
α1
" #
0
0 α2
So how can we change a binary quadratic to a diagonal quadratic? We know how to diagonalize matrices. So
maybe the same procedure can be used to diagonalize quadratics? Let’s give it a try.
The characteristic polynomial of A is
b2
char (A, x) = x2 − (a + c)x − − ac
4
Its eigenvalues are
p p
a+c (a − c)2 + b2 a+c (a − c)2 + b2
λ1 = + and λ1 = −
2 2 2 2
and their corresponding eigenvectors
√1 √1
v1 = (c−a)+ (c−a) +b and v2 = (c−a)− (c−a) +b .
2 2
2 2
1 1
√1 √1
P = (c−a)+ (c−a)2 +b2 .
(c−a)2 +b2
(c−a)−
1 1
Since D is given by
D = P−1 AP
we can make this work if somehow P−1 = PT as in Lemma 8.2. But we know exactly about matrices with this
property, thanks to Theorem 7.5. They are the orthogonal matrices.
So our next challenge becomes to find an orthogonal matrix P and a diagonal matrix D such that A = PDP−1 ,
or in other words to choose the eigenvectors in the diagonalization process such that the transition matrix C is
orthogonal.
143
Linear Algebra Shaska T.
x α1 α2
f(x) a -a a
the sign of f (x, 1) is determined by the following: f (x, 1) has the opposite sign of a in the interval (−α1 , α2 ) and it has the sign
of a everywhere else.
let us make the substitution t = xy . Then the sign of f (x, y) is the same as the sign of
g(t) = at2 + bt + c.
From the above discussion, this is always positive if and only if a > 0 and ∆ g = ∆ f < 0.
For values y = 0 we have f (x, 0) = ax2 , so f (x, y) is not positive definite since for x = 0 it is f (0, 0) = 0. This
completes the proof.
q : Rn → R
such that
n
X
q(x) = ai,j xi x j , (8.2)
i=0, j=0
Exercise 8.2. Prove that for a given form q the matrix Aq is unique.
Exercise 8.3. Prove that the set Qn of all quadratic forms over R forms a subspace in the space of all functions from Rn to R.
What is the dimension of this space?
144
Shaska T. Linear Algebra
Binary forms are the simplest of all quadratic forms. Quadratic forms
q : R3 → R
q(x) = a1,1 x21 + a1,2 x1 x2 + a1,3 x1 x3 + a2,2 x22 + a2,3 x2 x3 + a3,3 x23
x2 y2 z2 z
q(x, y, z) = + + ,
a2 b2 a2 c
which has corresponding matrix
1 y
a2 0 0
0 b
Aq = 0 1 a
b2
0
1
0 0 c2
Exercises:
8.1. Prove that there is a one to one correspondence between 8.2. Let
the set of positive definite quadratic forms and the upper half q(x, y) = a x2 + 2b xy + c y2
plane
be a quadratic form. From methods of multivariable calcu-
lus determine the global extrema of this function. Can you
H = {z ∈ C | <(z) > 0} accomplish this via linear algebra methods?
A = ST DS
145
Linear Algebra Shaska T.
Expressing a matrix A in the above form would be beneficial for obvious reasons, not only we change the base of
the vector space such that A becomes a diagonal matrix, but we do so preserving distances. The natural question
is, which matrices are orthogonally diagonalizable? We will answer this question in the remaining of this lecture.
Lemma 8.5. If A is orthogonally diagonalizable then AT = A.
Proof. If A is orthogonally diagonalizable then it exists an orthogonal matrix S ∈ GLn (R) such that
A = SDST ,
for some diagonal matrix D. Then,
AT = (SDST )T = (ST )T · DT · ST = A.
Hence, AT = A.
" #
1 2
Example 8.2. Let A = . Find S orthogonal such that ST AS is orthogonal.
2 −2
Solution: Since for an orthogonal matrix S we have ST = S−1 , then we are looking for a matrix such that S−1 AS is
diagonal. We follow the same method as in Section 5.3. The characteristic polynomial is
char (A, λ) = (1 − λ)(−2 − λ) − 4 = λ2 + λ − 6 = (λ − 2)(λ + 3)
For λ = 2 we have " #!
2
E2 = Span
1
and for λ = −3 we have " #!
−1
E−3 = Span
2
Then the matrices S and D are " # " #
2 −1 2 0
S= , and D=
1 2 0 −3
The matrix S is not orthogonal, since its columns do not form an orthonormal basis for R2 . We can fix this by taking
" #
1 2 −1
S= √ .
5 1 2
Next, we see how to do this in general.
Theorem 8.1. Let v1 and v2 be eigenvectors of a symmetric matrix A belonging to distinct eigenvalues λ1 and λ2 . Then, v1
and v2 are orthogonal.
Proof. Consider the product vT1 Av2 . Then we have,
146
Shaska T. Linear Algebra
Theorem 8.2. If A is a symmetric matrix, then all its eigenvalues are real.
Proof. Since complex eigenvalues occur in pairs via the conjugate, consider such a pair α ± iβ and the corresponding
eigenvectors v ± iw, respectively. Note that
Also,
(v + iw)T A(v − iw) = (A(v + iw))T (v − iw) = (α + iβ)(v + iw)T (v − iw) = (α + iβ) (||v||2 + ||w||2 )
Hence, α + iβ = α − iβ and we are done.
Exercise 8.4. Prove the above theorem using the definition of the dot product for vectors spaces over C.
Next, we consider the main result of this lecture, the so called spectral theorem.
Theorem 8.3 (Spectral theorem). A matrix A is orthogonally diagonalizable if and only if A is symmetric.
B = {vi,1 , . . . , vi,si }
147
Linear Algebra Shaska T.
Theorem 8.4 (Principal Axes Theorem). Let q(x) be a quadratic form and A its corresponding matrix with
A = QDQT
Proof. Let q(x) = xT Ax. By the Spectral Theorem there exist matrices Q and D such that Q is orthogonal and D is
diagonal with eigenvalues of A as entries in the main diagonal and
A = QT DQ.
Then we have
−1
D = Q−1 A QT = QT AQ.
2x + y x − 2y
!
q (Qx) =q √ , √ = 6x2 + y2 ,
5 5
Figure 8.2: The ellipse after the transformation
as expected.
148
Shaska T. Linear Algebra
Diagonalize the quadratic form q(x) and graph the equation again.
" #
−7 −6
Proof. The corresponding matrix is A = .
−6 2
Its eigenvalues are λ1 = 5 and λ2 = −10 and the corresponding unit
eigenvectors " 1# " #
−2 2
v1 = and v2
1 1
Normalizing them we have
" # " #
1 −1 1 2
v1 = √ , v2 = √ .
5 2 5 1
So the matrix for the coordinate change is
"
1 −1 2
# Figure 8.3: The hyperbola after the transfor-
Q= √ mation
5 2 1
If we check the change of coordinates we have
1
q (Qx) = q − x + 2y, x + y = 5x2 − 10y2 .
2
The red graph is the initial one and the blue graph is the graph of the quadratic in the diagonal form.
Let us see another example.
Exercise 8.5. Let T : R2 → R2 be a linear map such that
T(x) = Ax,
for A a 2 × 2 invertible symmetric matrix. Show that the unit circle is mapped to an ellipse under T. Find the lengths of the
semi-major and the semi-minor axis of the ellipse in terms of the eigenvalues of A.
Solution: Since A is invertible, then its eigenvalues λ1 , λ2 are nonzero and real. Assume that |λ1 | ≥ |λ2 |. We denote
by v1 , v2 the
" corresponding
# orthonormal eigenbasis.
x
Let x = be a vector on the unit circle. Then,
y
x = v1 cos θ + v2 sin θ
Then,
T(x) = cos θ · T(v1 ) + sin θ · T(v2 ) = cos θ · (λ1 v1 ) + sin θ · (λ2 v2 )
which is on the ellipse with semi-major axis ||λ1 v1 || = |λ1 | and semi-minor axis ||λ2 v2 || = |λ2 |.
Notice that A is orthogonally decomposed as
√
. . 2 (5 − 17) . .
" #"1 #" #
0√
A=
. . . .
2 (5 + 17)
1
0
" #
2 2
Example 8.5. The unit circe under the transformation x → Ax, where A = is transformed to the ellipse as shown
2 3
Fig. 8.4.
149
Linear Algebra Shaska T.
f Q = λ1 x2 + λ2 y2 .
From high school we know that the corresponding graph is an ellipse if λ1 and λ2 have the same sign and a
hyperbola if they have different signs.
Example 8.6. Find the shape of the equation
Determine linear substitutions for x, y, z such that the monomials xy, xz, yz disappear.
150
Shaska T. Linear Algebra
A ternary quadratic form is a second degree polynomial equation in three variables x, y, z which has the form
F(x, y, z) = h.
A is called the matrix associated with the quadratic form F(x, y, z). Sometimes it is useful to rotate the xy-axis such
that the equation of the above curve does not have the terms xy, yz, xz. Such quadratic forms are called diagonal
quadratic forms. This would be equivalent to asking that the associated matrix be diagonal.
Example 8.8. Let the quadratic form q(x, y, z) be given as below
−1 −1
2
A = −1 3 −1
−1 −1 3
in(A) := (n1 , n2 , n3 )
where ni , i = 1, 2, 3 denotes the number of positive, negative, and zero eigenvalues of A respectively.
Lemma 8.6. Let F(x, y, z) be a ternary quadratic form and A its associated matrix. The following are true:
151
Linear Algebra Shaska T.
with
char (A, x) = (x2 − 81)(x − 18).
So the eigenvalues are
λ1 = 18, λ2 = 9, λ3 = −9.
From the Lemma 8.6 we already know that this is a hyperboloid with one sheet.
The normalized eigenvectors are:
2
1
2
1 1 1
v1 = , v2 = −2 , v3 = −1
2
3 1
32 3 −2
Solution: We notice that our previous techniques do not apply here since this is not a norm. However, we could group all x
terms together and all y terms together as follows:
9x2 − 18x + 4y2 − 16y =
152
Shaska T. Linear Algebra
q(x) = x2 + 2 xy + 2 xz + y2 − 2 yz + z2 .
The matrix associated to q(x) is
1 1 1
1
1 −1
1 −1 1
Exercises:
153
Linear Algebra Shaska T.
8.3. Find an orthogonal matrix S and a diagonal matrix D What is the kernel of A?
such that A = ST DS, where
" #
3 2
A= 8.9. Find all the eigenvalues and their multiplicities of the
2 3
following matrix
8.4. Find an orthogonal matrix S and a diagonal matrix D
such that A = ST DS, where
3 1 1 1 1
" #
3 3 1
3 1 1 1
A=
3 −5 A = 1 1 3 1 1
1 1 1 3 1
8.5. Find an orthogonal matrix S and a diagonal matrix D
such that A = ST DS, where 1 1 1 1 3
0 0 3
A = 0 2
0 8.10. Let the unit sphere in R3 with equation
3 0 0
8.7. Find an orthogonal matrix S and a diagonal matrix D 8.11. Classify the quadratic surface
such that A = ST DS, where
1 0 1
2x2 + 4y2 − 5z2 + 3xy − 2xz + 4yz = 2.
A = 0 1 0
1 0 1
8.12. Classify the quadratic surface
8.8. Prove that the algebraic multiplicities equal the geometric
multiplicities for all the eigenvalues of the following matrix
x2 + y2 − z2 + 3xy − 5xz + 4yz = 1.
1 1 1 1 1
1 1 1 1 1
A = 1 1 1 1 1
8.13. Classify the quadratic surface
1 1 1 1 1
1 1 1 1 1 x2 + y2 + z2 = 1.
154
Shaska T. Linear Algebra
Proof. Exercise.
Theorem 8.5. The following are true:
• A symmetric matrix A is positive definite if and only if all of its eigenvalues are positive. A is positive semidefinite if and
only if all of its eigenvalues are positive or zero.
• A symmetric matrix A is positive definite iff all of its eigenvalues are positive. A is positive semidefinite iff all of its
eigenvalues are positive or zero.
Proof. The proof is rather straightforward. If λ1 , . . . , λn are the eigenvalues of A, then in its diagonal form
are the matrices obtained by chopping off all the rows and columns > i of A. We have the following:
Theorem 8.6. A symmetric n × n matrix A is positive definite if and only if det A(m) > 0 for all principal submatrices A(m) ,
m = 1, . . . , n.
155
Linear Algebra Shaska T.
λ1 ≥ λ2 ≥ · · · ≥ λn
i) λ1 ≥ q(x) ≥ λn
ii) q(x) obtains a maximum value when x is a unit eigenvector of λ1 . This maximum value is q(x) = λ1 .
ii) q(x) obtains a minimum value when x is a unit eigenvector of λn . This minimum value is q(x) = λn .
Proof. Exercise
Exercises:
156
Shaska T. Linear Algebra
Proof. Exercise.
Hence all eigenvalues of AT A are positive or zero. Let assume they are
λ1 ≥ λ2 ≥ · · · ≥ λm ≥ 0
We denote by
p
σi = λi , i = 1, . . . m.
The singular values of A are the square roots of the eigenvalues of the m × m matrix AT A. Usually we write the
singular values σ1 , . . . , σm of a matrix in decreasing order
σ1 ≥ · · · ≥ σm ≥ 0
The following theorem shows that the number of singular values that are equal to zero is an invariant under
any base change.
Theorem 8.8 (Singular values and rank). If A is an n × m matrix of rank r, then the singular values σ1 , . . . , σr are nonzero
and
σr+1 = · · · = σm = 0.
Proof. We start by
157
Linear Algebra Shaska T.
Solution: We have
5 0 0 4
0 1 0 0
A A =
T
0 0 1 0
4 0 0 5
Then,
char (AT A, x) = (x − 9)(x − 1)3
so the eigenvalues are
λ1 = 9, λ2 = 1
and the singular values
σ1 = 3, σ2 = 1.
Next we take a non-symmetric matrix
Example 8.15. Let be given the matrix
−1
2 1 1
1 1 0 3
A =
1 0 1 0
1 0 −3 2
Solution: We have
−4
7 3 7
3 2 −1 4
A A =
T
−4 −1 11 −7
7 4 −7 14
Then,
char (AT A, x) = x4 − 34 ∗ x3 + 253 ∗ x2 − 508 ∗ x + 144
This polynomial is irreducible. We can find its eigenvalues numerically and they are
A = UΣV T
where U is an orthogonal n × n matrix, V is an orthogonal m × m matrix, and Σ is an n × m matrix whose first r diagonal
entries are the nonzero singular values σ1 , . . . , σr of A, while all the other values are zero.
158
Shaska T. Linear Algebra
v1 , v2 , . . . , vm .
Let
1 1 1
u1 = Av1 , u2 = Av2 , . . . , ur = Avr .
σ1 σ2 σr
Then take
V = [v1 |v2 | · · · |vr ] , U = [u1 |u2 | · · · |ur ]
and
σ1
..
Σ =
.
σr
0 0
Next we see some examples.
Example 8.16. Find a singular value decomposition for
"
#
6 2
A=
−7 6
Solution: We have
# "
85 −30
A A=
T
−30 40
with characteristic polynomial
159
Linear Algebra Shaska T.
Hence " #
1 1 2
U= √
5 −2 1
Finally
# "
10 0
Σ=
0 5
2 0 0 1
0 1 0 0
A =
0 0 1 0
1 0 0 2
1
1 0
v1 = √
2 0
1
Then,
3 0 0 0
0 1 0 0
A = Q
T
Q
0 0 1 0
0 0 0 1
160
Shaska T. Linear Algebra
in the standard form. Write down the standard form and identify the surface.
2 −1
1
A = 2 1 1
−1 1 −2
−1
1
1
1 1 1
v1 = √ 1 , v2 = √ −1 , v3 = √ 1 ,
3 1 6 2 2 0
Hence, the orthogonal transitional matrix is Q = [v1 |v2 |v3 ]. The diagonalized form will be
ax + by + cz = 0.
Consider the reflection map T : R3 → R3 which takes a point P to its reflection with respect to the given plane. Is this map
linear? If that is the case determine its corresponding matrix. Prove your answers.
Solution: We just give a brief outline here. We will call the plane ax + by + cz = 0, the plane P.
a
~
Take the vector u = b, this is called the normal vectors to the plane P. Let A be a point and v := OA.
c
Use what we have already learned to find the projection projP v of v on the plane. Then the vector perpendicular
from A to the plane is
v − projP v
The symmetric point A0 of the point A with respect to P is represented by the vector
~ 0 = v − 2 v − proj v
OA P
161
Linear Algebra Shaska T.
Since this map is given through a multiplication by a matrix then it must be linear.
This transformation is sometimes called Householder transformation and is widely used in optics, computer
vision, etc.
Exercise 8.6. Prove the linearity of the reflection to a plane using only the geometry of vectors in R3 .
3
Exercise 8.7. Find the orthogonal projection of v = −1 onto the plane V in R3 with equation
2
x − y + 2z = 0.
Exercises:
8.14. Find the singular values of 8.15. Find the singular values of
" #
p −q
A= .
q p " #
1 1
A= .
Explain the results geometrically. 0 1
162
Index
163
Linear Algebra Shaska T.
164
Shaska T. Linear Algebra
linearly independent , 28
parallel, 28
perpendicular, 30
unit, 24
vector space, 51
vectors in Rn , 28
zero matrix, 33
165
Linear Algebra Shaska T.
166
Bibliography
[DF04] David S. Dummit and Richard M. Foote, Abstract algebra, Third, John Wiley & Sons, Inc., Hoboken, NJ, 2004. MR2286236
[Lan87] Serge Lang, Linear algebra, Third, Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1987. MR874113
[Sha17] T. Shaska, Foundations of mathematics, 2017. Lecture Notes.
[Sha18] , An introduction to algebra, AP, 2018.
167