This action might not be possible to undo. Are you sure you want to continue?
6.1 Vector Spaces and Subspaces . . . . . . . . . . . . . . . . 3
Review of R
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
6.1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
6.1.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
6.1.3 Spanning sets as a subspace . . . . . . . . . . . . . . . . . . . . . . 5
6.2 Linear Independence, Basis, and Dimension . . . 6
6.2.1 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6.2.2 Basis and Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6.2.3 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6.3 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.3.1 Changeofbasis Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.3.2 GaussJordan Method for computing Changeofbasis Matrix . 11
4.1 Introduction to Eigenvalues and Eigenvectors . 12
4.2 Introduction to Determinants . . . . . . . . . . . . . . . . 13
4.3 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . 14
4.4 Similarity and Diagonalization . . . . . . . . . . . . . . . 17
6.4 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . 19
6.4.1 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.4.2 Composition of Linear Transformations . . . . . . . . . . . . . . . 20
6.4.3 Inverse of Linear Transformations . . . . . . . . . . . . . . . . . . . 20
6.5 The Kernel and Range of a Linear Transforma
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.5.1 Kernel and range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.5.2 Rank and nullity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.5.3 OnetoOne and Onto . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.5.3 Isomorphism of Vector Spaces . . . . . . . . . . . . . . . . . . . . . 23
6.6 The Matrix of a Linear Transformation . . . . . . . 24
6.6.1 Matrix of Linear Transformation . . . . . . . . . . . . . . . . . . . 24
6.6.2 Matrices of Composite and inverse Linear Transformations . . . 25
6.6.3 Change of Basis and Similarity . . . . . . . . . . . . . . . . . . . . 26
6.7 An Application: Homogeneous Linear Diﬀeren
tial Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.7.1 First Order Homogeneous Linear Diﬀerential Equation . . . . . 28
6.7.2 Second Order Homogeneous Linear Diﬀerential Equation . . . . 29
6.7.3 Linear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1
5.1 Orthogonality in R
n
. . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.2 Orthogonal Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.2 Orthonormal Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Orthogonal Complements and Orthogonal Pro
jections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2.1 Orthogonal Complements . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2.2 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2.3 The orthogonal Decomposition Theorem . . . . . . . . . . . . . . 37
5.3 The GramSchmidt Process and the QR Fac
torization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3.1 The GramSchmidt Process (Algorithm) . . . . . . . . . . . . . . 40
5.3.2 The QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.4 Orthogonal Diagonalization of Symmetric Matrix 42
5.5 An Application: Quadratic Forms . . . . . . . . . . . . 44
5.5.1 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . 44
5.5.2 Constrained Optimization Problem . . . . . . 46
5.5.3 Graphing quadratic equations . . . . . . . . . . . 47
7.1 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . 48
7.1.1 Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.1.2 Length, Distance and Orthogonality . . . . . . 49
7.1.3 Orthogonal Projections and the GramSchmidt
Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 Norms and Distance Functions . . . . . . . . . . . . . . . 51
7.2.1 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.2.2 Distance Function . . . . . . . . . . . . . . . . . . . . . . 51
7.2.3 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.3 Least Squares Approximation . . . . . . . . . . . . . . . . 53
7.3.1 The Best Approximation Theorem . . . . . . . 53
7.3.2 Least Squares Approximation . . . . . . . . . . . 54
7.3.3 Least Squares via the QR Factorization . . 55
7.3.4 Orthogonal Projection Revisited . . . . . . . . . 56
7.4 The Singular Value Decomposition . . . . . . . . . . . 57
7.4.1 The Singular Values of a Matrix . . . . . . . . . 57
7.4.2 The Singular Values Decomposition . . . . . . 57
2
6.1 Vector Spaces and Subspaces
Review of R
n
Vectors, addition, scalar multiplication, dot product, cross product, length, distance.
6.1.1 Vector Spaces
Deﬁnition. Let V be a set of elements. If the operations addition and scalar multiplication
are deﬁned in V satisfying the following axioms:
A1. If ⃗u, ⃗v ∈ V , then ⃗u +⃗v ∈ V .
A2. If ⃗u, ⃗v ∈ V , then ⃗u +⃗v = ⃗v +⃗u.
A3. If ⃗u, ⃗v, ⃗ w ∈ V , then ⃗u + (⃗v + ⃗ w) = (⃗u +⃗v) + ⃗ w.
A4. There exists an element, denoted by
⃗
0, in V, such that
⃗
0 +⃗u = ⃗u for every ⃗u.
A5. For every ⃗u ∈ V , there exists an element, denoted by −⃗u, in V such that ⃗u+(−⃗u) =
⃗
0.
S1. Operation scalar multiplication is deﬁned for every number c and every ⃗u in V, and
c⃗u ∈ V .
S2. Operation scalar multiplication satisﬁes the distributive law: c(⃗u +⃗v) = c⃗u +c⃗v.
S3. Operation scalar multiplication satisﬁes the second distributive law: (c+d)⃗u = c⃗u+d⃗u.
S4. Operation scalar multiplication satisﬁes the associative law: (cd)⃗u = c(d⃗u).
S5. For every element u ∈ V , 1⃗u = ⃗u.
Then V is called a vector space.
Remark. Usually, we use
⊕
for general addition,
⊙
or
⊗
for general multiplication.
Example 1 1. R
n
with the usual operations is a vector space.
2. C
n
with the usual operations is a vector space.
3. P=the set of all polynomials with the usual operations are vector space.
4. M
mn
= the set of all m by n matrices with the usual operations is a vector space.
5. F(R) of all realvalued functions on R with the usual operations is a vector space.
6. F[a, b] of all realvalued functions on [a, b] with the usual operations is a vector space.
3
Example 2 1. Z of integers with the usual operations is NOT a vector space.
2. R
2
with c(x, y) = (cx, 0) is not a vector space.
Property 1 (i) ⃗ w +⃗v = ⃗u +⃗v implies ⃗ w = ⃗u.
(ii) 0⃗v =
⃗
0.
(iii) c
⃗
0 =
⃗
0.
(iv) av = 0 implies a = 0 or v = 0.
6.1.2 Subspaces
A set U is a subspace of a vector space V if U is a vector space with respect to the operations
of V.
Theorem 1 U is a subspace of V if
(i) the zero vector is in U;
(ii) if x is in U, then ax is in U for any scalar a, and
(iii) if x, y are in U, then x + y is in U.
Examples
(1) {
⃗
0} is a subspace. A subspace that is not {
⃗
0} is a proper subspace.
(2) A line through the origin in the space is a subspace; A plane through the origin in the
space is a subspace.
(3) S
1
= {(s, 2s, 3)s ∈ R} is not a subspace.
(4) S
2
=
¸
s
s
2
¸
s ∈ R
¸
is not a subspace. It does not satisfy (ii).
(5) S
3
= {(s, t)s
2
= t
2
, s, t ∈ R} is not a subspace. It does not satisfy (iii).
(6) S
4
=
s
2s + 3t
5t
¸
¸
¸
s, t ∈ R
is a subspace.
(7) R
n
is a subspace of itself.
(8) S
5
=
¸
s + 1
t
¸
s, t ∈ R
¸
is a subspace.
(9) S
6
=
a
b
c
¸
¸
¸
a = 3b + 2c, a, b, c ∈ R
is a subspace.
(10) S
7
=
¸
a b
c d
¸
a = 3b + 2c, a, b, c ∈ R
¸
is a subspace of M
22
.
4
(11) P
n
the set of all polynomials with degree less than or equal to n is a subspace of P.
(12) S
2
= {p ∈ P
2
p(1) = 0} is a subspace of P.
6.1.3 Spanning sets as a subspace
Let S = {v
1
, v
2
, · · · , v
k
}. span S = {c
1
v
1
+ c
2
v
2
+· · · + c
k
v
k
c
1
, · · · , c
k
∈ R}. It is a subspace
of V, called the span of S, denoted by span S.
Example. The span of a single nonzero vector in the space is a line through the origin. The
span of a two nonparallel nonzero vectors u and v in the space is a plane through the origin
with normal vector u ×v.
Let V be a subspace, and let S be a subset of V. If span S = V, then S is a spanning set
of V. In particular, V itself is a spanning set of V. A subspace generally has more than one
spanning set.
Property 2 (i) If X ∈ S, then X ∈ spanS.
(ii) If a subspace W contains every vector in S, then W contains span S.
As an example of using the second property, span{X + Y, X, Y } = span{X, Y }.
(iii) If
⃗
b is a linear combination of v
1
, v
2
, ..., v
k
, then
spanS = {
⃗
b, v
1
, v
2
, , v
k
} = spanS = {v
1
, v
2
, ..., v
k
}.
(iv) R
n
= span{E
1
, E
2
, ..., E
n
}.
(v) null A = the span of the basic solutions of AX=0.
(vi) im A = the span of the columns of A.
Example 3 (i) Verify that [1 2 0 1]
T
is in span{[2 1 2 0]
T
, [0 3 2 2]
T
}.
Solution: The corresponding system is consistent.
(ii) Verify that the set of vectors S = {[1 2 3]
T
, [1 0 1]
T
, [2 1 1]
T
} spans R
3
.
Solution: For any [a b c]
T
in R
3
, The corresponding system is consistent.
(iii) Find a,b such that X = [a b a+b ab]
T
is in span{X1, X2, X3}, where X1 = [1 1 1
1]
T
, X2 = [1 0 1 2]
T
, X3 = [1 0 1 0]
T
.
5
6.2 Linear Independence, Basis, and Dimension
6.2.1 Linear Independence
Deﬁnition 1 A set of vectors {⃗v
1
, · · · , ⃗v
m
} in V is linearly independent if the vector equation
x
1
⃗v
1
+x
2
⃗v
2
+· · · +x
m
⃗v
m
=
⃗
0
implies that x
1
, x
2
, · · · , x
m
= 0. The set is said to be linearly dependent if there is a nontrivial
solution to the vector equation.
Example 4
1
2
3
¸
¸
¸
is linearly independent,
0
0
0
¸
¸
¸
is linearly dependent.
Example 5 Given ⃗ v
1
=
1
2
3
¸
¸
¸
, ⃗ v
2
=
3
5
8
¸
¸
¸
, ⃗ v
3
=
−1
1
2
¸
¸
¸
. Show that {⃗ v
1
, ⃗ v
2
, ⃗ v
3
} is linearly
dependent and ﬁnd the linear combination.
Solution: −2⃗ v
1
+ ⃗ v
2
− ⃗ v
3
=
⃗
0.
Theorem 2 1. A set of two vectors is linearly dependent if and only if one of the vectors
is a multiple of the other.
2. A set of two or more vectors is linearly dependent if and only if at least one vector may
be written as a linearly combination of the others.
3. If a set contains more vectors than entries in each vector, then the set is linearly depen
dent.
4. If the zero vector is in a set of vectors, then the set of vectors is linearly dependent.
Example 6 1. In C(R), the set {sin x, cos x, sin(2x)} is linearly dependent.
2. In P
2
, the set {1, x, x
2
} is linearly independent.
3. In P
2
, the set {1 + x +x
2
, 1 −x + 3x
2
, 1 + 3x −x
2
} is linearly dependent.
6
Example 7 Let {⃗ v
1
, ⃗ v
2
, ⃗ v
3
, ⃗ v
4
} be linearly independent. Determine if the following set is lin
early independent or dependent: S = {⃗ v
1
− ⃗ v
2
, ⃗ v
2
− ⃗ v
3
, ⃗ v
3
− ⃗ v
4
, ⃗ v
4
− ⃗ v
1
}.
Solution: We need to set up the equation
c
1
( ⃗ v
1
− ⃗ v
2
) + c
2
(⃗ v
2
− ⃗ v
3
) + c
3
(⃗ v
3
− ⃗ v
4
) + c
4
(⃗ v
4
− ⃗ v
1
) =
⃗
0, ⇒
(c
1
−c
4
) ⃗ v
1
+ (−c
1
+ c
2
)⃗ v
2
+ (−c
2
+c
3
)⃗ v
3
+ (−c
3
+c
4
)⃗ v
4
=
⃗
0, ⇒
c
1
= c
2
= c
3
= c
4
. e.g., c
1
= c
2
= c
3
= c
4
. = 1.
Thus dependent.
6.2.2 Basis and Coordinates
Deﬁnition 2 A basis for a subspace V is a linearly independent set of vectors that spans V.
We denote it by B
V
. When V is clear, we just write the basis as B. The number of vectors in
a basis for a subspace V is called the dimension of V and is denoted by dimV .
Properties of basis:
• There is more than one basis for a subspace, except for the simplest subspace {
⃗
0}.
• A basis is the largest spanning set of linearly independent vectors for a subspace.
Example 8 1. Let e
1
=
1
0
.
.
.
0
¸
¸
¸
¸
¸
¸
, e
2
=
0
1
.
.
.
0
¸
¸
¸
¸
¸
¸
, ..., e
n
=
0
0
.
.
.
1
¸
¸
¸
¸
¸
¸
.. Then {e
1
, ..., e
n
} is a basis
for R
n
, which is called the standard basis of R
n
.
2. The set {1, x, x
2
, ..., x
n
} is the standard basis of P
n
.
3. Let E
ij
be the matrix of m×n where the (i, j) entry is 1, all other entries are 0. Then
the set {E
11
, ..., E
mn
} is the standard basis of M
mn
.
Example 9 1. Find a basis to each of the following subspaces:
U = {(a + 2b + 3c, a −c, b +a, a −b)a, b, c ∈ R}
V = {(a, b, c, d)a + 2b = c, 3b −2c = d; a, b, c, d ∈ R}
Solution:
7
• U = {a(1, 1, 1, 1) +b(2, 0, 1, −1) + c(3, −1, 0, 0)a, b, c ∈ R}
= Span{(1, 1, 1, 1), (2, 0, 1, −1), (3, −1, 0, 0)}.
Next we show that the set of three vectors S = {(1, 1, 1, 1), (2, 0, 1, −1), (3, −1, 0, 0)}
is linearly independent. If
a(1, 1, 1, 1) + b(2, 0, 1, −1) + c(3, −1, 0, 0) =
⃗
0,
then
(a + 2b + 3c, a −c, b +a, a −b) = (0, 0, 0, 0),
⇒a = 0, b = 0, c = 0, d = 0.
Thus S is independent, which is a basis of U.
• For the subspace V , from a +2b = c, 3b −2c = d we imply that d = −2a −b. Thus
V = {(a, b, a + 2b, −2a − b)a, b ∈ R} = {a(1, 0, 1, −2) + b(0, 1, 2, −1)a, b ∈ R} =
Span{(1, 0, 1, −2), (0, 1, 2, −1)}.
Similarly we can show that the set of two vectors T = {(1, 0, 1, −2), (0, 1, 2, −1)} is
linearly independent. Thus T is a basis of V .
2. Find a basis to the following subspace of M
2,2
:
W =
¸
2x 3x
y +z y +z −x
¸
x, y, z ∈ R
¸
Solution:
W =
x
¸
2 3
0 −1
¸
+y
¸
0 0
1 1
¸
+ z
¸
0 0
1 1
¸
x, y, z ∈ R
¸
= Span
¸
2 3
0 −1
¸
,
¸
0 0
1 1
¸
,
¸
0 0
1 1
¸¸
= Span
¸
2 3
0 −1
¸
,
¸
0 0
1 1
¸¸
.
8
Next we show that the set of two vectors S =
¸
2 3
0 −1
¸
,
¸
0 0
1 1
¸¸
is linearly inde
pendent. Set up the equation
a
¸
2 3
0 −1
¸
+b
¸
0 0
1 1
¸
=
⃗
0.
then
¸
2a 3a
b b −a
¸
=
¸
0 0
0 0
¸
,
⇒2a = 0, 3a = 0, b = 0, b −a = 0,
⇒a = 0, b = 0.
Thus S is independent, which is a basis of W.
Deﬁnition 3 Given a basis B = {⃗v
1
, ..., ⃗ v
p
} for a subspace V. Let ⃗x ∈ V . Then ⃗x may be
written as a linear combination of ⃗v
1
, ..., ⃗ v
p
:
⃗x = c
1
⃗v
1
+... +c
p
⃗ v
p
.
The weights c
1
, ..., c
p
are called the coordinates of ⃗x relative to the basis B. These coordinates
may be written as a vector
[⃗x]
B
=
c
1
.
.
.
c
p
¸
¸
¸
called the coordinate vector of ⃗x with respect to B, (or the Bcoordinate vector of ⃗x).
Example 10 Let A = [⃗a
1
⃗a
2
⃗a
3
⃗ a
4
⃗ a
5
] =
1 −3 2 5 3
0 0 4 7 4
0 0 0 0 0
¸
¸
¸
.
1) Pivot columns are ⃗a
1
⃗a
3
. So B
ColA
= {⃗a
1
, ⃗a
3
}.
2)
[⃗a
2
]
B
ColA
=
¸
−3
0
¸
, [⃗a
4
]
B
ColA
=
¸
3/2
7/4
¸
, [⃗a
5
]
B
ColA
=
¸
1
1
¸
.
3) dimB
ColA
= 2.
Note: The order of the vectors in the basis B inﬂuences the coordinate vector [⃗x]
B
.
9
Example 11 Find [p(x)]
B
, where B = {1 + x, 1 −x + 3x
2
, 1 + 3x −x
2
}, p(x) = 2 −5x −x
2
.
Property 3 (i) [c
1
⃗u
1
+... +c
k
⃗ u
k
]
B
= c
1
[⃗u
1
]
B
+· · · +c
k
[⃗u
k
]
B
.
(ii) {⃗u
1
, ..., ⃗ u
k
} is linearly independent iﬀ {[⃗u
1
]
B
, · · · , [⃗u
k
]
B
} is linearly independent.
Proof. (i) is from the deﬁnition, (ii) is from (i).
Theorem 3 Given a basis B = {⃗v
1
, ..., ⃗ v
n
} for a vector space V. Then
1. Any set of more than n vectors in V is linearly dependent.
2. Any set of less than n vectors in V can not span V.
3. Every basis for V has exactly n vectors.
Proof. 1. Let {⃗u
1
, ..., ⃗ u
k
} be a set of vectors in V, k > n. Then {[⃗u
1
]
B
, · · · , [⃗u
k
]
B
} is a set
of vectors in R
n
with more than n vectors. So dependent.
2 and 3 are from 1.
6.2.3 Dimension
Deﬁnition 4 The dimension of a vector space V is deﬁned to be the number of vectors in a
basis. We write it as dimV .
Example 12 1. dim{
⃗
0} = 0.
2. dimR
n
= n.
3. dimP
n
= n + 1.
4. dimM
mn
= mn.
Example 13 Extend the following linearly independent set {1 + x +x
2
, 1 + 3x −x
2
} to a
basis of P
2
.
10
6.3 Change of Basis
6.3.1 Changeofbasis Matrix
Question: Given two bases B and C for a vector space V , if [v]
B
is known, how to ﬁnd [v]
C
?
Deﬁnition 5 Let B = {⃗u
1
, ..., ⃗ u
n
} and C = {⃗v
1
, ..., ⃗ v
n
} be two bases for V. The changeof
basis matrix from B to C is deﬁned as:
P
C←B
= [[⃗u
1
]
C
· · · [⃗u
n
]
C
] .
Theorem 4 Let B = {⃗u
1
, ..., ⃗ u
n
} and C = {⃗v
1
, ..., ⃗ v
n
} be two bases for V. Then
a. P
C←B
[⃗x]
B
= [⃗x]
C
.
b. P
C←B
is the unique matrix P such that P [⃗x]
B
= [⃗x]
C
for all ⃗x ∈ V .
c. P
C←B
is invertible and the inverse is P
B←C
.
Proof. a. Let ⃗x = c
1
⃗u
1
+... +c
n
⃗ u
n
. Then [⃗x]
C
= c
1
[⃗u
1
]
C
+· · · +c
k
[⃗u
k
]
C
= P
C←B
[⃗x]
B
.
b. Let p
i
be the ith column of P. Then p
i
= Pe
i
= P[u
i
]
B
= [u
i
]
C
, which is the ith column
of P
C←B
.
c. Since the columns of P
C←B
are linearly independent, so the matrix P
C←B
is invertible.
Example 14 In M
22
, let B = {E
11
, E
12
, E
21
, E
22
}, and C = {W, X, Y, Z}, where
W =
¸
1 0
0 0
¸
, X =
¸
1 1
0 0
¸
, Y =
¸
1 1
1 0
¸
, Z =
¸
1 1
1 1
¸
.
(i) Find P
C←B
and P
B←C
.
(ii) Let U =
¸
1 2
3 4
¸
, verify that
P
C←B
[U]
B
= [U]
C
, P
B←C
[U]
C
= [U]
B
.
6.3.2 GaussJordan Method for computing Changeofbasis Matrix
Let B = {⃗u
1
, ..., ⃗ u
n
} and C = {⃗v
1
, ..., ⃗ v
n
} be two bases for V. Let E be any basis of V . Then
[[[⃗v
1
]
E
· · · [⃗v
n
]
E
]  [[⃗u
1
]
E
· · · [⃗u
n
]
E
]] →[IP
C←B
],
i.e.,
[P
E←C
P
E←B
] →[IP
C←B
].
11
Example 15 Let B = {1 + x, 1 −x + 3x
2
, 1 + 3x −x
2
} and C = {1 −x, 1 + x + 2x
2
, 1 + 2x −x
2
}.
Find P
C←B
by GaussJordan method.
4.1 Introduction to Eigenvalues and Eigenvectors
Deﬁnition 6 An eigenvector of an n ×n matrix A is a nonzero vector ⃗x such that A⃗x = λ⃗x
for some scalar λ. A scalar λ is called an eigenvalue of A if there is a nontrivial solution ⃗x
such that A⃗x = λ⃗x. ⃗x is called the eigenvector corresponding to λ.
To determine whether a given value λ is an eigenvalue of a matrix A we need to ﬁnd a
nonzero vector ⃗x such that A⃗x = λ⃗x. This is the same as determining whether the matrix
equation
(A −λI)⃗x = 0
has a nontrivial solution.
Example 16 Let A =
¸
1 6
5 2
¸
, ⃗u =
¸
1
1
¸
, ⃗v =
¸
6
−5
¸
, ⃗ w =
¸
1
0
¸
. Note that
A⃗u =
¸
7
7
¸
= 7⃗u, A⃗v =
¸
−24
20
¸
= −4⃗v, A⃗ w =
¸
1
5
¸
̸= λ⃗ w.
Thus ⃗u is an eigenvector corresponding to λ = 7, ⃗v is an eigenvector corresponding to λ = −4,
⃗ w is not an eigenvector.
Deﬁnition 7 The set of all eigenvectors for a particular eigenvalue λ of a matrix A is a
subspace and so is called the eigenspace of A corresponding to λ. We write it as E
λ
.
Example 17 The eigenvalues of a triangular matrix are the entries on its main diagonal.
12
4.2 Introduction to Determinants
Deﬁnition 8 Let A =
¸
a b
c d
¸
. The determinant of A is deﬁned as
det A = A =
a b
c d
= ad −bc.
For a n ×n matrix A, let A
ij
be the matrix obtained from A by deleting the ith row and jth
column. The (i, j)
th
cofactor of A is the number,
c
ij
= (−1)
i+j
det A
ij
.
det A = a
i1
c
i1
+a
i2
c
i2
+ ... +a
in
c
in
,
which is called a cofactor expansion across the ith row. Similarly,
det A = a
1j
c
1j
+a
2j
c
2j
+ ... +a
nj
c
nj
,
which is called a cofactor expansion across the jth column.
Example 18 Calculate det A, where
A =
1 3 5
2 1 1
3 4 2
¸
¸
¸
.
Solution: We do cofactor expansion across the 2nd row.
det A = a
21
c
21
+a
22
c
22
+a
23
c
23
= 2(−1)
2+1
det
¸
3 5
4 2
¸
+ (−1)
2+2
det
¸
1 5
3 2
¸
+ (−1)
2+3
det
¸
1 3
3 4
¸
= 2(14) + (−13) + 5 = 20.
Example 19 Calculate det A, where
A =
5 3 5 7
0 1 1 9
0 0 2 12
0 0 0 12
¸
¸
¸
¸
¸
.
Solution: A is an upper triangular matrix. det A = 5(1)(2)(12) = 120.
13
4.3 Eigenvalues and Eigenvectors
Characteristic equation: det(A −λI) is called the characteristic polynomial of A and
det(A −λI) = 0
is called the characteristic equation.
Theorem 5 The solutions of the characteristic equation are the eigenvalues of A.
Example 20 Let A =
¸
4 5
−1 0
¸
. Find all eigenvalues.
Sol: det(A −λI) = λ
2
−4λ + 5 ⇒λ = 2 ±i.
Example 21 Find all eigenvalues of A, where
A =
3 2 −1
0 1 2
0 0 3
¸
¸
¸
.
Solution: The characteristic polynomial is
det(A −λI) =
3 −λ 2 −1
0 1 −λ 2
0 0 3 −λ
= (3 −λ)
2
(1 −λ).
The solutions of the characteristic equation det(A −λI) = 0 are 3,3,1.
Algebraic and Geometric multiplicity:
• The algebraic multiplicity of an eigenvalue is equal to the number of times it is a root
of the characteristic equation.
• The geometric multiplicity of an eigenvalue is the dimension of its eigenspace.
Example 22 Let
A =
3 2 −1
0 1 2
0 0 3
¸
¸
¸
.
The eigenvalue 3 has algebraic multiplicity 2 and the eigenvalue 1 has algebraic multiplicity
1. Find the corresponding eigenspaces and their geometric multiplicities.
14
Solution: When λ = 3,
A −λI = A −3I =
0 2 −1
0 −2 2
0 0 0
¸
¸
¸
R
2
+ R
1
−−−−−→
0 2 −1
0 0 1
0 0 0
¸
¸
¸
R
1
+R
2
−−−−−→
0 2 0
0 0 1
0 0 0
¸
¸
¸
.
Thus (A −3I)⃗x = 0 has the solution
⃗x =
x
1
x
2
x
3
¸
¸
¸
=
t
0
0
¸
¸
¸
= t
1
0
0
¸
¸
¸
.
The eigenspace has a basis
1
0
0
¸
¸
¸
. The geometric multiplicity of the eigenvalue 3 is 1.
When λ = 1,
A −λI = A −I =
2 2 −1
0 0 2
0 0 2
¸
¸
¸
R
3
−R
2
−−−−−→
2 2 −1
0 0 2
0 0 0
¸
¸
¸
R
1
−
1
2
R
2
−−−−−−→
2 2 0
0 0 2
0 0 0
¸
¸
¸
.
Thus (A −I)⃗x = 0 has the solution
⃗x =
x
1
x
2
x
3
¸
¸
¸
=
t
−t
0
¸
¸
¸
= t
1
−1
0
¸
¸
¸
.
The eigenspace has a basis
1
−1
0
¸
¸
¸
. The geometric multiplicity of the eigenvalue 1 is 1.
Theorem 6 (The Invertible Matrix Theorem) Let A be a square n × n matrix. Then the
following statements are equivalent.
1. A is an invertible matrix.
2. A is row equivalent to the identity matrix.
3. A has n pivot positions.
4. The equation A⃗x =
⃗
0 has only the trivial solution.
5. The columns of A form a linearly independent set.
6. The linear transformation T
A
: ℜ
n
→ℜ
n
is 1:1.
7. The equation A⃗x =
⃗
b has at least one solution for each
⃗
b ∈ ℜ
n
.
15
8. The columns of A span ℜ
n
.
9. The linear transformation T
A
: ℜ
n
→ℜ
n
is onto.
10. There is an n ×n matrix C such that CA = I.
11. There is an n ×n matrix D such that AD = I.
12. A
T
is invertible.
13. The columns of A form a basis for ℜ
n
.
14. Col A = ℜ
n
15. dim ColA = n
16. rank A = n
17. Nul A = {0}
18. dim NulA = 0.
19. The number 0 is not an eigenvalue of A.
20. The determinant of A is not zero.
Property 4 Let λ be an eigenvalue of A with corresponding eigenvector x.
(1) For any positive integer n, λ
n
is an eigenvalue of A
n
with corresponding eigenvector
x.
(2) If A is invertible, then
1
λ
is an eigenvalue of A
−1
with corresponding eigenvector x.
(3) If A is invertible, then for any integer n, λ
n
is an eigenvalue of A
n
with corresponding
eigenvector x.
Theorem 7 If ⃗ v
1
, ..., ⃗ v
r
are eigenvectors that correspond to distinct eigenvalues λ
1
, . . . , λ
r
of an n ×n matrix A, then the set {⃗ v
1
, ..., ⃗ v
r
} is linearly independent.
Proof. We use induction.
Assume that when r = k, the set is linearly independent. When r = k + 1, if on the
contrary that the set is dependent, then
v
k+1
= c
1
v
1
+· · · c
k
v
k
.
Apply A to both sides,
λ
k+1
v
k+1
= c
1
λ
1
v
1
+· · · c
k
λ
k
v
k
.
We also have
λ
k+1
v
k+1
= c
1
λ
k+1
v
1
+· · · c
k
λ
k+1
v
k
.
By subtraction,
0 = c
1
(λ
1
−λ
k+1
)v
1
+· · · c
k
(λ
k
−λ
k+1
)v
k
.
Thus c
1
= · · · = c
k
= 0, i.e., v
k+1
= 0, a contradiction.
16
4.4 Similarity and Diagonalization
Similar matrices: Two matrices A and B are similar if there is an invertible matrix P such
that,
A = PBP
−1
.
Theorem 8 If n ×n matrices A and B are similar, then they have the same characteristics
polynomial and hence the same eigenvalues (with the same multiplicities).
Proof.
det(A −λI) = det(PBP
−1
−λPP
−1
)
= det[P(B −λI)P
−1
)
= det(P) det(B −λI) det(P
−1
)
= det(P) det(B −λI)
1
det(P)
= det(B −λI).
A diagonal matrix is a matrix with non zero values along its diagonal and zeros on its oﬀ
diagonal entries. Well only be considering diagonal matrices that are square.
Deﬁnition 9 If A is a square n ×n matrix and A is similar to a diagonal matrix D then A
is said to be diagonalizable.
Theorem 9 (Diagonalization Theorem) Let A be an n ×n matrix.
• A is diagonalizable if and only if A has n linearly independent eigenvectors. If A =
PDP
−1
, where D is a diagonal matrix, the the columns of P are n linearly independent
eigenvectors of A. In this case, the diagonal entries of D are eigenvalues of A that
correspond, respectively, to the eigenvectors in P.
• If A has n distinct eigenvalues, then A is diagonalizable.
• A is diagonalizable if and only if the sum of the dimensions of the distinct eigenspaces
equals n, if and only if the dimension of the eigenspace for each eigenvalue equals the
algebraic multiplicity of the eigenvalue. (Generally, the dimension of the eigenspace for
each eigenvalue is less than or equal to the algebraic multiplicity of the eigenvalue).
17
For an n × n matrix A, if A is diagonalizable and B
k
is a basis for the eigenspace cor
responding to the eigenvalue λ
k
, k = 1, ..., p, then the total collection of vectors in the sets
B
1
,..., B
p
forms an eigenvector basis of R
n
.
Example 23 A =
1 2 3 4
0 3 2 4
0 0 5 −1
0 0 0 7
¸
¸
¸
¸
¸
is diagonalizable: 4 distinct eigenvalues.
B =
¸
3 −1
1 5
¸
is not diagonalizable: λ = 4, one eigenvector
¸
−1
1
¸
.
Example 24 Let A =
¸
2 3
4 1
¸
.
1) Find P and D such that A = PDP
−1
.
2) Calculate A
4
.
Sol: 1) det(A −λI) = (λ −5)(λ + 2).
When λ = 5: ⃗x = x
2
¸
1
1
¸
;
When λ = −2: ⃗x =
x
2
4
¸
−3
4
¸
. Thus
P =
¸
1 −3
1 4
¸
, D =
¸
5 0
0 −2
¸
; or P =
¸
−3 1
4 1
¸
, D =
¸
−2 0
0 5
¸
.
2) Let P =
¸
1 −3
1 4
¸
, then P
−1
=
1
7
¸
4 3
−1 1
¸
.
A
4
=
¸
PDP
−1
¸
4
= PD
4
P
−1
=
¸
1 −3
1 4
¸¸
5 0
0 −2
¸
4
1
7
¸
4 3
−1 1
¸
=
1
7
¸
1 −3
1 4
¸¸
625 0
0 16
¸¸
4 3
−1 1
¸
=
¸
364 261
348 277
¸
.
18
6.4 Linear Transformations
6.4.1 Linear Transformations
Deﬁnition 10 Let V and W be vector spaces. Then a linear transformation T from V to W
is a function with domain V and range a subset of W satisfying
1) T(u + v) = T(u) + T(v)
2) T(cu) = cT(u)
for any vectors u and v in V and scalar c.
Example 25 T
A
(⃗x) = A⃗x is linear for any matrix A.
Example 26 T : R
n
→R
n
by T(⃗x) = r⃗x, r a scalar. Then T is linear.
Example 27 T : R
n
→R
m
by T(⃗x) = A⃗x +
⃗
b,
⃗
b ̸=
⃗
0. Then T is nonlinear.
Example 28 T : M
nn
→R by T(A) = A = det(A). Then T is nonlinear.
Example 29 Let V be the vector space of (inﬁnitely) diﬀerentiable functions and deﬁne D to
be the function from V to V given by D(f(t)) = f
′
(t). Then D is a linear transformation.
Proof. Since
D(f(t) + g(t)) = (f(t) + g(t))
′
= f
′
(t) + g
′
(t) = D(f(t)) + D(g(t)).
D(cf(t)) = (cf(t))
′
= cf
′
(t) = cD(f(t)).
Properties: If T is linear, then
• T(
⃗
0) =
⃗
0.
• T(c⃗u +d⃗v) = cT(⃗u) + dT(⃗v) for all ⃗u, ⃗v in the domain of T and all scalars c, d.
Example 30 Let T : R
2
→ P
3
by T(
¸
2
1
¸
) = 1 − x − x
2
, and T(
¸
1
2
¸
) = 2x + x
3
. Find
T(
¸
3
4
¸
).
19
6.4.2 Composition of Linear Transformations
Deﬁnition 11 Let U, V and W be vector spaces. Let T : U → V and S : V → W be two
transformations. Then the composition of S with T is S ◦ T, deﬁned by:
(S ◦ T)(u) = S(T(u))
for any vectors u in U.
Theorem 10 If T and S are linear, then S ◦ T is linear.
Example 31 Let T : R
2
→M
22
and S : M
22
→P
3
be two linear transformations deﬁned by:
T(
¸
a
b
¸
) =
¸
a b
a +b a −b
¸
, S(
¸
a b
c d
¸
) = a + (b +c)x −dx
2
+ (a +c)x
3
.
Find S ◦ T.
6.4.3 Inverse of Linear Transformations
Deﬁnition 12 A linear transformation T : V →W is invertible if there is a linear transfor
mation S : W →v such that
S ◦ T = I
V
and T ◦ S = I
W
.
In this case, S is called the inverse of T, and denoted by S = T
−1
.
Remark. T
−1
is unique.
20
6.5 The Kernel and Range of a Linear Transformation
6.5.1 Kernel and range
Deﬁnition 13 Let T : V →W be a linear transformation. Kernel ker(T) = {v ∈ V T(v) =
⃗
0}, Range range(T) = {T(v)v ∈ V }.
Example 32 (i) ker(T
A
) = null(A).
(ii) If D : P
3
→P
2
, then ker(D) = R, range(D) = P
2
.
(iii) If S : P
1
→R is given by S(p(x)) =
∫
1
0
p(x)dx, then
ker(S) = {−
b
2
+bxb ∈ R}, range(S) = R.
(iv) If T : M
22
→M
22
is given by T(A) = A
T
, then ker(T) = {0}, range(T) = M
22
.
Theorem 11 Let T : V → W be a linear transformation. Then ker(T) is a subspace of V ,
range(T) is a subspace of W.
6.5.2 Rank and nullity
Deﬁnition 14 Let T : V → W be a linear transformation. nullity(T) = dimker(T),
rank(T) = dimrange(T).
Example 33 Find the rank and nullity of the following:
(i) T
A
.
(ii) D : P
3
→P
2
.
(iii) S : P
1
→R is given by S(p(x)) =
∫
1
0
p(x)dx.
(iv) T : M
22
→M
22
is given by T(A) = A
T
.
Rank Theorem: Let T : V →W be a linear transformation. Then
nullity(T) + rank(T) = dimV.
Proof. Let dimV = n, and let {v
1
, · · · , v
k
} be a basis for ker(T). Then we can extend it
to a basis of V : {v
1
, · · · , v
k
, v
k+1
, · · · v
n
}. We only need to prove that {T(v
k+1
), · · · T(v
n
)} is
a basis for range(T).
21
Example 34 Let T : R
2
→M
22
and S : M
22
→P
3
be two linear transformations deﬁned by:
T(
¸
a
b
¸
) =
¸
a b
a +b a −b
¸
, S(
¸
a b
c d
¸
) = a −b + (b + c)x + (a −d)x
2
+ (a +c)x
3
.
Find rank(T), rank(S), nullity(T), nullity(S).
6.5.3 OnetoOne and Onto
Deﬁnition 15 Let T : V → W be a linear transformation. If T maps distinct vectors in V
to distinct vectors in W, then T is called onetoone. If range(T) = W, then T is called onto.
Theorem 12 Let T
A
: R
n
→R
m
be a linear transformation with standard matrix A. Then,
1. T
A
is onto if and only if the columns of A span R
m
.
2. T
A
is 1:1 if and only if the columns of A are linearly independent.
3. T
A
is 1:1 if and only if A⃗x = T
A
(⃗x) = 0 has only the trivial solution.
Example 35 Let A =
¸
1 0 9
0 3 7
¸
. Is T
A
: R
3
→R
2
ONTO, 1:1?
Sol: T
A
is ONTO, since the columns of A span ℜ
2
. T
A
is not 1:1, since the columns of A
are linearly dependent.
Example 36 Let T : R
2
→R
3
be given by
T(x
1
, x
2
) = (x
1
−2x
2
, −x
1
+ 3x
2
, 3x
1
−2x
2
) =
1 −2
−1 3
3 −2
¸
¸
¸
¸
x
1
x
2
¸
.
Is T ONTO, 1:1?
Sol: T is not ONTO, since A has at most two pivots, the columns of A can not span R
3
.
T
A
is 1:1, since the columns of A are linearly independent.
Theorem 13 A linear transformation T : V →W is onetoone iﬀ ker(T) = {0}.
Proof. If T(u) = T(v), then T(u −v) = 0.
Theorem 14 Let dimV = dimW. A linear transformation T : V →W is onetoone if and
only if it is onto.
22
Example 37 Let T : R
2
→P
1
be deﬁned by:
T(
¸
a
b
¸
) = a −b + (b + a)x.
Show that T is onto and onetoone.
Theorem 15 A linear transformation T : V →W is invertible if and only if it is onetoone
and onto.
6.5.3 Isomorphism of Vector Spaces
Deﬁnition 16 A linear transformation T : V → W is isomorphism if it is onetoone and
onto. Then we say that V is isomorphic to W and we write V
∼
= W.
Example 38 Show that R
n+1
and P
n
are isomorphic.
Proof. Let T(e
j
) = x
j−1
.
Theorem 16 Let dimV < ∞, dimW < ∞. Then V
∼
= W if and only if dimV = dimW.
Example 39 Show that M
33
and P
9
are NOT isomorphic.
23
6.6 The Matrix of a Linear Transformation
6.6.1 Matrix of Linear Transformation
Deﬁnition 17 Let V and W be two vector spaces with dimV = n and dimW = m. Let
B = {v
1
, · · · , v
n
} be a basis of V and C be a basis of W. Then
A = [[T(v
1
)]
C
· · · [T(v
n
)]
C
]
is called the matrix of T with respect to bases B and C. We write A = [T]
C←B
. when V = W
and B = C, we simply write [T]
C←B
as [T]
B
.
Theorem 17 For every v ∈ V , A[v]
B
= [T(v)]
C
.
Proof. Deﬁne isomorphisms N : V →R
n
and M : W →R
m
as follows:
N(v) = [v]
B
, M(w) = [w]
C
.
Then N(v
i
) = e
i
. Thus
(M ◦ T ◦ N)([v]
B
) = [T(v)]
C
.
Example 40 Let T : R
3
→R
2
be given by
T(
x
y
z
¸
¸
¸
) =
¸
x + 2y
y −3z
¸
.
Let B = {e
1
, e
2
, e
3
} and C = {e
1
, e
2
}.
(i) Find the matrix of T with respect to bases B and C.
(ii) Verify A[v]
B
= [T(v)]
C
for
1
2
3
¸
¸
¸
.
Example 41 Let T : P
2
→P
2
be given by
T(p(x)) = p(2 + x).
Let B = {1, x, x
2
}.
(i) Find the matrix [T]
B
.
(ii) Use (i) to calculate T(1 −x −x
2
).
24
6.6.2 Matrices of Composite and inverse Linear Transformations
Theorem 18 Let U, V and W be ﬁnitedimensional vector spaces with bases B, C, and D
respectively. Let T : U →V and S : V →W be two linear transformations. Then
[S ◦ T]
D←B
= [S]
D←C
[T]
C←B
.
Proof. Let v ∈ B. Then
[S ◦ T(v)]
D
= [S(T(v))]
D
= [s]
D←C
[T(v)]
C
= [S]
D←C
[T]
C←B
.
Example 42 Let T : R
2
→M
22
and S : M
22
→P
3
be two linear transformations deﬁned by:
T(
¸
a
b
¸
) =
¸
a b
a +b a −b
¸
, S(
¸
a b
c d
¸
) = a + (b +c)x −dx
2
+ (a +c)x
3
.
Let B, C, and D be standard bases of R
2
, M
22
and P
3
respectively. Find [S ◦ T]
D←B
.
Solution:
T(
¸
1
0
¸
) =
¸
1 0
1 1
¸
, T(
¸
0
1
¸
) =
¸
0 1
1 −1
¸
.
Thus
[T]
C←B
=
1 0
0 1
1 1
1 −1
¸
¸
¸
¸
¸
.
S(E
11
) = 1 + x
3
, S(E
12
) = x, S(E
21
) = x +x
3
, S(E
22
) = −x
2
.
Thus
[S]
D←C
=
1 0 0 0
0 1 1 0
0 0 0 −1
1 0 1 0
¸
¸
¸
¸
¸
.
Theorem 19 Let U and V be ndimensional vector spaces with bases B and C respectively.
Let T : U → V be a linear transformation. Then T is invertible if and only if the matrix
[T]
C←B
is invertible. And we have
[T
−1
]
B←C
= ([T]
C←B
)
−1
.
Proof. Let v ∈ ker(T). Then
[T]
C←B
[v]
B
= [T(v)]
C
= [
⃗
0]
C
=
⃗
0.
25
Example 43 Let S : M
22
→P
3
be a linear transformation deﬁned by:
S(
¸
a b
c d
¸
) = a + (b +c)x −dx
2
+ (a +c)x
3
.
Let C, and D be standard bases of M
22
and P
3
respectively. Find [S
−1
]
C←D
.
Solution:
[S
−1
]
C←D
=
1 0 0 0
0 1 1 0
0 0 0 −1
1 0 1 0
¸
¸
¸
¸
¸
−1
.
6.6.3 Change of Basis and Similarity
Theorem 20 Let V be a ﬁnitedimensional vector space with bases B and C respectively. Let
T : V →V be a linear transformation. Then
[T]
C
= P
−1
[T]
B
P,
where P is the changeofbasis matrix from C to B.
Proof.
[I]
B←C
[T]
C←C
= [I ◦ T]
B←C
= [T ◦ I]
B←C
= [T]
B←B
[I]
B←C
.
Thus
[T]
C
∼ [T]
B
.
Example 44 Let T : R
2
→R
2
be deﬁned by:
T(
¸
a
b
¸
) =
¸
a + 3b
2a + 2b
¸
.
Let E be the standard basis of R
2
, and let C =
¸
1
1
¸
,
¸
3
−2
¸¸
. Find [T]
C
.
Solution:
[T]
E
=
¸
1 3
2 2
¸
, P =
¸
1 3
1 −2
¸
.
Thus
[T]
C
=
¸
4 0
0 −1
¸
.
26
Deﬁnition 18 Let V be a ﬁnitedimensional vector space and let T : V → V be a linear
transformation. If there is a basis C of V such that [T]
C
is a diagonal matrix, then T is called
diagonalizable.
Example 45 Let T : P
2
→P
2
be given by
T(p(x)) = p(2x −1).
Let E be the standard basis of R
2
, and let B = {1 + x, 1 −x, x
2
}.
(i) Find the matrix [T]
B
.
(ii) Show that T is diagonalizable by ﬁnding a basis C such that [T]
C
is a diagonal matrix.
Solution:
(i) [T]
E
=
1 −1 1
0 2 −4
0 0 4
¸
¸
¸
, P
E←B
=
1 1 0
1 −1 0
0 0 1
¸
¸
¸
. Thus
[T]
B
=
1 0 −
3
2
−1 2
5
2
0 0 4
¸
¸
¸
.
(ii) C = {1, −1 + x, 1 −2x +x
2
}.
27
6.7 An Application: Homogeneous Linear Diﬀerential
Equation
6.7.1 First Order Homogeneous Linear Diﬀerential Equation
First Order Homogeneous Linear Diﬀerential Equation:
y
′
(t) + ay(t) = 0,
where a is a constant.
Theorem 21 Let S = {yy
′
+ay = 0}. Then
(i) S is a subspace of F.
(ii) {e
−at
} is a basis of S, dimS = 1.
Proof. Let x(t) be a general solution. Then [x(t)e
at
]
′
= 0.
Example 46 A bacteria culture growth at a rate proportional to its size. After 2 hours there
are 40 bacteria and after 4 hours the count is 120. Find an expression for the population after
t hours.
Solution: We measure the time t in hours. Let P(t) be the population at t hours, then
we have
dP
dt
= kP.
The solution of the equation is
P(t) = P(0)e
kt
.
Note that P(2) = 40 and P(4) = 120, we obtain
40 = P(0)e
2k
, 120 = P(0)e
4k
.
These imply that
P(0) =
40
3
and
e
2k
= 3, or k = ln 3/2.
We thus have
P(t) =
40
3
3
t/2
=
40
3
√
3
t
=
40
3
e
(ln 3/2)t
.
28
Example 47 The halflife of Sodium24 is 15 hours. Suppose you have 100 grams of Sodium
24. How many grams remaining after 27 minutes (keep three decimals)?
Solution: Assume m(t) be the amount after t hours. Then
m(t) = m(0)(
1
2
)
t/H
,
where m(0) = 100, H = 15 hours. Note that 27 minutes = 27/60=0.45 hours. Thus
m(0.45) = 100(
1
2
)
0.45/15
= 100(
1
2
)
0.03
= 97.942g.
6.7.2 Second Order Homogeneous Linear Diﬀerential Equation
Second Order Homogeneous Linear Diﬀerential Equation:
y
′′
(t) + ay
′
(t) + by(t) = 0,
where a and b are constants.
Theorem 22 Let S = {yy
′′
(t) + ay
′
(t) + by(t) = 0}, and let λ
1
and λ
2
be the two solutions
of the characteristic equation λ
2
+aλ +b = 0. Then
(i) S is a subspace of F.
(ii) If λ
1
̸= λ
2
, then {e
λ
1
t
, e
λ
2
t
} is a basis of S, dimS = 2.
(iii) If λ
1
= λ
2
, then {e
λ
1
t
, te
λ
1
t
} is a basis of S, dimS = 2.
Proof. Omitted.
Example 48 Find the solution spaces of y
′′
(t)−y
′
(t)−12y(t) = 0 and y
′′
(t)−6y
′
(t)+9y(t) = 0.
6.7.3 Linear Codes
For the purposes of coding, we will be working with linear algebra.
Let Z
n
p
be the set of vectors of length n (n entries) such that each entry is an integer
between 0 and p1 (inclusive). Only scalars are 0, 1,..., p −1. The addition is
a +b = c(mod p).
For example, 3 + 4 = 2(mod 5).
Deﬁnition 19 A linear code C is a subspace of Z
n
p
. If dimC = k, then C is called (n, k)
code.
29
Example 49 Let C
1
=
0
0
0
0
¸
¸
¸
¸
¸
,
1
1
1
1
¸
¸
¸
¸
¸
,
0
0
1
1
¸
¸
¸
¸
¸
,
1
1
0
0
¸
¸
¸
¸
¸
,
C
2
=
0
0
0
0
¸
¸
¸
¸
¸
,
1
1
1
1
¸
¸
¸
¸
¸
,
0
1
1
1
¸
¸
¸
¸
¸
,
1
1
0
0
¸
¸
¸
¸
¸
.
Show that C
1
is a linear code, but C
2
is not a linear code.
30
5.1 Orthogonality in R
n
5.1.1 Review
Deﬁnition 20 Let ⃗u, ⃗v be two vectors in R
n
. i.e.
⃗u =
u
1
u
2
.
.
.
u
n
¸
¸
¸
¸
¸
¸
, ⃗v =
v
1
v
2
.
.
.
v
n
¸
¸
¸
¸
¸
¸
.
The number ⃗u
T
⃗v = u
1
v
1
+u
2
v
2
+... +u
n
v
n
is called the inner product or dot product of ⃗u and
⃗v and is denoted as ⃗u · ⃗v.
Deﬁnition 21 The length or norm of a vector ⃗v =
v
1
v
2
.
.
.
v
n
¸
¸
¸
¸
¸
¸
is a deﬁned by,
⃗v =
√
⃗v · ⃗v =
√
v
2
1
+ ... +v
2
n
.
If c is a scalar, then c⃗v = c⃗v.
Normalization: When the length of a vector is 1 that vector is called a unit vector. Given
any vector ⃗v we can change it into a unit vector:
⃗v
⃗v
is a unit vector in the direction of ⃗v;
−
⃗v
⃗v
is a unit vector in the opposite direction of ⃗v.
This process of constructing a unit vector in the direction of a given vector ⃗v is called
normalizing ⃗v.
Deﬁnition 22 Let ⃗u, ⃗v be two vectors in R
n
. The distance between ⃗u, ⃗v is the length of the
vector ⃗u −⃗v:
dist(⃗u, ⃗v) = ⃗u −⃗v.
31
Angle between two vectors: Let ⃗u, ⃗v be two vectors, and let θ ≤ π be the angle between
them. Then
cos θ =
⃗u · ⃗v
⃗u⃗v
.
5.1.2 Orthogonal Set
Deﬁnition 23 (Orthogonality) Let ⃗u, ⃗v be two vectors in R
n
. They are said to be orthogonal
if ⃗u · ⃗v = 0.
Theorem 23 (The Pythagorean Theorem) Two vectors ⃗u, ⃗v are orthogonal if and only if
⃗u +⃗v
2
= ⃗u
2
+⃗v
2
.
Deﬁnition 24 If each pair of distinct vectors in a set is orthogonal then the set is called an
orthogonal set.
Example 50 Is the set
⃗v
1
=
0
1
−2
1
¸
¸
¸
¸
¸
, ⃗v
2
=
0
0
1
2
¸
¸
¸
¸
¸
, ⃗v
3
=
0
−5
−2
1
¸
¸
¸
¸
¸
an orthogonal set?
Sol: We need to check
⃗ v
1
· ⃗v
2
= 0 + 0 + (−2) + 2 = 0
⃗ v
1
· ⃗v
3
= 0 + (−5) + 4 + 1 = 0
⃗ v
2
· ⃗v
3
= 0 + 0 + (−2) + 2 = 0.
Thus ⃗ v
1
⊥⃗v
2
, ⃗ v
1
⊥⃗v
3
, ⃗ v
2
⊥⃗v
3
, the set is orthogonal.
Theorem 24 If a set of nonzero vectors is orthogonal, then the set is linearly independent.
Deﬁnition 25 (Orthogonal Basis) If S = {⃗ v
1
, ⃗ v
2
, ..., ⃗ v
m
} is an orthogonal set of nonzero
vectors, then it is called an orthogonal basis for the subspace W = Span{⃗ v
1
, ⃗ v
2
, ..., ⃗ v
m
}.
Example 51 Let v
1
=
1
−2
1
¸
¸
¸
, ⃗v
2
=
0
1
2
¸
¸
¸
, ⃗v
3
=
−5
−2
1
¸
¸
¸
.
Show that {⃗v
1
, ⃗v
2
, ⃗v
3
} is an orthogonal basis for R
3
.
32
Proof.
⃗ v
1
· ⃗v
2
= 0 + (−2) + 2 = 0
⃗ v
1
· ⃗v
3
= (−5) + 4 + 1 = 0
⃗ v
2
· ⃗v
3
= 0 + (−2) + 2 = 0.
Thus ⃗ v
1
⊥⃗v
2
, ⃗ v
1
⊥⃗v
3
, ⃗ v
2
⊥⃗v
3
. The set {⃗v
1
, ⃗v
2
, ⃗v
3
} is orthogonal of nonzero vectors, so they are
linearly independent. Three such vectors automatically form a basis for R
3
.
Theorem 25 If ⃗y is a vector in W = Span{⃗ v
1
, ⃗ v
2
, ..., ⃗ v
m
}, where S = {⃗ v
1
, ⃗ v
2
, ..., ⃗ v
m
} is an
orthogonal set of nonzero vectors, then it may be written uniquely as a linear combination of
the vectors in S,
⃗y = c
1
⃗ v
1
+c
2
⃗ v
2
+... +c
m
⃗v
m
, c
k
=
⃗y · ⃗v
k
⃗v
k
· ⃗v
k
, k = 1, ..., m.
Proof.
⃗y · ⃗ v
1
= (c
1
⃗ v
1
+c
2
⃗ v
2
+... +c
m
⃗v
m
) · ⃗ v
1
= c
1
⃗ v
1
· ⃗ v
1
+c
2
⃗ v
2
· ⃗ v
1
+ ... +c
m
⃗ v
m
· ⃗ v
1
= c
1
⃗ v
1
· ⃗ v
1
+ 0.
Example 52 Let v
1
=
1
−2
1
¸
¸
¸
, ⃗v
2
=
0
1
2
¸
¸
¸
, ⃗v
3
=
−5
−2
1
¸
¸
¸
, ⃗x =
3
2
1
¸
¸
¸
. Represent ⃗x as a
linear combination of {⃗v
1
, ⃗v
2
, ⃗v
3
} .
5.1.2 Orthonormal Set
Deﬁnition 26 If S = { ⃗ u
1
, ⃗ u
2
, ..., ⃗ u
m
} is an orthogonal set of unit vectors, then it is called
an orthonormal set. If an orthonormal set S spans some subspace W, then S is called an
orthonormal basis for W. (S is an orthogonal set that spans W, so is linearly independent and
thus a basis for W.)
Example 53 The set {⃗ e
1
, ⃗ e
2
, ..., ⃗ e
n
} is an orthonormal set that spans R
n
, thus an orthonormal
basis for R
n
, which is called the standard basis for R
n
.
Theorem 26 A matrix A has orthonormal columns if and only if A
T
A = I.
Deﬁnition 27 An n × n matrix U is called an orthogonal matrix if its columns form an
orthonormal set.
33
Theorem 27 An n ×n matrix U is an orthogonal matrix if and only if U
−1
= U
T
.
Example 54 Show that A =
√
2/6
√
2/2 −2/3
4
√
2/6 0 1/3
√
2/6 −
√
2/2 −2/3
¸
¸
¸
is an orthogonal matrix.
Proof. It is easy to see that A
T
A = I. Thus A
−1
= A
T
.
Example 55 A =
√
2/6
√
2/2 −2/3
4
√
2/6 0 1/3
√
2/6 −
√
2/2 −2/3
¸
¸
¸
has orthonormal columns.
Theorem 28 Let A be an m × n matrix with orthonormal columns, and let ⃗x, ⃗y be in R
n
.
Then,
1. A⃗x = ⃗x.
2. (A⃗x) · (A⃗y) = ⃗x · ⃗y.
3. (A⃗x) · (A⃗y) = 0 if and only if ⃗x · ⃗y = 0.
Proof. 1.
A⃗x
2
= (A⃗x)
T
(A⃗x) = ⃗x
T
A
T
A⃗x = ⃗x
T
I⃗x = ⃗x
T
⃗x = ⃗x
2
.
2.
(A⃗x) · (A⃗y) = (A⃗x)
T
(A⃗y) = ⃗x
T
A
T
A⃗y = ⃗x
T
I⃗y = ⃗x
T
⃗y.
Property 5 Let A be an orthogonal matrix. Then
(i) A
−1
is orthogonal.
(ii) det(A) = ±1.
(iii) If λ is an eigenvalue of A, then λ = ±1.
(iv) If A and B are orthogonal n ×n matrices, then so is AB.
5.2 Orthogonal Complements and Orthogonal Projec
tions
5.2.1 Orthogonal Complements
Deﬁnition 28 (Orthogonal Complement) Let W be a plane and L a line intersecting W. At
the point of intersection of the line L to the plane W, L is orthogonal to W. We call L the
orthogonal complement of W and denote it by L = W
⊥
. Similarly, we may think of W as
34
being perpendicular to L and so may be called the orthogonal complement of L and is denoted
by, W = L
⊥
.
Properties of Orthogonal Complement:
• A vector ⃗x is in W
⊥
if and only if ⃗x is orthogonal to every vector in a set that spans W.
• W
⊥
is a subspace.
• W
∩
W
⊥
= {0}.
• If W = span{w
1
, ..., w
k
}, then v ∈ W
⊥
if and only if v · w
i
= 0 for all i.
• (RowA)
⊥
= NulA, (ColA)
⊥
= NulA
T
.
Example 56 Let
W = span
−1
2
1
¸
¸
¸
,
3
1
1
¸
¸
¸
, ⃗x =
1
4
−7
¸
¸
¸
.
1) Show that ⃗x ∈ W
⊥
.
2) Find other vectors in W
⊥
.
Solution: 1)
−1
2
1
¸
¸
¸
· ⃗x = 0,
3
1
1
¸
¸
¸
· ⃗x = 0.
2) Let ⃗x =
x
1
x
2
x
3
¸
¸
¸
∈ W
⊥
. Then
−1
2
1
¸
¸
¸
· ⃗x = 0,
3
1
1
¸
¸
¸
· ⃗x = 0 ⇒
A⃗x = 0, A =
¸
−1 2 1
3 1 1
¸
∼
¸
−4 1 0
0 7 4
¸
.
The solution of this is: x
1
= 1/4x
2
, x
3
= −7/4x
2
, i.e.,
⃗x = 1/4x
2
1
4
−7
¸
¸
¸
.
35
5.2.2 Orthogonal Projections
Given two vectors ⃗v and ⃗x, we would like to write ⃗v as a linear combination of two orthogonal
vectors: one vector in the direction of the vector ⃗x and another vector, ⃗y, orthogonal to ⃗x. So,
⃗v = α⃗x +⃗y, ⇒α =
⃗v · ⃗x
⃗x · ⃗x
, ⇒⃗y = ⃗v −
⃗v · ⃗x
⃗x · ⃗x
⃗x.
Deﬁnition 29
proj
⃗x
⃗v =
⃗v · ⃗x
⃗x · ⃗x
⃗x
is called the orthogonal projection of ⃗v onto ⃗x and
perp
⃗x
⃗v = ⃗v −
⃗v · ⃗x
⃗x · ⃗x
⃗x
is the component of ⃗v orthogonal to ⃗x. If W = Span{⃗ v
1
, ⃗ v
2
, ..., ⃗ v
m
}, where S = {⃗ v
1
, ⃗ v
2
, ..., ⃗ v
m
}
is an orthogonal set of nonzero vectors, then the orthogonal projection of ⃗y onto W is deﬁned
as
proj
W
⃗y = c
1
⃗ v
1
+ c
2
⃗ v
2
+ ... +c
m
⃗v
m
, c
k
=
⃗y · ⃗v
k
⃗v
k
· ⃗v
k
, k = 1, ..., m.
The complement of ⃗y orthogonal to W is
perp
W
⃗y = ⃗y −proj
W
⃗y.
Example 57 Let ⃗v =
¸
1
−2
¸
, ⃗x =
¸
1
2
¸
1) Find the orthogonal projection of ⃗v onto ⃗x.
2) Write ⃗v as the sum of two vectors, one in Span{⃗x} and one orthogonal to ⃗x.
3) Find the distance from ⃗v to the line through ⃗x and the origin (i.e., L = Span{⃗x}).
Sol: 1)
ˆ
⃗v =
⃗v · ⃗x
⃗x · ⃗x
⃗x =
−3
5
¸
1
2
¸
=
¸
−0.6
−1.2
¸
.
2) The component of ⃗v orthogonal to ⃗x is
⃗v −
ˆ
⃗v =
¸
1.6
3.2
¸
.
Thus
⃗v =
¸
−0.6
−1.2
¸
+
¸
1.6
3.2
¸
.
36
3) The distance is
⃗v −
ˆ
⃗v = 
¸
1.6
3.2
¸
 =
√
1.6
2
+ 3.2
2
=
√
12.8.
5.2.3 The orthogonal Decomposition Theorem
Theorem 29 Let W be a subspace of R
n
with orthogonal basis {⃗ v
1
, ⃗ v
2
, ..., ⃗ v
m
}. Then each ⃗y
in R
n
can be written uniquely in the form
⃗y =
ˆ
⃗y +⃗z,
ˆ
⃗y ∈ W, ⃗z ∈ W
⊥
,
where
ˆ
⃗y =
⃗y · ⃗v
1
⃗v
1
· ⃗v
1
⃗ v
1
+· · · +
⃗y · ⃗v
m
⃗v
m
· ⃗v
m
⃗ v
m
,
⃗z = ⃗y −
ˆ
⃗y.
Example 58 Let ⃗y =
1
1
1
1
¸
¸
¸
¸
¸
, ⃗v
1
=
0
1
−2
1
¸
¸
¸
¸
¸
, ⃗v
2
=
0
0
1
2
¸
¸
¸
¸
¸
, ⃗v
3
=
0
−5
−2
1
¸
¸
¸
¸
¸
. Let W = Span{⃗v
1
, ⃗v
2
, ⃗v
3
}.
Find proj
W
⃗y.
Sol: Since
⃗ v
1
· ⃗v
2
= 0 + 0 + (−2) + 2 = 0
⃗ v
1
· ⃗v
3
= 0 + (−5) + 4 + 1 = 0
⃗ v
2
· ⃗v
3
= 0 + 0 + (−2) + 2 = 0.
Thus ⃗ v
1
⊥⃗v
2
, ⃗ v
1
⊥⃗v
3
, ⃗ v
2
⊥⃗v
3
, the set {⃗v
1
, ⃗v
2
, ⃗v
3
} is orthogonal basis of W.
proj
W
⃗y =
ˆ
⃗y =
⃗y · ⃗v
1
⃗v
1
· ⃗v
1
⃗ v
1
+
⃗y · ⃗v
2
⃗v
2
· ⃗v
2
⃗ v
2
+
⃗y · ⃗v
3
⃗v
3
· ⃗v
3
⃗ v
3
= 0
0
1
−2
1
¸
¸
¸
¸
¸
+
3
5
0
0
1
2
¸
¸
¸
¸
¸
+
−6
5
0
−5
−2
1
¸
¸
¸
¸
¸
=
0
6
3
0
¸
¸
¸
¸
¸
.
Property: If we have an orthogonal basis {⃗ v
1
, ⃗ v
2
, ..., ⃗ v
m
} for W and if ⃗y ∈ W, then proj
W
⃗y =
⃗y.
37
Proof. Since ⃗y ∈ W,
⃗y = d
1
⃗ v
1
+d
2
⃗ v
2
+... +d
m
⃗ v
m
. ⇒
⃗y · ⃗ v
1
= d
1
⃗ v
1
· ⃗ v
1
, ..., ⃗y · ⃗ v
m
= d
m
⃗ v
m
· ⃗ v
m
.
Thus
proj
W
⃗y =
⃗y · ⃗v
1
⃗v
1
· ⃗v
1
⃗ v
1
+· · · +
⃗y · ⃗v
m
⃗v
m
· ⃗v
m
⃗ v
m
=
d
1
⃗ v
1
· ⃗ v
1
⃗v
1
· ⃗v
1
⃗ v
1
+· · · +
d
m
⃗ v
m
· ⃗ v
m
⃗v
m
· ⃗v
m
⃗ v
m
= d
1
⃗ v
1
+d
2
⃗ v
2
+... +d
m
⃗ v
m
= ⃗y.
Theorem 30 Let W be a subspace of R
n
with orthonormal basis { ⃗ u
1
, ⃗ u
2
, ..., ⃗ u
m
}. Then for
each ⃗y in R
n
,
ˆ
⃗y = proj
W
⃗y = (⃗y · ⃗u
1
) ⃗ u
1
+· · · + (⃗y · ⃗u
m
) ⃗ u
m
.
If U = [ ⃗ u
1
⃗ u
2
· · · ⃗ u
m
] then
proj
W
⃗y = UU
T
⃗y.
Example 59 Let ⃗y =
1
1
1
¸
¸
¸
, ⃗u
1
=
2/3
1/3
2/3
¸
¸
¸
, ⃗u
2
=
−2/3
2/3
1/3
¸
¸
¸
. Let W = Span{⃗u
1
, ⃗u
2
}. Find
proj
W
⃗y.
Note that ⃗ u
1
· ⃗ u
2
= 0,  ⃗ u
1
 =  ⃗ u
2
 = 1, {⃗u
1
, ⃗u
2
} is orthonomal basis of W.
proj
W
⃗y = UU
T
⃗y =
2/3 −2/3
1/3 2/3
2/3 1/3
¸
¸
¸
2/3 −2/3
1/3 2/3
2/3 1/3
¸
¸
¸
T
1
1
1
¸
¸
¸
=
8/9
7/9
11/9
¸
¸
¸
.
Theorem 31 (The Best Approximation Theorem) Let W be a subspace of R
n
, ⃗y in R
n
and
ˆ
⃗y be the orthogonal projection of ⃗y onto W. Then
ˆ
⃗y is the closest point in W to ⃗y in the sense
that
⃗y −
ˆ
⃗y < ⃗y −⃗v
for all ⃗v ∈ W distinct from
ˆ
⃗y.
Example 60 Let ⃗y =
3
−1
1
13
¸
¸
¸
¸
¸
, ⃗v
1
=
1
−2
−1
2
¸
¸
¸
¸
¸
, ⃗v
2
=
−4
1
0
3
¸
¸
¸
¸
¸
. Let W = Span{⃗v
1
, ⃗v
2
}. Find
the distance from ⃗y to W.
38
Sol: The closest point in W to ⃗y is proj
W
⃗y. So the distance is ⃗y − proj
W
⃗y. Note that
⃗ v
1
· ⃗ v
2
= 0, {⃗v
1
, ⃗v
2
} is orthogonal basis of W,
proj
W
⃗y =
⃗y · ⃗v
1
⃗v
1
· ⃗v
1
⃗ v
1
+
⃗y · ⃗v
2
⃗v
2
· ⃗v
2
⃗ v
2
=
−1
−5
−3
9
¸
¸
¸
¸
¸
, ⇒⃗y −proj
W
⃗y =
4
4
4
4
¸
¸
¸
¸
¸
, ⇒⃗y −proj
W
⃗y = 8.
Theorem 32 If W is a subspace of R
n
, then
dimW + dimW
⊥
= n.
Proof. (1) A basis of W and a basis of W
⊥
form a linearly independent set; (2) By
decomposition theorem, any vector in R
n
can be written as a linear combination of the set.
39
5.3 The GramSchmidt Process and the QR Factorization
5.3.1 The GramSchmidt Process (Algorithm)
Let S = {X
1
, X
2
, · · · , X
k
} be a set of vectors, and let
F
1
= X
1
F
2
= X
2
−
X
2
· F
1
F
1

2
F
1
· · ·
F
k
= X
k
−
X
k
· F
1
F
1

2
F
1
−
X
k
· F
2
F
2

2
F
2
−· · · −−
X
k
· F
k−1
F
k−1

2
F
k−1
.
Then {F
1
, F
2
, · · · , F
k
} is an orthogonal set.
Example 61 Consider the following independent set S = {X
1
, X
2
, X
3
} of vectors from R
4
:
X
1
=
1
1
1
1
¸
¸
¸
¸
¸
, X
2
=
6
0
0
2
¸
¸
¸
¸
¸
, X
3
=
−1
−1
2
4
¸
¸
¸
¸
¸
.
Use the GramSchmidt algorithm to convert the set S = {X
1
, X
2
, X
3
} into an orthogonal set
B = {F
1
, F
2
, F
3
}.
Solution: Let F
1
= X
1
=
1
1
1
1
¸
¸
¸
¸
¸
. Then
F
2
= X
2
−
X
2
· F
1
F
1

2
F
1
=
6
0
0
2
¸
¸
¸
¸
¸
−
8
4
1
1
1
1
¸
¸
¸
¸
¸
=
4
−2
−2
0
¸
¸
¸
¸
¸
.
F
3
= X
3
−
X
3
· F
1
F
1

2
F
1
−
X
3
· F
2
F
2

2
F
2
=
−1
−1
2
4
¸
¸
¸
¸
¸
−
4
4
1
1
1
1
¸
¸
¸
¸
¸
−
−6
24
4
−2
−2
0
¸
¸
¸
¸
¸
=
−1
−2.5
0.5
3
¸
¸
¸
¸
¸
.
40
5.3.2 The QR Factorization
By the GramSchmidt Process (Algorithm), we get the following QR Factorization:
Theorem 33 If A is m×n matrix with linearly independent columns, then
A = QR,
where Q is an m×n matrix with orthonormal columns, and R is an upper triangular matrix.
Proof. Let {X
1
, X
2
, · · · , X
k
} be the columns of A. Then there exists orthonormal set
{F
1
, F
2
, · · · , F
k
} such that
X
1
= c
11
F
1
X
2
= c
21
F
1
+c
22
F
2
· · ·
X
k
= c
k1
F
1
+c
k2
F
2
+c
kk
F
k
.
Remark. c
ij
= X
i
· F
j
.
Example 62
1 0 −3
0 2 −1
1 0 1
1 3 5
¸
¸
¸
¸
¸
=
1
√
3
−
1
√
10
−
3
√
23
0
2
√
10
−
3
√
23
1
√
3
−
1
√
10
1
√
23
1
√
3
2
√
10
2
√
23
¸
¸
¸
¸
¸
¸
√
3
√
3
√
3
0
√
10
√
10
0 0
√
23
¸
¸
¸
.
41
5.4 Orthogonal Diagonalization of Symmetric Matrix
Deﬁnition 30 A square matrix A is orthogonally diagonalizable if there exist an orthogonal
matrix Q and a diagonal matrix D such that
Q
T
AQ = D.
Conditions under which a matrix is orthogonally diagonalizable:
Spectral Theorem. Let A be a real square matrix. Then A is orthogonally diagonalizable
if and only if A is symmetric.
Property 6 Let A be symmetric.
(i) If A is real, then all eigenvalues are real.
(ii) Any two eigenvectors corresponding to distinct eigenvalues are orthogonal.
Proof. Using x · y = x
T
y.
Method to orthogonally diagonalize a symmetric matrix: Columns of Q consist of
orthonormal bases of all eigenspaces.
Example 63 Orthogonally diagonalize the matrix A =
2 1 1
1 2 1
1 1 2
¸
¸
¸
Solution: Step 1: Find all eigenvalues: The characteristic polynomial is −λ
3
+6λ
2
−9λ+4.
So λ = 1, 4.
Step 2: Find bases to each eigenspace:
Basis for E
1
:
−1
0
1
¸
¸
¸
,
−1
1
0
¸
¸
¸
.
Basis for E
4
:
1
1
1
¸
¸
¸
.
Step 3: Find orthogonal bases to each eigenspace using GramSchmidt Process:
42
For E
1
:
−1
0
1
¸
¸
¸
,
−1/2
1
−1/2
¸
¸
¸
.
Basis for E
4
:
1
1
1
¸
¸
¸
.
Step 4: Find orthonormal bases to each eigenspace:
For E
1
:
−1/
√
2
0
1/
√
2
¸
¸
¸
,
−1/
√
6
2/
√
6
−1/
√
6
¸
¸
¸
.
Basis for E
4
:
1/
√
3
1/
√
3
1/
√
3
¸
¸
¸
.
Step 5: Construct Q and D:
Q =
−1/
√
2 −1/
√
6 1/
√
3
0 2/
√
6 1/
√
3
−1/
√
2 −1/
√
6 1/
√
3
¸
¸
¸
, D =
1 0 0
0 1 0
0 0 4
¸
¸
¸
.
43
5.5 An Application: Quadratic Forms
5.5.1 Quadratic Forms
Deﬁnition 31 A quadric form in n variables is a function f : R
n
→R of the form
f(x) = x
T
Ax, x ∈ R
n
,
where A is a symmetric n ×n matrix. We call A the matrix associated with f.
• Quadric form in 2 variables: ax
2
+by
2
+cxy.
• Quadric form in 3 variables: ax
2
+by
2
+cz
2
+dxy + exz +fyz.
Example 64 Find the matrix associated with the quadratic form
f(x
1
, x
2
, x
3
) = 2x
2
1
+ 3x
2
2
−1x
3
3
−8x
1
x
2
+ 10x
2
x
3
.
Solution: The coeﬃcients of the squared terms x
2
i
go on the diagonal as a
ii
of A; the
coeﬃcients of the crossproduct terms x
i
x
j
are split into half between a
ij
and a
ji
. Thus
A =
2 −4 0
−4 3 5
0 5 −1
¸
¸
¸
.
The Principle Axes Theorem:
Every quadratic form can be diagonalized.
Speciﬁcally, if A is the n ×n symmetric matrix associated with the quadratic form x
T
Ax,
and if Q is an orthogonal matrix such that Q
T
AQ = D is an diagonal matrix, then the change
of variable
x = Qy
transforms x
T
Ax into the quadratic form y
T
Dy:
x
T
Ax = y
T
Dy = λ
1
y
2
1
+λ
2
y
2
2
+· · · +λ
n
y
2
n
,
where λ
1
, λ
2
, · · · , λ
n
are the eigenvalues of A, y =
y
1
.
.
.
y
n
¸
¸
¸
. The process is called diago
nalizing a quadratic form.
44
Example 65 Find a change of variable that transforms the quadratic form associated with
the matrix A =
¸
5 2
2 8
¸
as a quadratic form with no crossproduct terms. Test the result
with x =
¸
2
1
¸
.
Solution: The quadratic form is f(x) = 5x
2
1
+ 8x
2
2
+ 4x
1
x
2
.
Step 1: Find all eigenvalues: λ = 9, 4.
Step 2: Find corresponding unit eigenvectors:
When λ = 9: q
1
=
¸
1/
√
5
2/
√
5
¸
.
When λ = 4: q
2
=
¸
2/
√
5
−1/
√
5
¸
.
Step 3: Construct Q and D:
Q =
¸
1/
√
5 2/
√
5
2/
√
5 −1/
√
5
¸
, D =
¸
9 0
0 4
¸
.
Step 4: Let x = Qy, then
f(y) = f(y
1
, y
2
) = 9y
2
1
+ 4y
2
2
.
Step 5: When x =
¸
2
1
¸
, y = Q
T
x =
¸
4/
√
5
3/
√
5
¸
.
f(x) = 36, f(y) = 36.
Deﬁnition 32 A quadratic form f(x) = x
T
Ax ( and also a symmetric matrix A) is classiﬁed
as one of the following:
1. positive deﬁnite if f(x) > 0 for all x ̸= 0;
2. positive semideﬁnite if f(x) ≥ 0 for all x;
3. negative deﬁnite if f(x) < 0 for all x ̸= 0;
4. negative deﬁnite if f(x) ≤ 0 for all x;
5. indeﬁnite if f(x) takes on both positive and negative values.
Theorem 34 A quadratic form f(x) = x
T
Ax ( and also a symmetric matrix A) is
1. positive deﬁnite if and only if the eigenvalues of A are positive;
45
2. positive semideﬁnite if and only if the eigenvalues of A are nonnegative;
3. negative deﬁnite if and only if the eigenvalues of A are negative;
4. negative deﬁnite if and only if the eigenvalues of A are nonpositive;
5. indeﬁnite if and only if the eigenvalues of A are both positive and negative.
Proof. Let x = Qy.
Example 66 Classify f(x, y, z) = y
2
+2xy +4xz +2yz as positive deﬁnite, negative deﬁnite,
indeﬁnite, or none of them.
Solution: The eigenvalues of A are 2, 0, 3. Thus it is indeﬁnite.
5.5.2 Constrained Optimization Problem
Theorem 35 Let f(x) = x
T
Ax be a quadratic form. Let λ
1
≥ λ
2
≥ · · · ≥ λ
n
be the eigenval
ues of A. Then the following are true subject to the constraint x = 1:
1. λ
1
≥ f(x) ≥ λ
n
.
2. The maximum value of f(x) is λ
1
, and it occurs when x is an eigenvector corresponding
to λ
1
.
3. The minimum value of f(x) is λ
n
, and it occurs when x is an eigenvector corresponding
to λ
n
.
Proof. Using x = Qy. Then 1 = x
T
x = y
T
y.
Example 67 Find the max and min of f(x, y) = 5x
2
+2y
2
+4xy subject to x
2
+y
2
= 1, and
determine values of x and y for which each of these occurs.
Solution: max f is 6, when (x, y) = (2/
√
5, 1/
√
5); min f is 1, when (x, y) = (1/
√
5, −2/
√
5).
46
5.5.3 Graphing quadratic equations
Example 68 Identify and graph the conic (slicing a cone by a plane) whose equation is
5x
2
1
+ 4x
1
x
2
+ 2x
2
2
−
28
√
5
x
1
−
4
√
5
x
2
+ 4 = 0.
Solution:
Q =
¸
1/
√
5 2/
√
5
−2/
√
5 1/
√
5
¸
, D =
¸
1 0
0 6
¸
.
Let x = Qy. Then
(y
1
−2)
2
+ 6(y
2
−1)
2
= 6.
It is ellipse with center (2,1), main axis along
¸
1/
√
5
−2/
√
5
¸
, second axis along
¸
2/
√
5
1/
√
5
¸
.
47
7.1 Inner Product Spaces
7.1.1 Inner Product
Deﬁnition 33 Suppose u, v, and w are vectors in a vector space V and c is any scalar. An
inner product on the vector space V is a function that associates with each pair of vectors in
V, say u and v, a real number denoted by < u, v > that satisﬁes the following axioms.
(a) < u, v >=< v, u >,
(b) < u, v + w >=< u, v > + < u, w >,
(c) < cu, v >= c < u, v >,
(d) < u, u >≥ 0, and < u, u >= 0 if and only if u = 0.
A vector space along with an inner product is called an inner product space.
Example 69 Let ⃗u, ⃗v be two vectors in R
n
. i.e.
⃗u =
u
1
u
2
.
.
.
u
n
¸
¸
¸
¸
¸
¸
, ⃗v =
v
1
v
2
.
.
.
v
n
¸
¸
¸
¸
¸
¸
.
The following are inner products:
• Dot product: < u, v >= ⃗u · ⃗v = ⃗u
T
⃗v = u
1
v
1
+ u
2
v
2
+... +u
n
v
n
.
• Weighted dot product: < u, v >= w
1
u
1
v
1
+w
2
u
2
v
2
+... +w
n
u
n
v
n
, where w
1
, · · · , w
n
are
positive scales.
Example 70 • Let A be symmetric, positive deﬁnite matrix. < u, v >= u
T
Av deﬁnes an
inner product.
• < A, B >= trace(A
T
B) deﬁnes an inner product on M
22
.
• < f, g >=
∫
b
a
f(x)g(x)dx deﬁnes an inner product on C[a, b].
• < a
1
+b
1
x +c
1
x
2
, a
2
+b
2
x +c
2
x
2
>= a
1
a
2
+b
1
b
2
+c
1
c
2
deﬁnes an inner product on P
2
.
Property 7 Suppose u, v, and w are vectors in an inner product vector space V and c is any
scalar.
48
(a) < u +v, w >=< u, w > + < u, w >,
(b) < u, cv >= c < u, v >,
(c) < u, 0 >=< 0, v >= 0.
7.1.2 Length, Distance and Orthogonality
Deﬁnition 34 Suppose u, v, and w are vectors in an inner product vector space V and c is
any scalar.
1. The length or norm of v is v =
√
< v, v >.
2. The distance between u and v is d(u, v) = u −v.
3. u and v are orthogonal if < u, v >= 0.
A vector of length 1 is called a unit vector.
Example 71 • Consider the inner product < f, g >=
∫
1
0
f(x)g(x)dx deﬁned on C[a, b].
Given f(x) = 1 + 3x, g(x) = 1 −3x. Calculate f, d(f, g), and < f, g >.
• Consider the inner product < a
1
+ b
1
x + c
1
x
2
, a
2
+ b
2
x + c
2
x
2
>= a
1
a
2
+ b
1
b
2
+ c
1
c
2
on
P
2
. Given f(x) = 1 + 3x, g(x) = 1 −3x. Calculate f, d(f, g), and < f, g >.
Pythagoras’ Theorem. Let u, v be vectors in an inner product vector space V. Then u and
v are orthogonal if and only if u +v
2
= u
2
+v
2
.
Proof. u +v
2
=< u +v, u +v >.
7.1.3 Orthogonal Projections and the GramSchmidt Pro
cess
Example 72 Apply the GramSchmidt Process to the basis {1, x, x
2
, x
3
} of P
3
to ﬁnd a basis
that is orthogonal with respect to the inner product < f, g >=
∫
1
−1
f(x)g(x)dx.
Solution: {1, x, x
2
−
1
3
, x
3
−
3
5
x}. Remark. They are called Legendre Polynomials. If we
divide them by their lengths, then they are called normalized Legendre Polynomials.
49
Deﬁnition 35 Let W be an inner product space with orthogonal basis {w
1
, · · · , w
k
}. Then
the orthogonal projection of v onto W is deﬁned as
proj
W
(v) =
< w
1
, v >
< w
1
, w
1
>
w
1
+· · · +
< w
k
, v >
< w
k
, w
k
>
w
k
.
The component of v orthogonal to W is
perp
W
(v) −proj
W
(v).
Example 73 Let p(x) = 5x
3
+2x
2
−x+3. Find the projection of p(x) onto P
3
with Legendre
polynomials as a basis.
CauchySchwarz Inequality. Let u, v be vectors in an inner product vector space V. Then
 < u, v >  ≤ u v with equality holding if and only if u and v are scalar multiples of each
other.
Proof. If u ̸= 0, then
< u, v >
2
u
2
= proj
u
(v)
2
= v
2
−perp
u
(v)
2
.
The Triangle Inequality. Let u, v be vectors in an inner product vector space V. Then
u +v ≤ u +v with equality holding if and only if u and v are scalar multiples of each
other.
50
7.2 Norms and Distance Functions
7.2.1 Norms
Deﬁnition 36 A norm on a vector space V is a mapping that associates with each vector v
a real number v such that, for all u, v and scalar c:
(a) v ≥ 0, and v = 0 if and only if v = 0.
(b) cv = cv.
(c) u +v ≤ u +v.
A vector space along with a norm is called a normed linear space.
Example 74 Let ⃗u =
u
1
u
2
.
.
.
u
n
¸
¸
¸
¸
¸
¸
be a vector in R
n
. i.e.
• The sum norm: u
s
= u
1
 +u
2
 +· · · +u
n
.
• The max norm (∞norm, or uniform norm): u
∞
= max{u
1
, u
2
, · · · , u
n
}.
• The sum norm: u
p
= (u
1

p
+u
2

p
+· · · +u
n

p
)
1/p
. When p = 2, it is called Eu
clidean norm.
Example 75 Let v ∈ Z
n
2
. Deﬁne v
H
= w(v), the weight of v, which counts the number of
1’s in v. Then it is a norm, and called Hamming norm.
7.2.2 Distance Function
Deﬁnition 37 Form any norm, we deﬁne d(u, v) = u −v.
Property 8 (a) d(u, v) ≥ 0, and d(u, v) = 0 if and only if v = u.
(b) d(u, v) = d(v, u).
(c) d(u, w) ≤ d(u, v) + d(v, w).
A function d satisfying the three properties is called metric. A vector space the possesses such
a function is called metric space.
51
Example 76 Let ⃗u =
1
2
3
¸
¸
¸
, ⃗v =
4
4
4
¸
¸
¸
. Calculate d
E
(u, v), d
s
(u, v), d
∞
(u, v).
Example 77 Let ⃗u =
1
1
0
¸
¸
¸
, ⃗v =
0
1
1
¸
¸
¸
∈ Z
3
2
. Calculate d
H
(u, v).
7.2.3 Matrix Norms
Deﬁnition 38 A matrix norm on M
nn
is a mapping that associates with each matrix A a
real number A such that, for all A, B and scalar c:
(a) A ≥ 0, and A = 0 if and only if A = 0.
(b) cA = cA.
(c) A +B ≤ A +B.
(d) AB ≤ A B.
A matrix norm on M
nn
is said to be compatible with a vector norm v on R
n
if for all A
and x:
Ax ≤ A x.
Example 78 • The Frobenius norm:
A
F
=
n
∑
i,j=1
a
2
ij
.
1. Show that it is compatible with the Euclidean norm.
2. Show that it is a matrix norm.
• A
1
= max
x=1
Ax deﬁnes another matrix norm. This norm is called operator
norm induced by the vector norm x.
• A
1
= max
j=1,··· ,n
{
∑
n
i=1
a
ij
} = max
j=1,··· ,n
{a
j

s
} deﬁnes a matrix norm, where a
j
is the jth column of A.
• A
∞
= max
i=1,··· ,n
∑
n
j=1
a
ij

¸
= max
i=1,··· ,n
{b
i

s
} deﬁnes a matrix norm, where b
i
is the ith column of A.
52
7.3 Least Squares Approximation
7.3.1 The Best Approximation Theorem
Deﬁnition 39 Let W be a subspace of a normed linear space V . Then for any v ∈ V , the
best approximation to v in W is the vector ¯ v ∈ W such that
v − ¯ v < v −w
for any w ∈ W diﬀerent from ¯ v.
The Best Approximation Theorem. Let W be a ﬁnitedimensional subspace of an inner
product space V . Then for any v ∈ V , the best approximation to v in W is proj
W
(v).
Proof. v −proj
W
(v) is orthogonal to proj
W
(v) −w.
Example 79 Let ⃗y =
1
1
1
1
¸
¸
¸
¸
¸
, ⃗v
1
=
0
1
−2
1
¸
¸
¸
¸
¸
, ⃗v
2
=
0
0
1
2
¸
¸
¸
¸
¸
, ⃗v
3
=
0
−5
−2
1
¸
¸
¸
¸
¸
. Let W = Span{⃗v
1
, ⃗v
2
, ⃗v
3
}.
Find the best approximation to ⃗y in W, i.e., proj
W
⃗y.
Sol: Since
⃗ v
1
· ⃗v
2
= 0 + 0 + (−2) + 2 = 0
⃗ v
1
· ⃗v
3
= 0 + (−5) + 4 + 1 = 0
⃗ v
2
· ⃗v
3
= 0 + 0 + (−2) + 2 = 0.
Thus ⃗ v
1
⊥⃗v
2
, ⃗ v
1
⊥⃗v
3
, ⃗ v
2
⊥⃗v
3
, the set {⃗v
1
, ⃗v
2
, ⃗v
3
} is orthogonal basis of W.
proj
W
⃗y =
ˆ
⃗y =
⃗y · ⃗v
1
⃗v
1
· ⃗v
1
⃗ v
1
+
⃗y · ⃗v
2
⃗v
2
· ⃗v
2
⃗ v
2
+
⃗y · ⃗v
3
⃗v
3
· ⃗v
3
⃗ v
3
= 0
0
1
−2
1
¸
¸
¸
¸
¸
+
3
5
0
0
1
2
¸
¸
¸
¸
¸
+
−6
5
0
−5
−2
1
¸
¸
¸
¸
¸
=
0
6
3
0
¸
¸
¸
¸
¸
.
53
7.3.2 Least Squares Approximation
The line y = a + bx is called the line of best ﬁt (or the leat squares approximating
line) for the points (x
1
, y
1
), · · · , (x
n
, y
n
), if it minimizes
(a +bx
1
−y
1
)
2
+· · · + (a +bx
n
−y
n
)
2
= A⃗x −
⃗
b,
where
A =
1 x
1
1 x
2
.
.
.
.
.
.
1 x
n
¸
¸
¸
¸
¸
¸
, ⃗x =
¸
a
b
¸
,
⃗
b =
y
1
y
2
.
.
.
y
n
¸
¸
¸
¸
¸
¸
.
Deﬁnition 40 Let A be m × n matrix and
⃗
b ∈ R
n
, a least squares solution of A⃗x =
⃗
b is a
vector ⃗y ∈ R
n
such that

⃗
b −A⃗y ≤ 
⃗
b −A⃗x
for all ⃗x ∈ R
n
.
The Least Squares Theorem. Let A be m×n matrix and
⃗
b ∈ R
n
, a least squares solution
of A⃗x =
⃗
b always exists. Moreover:
a. X is a least squares solution of A⃗x =
⃗
b if and only if X is a solution of the normal
equation
A
T
A⃗x = A
T
⃗
b.
b. A has linearly independent columns if and only if A
T
A is invertible. In this case, the
least squares solution of A⃗x =
⃗
b is unique and given by
X = (A
T
A)
−1
A
T
⃗
b.
Proof. a. Let AX = proj
col(A)
⃗
b. Then (
⃗
b −AX)⊥col(A). Thus A
T
(
⃗
b −AX) = 0.
b. rank(A) = n if and only if A
T
A is invertible.
Example 80 Find the least squares approximating line for the data points (1,2), (2,2) and
(3,4).
54
Solution: Let the line be y = a +bx. Then
1 1
1 2
1 3
¸
¸
¸
¸
a
b
¸
=
2
2
4
¸
¸
¸
.
The normal equation is:
1 1
1 2
1 3
¸
¸
¸
T
1 1
1 2
1 3
¸
¸
¸
¸
a
b
¸
=
1 1
1 2
1 3
¸
¸
¸
T
2
2
4
¸
¸
¸
, ⇒
¸
3 6
6 4
¸¸
a
b
¸
=
¸
8
18
¸
.
The solution is
¸
a
b
¸
=
¸
3 6
6 4
¸
−1
¸
8
18
¸
=
¸
2/3
1
¸
.
Remark. Similarly, we can ﬁnd the parabola that gives the best least squares approximation
to data points.
7.3.3 Least Squares via the QR Factorization
Theorem 36 Let A be m × n matrix and
⃗
b ∈ R
n
. If A = QR is a QR factorization of A,
then the unique least squares solution of A⃗x =
⃗
b is
X = R
−1
Q
T
⃗
b.
Proof. By the facts that Q
T
Q = I and R is invertible.
Example 81 Use QR factorization to ﬁnd a least squares solution of
1 0 −3
0 2 −1
1 0 1
1 3 5
¸
¸
¸
¸
¸
⃗x =
1
1
1
1
¸
¸
¸
¸
¸
.
Solution: By 5.3.2,
1 0 −3
0 2 −1
1 0 1
1 3 5
¸
¸
¸
¸
¸
=
1
√
3
−
1
√
10
−
3
√
23
0
2
√
10
−
3
√
23
1
√
3
−
1
√
10
1
√
23
1
√
3
2
√
10
2
√
23
¸
¸
¸
¸
¸
¸
√
3
√
3
√
3
0
√
10
√
10
0 0
√
23
¸
¸
¸
.
55
7.3.4 Orthogonal Projection Revisited
Theorem 37 Let W be a subspace of R
m
, and let A be m×n matrix whose columns form a
basis of W. If v ∈ R
m
, then
proj
W
(v) = A(A
T
A)
−1
A
T
v.
The linear transformation P that projects R
m
onto W has A(A
T
A)
−1
A
T
as its standard matrix.
Proof. Let X be the unique leat squares solution of Ax = v. Then AX = proj
col(A)
(v) =
proj
W
(v).
Example 82 Let W = {(x, y, z) ∈ R
3
x − y + 2z = 0}. Find the orthogonal projection of
v = [3 − 1 2]
T
onto W and give the standard matrix of the linear transformation P that
projects R
3
onto W.
Solution: Let A =
1 −1
1 1
0 1
¸
¸
¸
, whose columns form a basis of W.
56
7.4 The Singular Value Decomposition
7.4.1 The Singular Values of a Matrix
Deﬁnition 41 Let A be m× n matrix. The singular values of A are the square roots of the
eigenvalues of A
T
A and are denoted by σ
1
, · · · , σ
n
so that σ
1
≥ · · · ≥ simga
n
.
Example 83 Let A =
1 1
1 0
0 1
¸
¸
¸
. Then sigma
1
=
√
3, σ
2
= 1.
7.4.2 The Singular Values Decomposition
The Singular Values Decomposition: Let A be m×n matrix. The singular values of A
satisfy σ
1
≥ · · · ≥ σ
r
> 0, and σ
r+1
= · · · σ
n
= 0. Let
V = [v
1
· · · v
n
],
where {v
1
, · · · , v
n
} is an orthonormal basis of R
n
consisting of eigenvectors of A
T
A. Let
U = [u
1
· · · u
m
],
where {u
1
, · · · , u
m
} is an orthonormal basis of R
m
extended from the set
{u
1
, · · · , u
r
}, u
1
=
1
σ
1
Av
1
, · · · , u
r
=
1
σ
r
Av
r
.
Let
Σ =
¸
D O
O O
¸
, D = diag{σ
1
, · · · , σ
r
}.
Then
A = UΣV
T
,
which is called a singular value decomposition (SVD) of A. The columns of U are called
left singular vectors of A, the columns of V are called right singular vectors of A.
Example 84 Let A =
1 1
1 0
0 1
¸
¸
¸
. Find a singular value decomposition (SVD) of A.
57
Solution:
Step 1. Orthogonally diagonalize A
T
A to ﬁnd V :
V =
¸
1/
√
2 −1/
√
2
1/
√
2 1/
√
2
¸
.
Step 2. Find Σ: σ
1
=
√
3, σ
2
= 1. Thus
Σ =
√
3 0
0 1
0 0
¸
¸
¸
.
Step 3. Find U:
u
1
=
1
σ
1
Av
1
=
2/
√
6
1/
√
6
1/
√
6
¸
¸
¸
, u
2
=
1
σ
2
Av
2
=
0
−1/
√
2
1/
√
2
¸
¸
¸
.
Then applying GramSchmidt Process to get u
3
from e
1
or e
2
or e
3
.
Theorem 38 Let A be m× n matrix. The singular values of A are σ
1
≥ · · · ≥ σ
r
> 0, and
σ
r+1
= · · · σ
n
= 0. Let Let
A = UΣV
T
.
Then
1. A = σ
1
u
1
v
T
1
+· · · +σ
r
u
r
v
T
r
.
2. rank(A) = r.
3. {u
1
, · · · , u
r
} is an orthonormal basis for col(A).
4. {u
r+1
, · · · , u
m
} is an orthonormal basis for null(A
T
).
5. {v
1
, · · · , v
r
} is an orthonormal basis for row(A).
6. {v
r+1
, · · · , v
n
} is an orthonormal basis for null(A).
58
5.1 Orthogonality in Rn
. 5.1.1 Review . . . . . . . . . . . . 5.1.2 Orthogonal Set . . . . . . 5.1.2 Orthonormal Set . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
31 31 32 33 34 34 36 37 40 40 41 42 44 44 46 47 48 48 49
5.2 Orthogonal Complements and Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Orthogonal Complements . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 The orthogonal Decomposition Theorem . . . . . . . . . . . . . .
5.3 The GramSchmidt Process and the QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 The GramSchmidt Process (Algorithm) . . . . . . . . . . . . . . 5.3.2 The QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Orthogonal Diagonalization of Symmetric Matrix 5.5 An Application: Quadratic Forms . . . . . . . . . . . . 5.5.1 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Constrained Optimization Problem . . . . . . 5.5.3 Graphing quadratic equations . . . . . . . . . . . 7.1 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Length, Distance and Orthogonality . . . . . . 7.1.3 Orthogonal Projections and the GramSchmidt Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 7.2 Norms and Distance Functions . . . . . . . . . . . . . . . 51 7.2.1 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 7.2.2 Distance Function . . . . . . . . . . . . . . . . . . . . . . 51 7.2.3 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . 52 7.3 Least Squares Approximation . . . . . . . . . . . . . . . . 53 7.3.1 The Best Approximation Theorem . . . . . . . 53 7.3.2 Least Squares Approximation . . . . . . . . . . . 54 7.3.3 Least Squares via the QR Factorization . . 55 7.3.4 Orthogonal Projection Revisited . . . . . . . . . 56 7.4 The Singular Value Decomposition . . . . . . . . . . . 57 7.4.1 The Singular Values of a Matrix . . . . . . . . . 57 7.4.2 The Singular Values Decomposition . . . . . . 57
2
6.1 Vector Spaces and Subspaces
Review of Rn Vectors, addition, scalar multiplication, dot product, cross product, length, distance.
6.1.1 Vector Spaces Deﬁnition. Let V be a set of elements. If the operations addition and scalar multiplication are deﬁned in V satisfying the following axioms: A1. If ⃗ , ⃗ ∈ V , then ⃗ + ⃗ ∈ V . u v u v A2. If ⃗ , ⃗ ∈ V , then ⃗ + ⃗ = ⃗ + ⃗ . u v u v v u A3. If ⃗ , ⃗ , w ∈ V , then ⃗ + (⃗ + w) = (⃗ + ⃗ ) + w. u v ⃗ u v ⃗ u v ⃗ A4. There exists an element, denoted by ⃗ in V, such that ⃗ + ⃗ = ⃗ for every ⃗ . 0, 0 u u u A5. For every ⃗ ∈ V , there exists an element, denoted by −⃗ , in V such that ⃗ +(−⃗ ) = ⃗ u u u u 0. S1. Operation scalar multiplication is deﬁned for every number c and every ⃗ in V, and u c⃗ ∈ V . u S2. Operation scalar multiplication satisﬁes the distributive law: c(⃗ + ⃗ ) = c⃗ + c⃗ . u v u v S3. Operation scalar multiplication satisﬁes the second distributive law: (c+d)⃗ = c⃗ +d⃗ . u u u S4. Operation scalar multiplication satisﬁes the associative law: (cd)⃗ = c(d⃗ ). u u S5. For every element u ∈ V , 1⃗ = ⃗ . u u Then V is called a vector space. Remark. Usually, we use ⊕ for general addition, ⊙ or ⊗ for general multiplication.
Example 1
1. Rn with the usual operations is a vector space.
2. Cn with the usual operations is a vector space. 3. P =the set of all polynomials with the usual operations are vector space. 4. Mm n = the set of all m by n matrices with the usual operations is a vector space. 5. F (R) of all realvalued functions on R with the usual operations is a vector space. 6. F [a, b] of all realvalued functions on [a, b] with the usual operations is a vector space. 3
Example 2
1. Z of integers with the usual operations is NOT a vector space.
2. R2 with c(x, y) = (cx, 0) is not a vector space. Property 1 (i) w + ⃗ = ⃗ + ⃗ implies w = ⃗ . ⃗ v u v ⃗ u ⃗ (ii) 0⃗ = 0. v (iii) c⃗ = ⃗ 0 0. (iv) av = 0 implies a = 0 or v = 0. 6.1.2 Subspaces A set U is a subspace of a vector space V if U is a vector space with respect to the operations of V. Theorem 1 U is a subspace of V if (i) the zero vector is in U; (ii) if x is in U, then ax is in U for any scalar a, and (iii) if x, y are in U, then x + y is in U. Examples (1) {⃗ is a subspace. A subspace that is not {⃗ is a proper subspace. 0} 0} (2) A line through the origin in the space is a subspace; A plane through the origin in the space is a subspace. (3) S1 = {(s, 2s, 3)s ∈} is not a subspace. R} {[ ] s (4) S2 = s ∈ R is not a subspace. It does not satisfy (ii). s2 (5) S3 = {(s, t)s2 = 2 , s, t ∈ R} is not a subspace. It does not satisfy (iii). t s (6) S4 = 2s + 3t s, t ∈ R is a subspace. 5t (7) Rn is {[ a subspace of itself. ] } s+1 (8) S5 = s, t ∈ R is a subspace. t a (9) S6 = b a = 3b + 2c, a, b, c ∈ R is a subspace. c {[ ] } a b (10) S7 = a = 3b + 2c, a, b, c ∈ R is a subspace of M22 . c d
4
(11) Pn the set of all polynomials with degree less than or equal to n is a subspace of P . (12) S2 = {p ∈ P2 p(1) = 0} is a subspace of P .
6.1.3 Spanning sets as a subspace Let S = {v1 , v2 , · · · , vk }. span S = {c1 v1 + c2 v2 + · · · + ck vk c1 , · · · , ck ∈ R}. It is a subspace of V, called the span of S, denoted by span S. Example. The span of a single nonzero vector in the space is a line through the origin. The span of a two nonparallel nonzero vectors u and v in the space is a plane through the origin with normal vector u × v. Let V be a subspace, and let S be a subset of V. If span S = V, then S is a spanning set of V. In particular, V itself is a spanning set of V. A subspace generally has more than one spanning set.
Property 2 (i) If X ∈ S, then X ∈ spanS. (ii) If a subspace W contains every vector in S, then W contains span S. As an example of using the second property, span{X + Y, X, Y } = span{X, Y }. (iii) If ⃗ is a linear combination of v1 , v2 , ..., vk , then b spanS = {⃗ v1 , v2 , , vk } = spanS = {v1 , v2 , ..., vk }. b, (iv) Rn = span{E1 , E2 , ..., En }. (v) null A = the span of the basic solutions of AX=0. (vi) im A = the span of the columns of A. Example 3 (i) Verify that [1 2 0 1]T is in span{[2 1 2 0]T , [0 3 2 2]T }. Solution: The corresponding system is consistent. (ii) Verify that the set of vectors S = {[1 2 3]T , [1 0 1]T , [2 1 1]T } spans R3 . Solution: For any [a b c]T in R3 , The corresponding system is consistent. (iii) Find a,b such that X = [a b a+b ab]T is in span{X1, X2, X3}, where X1 = [1 1 1 1]T , X2 = [1 0 1 2]T , X3 = [1 0 1 0]T .
5
x.6. Theorem 2 1. Basis. 1 − x + 3x2 . 3.2 Linear Independence. Solution: −2v1 + v2 − v3 = ⃗ ⃗ ⃗ ⃗ 0. If the zero vector is in a set of vectors. the set {1 + x + x2 . In C(R). If a set contains more vectors than entries in each vector. 2. then the set of vectors is linearly dependent. 1 + 3x − x2 } is linearly dependent. v2 = 5 . 6 . Example 6 1. xm = 0. 0 is linearly dependent.2. 3. The set is said to be linearly dependent if there is a nontrivial solution to the vector equation. · · · . 2. · · · . v2 . the set {sin x. cos x. 3 0 −1 3 1 ⃗ ⃗ ⃗ ⃗ ⃗ Example 5 Given v1 = 2 . x2 . In P2 . A set of two vectors is linearly dependent if and only if one of the vectors is a multiple of the other. ⃗m } in V is linearly independent if the vector equation v v x1⃗1 + x2⃗2 + · · · + xm⃗m = ⃗ v v v 0 implies that x1 . v3 } is linearly ⃗ 2 8 3 dependent and ﬁnd the linear combination. 1 0 Example 4 2 is linearly independent. sin(2x)} is linearly dependent. and Dimension 6. the set {1. 4. In P2 . then the set is linearly dependent. Show that {v1 . A set of two or more vectors is linearly dependent if and only if at least one vector may be written as a linearly combination of the others.1 Linear Independence Deﬁnition 1 A set of vectors {⃗1 . v3 = 1 . x2 } is linearly independent.
c ∈ R} V = {(a.. we just write the basis as B. Then the set {E11 .. ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ Solution: We need to set up the equation c1 (v1 − v2 ) + c2 (v2 − v3 ) + c3 (v3 − v4 ) + c4 (v4 − v1 ) = ⃗ ⇒ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ 0. c1 = c2 = c3 = c4 . • A basis is the largest spanning set of linearly independent vectors for a subspace. Properties of basis: • There is more than one basis for a subspace.g. 0 0 for R . all other entries are 0. Then {e1 . . Thus dependent. 3. d ∈ R} Solution: 7 . Example 9 1.. 1 0 0 0 1 0 Example 8 1. (c1 − c4 )v1 + (−c1 + c2 )v2 + (−c2 + c3 )v3 + (−c3 + c4 )v4 = ⃗ ⇒ ⃗ ⃗ ⃗ ⃗ 0.. Find a basis to each of the following subspaces: U = {(a + 2b + 3c. e2 = . c.. e. Emn } is the standard basis of Mmn . . en = .... x. The number of vectors in a basis for a subspace V is called the dimension of V and is denoted by dim V . xn } is the standard basis of Pn . We denote it by BV . . 6. a − c. a − b)a. The set {1. v2 .2 Basis and Coordinates Deﬁnition 2 A basis for a subspace V is a linearly independent set of vectors that spans V. Let e1 = . v3 − v4 . Determine if the following set is lin⃗ ⃗ ⃗ ⃗ early independent or dependent: S = {v1 − v2 .Example 7 Let {v1 . .. = 1. n 1 2. . c. .. . .. v2 − v3 . .2. 3b − 2c = d. Let Eij be the matrix of m × n where the (i. d)a + 2b = c. .. which is called the standard basis of Rn . . en } is a basis . v4 } be linearly independent. j) entry is 1. except for the simplest subspace {⃗ 0}. a.. . c1 = c2 = c3 = c4 . b. When V is clear.. b. b + a. v3 . b. x2 . v4 − v1 }.
0. 0). 1 1 1 1 0 −1 1 1 ] [ 8 . ⇒ a = 0. (3. 1. (2. 1. −2). z ∈ R 1 1 ] [ ]} {[ ] [ ]} 0 0 0 0 2 3 0 0 . 0. a − b) = (0. c ∈ R} = Span{(1. 0. (0. b ∈ R} = Span{(1. 0. 1). c = 0. −1)} is linearly independent. 1. 0. 1. −1). 0. −1)a. 1. 1. b. 1. −1)}. 0. b + a. 2. −1) + c(3. 0. (2. (3. 2. 0)} is linearly independent. then (a + 2b + 3c. −1. 1. y. a − c. 1. y. Thus V = {(a. 1. 0 −1 0 0 1 1 ] } 0 0 +z x. 2. 0. Similarly we can show that the set of two vectors T = {(1. 1. If a(1. Thus S is independent. Thus T is a basis of V . −1. 1) + b(2. 1. z ∈ R y+z y+z−x Solution: { [ ] [ 2 3 W = x +y 0 −1 {[ ] [ 2 3 = Span . −2) + b(0. 0. 0)}. a + 2b. 2. from a + 2b = c. 0. 1. = Span . . 1. b ∈ R} = {a(1. 1. which is a basis of U . −1. 3b − 2c = d we imply that d = −2a − b. b. 0)a. −2a − b)a. −1. • For the subspace V . 1. 1.2 : {[ ] } 2x 3x W = x. Find a basis to the following subspace of M2. d = 0. 0. 1). 0. 1) + b(2. (0. −2). −1) + c(3. Next we show that the set of three vectors S = {(1. b = 0.• U = {a(1. 1. 0) = ⃗ 0. −1).
[⃗ ]B = .. + cp vp .. a a a ⃗ ⃗ 0 0 0 0 0 1) Pivot columns are ⃗ 1 ⃗ 3 . . which is a basis of W . Then ⃗ may be v ⃗ x x written as a linear combination of ⃗1 . Set up the equation [ a then [ 2 3 0 −1 ] +b ] = [ 2 3 0 −1 ] ] [ . Let ⃗ ∈ V . (or the Bcoordinate vector of ⃗ ). Thus S is independent. cp are called the coordinates of ⃗ relative to the basis B. So BColA = {⃗ 1 . Deﬁnition 3 Given a basis B = {⃗1 .{[ Next we show that the set of two vectors S = pendent. ⃗ 3 }.. . b − a = 0. ] ... [⃗ 4 ]BColA = a [⃗ 2 ]BColA = a 1 7/4 0 3) dim BColA = 2.. [⃗ 5 ]BColA = a . b = 0. . 0 0 1 1 ]} is linearly inde 0 0 1 1 [ =⃗ 0. . b = 0. x v ⃗ The weights c1 .. x x 1 −3 2 5 3 Example 10 Let A = [⃗ 1 ⃗ 2 ⃗ 3 a4 a5 ] = 0 0 4 7 4 .. These coordinates x may be written as a vector c1 . x . cp called the coordinate vector of ⃗ with respect to B. vp } for a subspace V.. 2a 3a b b−a 0 0 0 0 ⇒ 2a = 0. a a a a 2) [ ] [ ] [ ] 1 3/2 −3 . Note: The order of the vectors in the basis B inﬂuences the coordinate vector [⃗ ]B .. vp : v ⃗ ⃗ = c1⃗1 + . ⇒ a = 0. x 9 .. 3a = 0.
Then v ⃗ 1. We write it as dim V . 0} 2. Example 13 Extend the following linearly independent set {1 + x + x2 . uk } is linearly independent iﬀ {[⃗ 1 ]B . dim{⃗ = 0...3 Dimension Deﬁnition 4 The dimension of a vector space V is deﬁned to be the number of vectors in a basis.. uk } be a set of vectors in V. Property 3 (i) [c1⃗ 1 + . Any set of less than n vectors in V can not span V. vn } for a vector space V. 1 + 3x − x2 } to a basis of P2 . (ii) is from (i). where B = {1 + x. u ⃗ u u Proof. [⃗ k ]B } is a set u ⃗ u u n of vectors in R with more than n vectors. So dependent. u ⃗ u u (ii) {⃗ 1 . Theorem 3 Given a basis B = {⃗1 .Example 11 Find [p(x)]B . 10 . dim Pn = n + 1.. Any set of more than n vectors in V is linearly dependent. 3.. · · · . 1 + 3x − x2 }. 2 and 3 are from 1.2. 1 − x + 3x2 . . + ck uk ]B = c1 [⃗ 1 ]B + · · · + ck [⃗ k ]B . dim Mmn = mn.. Then {[⃗ 1 ]B . k > n... [⃗ k ]B } is linearly independent. . 4. 6. 3.... 2. Proof. Example 12 1. Let {⃗ 1 . . 1. (i) is from the deﬁnition. p(x) = 2 − 5x − x2 . Every basis for V has exactly n vectors. dim Rn = n. · · · .
Then u ⃗ v ⃗ [[[⃗1 ]E · · · [⃗n ]E ]  [[⃗ 1 ]E · · · [⃗ n ]E ]] → [IPC←B ]. vn } be two bases for V.6. a. un } and C = {⃗1 . PC←B is the unique matrix P such that P [⃗ ]B = [⃗ ]C for all ⃗ ∈ V .. Let E be any basis of V . [PE←C PE←B ] → [IPC←B ]. Let ⃗ = c1⃗ 1 + .. Let pi be the ith column of P . The changeofu ⃗ v ⃗ basis matrix from B to C is deﬁned as: PC←B = [[⃗ 1 ]C · · · [⃗ n ]C ] ..Y = .3 Change of Basis 6. x u ⃗ x u u x b.. x x b. verify that 3 4 PC←B [U ]B = [U ]C . Then [⃗ ]C = c1 [⃗ 1 ]C + · · · + ck [⃗ k ]C = PC←B [⃗ ]B . ... if [v]B is known. un } and C = {⃗1 . 11 .. Then u ⃗ v ⃗ a. E12 . X.. u u Theorem 4 Let B = {⃗ 1 .. 6... . . so the matrix PC←B is invertible.Z = . Example 14 In M22 ... let B = {E11 .. [ ] 1 2 (ii) Let U = .. . c. Z}. which is the ith column of PC←B . x x x c.1 Changeofbasis Matrix Question: Given two bases B and C for a vector space V . Proof. where [ ] [ ] [ ] [ ] 1 0 1 1 1 1 1 1 W = ..3.X = . 0 0 0 0 1 0 1 1 (i) Find PC←B and PB←C .. . Y. v v u u i. how to ﬁnd [v]C ? Deﬁnition 5 Let B = {⃗ 1 . un } and C = {⃗1 . PB←C [U ]C = [U ]B . Since the columns of PC←B are linearly independent.3. Then pi = P ei = P [ui ]B = [ui ]C . vn } be two bases for V. vn } be two bases for V. + cn un . E22 }. and C = {W. PC←B is invertible and the inverse is PB←C . PC←B [⃗ ]B = [⃗ ]C .. E21 ..2 GaussJordan Method for computing Changeofbasis Matrix Let B = {⃗ 1 .e.. ..
1 Introduction to Eigenvalues and Eigenvectors Deﬁnition 6 An eigenvector of an n × n matrix A is a nonzero vector ⃗ such that A⃗ = λ⃗ x x x for some scalar λ. [ ] [ ] [ ] [ ] 1 6 1 6 1 Example 16 Let A = . x x x To determine whether a given value λ is an eigenvalue of a matrix A we need to ﬁnd a nonzero vector ⃗ such that A⃗ = λ⃗ .Example 15 Let B = {1 + x. 1 + 3x − x2 } and C = {1 − x. ⃗ is an eigenvector corresponding to λ = −4. A scalar λ is called an eigenvalue of A if there is a nontrivial solution ⃗ x such that A⃗ = λ⃗ . 4. 12 . Find PC←B by GaussJordan method. 1 + x + 2x2 .w = ⃗ . u v w is not an eigenvector. ⃗ Thus ⃗ is an eigenvector corresponding to λ = 7. ⃗ Deﬁnition 7 The set of all eigenvectors for a particular eigenvalue λ of a matrix A is a subspace and so is called the eigenspace of A corresponding to λ. We write it as Eλ . v Aw = ⃗ [ 1 5 ] ̸= λw.⃗= u .⃗= v . 1 + 2x − x2 }. ⃗ is called the eigenvector corresponding to λ. 1 − x + 3x2 . Example 17 The eigenvalues of a triangular matrix are the entries on its main diagonal. u A⃗ = v [ −24 20 ] = −4⃗ . This is the same as determining whether the matrix x x x equation (A − λI)⃗ = 0 x has a nontrivial solution. Note that 5 2 1 −5 0 [ A⃗ = u 7 7 ] = 7⃗ .
Example 18 Calculate det A.. det A = 5(1)(2)(12) = 120. det A = a21 c21 + a22 c22 + a23 c23 [ ] [ ] [ ] 3 5 1 5 1 3 = 2(−1)2+1 det + (−1)2+2 det + (−1)2+3 det 4 2 3 2 3 4 = 2(14) + (−13) + 5 = 20. det A = ai1 ci1 + ai2 ci2 + . The determinant of A is deﬁned as c d det A = A = a b c d = ad − bc.4.. Solution: A is an upper triangular matrix.. + anj cnj . 13 . Similarly. j)th cofactor of A is the number. which is called a cofactor expansion across the jth column. cij = (−1)i+j det Aij . For a n × n matrix A. 3 4 2 Solution: We do cofactor expansion across the 2nd row. The (i. where A= 5 0 0 0 3 1 0 0 5 7 1 9 2 12 0 12 . which is called a cofactor expansion across the ith row. let Aij be the matrix obtained from A by deleting the ith row and jth column. Example 19 Calculate det A.2 Introduction to Determinants [ ] a b Deﬁnition 8 Let A = .. + ain cin . det A = a1j c1j + a2j c2j + . where 1 3 5 A = 2 1 1 .
• The geometric multiplicity of an eigenvalue is the dimension of its eigenspace. Find all eigenvalues. −1 0 Sol: det(A − λI) = λ2 − 4λ + 5 ⇒ λ = 2 ± i. Algebraic and Geometric multiplicity: • The algebraic multiplicity of an eigenvalue is equal to the number of times it is a root of the characteristic equation. The solutions of the characteristic equation det(A − λI) = 0 are 3.3 Eigenvalues and Eigenvectors Characteristic equation: det(A − λI) is called the characteristic polynomial of A and det(A − λI) = 0 is called the characteristic equation. Theorem 5 The solutions of the characteristic equation are the eigenvalues of A. [ ] 4 5 Example 20 Let A = . 0 0 3 The eigenvalue 3 has algebraic multiplicity 2 and the eigenvalue 1 has algebraic multiplicity 1. Example 21 Find all eigenvalues of A. Find the corresponding eigenspaces and their geometric multiplicities. 0 0 3 Solution: The characteristic polynomial is 3−λ 2 −1 0 1−λ 2 0 0 3−λ det(A − λI) = = (3 − λ)2 (1 − λ).3.1. where 3 2 −1 A = 0 1 2 . 14 . Example 22 Let 3 2 −1 A = 0 1 2 .4.
The equation A⃗ = ⃗ has at least one solution for each ⃗ ∈ ℜn . x 0 0 x3 1 The eigenspace has a basis −1 . 3. Then the following statements are equivalent. A has n pivot positions. The equation A⃗ = ⃗ has only the trivial solution. 2. A is an invertible matrix. The columns of A form a linearly independent set. 4. 2 2 −1 2 2 −1 2 2 0 1 A − λI = A − I = 0 0 2 R3 − R2 0 0 2 R1 − R2 0 0 2 . x b b 15 . 7. 0 When λ = 1. 1.Solution: When λ = 3. −− − − −→ −−−→ 0 0 0 − −2 − 0 0 0 0 0 2 Thus (A − I)⃗ = 0 has the solution x x1 1 t ⃗ = x2 = −t = t −1 . x 0 5. 0 Theorem 6 (The Invertible Matrix Theorem) Let A be a square n × n matrix. 6. The linear transformation TA : ℜn → ℜn is 1:1. 0 2 −1 0 2 −1 0 2 0 A − λI = A − 3I = 0 −2 2 R2 + R1 0 0 1 R1 + R2 0 0 1 . A is row equivalent to the identity matrix. x x3 0 0 1 The eigenspace has a basis 0 . The geometric multiplicity of the eigenvalue 3 is 1. −− − − −→ −− − − −→ 0 0 0 0 0 0 0 0 0 Thus (A − 3I)⃗ = 0 has the solution x t 1 x1 ⃗ = x2 = 0 = t 0 . The geometric multiplicity of the eigenvalue 1 is 1.
20.. 13. . vk+1 = 0. AT is invertible. Property 4 Let λ be an eigenvalue of A with corresponding eigenvector x. rank A = n 17. Apply A to both sides. if on the contrary that the set is dependent. 19. 10. λr ⃗ ⃗ of an n × n matrix A. We also have λk+1 vk+1 = c1 λk+1 v1 + · · · ck λk+1 vk .8. Assume that when r = k. The linear transformation TA : ℜn → ℜn is onto. ⃗ ⃗ Proof. We use induction. 12. 0 = c1 (λ1 − λk+1 )v1 + · · · ck (λk − λk+1 )vk .. . . Col A = ℜn 15. 9. dim ColA = n 16. 1 (2) If A is invertible. Nul A = {0} 18. the set is linearly independent. Thus c1 = · · · = ck = 0. . . i. When r = k + 1. λn is an eigenvalue of An with corresponding eigenvector x. . The columns of A span ℜn .. then λ is an eigenvalue of A−1 with corresponding eigenvector x. There is an n × n matrix C such that CA = I. 16 . Theorem 7 If v1 . (3) If A is invertible. vr are eigenvectors that correspond to distinct eigenvalues λ1 . then for any integer n. then vk+1 = c1 v1 + · · · ck vk . By subtraction. dim NulA = 0... λk+1 vk+1 = c1 λ1 v1 + · · · ck λk vk . The number 0 is not an eigenvalue of A. 11..e. The determinant of A is not zero. 14. vr } is linearly independent. a contradiction.. There is an n × n matrix D such that AD = I. λn is an eigenvalue of An with corresponding eigenvector x. then the set {v1 . (1) For any positive integer n. The columns of A form a basis for ℜn .
Deﬁnition 9 If A is a square n × n matrix and A is similar to a diagonal matrix D then A is said to be diagonalizable. If A = P DP −1 . • A is diagonalizable if and only if A has n linearly independent eigenvectors. then they have the same characteristics polynomial and hence the same eigenvalues (with the same multiplicities). 17 . Theorem 8 If n × n matrices A and B are similar. • If A has n distinct eigenvalues. if and only if the dimension of the eigenspace for each eigenvalue equals the algebraic multiplicity of the eigenvalue. (Generally.4 Similarity and Diagonalization Similar matrices: Two matrices A and B are similar if there is an invertible matrix P such that. Well only be considering diagonal matrices that are square. • A is diagonalizable if and only if the sum of the dimensions of the distinct eigenspaces equals n. the the columns of P are n linearly independent eigenvectors of A. A diagonal matrix is a matrix with non zero values along its diagonal and zeros on its oﬀ diagonal entries. to the eigenvectors in P. where D is a diagonal matrix. then A is diagonalizable. A = P BP −1 . the diagonal entries of D are eigenvalues of A that correspond. the dimension of the eigenspace for each eigenvalue is less than or equal to the algebraic multiplicity of the eigenvalue). det(A − λI) = det(P BP −1 − λP P −1 ) = det[P (B − λI)P −1 ) = det(P ) det(B − λI) det(P −1 ) 1 = det(P ) det(B − λI) det(P ) = det(B − λI). Proof.4. In this case. respectively. Theorem 9 (Diagonalization Theorem) Let A be an n × n matrix.
−1 1 [ 1 −3 1 4 [ −3 1 4 1 ] . k = 1. Thus 4 4 [ P = [ 2) Let P = 1 −3 1 4 1 −3 1 4 ] ..D = ] .. or P = [ ] 4 3 . Sol: 1) det(A − λI) = (λ ] 5)(λ + 2). 2) Calculate A4 . then the total collection of vectors in the sets B1 .. [ − 1 When λ = 5: ⃗ = x2 x . Example 23 A = [ B= 3 −1 1 5 ] 1 0 0 0 2 3 0 0 3 4 2 4 5 −1 0 7 is diagonalizable: 4 distinct eigenvalues.. is not diagonalizable: λ = 4. p. then P −1 = 1 7 [ 5 0 0 −2 ] . . 4 1 1) Find P and D such that A = P DP −1 . one eigenvector [ ] 2 3 Example 24 Let A = . if A is diagonalizable and Bk is a basis for the eigenspace corresponding to the eigenvalue λk . { }4 A4 = P DP −1 = P D4 P −1 = [ 1 −3 1 4 ][ ][ 5 0 0 −2 [ = ]4 1 7 [ 4 3 −1 1 ] ..D = [ −2 0 0 5 ] . ] 1 = 7 625 0 0 16 ][ 4 3 −1 1 ] 364 261 348 277 18 .. Bp forms an eigenvector basis of Rn . 1 [ ] −3 When λ = −2: ⃗ = x2 x . [ −1 1 ] ..For an n × n matrix A.
4. ⃗ in the domain of T and all scalars c. Then D is a linear transformation.1 Linear Transformations Deﬁnition 10 Let V and W be vector spaces. x x Example 27 T : Rn → Rm by T (⃗ ) = A⃗ + ⃗ ⃗ ̸= ⃗ Then T is nonlinear. then • T (⃗ = ⃗ 0) 0.4 Linear Transformations 6. d. Then T is nonlinear. b 0. Proof. 4 19 . Since D(f (t) + g(t)) = (f (t) + g(t))′ = f ′ (t) + g ′ (t) = D(f (t)) + D(g(t)). • T (c⃗ + d⃗ ) = cT (⃗ ) + dT (⃗ ) for all ⃗ . Example 28 T : Mnn → R by T (A) = A = det(A). Find 1 2 [ ] 3 T( ). Example 29 Let V be the vector space of (inﬁnitely) diﬀerentiable functions and deﬁne D to be the function from V to V given by D(f (t)) = f ′ (t). Properties: If T is linear. and T ( ) = 2x + x3 . u v u v u v [ ] [ ] 2 1 Example 30 Let T : R2 → P3 by T ( ) = 1 − x − x2 . Then T is linear. Then a linear transformation T from V to W is a function with domain V and range a subset of W satisfying 1) T(u + v) = T(u) + T(v) 2) T(cu) = cT(u) for any vectors u and v in V and scalar c. r a scalar.6. D(cf (t)) = (cf (t))′ = cf ′ (t) = cD(f (t)). Example 25 TA (⃗ ) = A⃗ is linear for any matrix A. x x b. x x Example 26 T : Rn → Rn by T (⃗ ) = r⃗ .
b a+b a−b c d Find S ◦ T . and denoted by S = T −1 .4. T −1 is unique.3 Inverse of Linear Transformations Deﬁnition 12 A linear transformation T : V → W is invertible if there is a linear transformation S : W → v such that S ◦ T = IV and T ◦ S = IW .4. S( ) = a + (b + c)x − dx2 + (a + c)x3 . S is called the inverse of T . Remark. 6. deﬁned by: (S ◦ T )(u) = S(T (u)) for any vectors u in U. Let T : U → V and S : V → W be two transformations. V and W be vector spaces. then S ◦ T is linear. In this case.2 Composition of Linear Transformations Deﬁnition 11 Let U. Theorem 10 If T and S are linear.6. 20 . Then the composition of S with T is S ◦ T . Example 31 Let T : R2 → M22 and S : M22 → P3 be two linear transformations deﬁned by: [ ] [ ] [ ] a a b a b T( )= .
21 . then ker(T ) = {0}. (iv) If T : M22 Theorem 11 Let T : V → W be a linear transformation. 6. Example 32 (i) ker(TA ) = null(A). (iv) T : M22 → M22 is given by T (A) = AT .6. · · · T (vn )} is a basis for range(T ). then b ker(S) = {− + bxb ∈ R}. (ii) D : P3 → P2 .2 Rank and nullity Deﬁnition 14 Let T : V → W be a linear transformation.5 The Kernel and Range of a Linear Transformation 6. Then ker(T ) is a subspace of V .5. We only need to prove that {T (vk+1 ). vk+1 . vk . vk } be a basis for ker(T ). Example 33 Find the rank and nullity of the following: (i) TA . and let {v1 . Rank Theorem: Let T : V → W be a linear transformation. 2 → M22 is given by T (A) = AT . · · · vn }. nullity(T ) = dim ker(T ). · · · . ∫1 (iii) If S : P1 → R is given by S(p(x)) = 0 p(x)dx. range(T ) = M22 . Then we can extend it to a basis of V : {v1 . rank(T ) = dim range(T ). · · · .1 Kernel and range Deﬁnition 13 Let T : V → W be a linear transformation. ∫1 (iii) S : P1 → R is given by S(p(x)) = 0 p(x)dx. (ii) If D : P3 → P2 . Proof. 0}. then ker(D) = R. Kernel ker(T ) = {v ∈ V T (v) = ⃗ Range range(T ) = {T (v)v ∈ V }. range(D) = P2 . Then nullity(T ) + rank(T ) = dim V.5. range(S) = R. range(T ) is a subspace of W . Let dim V = n.
Example 34 Let T : R2 → M22 and S : M22 → P3 be two linear transformations deﬁned by: [ ] [ ] [ ] a a b a b T( )= . TA is onto if and only if the columns of A span Rm . Theorem 13 A linear transformation T : V → W is onetoone iﬀ ker(T ) = {0}. nullity(S). If range(T ) = W . −x1 + 3x2 . since the columns of A span ℜ2 . Example 36 Let T : R2 → R3 be given by ] 1 −2 [ x1 T (x1 .5. 1:1? Sol: T is not ONTO. since the columns of A are linearly dependent. 6. 22 . Is TA : R3 → R2 ONTO. x2 3 −2 Is T ONTO. 2. TA is 1:1 if and only if the columns of A are linearly independent. then T (u − v) = 0. 3. then T is called onto. 3x1 − 2x2 ) = −1 3 . the columns of A can not span R3 . If T maps distinct vectors in V to distinct vectors in W . b a+b a−b c d Find rank(T ). If T (u) = T (v). x x [ ] 1 0 9 Example 35 Let A = . Proof. TA is 1:1 if and only if A⃗ = TA (⃗ ) = 0 has only the trivial solution. nullity(T ). 1:1? 0 3 7 Sol: TA is ONTO. S( ) = a − b + (b + c)x + (a − d)x2 + (a + c)x3 . since A has at most two pivots. TA is 1:1.3 OnetoOne and Onto Deﬁnition 15 Let T : V → W be a linear transformation. x2 ) = (x1 − 2x2 . 1. since the columns of A are linearly independent. TA is not 1:1. then T is called onetoone. A linear transformation T : V → W is onetoone if and only if it is onto. Theorem 12 Let TA : Rn → Rm be a linear transformation with standard matrix A. rank(S). Then. Theorem 14 Let dim V = dim W .
Proof. = Example 38 Show that Rn+1 and Pn are isomorphic. Let T (ej ) = xj−1 . Then we say that V is isomorphic to W and we write V ∼ W . = Example 39 Show that M33 and P9 are NOT isomorphic.5. b Show that T is onto and onetoone. Theorem 16 Let dim V < ∞.3 Isomorphism of Vector Spaces Deﬁnition 16 A linear transformation T : V → W is isomorphism if it is onetoone and onto. Theorem 15 A linear transformation T : V → W is invertible if and only if it is onetoone and onto. 23 . Then V ∼ W if and only if dim V = dim W. dim W < ∞. 6.Example 37 Let T : R2 → P1 be deﬁned by: [ ] a T( ) = a − b + (b + a)x.
(i) Find the matrix of T with respect to bases B and C. Proof. Then N (vi ) = ei . e2 }. A[v]B = [T (v)]C . x2 }. (i) Find the matrix [T ]B .6. Thus (M ◦ T ◦ N )([v]B ) = [T (v)]C . 1 (ii) Verify A[v]B = [T (v)]C for 2 . . x.1 Matrix of Linear Transformation Deﬁnition 17 Let V and W be two vector spaces with dim V = n and dim W = m. Let B = {v1 .6. when V = W and B = C. Theorem 17 For every v ∈ V . vn } be a basis of V and C be a basis of W . Example 40 Let T : R3 → R2 be given by [ ] x x + 2y . e3 } and C = {e1 . (ii) Use (i) to calculate T (1 − x − x2 ). We write A = [T ]C←B . T ( y ) = y − 3z z Let B = {e1 . Then A = [[T (v1 )]C · · · [T (vn )]C ] is called the matrix of T with respect to bases B and C. e2 . we simply write [T ]C←B as [T ]B . 3 Example 41 Let T : P2 → P2 be given by T (p(x)) = p(2 + x). · · · . Deﬁne isomorphisms N : V → Rn and M : W → Rm as follows: N (v) = [v]B . 24 M (w) = [w]C .6 The Matrix of a Linear Transformation 6. Let B = {1.
Solution: T( Thus [ 1 0 ] )= [ 1 0 1 1 ] . Let T : U → V and S : V → W be two linear transformations. S(E22 ) = −x2 . Then T is invertible if and only if the matrix [T ]C←B is invertible. Thus [S]D←C = 1 0 0 1 0 1 0 0 0 0 1 0 0 −1 1 0 . S(E21 ) = x + x3 . and D respectively. Let T : U → V be a linear transformation. b a+b a−b c d Let B. and D be standard bases of R2 . Let v ∈ B.2 Matrices of Composite and inverse Linear Transformations Theorem 18 Let U. 0 T( 1 [T ]C←B = [ ] )= . Then [S ◦ T ]D←B = [S]D←C [T ]C←B . Proof. C. M22 and P3 respectively. Let v ∈ ker(T ). Then [S ◦ T (v)]D = [S(T (v))]D = [s]D←C [T (v)]C = [S]D←C [T ]C←B . V and W be ﬁnitedimensional vector spaces with bases B. Find [S ◦ T ]D←B . S( ) = a + (b + c)x − dx2 + (a + c)x3 . Then [T ]C←B [v]B = [T (v)]C = [⃗ C = ⃗ 0] 0. Proof. C. S(E12 ) = x. [ 0 1 1 −1 ] . Example 42 Let T : R2 → M22 and S : M22 → P3 be two linear transformations deﬁned by: [ ] [ ] [ ] a a b a b T( )= . 25 . 1 0 0 1 1 1 1 −1 S(E11 ) = 1 + x3 .6. And we have [T −1 ]B←C = ([T ]C←B )−1 .6. Theorem 19 Let U and V be ndimensional vector spaces with bases B and C respectively.
c d Let C. Example 44 Let T : R2 → R2 be deﬁned by: [ ] [ ] a a + 3b T( )= . 6. 1 1 ] [ . 1 3 2 2 1 3 1 −2 4 0 0 −1 26 . Thus [T ]C ∼ [T ]B . Then [T ]C = P −1 [T ]B P.3 Change of Basis and Similarity Theorem 20 Let V be a ﬁnitedimensional vector space with bases B and C respectively. Find [T ]C . b 2a + 2b {[ Let E be the standard basis of R2 . Solution: [S −1 ]C←D = 1 0 0 1 0 1 0 0 0 0 1 0 0 −1 1 0 −1 . and D be standard bases of M22 and P3 respectively. and let C = Solution: [T ]E = Thus [T ]C = [ ] . [I]B←C [T ]C←C = [I ◦ T ]B←C = [T ◦ I]B←C = [T ]B←B [I]B←C . ] . [ 3 −2 ]} . Proof.Example 43 Let S : M22 → P3 be a linear transformation deﬁned by: [ ] a b S( ) = a + (b + c)x − dx2 + (a + c)x3 . Let T : V → V be a linear transformation. [ P = ] . Find [S −1 ]C←D .6. where P is the changeofbasis matrix from C to B.
x2 }. 27 . 0 0 4 (ii) C = {1. Let E be the standard basis of R2 . −1 + x. If there is a basis C of V such that [T ]C is a diagonal matrix.Deﬁnition 18 Let V be a ﬁnitedimensional vector space and let T : V → V be a linear transformation. (i) Find the matrix [T ]B . Thus 0 0 4 0 0 1 Solution: 3 1 0 −2 5 [T ]B = −1 2 2 . (ii) Show that T is diagonalizable by ﬁnding a basis C such that [T ]C is a diagonal matrix. 1 − x. 1 − 2x + x2 }. PE←B = 1 −1 0 . Example 45 Let T : P2 → P2 be given by T (p(x)) = p(2x − 1). then T is called diagonalizable. and let B = {1 + x. 1 1 0 1 −1 1 (i) [T ]E = 0 2 −4 .
Let x(t) be a general solution. then we have dP = kP. where a is a constant. Find an expression for the population after t hours. These imply that P (0) = and e2k = 3. Note that P (2) = 40 and P (4) = 120. Proof.7 An Application: Homogeneous Linear Diﬀerential Equation 6.6. 3 3 3 28 .7. Example 46 A bacteria culture growth at a rate proportional to its size. Then (i) S is a subspace of F . Then [x(t)eat ]′ = 0. 40 3 120 = P (0)e4k . dt The solution of the equation is P (t) = P (0)ekt . We thus have P (t) = or k = ln 3/2. Let P (t) be the population at t hours. Solution: We measure the time t in hours. (ii) {e−at } is a basis of S. Theorem 21 Let S = {yy ′ + ay = 0}. 40 t/2 40 √ t 40 (ln 3/2)t 3 = 3 = e . we obtain 40 = P (0)e2k .1 First Order Homogeneous Linear Diﬀerential Equation First Order Homogeneous Linear Diﬀerential Equation: y ′ (t) + ay(t) = 0. dim S = 1. After 2 hours there are 40 bacteria and after 4 hours the count is 120.
2 Second Order Homogeneous Linear Diﬀerential Equation Second Order Homogeneous Linear Diﬀerential Equation: y ′′ (t) + ay ′ (t) + by(t) = 0. Only scalars are 0.7. 2 2 6.03 = 97. p − 1. dim S = 2.7. For example. (iii) If λ1 = λ2 .942g. where a and b are constants. 6. Omitted. (ii) If λ1 ̸= λ2 . Thus 1 1 m(0. 2 where m(0) = 100. How many grams remaining after 27 minutes (keep three decimals)? Solution: Assume m(t) be the amount after t hours. we will be working with linear algebra. then {eλ1 t . Proof. k) p code. eλ2 t } is a basis of S.45 hours. then {eλ1 t . Theorem 22 Let S = {yy ′′ (t) + ay ′ (t) + by(t) = 0}.3 Linear Codes For the purposes of coding. Note that 27 minutes = 27/60=0.. 29 . 3 + 4 = 2(mod 5).Example 47 The halflife of Sodium24 is 15 hours. Example 48 Find the solution spaces of y ′′ (t)−y ′ (t)−12y(t) = 0 and y ′′ (t)−6y ′ (t)+9y(t) = 0..45/15 = 100( )0. Let Zn be the set of vectors of length n (n entries) such that each entry is an integer p between 0 and p1 (inclusive). dim S = 2. Then (i) S is a subspace of F . Suppose you have 100 grams of Sodium24. If dim C = k. teλ1 t } is a basis of S. Deﬁnition 19 A linear code C is a subspace of Zn .45) = 100( )0. then C is called (n. The addition is a + b = c(mod p).. Then 1 m(t) = m(0)( )t/H . H = 15 hours. and let λ1 and λ2 be the two solutions of the characteristic equation λ2 + aλ + b = 0. 1..
. . 1 0 1 0 0 1 0 1 Example 49 Let C1 = . . 0 1 1 0 0 1 1 0 1 0 1 0 0 1 1 1 C2 = . . 30 . . 0 1 1 0 0 1 1 0 Show that C1 is a linear code. but C2 is not a linear code. .
⃗ be two vectors in Rn . v v If c is a scalar. then c⃗  = c⃗ . v Deﬁnition 22 Let ⃗ . v v Normalization: When the length of a vector is 1 that vector is called a unit vector.. v u v v1 v2 . ⃗ ) = ⃗ − ⃗ . v This process of constructing a unit vector in the direction of a given vector ⃗ is called v normalizing ⃗ . ⃗ be two vectors in Rn .e. . Given any vector ⃗ we can change it into a unit vector: v ⃗ v ⃗  v is a unit vector in the direction of ⃗ . .1 Review Deﬁnition 20 Let ⃗ . vn ⃗  = v √ √ 2 2 ⃗ · ⃗ = v1 + . Deﬁnition 21 The length or norm of a vector ⃗ = v . un . u v u1 u2 ⃗ = . u v u v 31 .5. v − ⃗ v ⃗  v is a unit vector in the opposite direction of ⃗ . . + un vn is called the inner product or dot product of ⃗ and u v u ⃗ and is denoted as ⃗ · ⃗ . The distance between ⃗ . ⃗ is the length of the u v u v vector ⃗ − ⃗ : u v dist(⃗ .. . vn The number ⃗ T ⃗ = u1 v1 + u2 v2 + .1. i.1 Orthogonality in Rn 5. v1 v2 . + vn ... .⃗ = u v . is a deﬁned by.
vm }. ⃗ are orthogonal if and only if u v ⃗ + ⃗ 2 = ⃗ 2 + ⃗ 2 .Angle between two vectors: Let ⃗ . They are said to be orthogonal u v if ⃗ · ⃗ = 0. Deﬁnition 25 (Orthogonal Basis) If S = {v1 . . ⃗2 = 1 . ⃗ v Thus v1 ⊥⃗2 . v v 1 2 1 3 Show that {⃗1 . the set is orthogonal.. ⃗ ⃗ ⃗ 1 0 −5 Example 51 Let v1 = −2 .2 Orthogonal Set Deﬁnition 23 (Orthogonality) Let ⃗ .. v2 .1. Then ⃗ ·⃗ u v cos θ = . v2 . then the set is linearly independent. ⃗3 } is an orthogonal basis for R .. u v u v Deﬁnition 24 If each pair of distinct vectors in a set is orthogonal then the set is called an orthogonal set. . then it is called an orthogonal basis for the subspace W = Span{v1 . ⃗2 . v2 ⊥⃗3 .. v v v 32 . 0 0 0 1 0 −5 v v v Example 50 Is the set ⃗1 = . vm } is an orthogonal set of nonzero ⃗ ⃗ ⃗ vectors. ⃗3 = an orthogonal set? −2 1 −2 1 2 1 Sol: We need to check v1 · ⃗2 = 0 + 0 + (−2) + 2 = 0 ⃗ v v1 · ⃗3 = 0 + (−5) + 4 + 1 = 0 ⃗ v v2 · ⃗3 = 0 + 0 + (−2) + 2 = 0.. u v Theorem 23 (The Pythagorean Theorem) Two vectors ⃗ . v1 ⊥⃗3 . ⃗ v ⃗ v ⃗ v Theorem 24 If a set of nonzero vectors is orthogonal. ⃗2 = . ⃗ ⃗  u v 5. ⃗3 = −2 . ⃗ be two vectors.. and let θ ≤ π be the angle between u v them. ⃗ be two vectors in Rn .
. k = 1. y ⃗ ⃗ v Proof. ⃗3 } is orthogonal of nonzero vectors. v2 ⊥⃗3 . which is called the standard basis for Rn ... If an orthonormal set S spans some subspace W. where S = {v1 . ⃗ ⃗ 3 −5 0 1 x x v v Example 52 Let v1 = −2 .) Example 53 The set {e1 . . 33 ck = ⃗ · ⃗k y v . thus an orthonormal ⃗ ⃗ ⃗ n basis for R .. + cm⃗m . (S is an orthogonal set that spans W. ⃗ · v1 = (c1 v1 + c2 v2 + .. then it may be written uniquely as a linear combination of the vectors in S.. ⃗2 = 1 . .. ⃗k · ⃗k v v . v2 . ⃗ v Thus v1 ⊥⃗2 . then it is called ⃗ ⃗ ⃗ an orthonormal set. ⃗3 } .1. Theorem 26 A matrix A has orthonormal columns if and only if AT A = I... Represent ⃗ as a 1 1 2 1 linear combination of {⃗1 . v1 ⊥⃗3 . . um } is an orthogonal set of unit vectors. The set {⃗1 .. . vm }. en } is an orthonormal set that spans Rn ... Deﬁnition 27 An n × n matrix U is called an orthogonal matrix if its columns form an orthonormal set.2 Orthonormal Set Deﬁnition 26 If S = {u1 . v2 . + cm vm · v1 ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ = c1 v1 · v1 + 0. ⃗ = 2 .. so is linearly independent and thus a basis for W. so they are ⃗ v ⃗ v ⃗ v v v v linearly independent...... v1 · ⃗2 = 0 + (−2) + 2 = 0 ⃗ v v1 · ⃗3 = (−5) + 4 + 1 = 0 ⃗ v v2 · ⃗3 = 0 + (−2) + 2 = 0. Three such vectors automatically form a basis for R3 . + cm⃗m ) · v1 y ⃗ ⃗ ⃗ v ⃗ = c1 v1 · v1 + c2 v2 · v1 + .. v v v 5. ⃗ = c1 v1 + c2 v2 + .Proof. Theorem 25 If ⃗ is a vector in W = Span{v1 . then S is called an orthonormal basis for W.. ⃗2 . . ⃗2 . u2 . m. ⃗3 = −2 . vm } is an y ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ orthogonal set of nonzero vectors.. e2 .
(A⃗ ) · (A⃗ ) = ⃗ · ⃗ . x x x x x x x x x x 2. 1. It is easy to see that AT A = I. x y x y Proof. L is orthogonal to W. A⃗ 2 = (A⃗ )T (A⃗ ) = ⃗ T AT A⃗ = ⃗ T I⃗ = ⃗ T ⃗ = ⃗ 2 . A⃗  = ⃗ . we may think of W as 34 . (iii) If λ is an eigenvalue of A. then so is AB. x y x y 3. Then (i) A−1 is orthogonal. and let ⃗ . √ √ 2/6 2/2 −2/3 √ Example 54 Show that A = 4 2/6 0 1/3 is an orthogonal matrix.2 Orthogonal Complements and Orthogonal Projections 5. √ √ 2/6 − 2/2 −2/3 Theorem 28 Let A be an m × n matrix with orthonormal columns. 5. (A⃗ ) · (A⃗ ) = 0 if and only if ⃗ · ⃗ = 0. ⃗ be in Rn . Thus A−1 = AT . √ √ 2/6 2/2 −2/3 √ Example 55 A = 4 2/6 0 1/3 has orthonormal columns.Theorem 27 An n × n matrix U is an orthogonal matrix if and only if U −1 = U T . 1. (A⃗ ) · (A⃗ ) = (A⃗ )T (A⃗ ) = ⃗ T AT A⃗ = ⃗ T I⃗ = ⃗ T ⃗ . √ √ 2/6 − 2/2 −2/3 Proof. Similarly.2. (iv) If A and B are orthogonal n × n matrices. x y Then.1 Orthogonal Complements Deﬁnition 28 (Orthogonal Complement) Let W be a plane and L a line intersecting W. then λ = ±1. We call L the orthogonal complement of W and denote it by L = W ⊥ . At the point of intersection of the line L to the plane W. (ii) det(A) = ±1. x y x y x y x y x y Property 5 Let A be an orthogonal matrix. x x 2.
W = L⊥ .. . x3 = −7/4x2 . x 2) Find other vectors in W ⊥ . 1 . x −7 1) Show that ⃗ ∈ W ⊥ . • If W = span{w1 . Example 56 Let −1 3 W = span 2 .being perpendicular to L and so may be called the orthogonal complement of L and is denoted by. x x • W ⊥ is a subspace. 1 x1 2) Let ⃗ = x2 ∈ W ⊥ . Properties of Orthogonal Complement: • A vector ⃗ is in W ⊥ if and only if ⃗ is orthogonal to every vector in a set that spans W. • (RowA)⊥ = N ulA. then v ∈ W ⊥ if and only if v · wi = 0 for all i. A = x 3 1 3 x 1 ·⃗ = 0 ⇒ 1 ] [ ] 1 −4 1 0 ∼ .. Solution: 1) −1 x 2 · ⃗ = 0. 1 3 x 1 · ⃗ = 0. 1 [ −1 2 A⃗ = 0. i.. 1 ⃗ = 1/4x2 4 . 1 0 7 4 The solution of this is: x1 = 1/4x2 . (ColA)⊥ = N ulAT . Then x x3 −1 x 2 · ⃗ = 0. wk }.. ∩ • W W ⊥ = {0}. x −7 35 .e. 1 1 1 ⃗ = 4 .
If W = Span{v1 . where S = {v1 . m. [ 1 2 ] = [ −0.2.. v x x Sol: 1) v x ˆ ⃗ · ⃗ ⃗ = −3 ⃗= v x ⃗ ·⃗ x x 5 2) The component of ⃗ orthogonal to ⃗ is v x [ ˆ ⃗ −⃗ = v v Thus ⃗= v [ ] + 36 1. [ −0. v2 . v x x 3) Find the distance from ⃗ to the line through ⃗ and the origin (i.. we would like to write ⃗ as a linear combination of two orthogonal v x v vectors: one vector in the direction of the vector ⃗ and another vector.⇒ ⃗ = ⃗ − y v ⃗..2 Orthogonal Projections Given two vectors ⃗ and ⃗ .. .2 ... ⇒ α = v x y Deﬁnition 29 ⃗ ·⃗ v x ⃗ ·⃗ v x . x y x ⃗ = α⃗ + ⃗ . So. .6 −1. L = Span{⃗ })..2 [ ] .6 3.. k = 1. y y y ] [ ] 1 1 Example 57 Let ⃗ = v ..e. then the orthogonal projection of ⃗ onto W is deﬁned y as ⃗ · ⃗k y v projW ⃗ = c1 v1 + c2 v2 + . ] . vm } v x ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ is an orthogonal set of nonzero vectors. ck = y ⃗ ⃗ v .⃗= x −2 2 1) Find the orthogonal projection of ⃗ onto ⃗ . x ⃗ ·⃗ x x ⃗ ·⃗ x x ⃗ ·⃗ v x ⃗ x ⃗ ·⃗ x x is called the orthogonal projection of ⃗ onto ⃗ and v x proj⃗ ⃗ = xv perp⃗ ⃗ = ⃗ − v xv ⃗ ·⃗ v x ⃗ x ⃗ ·⃗ x x is the component of ⃗ orthogonal to ⃗ ... orthogonal to ⃗ . + cm⃗m .6 −1. ⃗k · ⃗k v v The complement of ⃗ orthogonal to W is y perpW ⃗ = ⃗ − projW ⃗ . . one in Span{⃗ } and one orthogonal to ⃗ . ⃗ .5.. v2 .6 3.2 ] .2 1. v x 2) Write ⃗ as the sum of two vectors. vm }.
Example 58 Let ⃗ = y Find projW ⃗ .3 The orthogonal Decomposition Theorem Theorem 29 Let W be a subspace of Rn with orthogonal basis {v1 . y y z where ˆ ⃗ ∈ W.8..3) The distance is [ ˆ ⃗ − ⃗  =  v v 1. Property: If we have an orthogonal basis {v1 .. ⃗2 = 0 0 1 2 v . vm } for W and if ⃗ ∈ W .6 3.. .2 ]  = √ 1. ⃗3 }. vm }. .22 = √ 12. then projW ⃗ = ⃗ ⃗ ⃗ y y ⃗. the set {⃗1 . ⃗2 . v1 ⊥⃗3 . z y y 1 1 1 1 v .. v2 . y 37 . ⃗ ∈ W ⊥ . Let W = Span{⃗1 . ⃗1 = 0 1 −2 1 v .. ⃗2 . y Sol: Since v1 · ⃗2 = 0 + 0 + (−2) + 2 = 0 ⃗ v v1 · ⃗3 = 0 + (−5) + 4 + 1 = 0 ⃗ v v2 · ⃗3 = 0 + 0 + (−2) + 2 = 0.62 + 3. v2 ⊥⃗3 ..2. v2 . ⃗ v ⃗ v ⃗ v v v v y v y v y v ˆ = ⃗ · ⃗1 v1 + ⃗ · ⃗2 v2 + ⃗ · ⃗3 v3 = 0 ⃗ ⃗ ⃗ projW ⃗ = ⃗ y y ⃗1 · ⃗1 v v ⃗2 · ⃗2 v v ⃗3 · ⃗3 v v 0 1 −2 1 3 + 5 0 0 1 2 −6 + 5 0 −5 −2 1 = 0 6 3 0 . Then each ⃗ ⃗ ⃗ ⃗ y in Rn can be written uniquely in the form ˆ ⃗ = ⃗ + ⃗. 5. ⃗3 } is orthogonal basis of W. ⃗ ⃗= y ⃗1 · ⃗1 v v ⃗m · ⃗m v v ˆ ⃗ = ⃗ − ⃗. ⃗3 = 0 −5 −2 1 v v v . y z ⃗ · ⃗1 y v ⃗ · ⃗m y v ˆ v1 + · · · + ⃗ vm . ⃗ v Thus v1 ⊥⃗2 .
⃗1 = 0 −1 1 3 2 13 the distance from ⃗ to W. ⃗ 1 = 1/3 . Find . ⃗ · vm = dm vm · vm . ⃗ ⃗ ⃗ ⃗ u u T 2/3 −2/3 2/3 −2/3 1 8/9 projW ⃗ = U U T ⃗ = 1/3 2/3 1/3 2/3 1 = 7/9 . ⃗ 2 = 2/3 .. u2 . .. ⃗2 }. Find y 1/3 2/3 1 projW ⃗ . + dm vm . ... {⃗ 1 . ⇒ y ⃗ ⃗ ⃗ ⃗ · v1 = d1 v1 · v1 . y y y u ⃗ y u ⃗ If U = [u1 u2 · · · um ] then ⃗ ⃗ ⃗ projW ⃗ = U U T ⃗ . u1  = u2  = 1. Since ⃗ ∈ W . y ⃗ = d1 v1 + d2 v2 + . y Theorem 30 Let W be a subspace of Rn with orthonormal basis {u1 . ⃗ in Rn and y ˆ ˆ ⃗ be the orthogonal projection of ⃗ onto W. Let W = Span{⃗1 . Then ⃗ is the closest point in W to ⃗ in the sense y y y y that ˆ ⃗ − ⃗  < ⃗ − ⃗  y y y v ˆ for all ⃗ ∈ W distinct from ⃗ . ⃗ 2 } is orthonomal basis of W.. y 38 v v . Let W = Span{⃗ 1 . um }.. y ˆ ⃗ = projW ⃗ = (⃗ · ⃗ 1 )u1 + · · · + (⃗ · ⃗ m )um . y y −2/3 2/3 1 u u u u Example 59 Let ⃗ = 1 . y Note that u1 · u2 = 0.Proof. ⃗ 2 }.. y ⃗ ⃗ ⃗ y ⃗ ⃗ ⃗ Thus projW ⃗ = y ⃗ · ⃗1 y v ⃗ · ⃗m y v v1 + · · · + ⃗ vm ⃗ ⃗1 · ⃗1 v v ⃗m · ⃗m v v d1 v1 · v1 ⃗ ⃗ dm vm · vm ⃗ ⃗ = v1 + · · · + ⃗ vm ⃗ ⃗1 · ⃗1 v v ⃗m · ⃗m v v = d1 v1 + d2 v2 + .. Then for ⃗ ⃗ ⃗ n each ⃗ in R . + dm vm ⃗ ⃗ ⃗ = ⃗. v y −4 1 3 1 −2 −1 v v Example 60 Let ⃗ = y . y y 2/3 1/3 2/3 1/3 1 11/9 Theorem 31 (The Best Approximation Theorem) Let W be a subspace of Rn . ⃗2 = ...
then dim W + dim W ⊥ = n. So the distance is ⃗ − projW ⃗ . ⃗ ⃗ v v projW ⃗ = y ⃗ · ⃗1 y v ⃗ · ⃗2 y v v1 + ⃗ v2 = ⃗ ⃗1 · ⃗1 v v ⃗2 · ⃗2 v v −1 −5 −3 9 . (2) By decomposition theorem. ⇒ ⃗ − projW ⃗  = 8. Note that y y y y v1 · v2 = 0. Theorem 32 If W is a subspace of Rn . ⇒ ⃗ − projW ⃗ = y y 4 4 4 4 y y . {⃗1 . Proof.Sol: The closest point in W to ⃗ is projW ⃗ . (1) A basis of W and a basis of W ⊥ form a linearly independent set. any vector in Rn can be written as a linear combination of the set. ⃗2 } is orthogonal basis of W. 39 .
X2 = . 2 2 F1  F2  Fk−1 2 Then {F1 . · · · . 1 0 2 1 2 4 Use the GramSchmidt algorithm to convert the set S = {X1 .1 The GramSchmidt Process (Algorithm) Let S = {X1 . Fk } is an orthogonal set. Example 61 Consider the following independent set S = {X1 .5. F1 2 0 4 1 −2 0 2 1 −1 −1 1 4 −1 4 1 −6 −2 −2. F2 .5 3 4 1 0 40 . Then 4 6 1 0 8 1 −2 X 2 · F1 F2 = X2 − F1 = − = .3. X3 } of vectors from R4 : 1 6 −1 1 0 −1 X1 = . − − = 2 2 F1  F2  2 4 1 24 −2 0.3 The GramSchmidt Process and the QR Factorization 5. F3 }. Xk } be a set of vectors. Solution: Let F1 = X1 = 1 1 1 1 . · · · . X2 . X2 .5 X3 · F1 X 3 · F2 F3 = X3 − F1 − F2 = . X3 } into an orthogonal set B = {F1 . F2 . X3 = . and let F1 F2 Fk = = ··· = Xk − X1 X2 − X 2 · F1 F1 F1 2 Xk · F1 X k · F2 Xk · Fk−1 F1 − F2 − · · · − − Fk−1 . X2 .
Xk } be the columns of A. and R is an upper triangular matrix. Then there exists orthonormal set {F1 .2 The QR Factorization By the GramSchmidt Process (Algorithm). where Q is an m × n matrix with orthonormal columns. · · · . √ 0 0 23 41 . Example 62 1 0 1 1 1 √ − √1 − √3 0 −3 3 10 23 √2 − √3 2 −1 0 10 23 = 1 √ √1 √1 0 1 3 − 10 23 1 √ √2 √2 3 5 3 10 23 √ √ √ 3 3 3 √ √ 0 10 10 . Fk } such that X1 X2 Xk = = ··· = ck1 F1 + ck2 F2 + ckk Fk . · · · . Proof.5. Let {X1 . X2 . we get the following QR Factorization: Theorem 33 If A is m × n matrix with linearly independent columns. F2 . then A = QR. cij = Xi · Fj .3. c11 F1 c21 F1 + c22 F2 Remark.
1 . Let A be a real square matrix. 1 Step 3: Find orthogonal bases to each eigenspace using GramSchmidt Process: 42 . Conditions under which a matrix is orthogonally diagonalizable: Spectral Theorem. 2 1 1 Example 63 Orthogonally diagonalize the matrix A = 1 2 1 1 1 2 Solution: Step 1: Find all eigenvalues: The characteristic polynomial is −λ3 +6λ2 −9λ+4. Method to orthogonally diagonalize a symmetric matrix: Columns of Q consist of orthonormal bases of all eigenspaces. Property 6 Let A be symmetric. then all eigenvalues are real. Proof. (ii) Any two eigenvectors corresponding to distinct eigenvalues are orthogonal.5. 4. Step 2: Find bases to each eigenspace: −1 −1 Basis for E1 : 0 . 1 0 1 Basis for E4 : 1 . Then A is orthogonally diagonalizable if and only if A is symmetric. Using x · y = xT y. (i) If A is real. So λ = 1.4 Orthogonal Diagonalization of Symmetric Matrix Deﬁnition 30 A square matrix A is orthogonally diagonalizable if there exist an orthogonal matrix Q and a diagonal matrix D such that QT AQ = D.
√ 1/ 3 Step 5: Construct Q and D: √ √ √ 1 0 0 −1/ 2 −1/ 6 1/ 3 √ √ Q= 0 2/ 6 1/ 3 . 2/ 6 . √ √ √ 0 0 4 −1/ 2 −1/ 6 1/ 3 43 . D = 0 1 0 . 1 Step 4: Find orthonormal bases to eigenspace: √ each √ −1/ 2 −1/ 6 √ . For E1 : 0 √ √ 1/ 2 −1/ 6 √ 1/ 3 √ Basis for E4 : 1/ 3 . 1 . 1 −1/2 1 Basis for E4 : 1 . −1 −1/2 For E1 : 0 .
then the change of variable x = Qy transforms xT Ax into the quadratic form y T Dy: 2 2 2 xT Ax = y T Dy = λ1 y1 + λ2 y2 + · · · + λn yn .5. 1 2 3 Solution: The coeﬃcients of the squared terms x2 go on the diagonal as aii of A. if A is the n × n symmetric matrix associated with the quadratic form xT Ax. · · · . Speciﬁcally. x ∈ Rn . x2 . x3 ) = 2x2 + 3x2 − 1x3 − 8x1 x2 + 10x2 x3 . • Quadric form in 2 variables: ax2 + by 2 + cxy. λn are the eigenvalues of A. and if Q is an orthogonal matrix such that QT AQ = D is an diagonal matrix. the i coeﬃcients of the crossproduct terms xi xj are split into half between aij and aji . y = . 0 5 −1 The Principle Axes Theorem: Every quadratic form can be diagonalized.5 An Application: Quadratic Forms 5.1 Quadratic Forms Deﬁnition 31 A quadric form in n variables is a function f : Rn → R of the form f (x) = xT Ax. We call A the matrix associated with f . . where A is a symmetric n × n matrix. y1 . Example 64 Find the matrix associated with the quadratic form f (x1 . The process is called diagowhere λ1 . . Thus 2 −4 0 A = −4 3 5 .5. • Quadric form in 3 variables: ax2 + by 2 + cz 2 + dxy + exz + f yz. yn nalizing a quadratic form. λ2 . 44 .
Step 2: Find corresponding ] [ √ unit eigenvectors: 1/ 5 √ . Step 5: When x = . 1 2 Step 1: Find all eigenvalues: λ = 9. positive deﬁnite if and only if the eigenvalues of A are positive. 45 . y = QT x = 3/ 5 1 f (x) = 36. 4. negative deﬁnite if f (x) < 0 for all x ̸= 0. 3. When λ = 4: q2 = −1/ 5 Step 3: Construct Q and D: ] [ √ [ √ ] 1/ 5 2/ 5 9 0 √ √ Q= . f (y) = 36. 4. When λ = 9: q1 = 2/ 5 [ √ ] 2/ 5 √ . positive deﬁnite if f (x) > 0 for all x ̸= 0.Example 65 Find a change of variable that transforms the quadratic form associated with [ ] 5 2 the matrix A = as a quadratic form with no crossproduct terms. negative deﬁnite if f (x) ≤ 0 for all x. Test the result 2 8 [ ] 2 with x = . 2. 1 Solution: The quadratic form is f (x) = 5x2 + 8x2 + 4x1 x2 . [ ] [ √ ] 2 4/ 5 √ . indeﬁnite if f (x) takes on both positive and negative values. 5. positive semideﬁnite if f (x) ≥ 0 for all x. Theorem 34 A quadratic form f (x) = xT Ax ( and also a symmetric matrix A) is 1. y2 ) = 9y1 + 4y2 .D = . Deﬁnition 32 A quadratic form f (x) = xT Ax ( and also a symmetric matrix A) is classiﬁed as one of the following: 1. then 2 2 f (y) = f (y1 . 2/ 5 −1/ 5 0 4 Step 4: Let x = Qy.
y) = 5x2 + 2y 2 + 4xy subject to x2 + y 2 = 1. 3. Then 1 = xT x = y T y. Proof. positive semideﬁnite if and only if the eigenvalues of A are nonnegative. Proof. Then the following are true subject to the constraint x = 1: 1. The minimum value of f (x) is λn .2. 5. negative deﬁnite. 1/ 5).5. when (x. 3. y) = (1/ 5. Thus it is indeﬁnite. Example 67 Find the max and min of f (x. λ1 ≥ f (x) ≥ λn . or none of them. y. indeﬁnite if and only if the eigenvalues of A are both positive and negative. Let λ1 ≥ λ2 ≥ · · · ≥ λn be the eigenvalues of A. z) = y 2 + 2xy + 4xz + 2yz as positive deﬁnite. and it occurs when x is an eigenvector corresponding to λ1 . 5. Example 66 Classify f (x. 4. The maximum value of f (x) is λ1 . Let x = Qy. 3. indeﬁnite. −2/ 5). 2. 46 .2 Constrained Optimization Problem Theorem 35 Let f (x) = xT Ax be a quadratic form. negative deﬁnite if and only if the eigenvalues of A are nonpositive. Solution: The eigenvalues of A are 2. 0. y) = (2/ 5. and it occurs when x is an eigenvector corresponding to λn . Using x = Qy. when (x. negative deﬁnite if and only if the eigenvalues of A are negative. √ √ √ √ Solution: max f is 6. and determine values of x and y for which each of these occurs. min f is 1.
second axis along √ .3 Graphing quadratic equations Example 68 Identify and graph the conic (slicing a cone by a plane) whose equation is 28 4 5x2 + 4x1 x2 + 2x2 − √ x1 − √ x2 + 4 = 0. Then (y1 − 2)2 + 6(y2 − 1)2 = 6. 1 2 5 5 Solution: Q= Let x = Qy.5. [ √ ] [ √ ] 2/ 5 1/ 5 √ . −2/ 5 1/ 5 0 6 47 .D = .1). main axis along −2/ 5 1/ 5 [ [ ] √ √ ] 1/ 5 2/ 5 1 0 √ √ .5. It is ellipse with center (2.
. . . vn . ∫b • < f.1. Example 70 • Let A be symmetric. (d) < u. u v u v • Weighted dot product: < u. Example 69 Let ⃗ . v >= w1 u1 v1 + w2 u2 v2 + .. An inner product on the vector space V is a function that associates with each pair of vectors in V. w >. (a) < u. + un vn . • < A. say u and v. b]. + wn un vn . v > + < u. and < u. (b) < u. Property 7 Suppose u. and w are vectors in a vector space V and c is any scalar. .1 Inner Product Spaces 7. a2 + b2 x + c2 x2 >= a1 a2 + b1 b2 + c1 c2 deﬁnes an inner product on P2 . positive deﬁnite matrix. u >≥ 0. v. where w1 . v. v >= c < u. ⃗ be two vectors in Rn . A vector space along with an inner product is called an inner product space. . v >= uT Av deﬁnes an inner product. and w are vectors in an inner product vector space V and c is any scalar. · · · . i. a real number denoted by < u. B >= trace(AT B) deﬁnes an inner product on M22 .. un The following are inner products: • Dot product: < u. v > that satisﬁes the following axioms. v >=< v. v + w >=< u.1 Inner Product Deﬁnition 33 Suppose u. < u. • < a1 + b1 x + c1 x2 . g >= a f (x)g(x)dx deﬁnes an inner product on C[a.7. (c) < cu. 48 v1 v2 .. u >. wn are positive scales.⃗ = u v .e. . v >. v >= ⃗ · ⃗ = ⃗ T ⃗ = u1 v1 + u2 v2 + . u v u1 u2 ⃗ = . u >= 0 if and only if u = 0.
49 . g). Given f (x) = 1 + 3x. b]. v >. Proof. v. g >. cv >= c < u. v >. Let u. Then u and v are orthogonal if and only if u + v2 = u2 + v2 . and < f. 1.2 Length. • Consider the inner product < a1 + b1 x + c1 x2 . Remark. x. The length or norm of v is v = √ < v.1. a2 + b2 x + c2 x2 >= a1 a2 + b1 b2 + c1 c2 on P2 . v >= 0. Pythagoras’ Theorem. (c) < u. x3 − 3 x}. Distance and Orthogonality Deﬁnition 34 Suppose u. and < f. then they are called normalized Legendre Polynomials. ∫1 Example 71 • Consider the inner product < f. 0 >=< 0. g >. g >= −1 f (x)g(x)dx. w > + < u. v be vectors in an inner product vector space V. u and v are orthogonal if < u. 3. They are called Legendre Polynomials. Calculate f . and w are vectors in an inner product vector space V and c is any scalar.3 Orthogonal Projections and the GramSchmidt Process Example 72 Apply the GramSchmidt Process to the basis {1. (b) < u. Calculate f . 2. g >= 0 f (x)g(x)dx deﬁned on C[a. x2 − 1 . v >= 0. x2 . w >. d(f. If we 3 5 divide them by their lengths. g(x) = 1 − 3x. u + v >. x3 } of P3 to ﬁnd a basis ∫1 that is orthogonal with respect to the inner product < f.(a) < u + v. u + v2 =< u + v. g(x) = 1 − 3x. g). d(f. The distance between u and v is d(u. w >=< u. v) = u − v. A vector of length 1 is called a unit vector. Solution: {1. Given f (x) = 1 + 3x. 7.1. x. 7.
Let u. Then  < u. CauchySchwarz Inequality. Then u + v ≤ u + v with equality holding if and only if u and v are scalar multiples of each other. then < u. v be vectors in an inner product vector space V. v > < wk . Find the projection of p(x) onto P3 with Legendre polynomials as a basis. v be vectors in an inner product vector space V. wk }. Proof. v > w1 + · · · + wk . v >2 = proju (v)2 = v2 − perpu (v)2 . v >  ≤ u v with equality holding if and only if u and v are scalar multiples of each other. Example 73 Let p(x) = 5x3 + 2x2 − x + 3. < w1 . 50 . · · · . wk > The component of v orthogonal to W is perpW (v) − projW (v). If u ̸= 0. u2 The Triangle Inequality. Let u. Then the orthogonal projection of v onto W is deﬁned as projW (v) = < w1 . w1 > < wk .Deﬁnition 35 Let W be an inner product space with orthogonal basis {w1 .
A function d satisfying the three properties is called metric. . and d(u. v) + d(v. u2 . v) = 0 if and only if v = u.1 Norms Deﬁnition 36 A norm on a vector space V is a mapping that associates with each vector v a real number v such that. (b) cv = cv. v) = d(v. we deﬁne d(u. which counts the number of 2 1’s in v. · · · .2 Norms and Distance Functions 7. un }.2 Distance Function Deﬁnition 37 Form any norm. w). Example 75 Let v ∈ Zn . u). v) = u − v.e. (c) d(u.7. u . A vector space along with a norm is called a normed linear space. and v = 0 if and only if v = 0. (b) d(u. or uniform norm): u∞ = max{u1 . A vector space the possesses such a function is called metric space. 7. be a vector in Rn . • The max norm (∞norm. it is called Euclidean norm. v and scalar c: (a) v ≥ 0. w) ≤ d(u. (c) u + v ≤ u + v. • The sum norm: up = (u1 p + u2 p + · · · + un p )1/p . un • The sum norm: us = u1  + u2  + · · · + un . v) ≥ 0.2. u1 u2 Example 74 Let ⃗ = . Then it is a norm. i.2. Deﬁne vH = w(v). and called Hamming norm. Property 8 (a) d(u. 51 . for all u. the weight of v. When p = 2.
n j=1 is the ith column of A. d∞ (u. (c) A + B ≤ A + B. Show that it is a matrix norm. B and scalar c: (a) A ≥ 0. 52 . for all A. v).n { n aij } = maxj=1. A matrix norm on Mnn is said to be compatible with a vector norm v on Rn if for all A and x: Ax ≤ A x. ∑ • A1 = maxj=1. u v 2 0 1 7. This norm is called operator norm induced by the vector norm x. ds (u. ij 1. • A1 = maxx=1 Ax deﬁnes another matrix norm.··· . Calculate dH (u. (d) AB ≤ A B. Example 78 • The Frobenius norm: AF = n ∑ i.3 Matrix Norms Deﬁnition 38 A matrix norm on Mnn is a mapping that associates with each matrix A a real number A such that.n {bi s } deﬁnes a matrix norm. and A = 0 if and only if A = 0.··· . } {∑ n aij  = maxi=1.2.··· . ⃗ = 4 . u v 3 4 1 0 Example 77 Let ⃗ = 1 . v).··· . 2. 1 4 Example 76 Let ⃗ = 2 . Show that it is compatible with the Euclidean norm.j=1 a2 . (b) cA = cA. v). Calculate dE (u.n {aj s } deﬁnes a matrix norm. where aj i=1 is the jth column of A. v). where bi • A∞ = maxi=1. ⃗ = 1 ∈ Z3 .
v − projW (v) is orthogonal to projW (v) − w. ⃗3 = v 0 −5 −2 1 53 . 1 0 0 1 1 0 Example 79 Let ⃗ = .7. ⃗1 = y v v . ⃗2 = 1 −2 1 1 1 2 Find the best approximation to ⃗ in W .. ⃗ v ⃗ v ⃗ v v v v ⃗ · ⃗1 y v ⃗ · ⃗2 y v ⃗ · ⃗3 y v ˆ projW ⃗ = ⃗ = y y v1 + ⃗ v2 + ⃗ v3 = 0 ⃗ ⃗1 · ⃗1 v v ⃗2 · ⃗2 v v ⃗3 · ⃗3 v v 0 1 −2 1 3 + 5 0 0 1 2 −6 + 5 0 −5 −2 1 = 0 6 3 0 .1 The Best Approximation Theorem Deﬁnition 39 Let W be a subspace of a normed linear space V . i. Let W be a ﬁnitedimensional subspace of an inner product space V . ⃗3 }.e. ⃗2 . Then for any v ∈ V . ⃗ v Thus v1 ⊥⃗2 . y y Sol: Since v1 · ⃗2 = 0 + 0 + (−2) + 2 = 0 ⃗ v v1 · ⃗3 = 0 + (−5) + 4 + 1 = 0 ⃗ v v2 · ⃗3 = 0 + 0 + (−2) + 2 = 0. the set {⃗1 . ⃗3 } is orthogonal basis of W. the best approximation to v in W is projW (v). Then for any v ∈ V . projW ⃗ . Let W = Span{⃗1 . the best approximation to v in W is the vector v ∈ W such that ¯ v − v  < v − w ¯ for any w ∈ W diﬀerent from v . Proof. v2 ⊥⃗3 . v1 ⊥⃗3 .3. ⃗2 . v v v . .3 Least Squares Approximation 7. ¯ The Best Approximation Theorem.
A has linearly independent columns if and only if AT A is invertible. Proof. Thus AT (⃗ − AX) = 0. b.2 Least Squares Approximation The line y = a + bx is called the line of best ﬁt (or the leat squares approximating line) for the points (x1 . Let AX = projcol(A)⃗ Then (⃗ − AX)⊥col(A). Moreover: of A⃗ = b x a. X is a least squares solution of A⃗ = ⃗ if and only if X is a solution of the normal x b equation AT A⃗ = AT⃗ x b. yn ). a least squares solution of A⃗ = ⃗ is a b x b vector ⃗ ∈ Rn such that y ⃗ − A⃗  ≤ ⃗ − A⃗  b y b x for all ⃗ ∈ Rn . where A= 1 x1 1 x2 . . rank(A) = n if and only if AT A is invertible. . In this case. · · · . . (xn . Let A be m × n matrix and ⃗ ∈ Rn .7.4). ⃗= b . . x The Least Squares Theorem. 1 xn [ ⃗= x a b ] .2). a. b. Example 80 Find the least squares approximating line for the data points (1. y1 ). . the least squares solution of A⃗ = ⃗ is unique and given by x b X = (AT A)−1 AT⃗ b. yn Deﬁnition 40 Let A be m × n matrix and ⃗ ∈ Rn . if it minimizes (a + bx1 − y1 )2 + · · · + (a + bxn − yn )2 = A⃗ − ⃗ x b. . b b b.3. 54 . . a least squares solution b ⃗ always exists. (2. . y1 y2 .2) and (3.
3 Least Squares via the QR Factorization Theorem 36 Let A be m × n matrix and ⃗ ∈ Rn . Example 81 Use QR factorization to ﬁnd a 1 0 −3 0 2 −1 1 0 1 1 3 5 Solution: By 5. √ 23 0 0 55 . 7. 1 2 b 1 3 4 The normal 1 1 1 equation is: T T [ ][ ] [ ] 1 1 1 [ ] 1 1 2 a 3 6 a 8 = 1 2 2 . By the facts that QT Q = I and R is invertible. 1 0 −3 0 2 −1 1 0 1 1 3 5 least squares solution of 1 1 x ⃗ = . we can ﬁnd the parabola that gives the best least squares approximation to data points. Similarly. Proof. Then 1 1 [ ] 2 a = 2 . 2 1 2 b 6 4 b 18 3 1 3 1 3 4 [ a b ] = [ 3 6 6 4 ]−1 [ 8 18 ] = [ 2/3 1 ] .3. b then the unique least squares solution of A⃗ = ⃗ is x b X = R−1 QT⃗ b. If A = QR is a QR factorization of A.3. The solution is Remark.Solution: Let the line be y = a + bx.⇒ = .2. 1 1 0 √2 − √3 10 23 = 1 √1 √3 − √1 10 23 1 √ 3 √2 10 √2 23 1 √ 3 − √1 − √3 10 23 √ √ √ 3 3 3 √ √ 0 10 10 .
The linear transformation P that projects Rm onto W has A(AT A)−1 AT as its standard matrix. z) ∈ R3 x − y + 2z = 0}. Find the orthogonal projection of v = [3 − 1 2]T onto W and give the standard matrix of the linear transformation P that projects R3 onto W . Then AX = projcol(A) (v) = projW (v). y.7. If v ∈ Rm . then projW (v) = A(AT A)−1 AT v. whose columns form a basis of W .3. 1 −1 Solution: Let A = 1 1 . Proof. Example 82 Let W = {(x. 0 1 56 .4 Orthogonal Projection Revisited Theorem 37 Let W be a subspace of Rm . Let X be the unique leat squares solution of Ax = v. and let A be m × n matrix whose columns form a basis of W .
1 1 Example 84 Let A = 1 0 .4 The Singular Value Decomposition 7. The singular values of A satisfy σ1 ≥ · · · ≥ σr > 0. Let V = [v1 · · · vn ]. The columns of U are called left singular vectors of A. 0 1 57 [ u1 = ] . 0 1 7. The singular values of A are the square roots of the eigenvalues of AT A and are denoted by σ1 . Let U = [u1 · · · um ]. · · · . ur }. σn so that σ1 ≥ · · · ≥ simgan . · · · . which is called a singular value decomposition (SVD) of A. · · · . σ1 σr D O O O . 1 1 √ Example 83 Let A = 1 0 .1 The Singular Values of a Matrix Deﬁnition 41 Let A be m × n matrix.4.7. σ2 = 1. vn } is an orthonormal basis of Rn consisting of eigenvectors of AT A. · · · . ur = Avr . and σr+1 = · · · σn = 0. 1 1 Av1 . Find a singular value decomposition (SVD) of A. σr }.4. · · · . where {v1 . D = diag{σ1 . um } is an orthonormal basis of Rm extended from the set {u1 . where {u1 .2 The Singular Values Decomposition The Singular Values Decomposition: Let A be m × n matrix. Let Σ= Then A = U ΣV T . the columns of V are called right singular vectors of A. · · · . Then sigma1 = 3.
4. rank(A) = r. Thus √ 3 0 Σ = 0 1 . · · · . 3. σ2 = 1. 0 0 0 √ 1 u2 = Av2 = −1/ 2 . Then T T 1. and σr+1 = · · · σn = 0. {v1 . Orthogonally diagonalize AT A to ﬁnd V : [ √ √ ] 1/ 2 −1/ 2 √ √ V = . √ σ1 1/ 6 Then applying GramSchmidt Process to get u3 from e1 or e2 or e3 . Let Let A = U ΣV T . A = σ1 u1 v1 + · · · + σr ur vr . 6. um } is an orthonormal basis for null(AT ). Find U : √ 2/ 6 1 √ u1 = Av1 = 1/ 6 . The singular values of A are σ1 ≥ · · · ≥ σr > 0. 2. 5.Solution: Step 1. {u1 . ur } is an orthonormal basis for col(A). vn } is an orthonormal basis for null(A). √ σ2 1/ 2 Step 3. Find Σ: σ1 = √ 3. Theorem 38 Let A be m × n matrix. 1/ 2 1/ 2 Step 2. vr } is an orthonormal basis for row(A). 58 . · · · . {ur+1 . {vr+1 . · · · . · · · .
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.