You are on page 1of 89

Level-up!

Linear Algebra 2
Morris Alper

Morris Alper – ITC Level-up December 2020 cohort


Agenda

● Matrices
● Matrix Multiplication
● Nullspace
● Determinants
● Matrix Inverse
● Eigenvectors and Eigenvalues
● Diagonalization
● Bonus: Vector Calculus
Matrices
Matrices

Recall: a system of linear equations is a set of 𝑚 linear


equations in 𝑛 variables:
𝑎1,1 𝑥1 + 𝑎1,2 𝑥2 + ⋯ + 𝑎1,𝑛 𝑥𝑛 = 𝑏1
𝑎2,1 𝑥1 + 𝑎2,2 𝑥2 + ⋯ + 𝑎2,𝑛 𝑥𝑛 = 𝑏2

𝑎𝑚,1 𝑥1 + 𝑎𝑚,2 𝑥2 + ⋯ + 𝑎𝑚,𝑛 𝑥𝑛 = 𝑏𝑚

A solution is a set of values 𝑥1 , 𝑥2 , … , 𝑥𝑛 = (𝑐1 , 𝑐2 , … , 𝑐𝑛 ) that


satisfy the equations.
Matrices

We can write the coefficients of the linear equations as a matrix:

𝑎1,1 ⋯ 𝑎1,𝑛
𝐴= ⋮ ⋱ ⋮
𝑎𝑚,1 ⋯ 𝑎𝑚,𝑛

The matrix 𝐴 is a rectangular array of numbers. 𝐴 has m rows


and n columns, so we say that 𝐴 is a m × n (“m by n”) matrix.
Matrices

Example 1: For the system of equations


𝑥 + 2𝑦 + 5𝑧 = 1
1
2𝑥 − 𝑦 = 3
2
1 2 5
The coefficient matrix is 𝐴 = 1 , which is a 2 × 3 matrix.
2 − 0
2
Matrices

Example 2: For the system of equations


𝑥1 + 2𝑥2 − 2𝑥3 + 3𝑥4 = 4
−𝑥1 + 2𝑥2 − 𝑥3 + 7𝑥4 = 3
3𝑥1 + 3𝑥2 + 𝑥4 = 2
−𝑥1 − 𝑥2 + 5𝑥3 + 5𝑥4 = 1

1 2 −2 3
The coefficient matrix is the 4 × 4 matrix 𝐴 = −1 2 −1 7 .
3 3 0 1
−1 −1 5 5
Matrices

We can also write vectors as matrices, in two formats:

1 × n matrices (row vectors)


𝒗 = 4 3 −1

n × 1 matrices (column vectors)


4
𝒗= 3
−1
Matrices

We often write the variables and values of a system of linear


equations as column vectors.

Example 1:

1 2 5 𝑥
1
𝑥 + 2𝑦 + 5𝑧 = 1 𝐴 = 2 −1 0 𝒗= 𝑦 𝒃=
2 𝑧 3
1
2𝑥 − 𝑦 = 3
2
Matrices

Example 2: 1 2 −2 3
𝑥1 + 2𝑥2 − 2𝑥3 + 3𝑥4 = 4 𝐴 = −1 2 −1 7
3 3 0 1
−𝑥1 + 2𝑥2 − 𝑥3 + 7𝑥4 = 3
−1 −1 5 5
3𝑥1 + 3𝑥2 + 𝑥4 = 2
−𝑥1 − 𝑥2 + 5𝑥3 + 5𝑥4 = 1 𝑥1 4
𝑥2
𝒗= 𝑥 𝒃= 3
3 2
𝑥4 1
Matrices

The transpose 𝑀𝑇 of a matrix 𝑀 is defined by flipping its rows


and columns as shown below:

4
𝑇 = 3
4 3 −1
−1
𝑇
1 2 −2 3 1 −1 3 −1
−1 2 −1 7 = 2 2 3 −1
3 3 0 1 −2 −1 0 5
−1 −1 5 5 3 7 1 5
Matrix Multiplication
Matrix Multiplication

We saw that we can represent systems of linear equations with


vectors and matrices as shown below.

1 2 5 𝑥
1
𝑥 + 2𝑦 + 5𝑧 = 1 𝐴 = 2 −1 0 𝒗= 𝑦 𝒃=
2 𝑧 3
1
2𝑥 − 𝑦 = 3
2

Now we will define matrix multiplication so that we can write the


system of linear equations using only vectors and matrices.
Matrix Multiplication

1 2 5 𝑥
𝐴 = 2 −1 0 𝒗= 𝑦
2 𝑧

?
𝐴𝒗 =
?
Matrix Multiplication

1 2 5 𝑥
𝐴 = 2 −1 0 𝒗= 𝑦
2 𝑧

1𝑥 + 2𝑦 + 5𝑧
𝐴𝒗 =
?
Matrix Multiplication

1 2 5 𝑥
𝐴 = 2 −1 0 𝒗= 𝑦
2 𝑧

𝑥 + 2𝑦 + 5𝑧
𝐴𝒗 = 1
2𝑥 − 𝑦 + 0𝑧
2
Matrix Multiplication

1 2 5 𝑥
𝐴 = 2 −1 0 𝒗= 𝑦
2 𝑧

𝑥 + 2𝑦 + 5𝑧
𝐴𝒗 = 1
2𝑥 − 𝑦
2
Matrix Multiplication

So letting
1 2 5 𝑥
1
𝐴 = 2 −1 0 𝒗= 𝑦 𝒃=
2 𝑧 3

the system of linear equations becomes

𝐴𝒗 = 𝒃
Matrix Multiplication

In general, suppose we are given m × n and n × o matrices:

𝑎1,1 ⋯ 𝑎1,𝑛 𝑏1,1 ⋯ 𝑏1,𝑜


𝐴= ⋮ ⋱ ⋮ 𝐵= ⋮ ⋱ ⋮
𝑎𝑚,1 ⋯ 𝑎𝑚,𝑛 𝑏𝑛,1 ⋯ 𝑏𝑛,𝑜

Then we can define the matrix product 𝑃 = 𝐴𝐵 to be the m × o matrix whose


(i,j)-th element is 𝑝𝑖,𝑗 = 𝑎𝑖,1 𝑏1,𝑗 + 𝑎𝑖,2 𝑏2,𝑗 + ⋯ + 𝑎𝑖,𝑛 𝑏𝑛,𝑗 = σ𝑛𝑘=1 𝑎𝑖,𝑘 𝑏𝑘,𝑗 .
Matrix Multiplication

Example 1:

1 3 0 1 ? ?
=
−1 2 1 2 ? ?
Matrix Multiplication

Example 1:

1 3 0 1 1⋅0+3⋅1 ?
=
−1 2 1 2 ? ?
Matrix Multiplication

Example 1:

1 3 0 1 𝟑 ?
=
−1 2 1 2 ? ?
Matrix Multiplication

Example 1:

1 3 0 1 3 1⋅1+3⋅2
=
−1 2 1 2 ? ?
Matrix Multiplication

Example 1:

1 3 0 1 3 𝟕
=
−1 2 1 2 ? ?
Matrix Multiplication

Example 1:

1 3 0 1 3 7
=
−1 2 1 2 −1 ⋅ 0 + 2 ⋅ 1 ?
Matrix Multiplication

Example 1:

1 3 0 1 3 7
=
−1 2 1 2 𝟐 ?
Matrix Multiplication

Example 1:

1 3 0 1 3 7
=
−1 2 1 2 2 −1 ⋅ 1 + 2 ⋅ 2
Matrix Multiplication

Example 1:

1 3 0 1 3 7
=
−1 2 1 2 2 𝟑
Matrix Multiplication

Example 2:

1 0
Q: For vectors 𝒗 = −2 , 𝒘 = 1 , what is 𝒗𝑻 𝒘?
1 2
Matrix Multiplication

Example 2:

1 0
Q: For vectors 𝒗 = −2 , 𝒘 = 1 , what is 𝒗𝑻 𝒘?
1 2
0
A: 𝒗𝑻 𝒘 = 1 −2 1 1 = 1 ⋅ 0 + −2 ⋅ 1 + 1 ⋅ 2 = 0
2
Matrix Multiplication

Example 2:

1 0
Q: For vectors 𝒗 = −2 , 𝒘 = 1 , what is 𝒗𝑻 𝒘?
1 2
0
A: 𝒗𝑻 𝒘 = 1 −2 1 1 = 1 ⋅ 0 + −2 ⋅ 1 + 1 ⋅ 2 = 0
2
This is another way to write the dot product 𝒗 ⋅ 𝒘.
Matrix Multiplication

Properties of Matrix Multiplication

● Linearity: 𝐴 𝐵 + 𝐶 = 𝐴𝐵 + 𝐴𝐶, 𝐴 + 𝐶 𝐵 = 𝐴𝐵 + 𝐶𝐵

● Associativity*: 𝐴 𝐵𝐶 = 𝐴𝐵 𝐶

● Transpose: 𝐴𝐵 𝑇 = 𝐵𝑇 𝐴𝑇

Note: Matrix multiplication is NOT COMMUTATIVE: 𝐴𝐵 ≠ 𝐵𝐴

* For a proof, see here


Matrix Multiplication

The n × n identity matrix 𝐼𝑛 (or simply 𝐼 for short) is the matrix with 1’s along
the diagonal and 0’s elsewhere:
1 0 0 ⋯
𝐼𝑛 = 0 1 0 ⋯
0 0 1 ⋯
⋯ ⋯ ⋯ ⋯

1 0 0
1 0
For example, 𝐼2 = and 𝐼3 = 0 1 0.
0 1
0 0 1
In general, 𝐼𝐴 = 𝐴 and 𝐴𝐼 = 𝐴 for any matrix 𝐴 of compatible size.
Nullspace
Nullspace

A system of linear equations where the constant terms are all


zero is called a homogeneous system of linear equations:

𝑎1,1 𝑥1 + 𝑎1,2 𝑥2 + ⋯ + 𝑎1,𝑛 𝑥𝑛 = 0


𝑎2,1 𝑥1 + 𝑎2,2 𝑥2 + ⋯ + 𝑎2,𝑛 𝑥𝑛 = 0

𝑎𝑚,1 𝑥1 + 𝑎𝑚,2 𝑥2 + ⋯ + 𝑎𝑚,𝑛 𝑥𝑛 = 0
Nullspace

A system of linear equations where the constant terms are all


zero is called a homogeneous system of linear equations:

𝐴𝒗 = 𝟎

𝑎1,1 ⋯ 𝑎1,𝑛 𝑥1 0
𝐴= ⋮ ⋱ ⋮ 𝒗= ⋯ 𝟎= ⋯
𝑎𝑚,1 ⋯ 𝑎𝑚,𝑛 𝑥𝑛 0
Nullspace

By linearity, if 𝐴𝒗 = 𝟎 and 𝐴𝒗′ = 𝟎, then 𝐴 𝒗 + 𝒗′ = 𝐴𝒗 + 𝐴𝒗′ =


𝟎 + 𝟎 = 𝟎.

Similarly, for any scalar 𝑐, 𝐴 𝑐𝒗 = 𝑐𝐴𝒗 = 𝑐𝟎 = 𝟎.

So the set of vectors 𝑵𝑨 = {𝒗 ∶ 𝐴𝒗 = 𝟎} is a vector space. It is


called the nullspace of 𝐴.
Nullspace

To find a basis for the nullspace of a matrix, we use the fact


that we can multiply both sides of any equation in the system of
linear equations by a nonzero constant, or add one equation to
another, and the solutions will be equivalent.

So we can use row reduction to convert the matrix to reduced


row-echelon form, and the nullspace will be the same.
Nullspace

Example:
4 2 2 −2
𝐴= 3 1 1 −3
3 2 2 0
Nullspace

Example:
4 2 2 −2
𝐴= 3 1 1 −3
3 2 2 0
1 0 0 −2
Applying row reduction: 𝐴′ = 0 1 1 3
0 0 0 0
Nullspace

Example:
1 0 0 −2
𝐴′ = 0 1 1 3
0 0 0 0

2𝑥4
′ −𝑥3 − 3𝑥4
The general solution for 𝐴 𝒗 = 𝟎 is 𝒗 = .
𝑥3
𝑥4
Nullspace

Example:
2𝑥4 0 2
−𝑥3 − 3𝑥4 −1 −3
𝒗= = 𝑥3 + 𝑥4
𝑥3 1 0
𝑥4 0 1

0 2
−1 −3
So the nullspace of 𝐴 has basis { , }.
1 0
0 1
Determinants
Determinants

The determinant is a number defined for any square matrix.

𝑎1,1 ⋯ 𝑎1,𝑛
For n × n matrix 𝐴 = ⋮ ⋱ ⋮ , we write the determinant
𝑎𝑚,1 ⋯ 𝑎𝑚,𝑛
as |𝐴| or 𝑑𝑒𝑡 𝐴 .
Determinants

For 2 × 2 matrices the determinant can be calculated as:


𝑎 𝑏
= 𝑎𝑑 − 𝑏𝑐
𝑐 𝑑
Examples:
1 0
=1⋅1−0⋅0=1
0 1
−1 3
= (−1) ⋅ (−5) − 3 ⋅ (−1) = 8
−1 −5
Determinants

For 3 × 3 matrices the determinant can be calculated as:

𝑎 𝑏 𝑐
𝑑 𝑓 𝑑 𝑒
𝑑 𝑒 𝑓 =𝑎 𝑒 𝑓
−𝑏 +𝑐
ℎ 𝑖 𝑔 𝑖 𝑔 ℎ
𝑔 ℎ 𝑖

(A similar definition holds for larger matrices; see the Wikipedia article on
the Laplace expansion for more information.)
Determinants

For 3 × 3 matrices the determinant can be calculated as:

𝑎 𝑏 𝑐
𝑑 𝑓 𝑑 𝑒
𝑑 𝑒 𝑓 =𝑎 𝑒 𝑓
−𝑏 +𝑐
ℎ 𝑖 𝑔 𝑖 𝑔 ℎ
𝑔 ℎ 𝑖

(A similar definition holds for larger matrices; see the Wikipedia article on
the Laplace expansion for more information.)
Determinants

For 3 × 3 matrices the determinant can be calculated as:

𝑎 𝑏 𝑐
𝑑 𝑓 𝑑 𝑒
𝑑 𝑒 𝑓 =𝑎 𝑒 𝑓
−𝑏 +𝑐
ℎ 𝑖 𝑔 𝑖 𝑔 ℎ
𝑔 ℎ 𝑖

(A similar definition holds for larger matrices; see the Wikipedia article on
the Laplace expansion for more information.)
Determinants

For 3 × 3 matrices the determinant can be calculated as:

𝑎 𝑏 𝑐
𝑑 𝑓 𝑑 𝑒
𝑑 𝑒 𝑓 =𝑎 𝑒 𝑓
−𝑏 +𝑐
ℎ 𝑖 𝑔 𝑖 𝑔 ℎ
𝑔 ℎ 𝑖

(A similar definition holds for larger matrices; see the Wikipedia article on
the Laplace expansion for more information.)
Determinants

Example:

2 1 −1
−2 4 4
3 1 1
Determinants

Example:

2 1 −1
4 4 −2 4 −2 4
−2 4 4 = 2 − 1 + −1
1 1 3 1 3 1
3 1 1
Determinants

Example:

2 1 −1
4 4 −2 4 −2 4
−2 4 4 = 2 − 1 + −1
1 1 3 1 3 1
3 1 1
= 2 4 ⋅ 1 − 4 ⋅ 1 − 1 −2 ⋅ 1 − 4 ⋅ 3 + −1 ⋅ −2 ⋅ 1 − 4 ⋅ 3
Determinants

Example:

2 1 −1
4 4 −2 4 −2 4
−2 4 4 = 2 − 1 + −1
1 1 3 1 3 1
3 1 1
= 2 4 ⋅ 1 − 4 ⋅ 1 − 1 −2 ⋅ 1 − 4 ⋅ 3 + −1 ⋅ −2 ⋅ 1 − 4 ⋅ 3
= 2 ⋅ 0 − 1 ⋅ −14 + −1 ⋅ −14
Determinants

Example:

2 1 −1
4 4 −2 4 −2 4
−2 4 4 = 2 − 1 + −1
1 1 3 1 3 1
3 1 1
= 2 4 ⋅ 1 − 4 ⋅ 1 − 1 −2 ⋅ 1 − 4 ⋅ 3 + −1 ⋅ −2 ⋅ 1 − 4 ⋅ 3
= 2 ⋅ 0 − 1 ⋅ −14 + −1 ⋅ −14
= 28
Determinants

Some properties of the determinant (for more, see Wikipedia)

● det 𝐼 = 1
● det 𝐴𝐵 = det 𝐴 det 𝐵
● If 𝐴’ equals 𝐴 with all elements of one row multiplied by 𝑐, then det 𝐴′ =
𝑐 det(𝐴) .

● If 𝐴’ equals 𝐴 with a multiple of one row added to another, then det 𝐴′ =


det 𝐴

● If 𝐴 has two identical rows or columns, then det 𝐴 = 0


Determinants
Matrix Inverse
Matrix Inverse

For a square (n × n) matrix 𝐴, the inverse matrix 𝐴−1 is the


unique matrix such that 𝐴−1 𝐴 = 𝐴𝐴−1 = 𝐼 (if it exists).

If 𝐴−1 exists we say that 𝐴 is invertible or nonsingular. If it does


not exist, then 𝐴 is singular.

This lets us “undo” multiplication by 𝐴; for example, if we have a


system of linear equations 𝐴𝒗 = 𝒃 and 𝐴 is invertible, then
multiplying both sides on the left by 𝐴−1 gives 𝒗 = 𝑨−𝟏 𝒃.
Matrix Inverse

Special case:

𝑎 𝑏
For 2 × 2 invertible matrix 𝐴 = ,
𝑐 𝑑

1 𝑑 −𝑏
𝐴−1 =
det(𝐴) −𝑐 𝑎

Where det 𝐴 = 𝑎𝑑 − 𝑏𝑐.


Matrix Inverse

Example:

To solve the system of linear equations


2𝑥 + 𝑦 = 5
−𝑥 + 2𝑦 = 3

7
𝑥 −1
2 1 5 1 2 −1 5 5
we calculate 𝑦 = = = 11 .
−1 2 3 5 1 2 3
5
Matrix Inverse

An efficient way to calculate the inverse of larger matrices is to use row


reduction with the matrix augmented with the identity matrix on the right:
0 −3 −2
𝐴 = 1 −4 −2
−3 4 1

0 −3 −2 | 1 0 0 1 0 0 | 4 −5 −2
1 −4 −2 |0 1 0 ⇒(row reduction)⇒ 0 1 0 | 5 −6 −2
−3 4 1 |0 0 1 0 0 1 | −8 9 3
Matrix Inverse

An efficient way to calculate the inverse of larger matrices is to use row


reduction with the matrix augmented with the identity matrix on the right:
0 −3 −2
𝐴 = 1 −4 −2
−3 4 1

4 −5 −2
𝐴−1 = 5 −6 −2
−8 9 3
Matrix Inverse

Some properties of the matrix inverse (for more, see Wikipedia)

● det 𝐴−1 = det1 𝐴

● (𝐴−1 )−1 = 𝐴
1
● 𝑐𝐴 −1
= 𝐴−1
𝑐

● 𝐴𝐵 −1
= 𝐵−1 𝐴−1

● det 𝐴 = 0 if and only if 𝐴 has no inverse (is singular)


● 𝐴𝒗 = 𝟎 has nontrivial (non-zero) solutions for 𝒗 if and only if 𝐴 is singular
Matrix Inverse
Eigenvectors and Eigenvalues
Eigenvectors and Eigenvalues

For any square matrix 𝐴, if there exists a (column) vector 𝒗 and


constant 𝜆 such that 𝐴𝒗 = 𝜆𝒗, then we say that 𝒗 is an
eigenvector of with eigenvalue 𝜆.

Example:

6 −7 1 6
𝐴= 𝒗= 𝐴𝒗 = = 6𝒗
0 3 0 0
1
𝐴 has eigenvector 𝒗 = with eigenvalue 𝜆 = 6.
0
Eigenvectors and Eigenvalues

Geometrically, eigenvectors
represent directions of
stretching/squishing space, and
their eigenvalues are the ratios of
stretching/squishing.

Example: For the linear


transformation illustrated on the
right, the blue vector is an
eigenvector with eigenvalue one. Reproduced from
https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvector
s#/media/File:Mona_Lisa_eigenvector_grid.png
Eigenvectors and Eigenvalues

The equation 𝐴𝒗 = 𝜆𝒗 is equivalent to 𝐴𝒗 − 𝜆𝒗 = 𝟎.

By linearity this is equivalent to 𝐴 − 𝜆𝐼 𝒗 = 𝟎.

We know that this has non-trivial (non-zero) solutions for 𝒗 if


and only if det 𝐴 − 𝜆𝐼 = 0.

So we can find the eigenvalues of 𝐴 by calculating det 𝐴 − 𝜆𝐼


(the characteristic polynomial) and finding the roots of this
polynomial in 𝜆.
Eigenvectors and Eigenvalues

Example:
6 −7
𝐴=
0 3
6−𝜆 −7
det(𝐴 − 𝜆𝐼) =
0 3−𝜆
= 6 − 𝜆 3 − 𝜆 − −7 ⋅ 0
= 𝜆−6 𝜆−3

This is zero iff 𝜆 = 3 or 𝜆 = 6, so these are the eigenvalues of 𝐴.


Eigenvectors and Eigenvalues

Example:

To find the eigenvectors of 𝐴, we plug in the eigenvalues and


solve, as shown here for the eigenvalue 3:

𝐴 − 3𝐼 𝒗 = 𝟎

3 −7 7 7
𝐴 − 3𝐼 = has nullspace 𝑠𝑝𝑎𝑛( ), so (or any
0 0 3 3
multiple) is an eigenvector with eigenvalue 3.
Eigenvectors and Eigenvalues
Diagonalization
Diagonalization

1 2
Consider the matrix 𝐴 = .
2 1
1−𝜆 2 2
det(𝐴 − 𝜆𝐼) = = 1−𝜆 − 2 ⋅ 2 = (𝜆 − 3)(𝜆 + 1)
2 1−𝜆
So 𝐴 has eigenvalues 𝜆1 = 3 and 𝜆1 = −1.

1
Solving for eigenvectors as shown before gives 𝒗𝟏 = (eigenvalue 3) and
1
−1
𝒗𝟐 = (eigenvalue -1).
1
Diagonalization

1 2 1 −1
Eigenvectors of 𝐴 = : 𝒗𝟏 = (𝜆 = 3) 𝒗𝟐 = (𝜆2 = −1)
2 1 1 1 1
Any vector 𝒘 ∈ ℝ2 can be written as 𝒘 = 𝑎𝒗𝟏 + 𝑏𝒗𝟐 . In other words 𝒘 has
coordinates (𝑎, 𝑏) in the basis 𝒗𝟏 , 𝒗𝟐 .

Then 𝐴𝒘 = 𝐴 𝑎𝒗𝟏 + 𝑏𝒗𝟐 = 𝑎𝐴𝒗𝟏 + 𝑏𝐴𝒗𝟐 = 3𝑎𝒗𝟏 − 𝑏𝒗𝟐 , so 𝐴𝒘 has


coordinates (3𝑎, −𝑏) in the basis 𝒗𝟏 , 𝒗𝟐 .

In general: it is easier to work with a matrix when it is expressed in the


coordinates of a basis of eigenvectors.
Diagonalization

1 2 1 −1
Eigenvectors of 𝐴 = : 𝒗𝟏 = (𝜆 = 3) 𝒗𝟐 = (𝜆2 = −1)
2 1 1 1 1

𝐴 = 𝑄Λ𝑄 −1

where Q = 𝒗𝟏 𝒗𝟐 = 1 −1
1 1
𝜆1 0 3 0
and Λ = =
0 𝜆2 0 −1
Diagonalization

Example:

−1 0 2
Q: Find a general formula for 𝑀𝑛 , where 𝑀 = 0 2 1 and n is a positive
2 1 3
integer.
Diagonalization

2 1 0
A: 𝑀= 1 2 1 has eigenvectors:
0 1 2

1 0 1
𝒗𝟏 = 1 (𝜆1 = 3), 𝒗𝟐 = 0 (𝜆2 = 2), 𝒗𝟑 = −1 (𝜆3 = 1)
1 1 1
Diagonalization

2 1 0
A: 𝑀= 1 2 1 has eigenvectors:
0 1 2

1 0 1
𝒗𝟏 = 1 (𝜆1 = 3), 𝒗𝟐 = 0 (𝜆2 = 2), 𝒗𝟑 = −1 (𝜆3 = 1)
1 1 1

1 0 1 𝜆1 0 0 3 0 0
Let 𝑄 = 𝒗𝟏 𝒗𝟐 𝒗𝟑 = 1 0 −1 Λ= 0 𝜆2 0 = 0 2 0
1 1 1 0 0 𝜆3 0 0 1
Diagonalization

2 1 0 1 0 1 3 0 0
A: 𝑀= 1 2 1 𝑄= 1 0 −1 Λ= 0 2 0
0 1 2 1 1 1 0 0 1

1 0 1 3 0 0 1/2 1/2 0
𝑀 = 𝑄Λ𝑄 −1 = 1 0 −1 0 2 0 −1 0 1
1 1 1 0 0 1 1/2 −1/2 0
Diagonalization

2 1 0 1 0 1 3 0 0
A: 𝑀= 1 2 1 𝑄= 1 0 −1 Λ= 0 2 0
0 1 2 1 1 1 0 0 1

1 0 1 3𝑛 0 0 1/2 1/2 0
𝑀𝑛 = 𝑄Λ𝑛 𝑄 −1 = 1 0 −1 0 2𝑛 0 −1 0 1
1 1 1 0 0 1 1/2 −1/2 0
Diagonalization

2 1 0 1 0 1 3 0 0
A: 𝑀= 1 2 1 𝑄= 1 0 −1 Λ= 0 2 0
0 1 2 1 1 1 0 0 1

1 0 1 3𝑛 0 0 1/2 1/2 0
𝑀𝑛 = 𝑄Λ𝑛 𝑄 −1 = 1 0 −1 0 2𝑛 0 −1 0 1
1 1 1 0 0 1 1/2 −1/2 0

1 3𝑛 + 1 3𝑛 − 1 0
= 3𝑛 − 1 3𝑛 + 1 0
2
−2𝑛+1 + 3𝑛 + 1 3𝑛 − 1 2𝑛+1
Bonus: Vector Calculus
Bonus: Vector Calculus

Linear algebra is extremely useful for optimizing functions of multiple


variables, which are common in machine learning.

For example, consider the function

𝑓 𝑥, 𝑦 = 𝑥 2 + 𝑦 2 − 𝑥

Q: How can we find the values of 𝑥, 𝑦 that minimize 𝑓 𝑥, 𝑦 ?


Bonus: Vector Calculus

Plots of 𝑓 𝑥, 𝑦 = 𝑥 2 + 𝑦 2 − 𝑥 (3d plot and contour plot):


Bonus: Vector Calculus

𝜕𝑓
(𝑥, 𝑦)
𝜕𝑥
Define the gradient ∇𝑓 𝑥, 𝑦 = 𝜕𝑓
(𝑥, 𝑦)
𝜕𝑦

For 𝑓 𝑥, 𝑦 = 𝑥 2 + 𝑦 2 − 𝑥, the gradient is


2𝑥 − 1
∇𝑓 =
2𝑦

This is a vector pointing in the direction of


maximum increase of the function at each
𝑥, 𝑦 .
Bonus: Vector Calculus

To find critical points such as local


minima and maxima, solve for ∇𝑓 = 𝟎.

2𝑥 − 1 0
In our case, = gives critical
2𝑦 0 (0.5, 0)
point 𝑥, 𝑦 = (0.5, 0).

The function takes its minimum value at


this point.
Bonus: Vector Calculus

It turns out that for twice-differentiable functions we can check whether


critical values are minima, maxima, or neither (e.g. saddle points) using the
Hessian matrix which contains the function’s second partial derivatives:

𝜕2 𝑓 𝜕2 𝑓
𝜕𝑥 2 𝜕𝑥𝜕𝑦 2 0
𝐇𝑓 = =
𝜕2 𝑓 𝜕2 𝑓 0 2
𝜕𝑥𝜕𝑦 𝜕𝑦 2

Because all of the eigenvalues of 𝐇𝑓 are positive at the critical point 0.5, 0 ,
it is a (local) minimum; this is similar to the second derivative test for
univariate functions.
Further Reading
Further Reading

● https://www.mathsisfun.com/algebra/matrix-multiplying.html

● https://textbooks.math.gatech.edu/ila/determinants-definitions-
properties.html

● https://www.purplemath.com/modules/mtrxinvr.htm

● https://lpsa.swarthmore.edu/MtrxVibe/EigMat/MatrixEigen.html

● https://yutsumura.com/how-to-diagonalize-a-matrix-step-by-step-
explanation/

You might also like