s

© All Rights Reserved

9 views

s

© All Rights Reserved

- Static Condensation (Guyan Reduction)
- A2T-OlympicRegressionProject
- EFS Survey
- Mass, Stiffness, And Damping Matrix Estimates From Structural Measurements
- Algebra Plus Ch04 En
- c52bf71f-de8f-4b9f-9daf-704310f63bd6XII SC
- computation of electromagnetic transients Dommel
- Tutorial 1 Solutions
- qwrqwr
- Excel Tutor
- MIT18 06S10 Final Exam
- 10th Maths
- Machine Learning
- 233 Fall14 Practice Final
- Asa006 Prb Output
- MAT517 Numerical Linear Algebra Assign1
- warwick linear algebra inna
- systems_des.pdf
- Exam2002
- 1702.06856

You are on page 1of 87

Contents

1 Linear Systems 3

1.1 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Linear Systems and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4.1 Traffic flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4.2 Data fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Euclidean space 17

2.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.1 Geometry of vector operations . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2 Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.3 Ax = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 Linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3.1 Spans and linear independence . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Matrices 27

3.1 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.1.2 What does T tell us? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.3 Geometry of Linear Transformations . . . . . . . . . . . . . . . . . . . . . 31

3.2 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2.1 The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2.2 Transpose of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2.3 Powers of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3.2 Finding A1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

1

4 Subspaces 45

4.1 Introduction to subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.1 Span and nullspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.2 Kernel and range of linear transformations . . . . . . . . . . . . . . . . . 49

4.2 Basis and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2.2 Finding a basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2.3 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.3 Row and Column Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5 Determinants 61

5.1 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.1.1 A Formula for the Determinant . . . . . . . . . . . . . . . . . . . . . . . . 62

5.1.2 Applications of the determinant . . . . . . . . . . . . . . . . . . . . . . . . 65

6.1 The Eigenvalue Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.1.1 Motivating examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.1.2 Finding Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.1.3 Finding Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.1.4 Multiplicity of an Eigenvalue . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.1.5 Remarks about Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.3 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.3.2 Changing it . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.3.3 Change of Basis Matrices between Nonstandard Bases . . . . . . . . . . . 81

6.4 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.4.1 How to diagonalize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.4.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

2

Introduction

Linear algebra is ubiquitous with advanced applications of math. Here are some examples of

applications, that we will talk about more during the quarter.

Ch. 1 Scheduling and optimization: Lufthansa pilots, flight attendants, cleaning, maintenance,

schedules.

Ch. 2 Video Games and computer animation: quickly update a 3d scene by rotating and scaling

vectors.

Ch. 4 Computer vision: rectify points in stereo vision.

Ch. 6 Google PageRank : determine how to list search results.

1 Linear Systems

1.1 Systems of Linear Equations

Lets motivate linear algebra with an example.

Example 1.1. Xavier is two years older than Yang. Their ages sum to 6.

x+y =6

x y = 2.

There are three ways to think about this: substitution, doing algebra with equations, or

graphically. Many of you are probably used to solving this with substitution. For example, if

we solve the second equation for x, we find that

x = y + 2.

Now, we can plug this into the first equation to get

y+2+y =6

which means y = 2, and x = 4. There is nothing wrong with this method, but well soon be

trying to solve more complicated systems (with lots of variables and lots of equations), so we

might want to focus on a different method.

We could solve this equation in another way by doing a little equation algebra. For instance,

if we just add the first equation to the second, we get

x+y+xy =6+2

or

2x = 8

which immediately tells us that x = 4. We can then solve for y to see y = 2 as well. This is the

process we are going to formalize in this course.

Finally, another beneficial way to do this would be to graph the linear system. The solution

is the intersection of two lines. We know how to graph lines and can just sketch them to find

the solution. Well come back to this throughout the course.

3

What if we wanted to solve a harder question?

Example 1.2. Ling, an engineer, was just hired to orchestrate the upkeep of 7 bridges in the

San Francisco area. She has a budget of $700,000. Older bridges cost more per day to repair.

In fact, the nth oldest bridge costs n thousand dollars per day. They need more work too. She

plans to spend 2 days more on each bridge than the one built just before it. How many days

does she allocate to each bridge?

What if the same scheme was kept, but instead Ling had 100 bridges and $770,000,000

dollars? How many days should the first bridge get? How could you program a computer to

solve this? (answer 10 days)

What if Ling had M bridges and 5M (M + 1) thousand dollars? How many days should the

first bridge get? Can a computer even solve this question? (answer 76 days)

Well come back to this later. The course will be focused on studying these equations and

systems of these equations. Lets begin with the terminology.

Notation 1.3. Rather than x and y we might have a lot of variables. So we use x1 , x2 , . . . , xn

denote different variables. Other ways they might appear are as xi , yi , wi , zi . We also let

a1 , . . . , an R denote constants. Here ai R means ai is an element of the set of real numbers.

a1 x1 + a2 x2 + + an xn = b

Example 1.5.

2x1 + 3x2 + x3 = 6

is an example of a linear equation with 3 variables.

Definition 1.7. A solution to a linear equation is a set of numbers (s1 , s2 , . . . , sn ) that satisfies

the equation.

2x1 + 3x2 + x3 = 6,

4

we get

2+3+1=6

which shows that (1, 1, 1) is a solution to the equation.

Definition 1.9. The solution set to a linear equation is the set of all solutions to a given

equation.

equations

a21 x1 + a22 x2 + + a2n xn = b2

..

.

am1 x1 + am2 x2 + + amn xn = bm .

Notice the double subscripts; aij means ith down and jth over when arranged in standard form.

x1 + x2 = 2

x2 + x3 = 2

x3 + x4 = 2

x4 + x5 = 2

x6 + x7 = 2

Can you tell that x1 = 17? Later well introduce a quick way to solve this system.

Example 1.12.

2x1 + 3x2 + x3 = 6

x1 x2 x3 = 2

2x1 + x2 7x3 = 1

Definition 1.13. A solution to a linear system is a set of numbers (s1 , s2 , . . . , sn ) that simul-

taneously satisfy all of the equations in the linear system.

Definition 1.14. We say a linear system is consistent if it has at least one solution. We say

it is inconsistent if it has no solutions.

5

Keep in mind that there can be more than one solution to a linear system; for example the

system

2x1 + 3x2 = 6

4x1 + 6x2 = 12

has infinitely many solutions (because it is two multiples of the same equation).

We can also use the graph perspective to analyze solution sets.

In the example of two lines intersecting (solving a system of two equations with two variables),

we know the lines could either intersect in one point (meaning there is exactly one solution), or

intersect nowhere by being parallel (meaning there are no solutions), or be the same line (meaning

there are infinitely many solutions). This is our first indication of the following theorem, which

well prove soon.

Theorem 1.15. A linear system has no solutions, exactly one solution, or infinitely many

solutions.

Two questions: Why is linearity important here? And, how do we actually find these solu-

tions? We want to develop an algorithm to systematically find the solutions. First, lets start

with a few special types of systems.

Example 1.16. Solve the system

x1 + 2x2 + 2x3 = 2

x2 x3 = 1

x3 = 2.

This triangular system is something we can solve by back-substitution (i.e., start at the

bottom and substitute upwards to find all the variables). We start at the bottom, where x3 = 2,

and then plug that in to the second equation to find that x2 2 = 1, or x2 = 3, and then finally

plug these into the first equation to find x1 = 8. Therefore, we find the (unique) solution is

(8, 3, 2).

6

Definition 1.17. The above system is an example of a triangular system, where the linear

system has n equations and n variables, and the first nonzero term of the k-th equation is the

k-th variable.

We can see this in the above example, and it is true in general because back-substitution

will always give us that solution.

x2 x3 + 2x4 = 5

3x4 = 9.

This system is not triangular (there are more variables than equations) but we can still try

to solve it with back substitution. The last equation tells us that x4 = 3, and if we try to plug

that in to the second equation, we see that x2 = x3 1 and that we have no information about

x3 !

Because there are no restrictions on x3 , that means it is a free variable and can take any

value. We assign it a free parameter (a letter representing any number we want), says x3 = s.

Then, x2 = s 1 and we can plug all of this into the first equation to find x1 = 2 + s. In other

words, the solutions are

(2 + s, 1 + s, s, 3)

where s R.

Definition 1.20. The above system is an example of a system in echelon form. We say a

system is in echelon form if

(a) Every variable is the leading variable (the first nonzero term) of at most one equation.

(b) The system is organized in a stair-step pattern, meaning the leading varibles move down

and to the right.

2x1 4x2 = 11

x1 x2 = 5.

Theorem 1.22. A system in echelon form always has at least one solution.

This is true for the same reason as the analogous fact about triangular systems; we can find

the solutions by back-substitution.

7

1.2 Linear Systems and Matrices

In the last section, we saw that if a system was in echelon form, we could solve it with back

substitution. In this section, well figure out how to convert systems into echelon form to easily

find solutions. To do that, were going to introduce matrices.

Definition 1.23. A matrix is a rectangular array of numbers.

Example 1.24.

1 2 3 1 2

and

3 4 1 4 7

are examples of matrices.

Notation 1.25. We will use both hard [] and soft () brackets for matrices. They both mean

the same thing.

Each linear system corresponds to two matrices.

Definition 1.26. The linear system

a21 x1 + a22 x2 + + a2n xn = b2

..

.

am1 x1 + am2 x2 + + amn xn = bm .

a11 a12 ... a1n

a21 a22 ... a2n

.

.. ..

. .

am1 am2 ... amn

It has augmented matrix

a11 a12 ... a1n b1

a21 a22 ... a2n b2

.

.. ..

. .

am1 am2 ... amn bm

For the first few chapters, well focus on the augmented matrix only. Note that the book

leaves out the vertical line separating the a and b terms.

Example 1.27. The linear system

4x1 + 2x2 + x3 = 2

x1 x2 + 4x3 = 1

4 2 1 2

.

1 1 4 1

8

We can use the matrix to systematically find solutions to systems of equations. First, well

introduce echelon and reduced echelon forms, and then elementary operations.

10 7 20 4

A = 3 2 6 1

1 4 2 9

Definition 1.28. A matrix is in echelon form if

(a) the leading terms (the first nonzero entries of each row) move down and to the right in a

stair step pattern.

(b) any rows of zeros are grouped at the bottom of the matrix.

7

2 25

1 10

echelon(A) = 0 1 0 2

0 0 0 0

Definition 1.29. A matrix is in reduced echelon form if

(c) each leading term is the only nonzero term in its column

1 0 2 1

reduced echelon(A) = 0 1 0 2

0 0 0 0

Definition 1.30. Elementary operations are operations that we can do to a linear system that

dont change its solution set. There are three of them:

Now, we will outline an algorithm for converting a matrix into an equivalent matrix in reduced

echelon form, and see how that gives us the solutions to the corresponding linear system. We

will outline the process with an example.

x1 x2 2x3 = 1

x1 + 3x2 + 5x3 = 2.

9

1. Write the linear system as the corresponding augmented matrix.

In the example,

2 4 6 2

1 1 2 1 .

1 3 5 2

2. If necessary, interchange two rows to get a nonzero entry in the upper left corner. Multiply

the first row by a constant to make that entry a 1.

In the example, this means multiplying the first row by -1/2,

1 2 3 1

1 1 2 1 .

1 3 5 2

3. Add multiples of the first row to the other rows to zero out the rest of the column.

In the example, this corresponds to adding R1 to R2 and adding R1 to R2 ,

1 2 3 1

0 1 1 2 .

0 1 2 1

4. Move over one column and down one row and repeat.

In this example, we move to the entry in the second row and second column. Its already

a 1, so we proceed to step 3 to zero out the rest of the second column by adding 2R2 to the

first row adding R2 to R3 ,

1 0 1 3

0 1 1 2 .

0 0 1 1

Now, we move down and over again to the third row and third column. Its also a 1, so

we proceed to step 3 and zero out the rest of the column,

1 0 0 2

0 1 0 3 .

0 0 1 1

In the example, if we translate back to a system of equations, we see that

x1 = 2

x2 = 3

x3 = 1

so the solution is (2, 3, 1). Its always a good idea to check this!

10

This process will work for any system.

1 3 9 7

1 2 6 5

2 7 21 16

1R1

1 3 9 7

1 2 6 5

2 7 21 16

1R1 + R2

2R1 + R3

1 3 9 7

0 1 3 2

0 1 3 2

1R2

1 3 9 7

0 1 3 2

0 1 3 2

1R2 + R3

1 3 9 7

0 1 3 2

0 0 0 0

Reduced echelon form:

1 0 0 1

0 1 3 2

0 0 0 0

Example 1.33. No solutions.

10 3 5

3 1 3

5 2 4

1/10R1

3

21

1 10

3 1 3

5 2 4

3R1 + R2

5R1 + R3

3

21

1 10

1

0

10 29

0 12 13

2

11

10R2

3

21

1 10

0 1 45

0 12 13

2

1/2R2 + R3

3

21

1 10

0 1 45

0 0 16

1/16R3

3

21

1 10

0 1 45

0 0 1

Reduced echelon form:

1 0 0

0 1 0

0 0 1

Lets make a few observations about systems in reduced echelon form.

1. In reduced echelon form, if the last row looks like [0 . . . 0 | 1], that translates to the

equation 0 = 1, which is FALSE, so the system has NO SOLUTIONS.

2. If there is not a row like that, then either each column has a leading one, meaning there

is exactly one solution, or there is at least one column without a leading one, meaning it

is free and can take any value, so there are infinitely many solutions.

3. Combining the two statements above, weve actually proven Theorem 1.11, because these

are the only possibilities in reduced echelon form.

4. If there are more variables than equations (more columns than rows in the matrix), then

each column CANNOT have a leading one. This means the system has either no solutions

or infinitely many, but it cannot have exactly one.

Now we introduce special type of linear systems that ALWAYS have solutions.

Definition 1.34. A linear system is homogeneous if all of the constant terms are equal to 0,

i.e. it looks like

a21 x1 + a22 x2 + + a2n xn = 0

..

.

am1 x1 + am2 x2 + + amn xn = 0.

Note that these systems always have a solution, namely (0, 0, . . . 0), called the trivial so-

lution. There may be (infinitely many) other solutions, but we at least know the system is

consistent.

12

1.3

SKIP

1.4 Applications

1.4.1 Traffic flow

Arcata, on the northern coast of California, is a small college town with a central plaza (Figure

1). Figure 2 shows the streets surrounding and adjacent to the towns central plaza. As indicated

by the arrows, all streets in the vicinity of the plaza are oneway.

Figure 2 Traffic volumes around the Arcata plaza. Traffic flows north and south on G and H

streets, respectively, and east and west on 8th and 9th streets, respectively. The number of

cars flowing on and off the plaza during a typical 15-minute period on a Saturday morning is

also shown. Our goal is to find x1 , x2 , x3 , and x4 , the volume of traffic along each side of the

plaza. The four intersections are labeled A, B, C, and D. At each intersection, the number of

cars entering the intersection must equal the number leaving. For example, the number of cars

entering A is 100 + x1 and the number exiting is 20 + x2 . Since these must be equal, we end up

with the equation

A : 100 + x1 = 20 + x2

Applying the same reasoning to intersections B, C, and D, we arrive at three more equations,

B : x3 + 30 = x1 + 100

C : x2 + 25 = x3 + 95

D : x3 + 75 = x4 + 15.

Rewriting the equations in the usual form, we obtain the system:

13

To solve the system, we populate an augmented matrix and transform to echelon form.

1 1 0 0 80

1

0 0 1 70

0 1 1 0 70

0 0 1 1 60

1R1 + R2

1 1 0 0 80

0

1 0 1 10

0 1 1 0 70

0 0 1 1 60

1R2 + R3

1 1 0 0 80

0

1 0 1 10

0 0 1 1 60

0 0 1 1 60

1R3

1 1 0 0 80

0

1 0 1 10

0 0 1 1 60

0 0 1 1 60

1R3 + R4

1 1 0 0 80

0

1 0 1 10

0 0 1 1 60

0 0 0 0 0

Reduced echelon form:

1 0 0 1 70

0

1 0 1 10

0 0 1 1 60

0 0 0 0 0

The last coordinate x4 = s1 is free. Back substitution yields the general solution

x1 = 70 + s1 , x2 = 10 + s1 , x3 = 60 + s1 , x4 = s1 .

where s1 is a free parameter. A moments thought reveals why it makes sense that this system

has infinitely many solutions. There can be an arbitrary number of cars simply circling the plaza,

perhaps looking for a parking space. Note also that since each of x1 , x2 , x3 , and x4 must be

nonnegative, it follows that the parameter s1 70. The analysis performed here can be carried

over to much more complex traffic questions, or to other similar settings, such as computer

networks.

14

1.4.2 Data fitting

Given a set of ordered pairs we want to find a curve that goes through all of them.

Example 1.35. Find a quadratic through (0, 1), (1, 2) and (2, 7).

y = a1 x2 + a2 x + a3 .

So we have a system:

a3 = 1

a1 + a2 + a3 = 2

4a1 + 2a2 + a3 = 7

0 0 1 1

1 1 1 2

4 2 1 7

R1 R2

1 1 1 2

0 0 1 1

4 2 1 7

4R1 + R3

1 1 1 2

0 0 1 1

0 2 3 1

R2 R3

1 1 1 2

0 2 3 1

0 0 1 1

1/2R2

1 1 1 2

0 1 3 1

2 2

0 0 1 1

Reduced echelon form:

1 0 0 2

0 1 0 1

0 0 1 1

Example 1.36. Find the plane equation going through (1, 1, 0), (1, 2, 1) and (0, 0, 1).

a1 x1 + a2 x2 + a3 x3 b = 0.

1 1 0 1 0

1 2 1 1 0

0 0 1 1 0

15

1R1 + R2

1 1 0 1 0

0 1 1 0 0

0 0 1 1 0

Reduced echelon form:

1 0 0 2 0

0 1 0 1 0

0 0 1 1 0

a1 = 2b, a2 = b, a3 = b.

So we have infinitely many solutions. But what is one solution? Well, set b equal to anything

nonzero. For instance, if b = 1 then we have

2x1 x2 + x3 1 = 0

16

2 Euclidean space

2.1 Vectors

Definition 2.1. A vector is an ordered list of real numbers (called components) arranged in

a vertical column or in wedge brackets h i.

Example 2.2.

u1

u2 1

2 h1, 2, 3i

..

.

3

un

We need to know where vectors live.

Notation 2.3. We write a R to mean that a is a real number. Often we call a a scalar.

Vectors on the other hand live in higher-dimensional space. The set of all n-dimensional vectors

is denoted by Rn , here R is the set of all real numbers. We write ~u Rn .

u1 v1

u2 v2

Definition 2.4. Let ~u, ~v Rn . That is ~u = . and ~v = . .

.. ..

un vn

~u = ~v ui = vi for all 1 i n.

u1 + v1

u2 + v2

~u + ~v =

..

.

un + vn

u1 cu1

u2 cu2

c~u = c . = .

.. ..

un cun

3 3 11

Example 2.5. 2 + 2 2 = 4

0 0 6

Vectors have all the nice properties that we associate to scalars. Like c(~u~v )+w ~ uc~v .

~ = w+c~

See the book for a list and proof of all of these properties. There are other vector operations.

You might remember the dot product from multi-dimensional calculus.

Example 2.7.

3 3

2 2 = 9 + 4 + 1 = 14.

1 1

17

c1

.

Notation 2.8. The dot product allows us write linear equations compactly. Fix b. Call ~c = ..

cn

x1

.

and ~x = .. then

xn

c1 x1 + cn xn = b is the same as ~c ~x = b.

We are interested in something that looks similar to a dot product, but involves summing

scalars times vectors.

Definition 2.9. Given c1 , . . . , cm R a linear combination of vectors ~u1 , . . . , ~um is the vector

c1 ~u1 + cm ~um .

Notation 2.10. Given vectors ~u1 , . . . , ~um it is convenient to write the matrix with columns

corresponding to these vectors as [~u1 ~um ].

We wont elaborate on this much now, but this notation allows us to concisely describe a

linear combination.

b1 c1

. .

Fact 2.11. Fix ~b = .. . Call ~c = .. and A = [~u1 ~um ] with ~ui Rn . We will see

bm cn

later that a linear combination is the same as A~c, and that a system of equations is the same as

A~c = ~b.

General solutions can be written in terms of linear combinations of vectors. This is called

the vector form of the general solution.

x1 = 70 + s1 , x2 = 10 + s1 , x3 = 60 + s1 , x4 = s1 .

70 1

10 1

60 + s1 1 .

~x =

0 1

Well see that there are several other ways to represent solutions.

There are visual ways to understand the vector operations:

18

Application 2.13. Vector graphics such as PDF and CAD use vectors to make scalable

images. Rather than recording image pixel data, these record everything as either a line, circle,

ellipse, curve with start and end points. Rescaling is then multiplying by a constant. So vector

graphics can be scaled infinitely! Basically the file has 1000s of instructions that read like:

Points in a vector graphic are described by a vector describing their position from the

origin.

Connect the points together with a straight line (or curve) of a certain thickness

Advantage: The size of the file does not change if the image needs to be largerthe same

instructions will make the image as large as you like, and yet the line would always remain

smooth and clear. Well see later that rotations, reflections and translations are very easy with

vectors.

Simple geometric shapes such as letters, autoshapes and logos are ideal for storing as vector

images. Some images are not suitable for storing as a vector image. For example, a detailed

photograph would need so many vector instructions to recreate it, that it is better to store it

as, say, a BITMAP.

19

Example 2.14. This is stored as a vector graphic consisting of three lines. Each line is

stored as a vector and a base point for the vector. Suppose that the base of the arrow is at the

origin and has length 4.

Describe the vector and base point of the three lines in the arrow. (There are many choices

that could work.) Then, rescale it to be 4 times larger in size. What would make it point left

instead of right?

4

Solution: The line from base to tip is rooted at (0, 0) with the vector . The two points

0

1 1

of the arrows are the vectors , based at (4, 0). To rescale we multiply everything by

1 1

4. Multiplying everything by 1 would change the direction.

2.2 Span

Thinking of vectors as building blocks. The span is the set of all points we can build with our

vectors. Better yet, think of vectors like your favorite smoothie ingredients. Imagine you want

to blend them together to hit a certain nutrient profile (fat, protein, carbs, iron, calcium, . . .).

Studying the span asks what profiles we can make with the vectors and which ones we cannot.

We expand on this here with an example in R2 .

2.2.1 Introduction

x1

x2 n

Question 2.15. How many vectors, ~u1 , ~u2 , . . . , ~um does it take so that for any ~x = ... in R

xn

we can find c1 , c2 , . . . , cm R so that c1 ~u1 + c2 ~u2 + + cm ~um = ~x?

More succinctly, how many vectors do we need to express every point in Rn as a linear

combination?

2 1 1

Example 2.16. Lets look at R . Imagine we have vectors ~u = and ~v = and we want

2 0

to use them to get to the point (5, 6). With ~v we can adjust our horizontal position as we please.

3

So, we first use ~u to get to height 6. This is at 3~u = . Next, we need to shift back by 1 in

6

20

the x-coordinate. This requires ~v . So

5

3~u ~v = .

6

In this manner we can express any point in R2 using ~u and ~v . To see this, take (x, y) R2

and solve

x

c1 ~u + c2~v = .

y

This is equivalent to

c1 + c2 = x

2c1 = y.

2 1

Example 2.17. What if we take ~u1 = 1 and ~u2 = 2 and ask where we can go with these?

1 3

We can go to any point

2x1 + x2

x1 ~u1 + x2 ~u2 = x1 + 2x2 .

x1 + 3x2

Notice that we cannot form (1, 0, 0) with these.

2 1 1 1 2 0

1 2 0 0 1 0

1 3 9 0 0 1

So we cant make everything with just these two vectors.

Given a set of vectors in Rn we are interested in the set of all points we can make with linear

combinations. This is called the span.

Definition 2.18. The span of vectors ~u1 , . . . , ~um Rn is the set of all linear combinations

where c1 , . . . , cm R.

2.2.2 Properties

Whether or not a matrix has a solution tells us whether a vector belongs to the span.

Theorem 2.19. Let ~u1 , . . . , ~um Rn . Then v span{~u1 , . . . , ~um } if and only if the matrix

equation [~u1 ~um | ~v ] has at least one solution.

21

1 1 1

Example 2.21. Is span 1 , 0 , 0 = R3 ?

2 1 0

Call these vectors ~u1 , ~u2 , ~u3 . Given (a, b, c) R3 we want to find x1 , x2 , x3 so that

a

x1 ~u1 + x2 ~u2 + x3 ~u3 = b .

c

This gives matrix

1 1 1 a 1 0 0 b

1 0 0 b 0 1 0 2 b + c

2 1 0 c 0 0 1 a 3b + c

Yes!

1 3 2

Example 2.22. Is span 1 , 1 , 1 = R3 ?

1 1 0

Lets setup a matrix and reduce.

1 0 21 12 a + 32 b

1 3 2 a

1 1 1 b 0 1 12 1

2a 2b

1

1 1 0 c 0 0 0 a 2b + c

No! It is described by the plane a 2b + c = 0.

Theorem 2.23. If m < n then ~u1 , . . . , ~um Rn do not span Rn . If m n then maybe they

span Rn .

Proof: Idea for m < n: Row reduce to get to a row of all zeros. Keep track of the operations.

Augment, setting the all zero row equal to 1. Reverse the operations. The new augmented

vector is not in the span. Heres a worked out example from the book:

22

For m n. The vectors ei = (0, . . . , 1, 0, . . . , 0) (1 in the ith place) span Rn . But if was take

2n copies of just say e1 , this doesnt span Rn .

2.2.3 Ax = b

With spans we are interested in solving vector equations. It will be useful to have a compact

way to express these.

Definition 2.24. Let ~a1 , . . . , ~am Rn be vectors. Set A = [~a1 ~am ]. Let ~x = (x1 , . . . , xm )

with xi R. We have

A~x = x1 ~u1 + xm ~um .

Example 2.25. Kyle pays very close attention to what he eats post workout, and his macronu-

trient needs vary greatly from week to week. He typically blends a smoothie with yogurt, banana,

avocado and tofu. These have macro-nutrient ratios per 100g as

Y B A T

Protein 9 1 2 5

.

Fat 5 1 15 3

Carb 4 22 1 2

x1

x2

If A is the nutrition matrix, then ~x =

x3 is the recipe Kyle follows, and A~x is the resulting

x4

smoothie.

Note that A needs to have the same number of columns as ~x has rows.

0 0 1

Example 2.26. Let A = 1 1 1 . The system

4 2 1

1

A~x = 2

7

0 0 1 1

1 1 1 2 .

4 2 1 7

We are interested in detecting how a set of vectors relate to one another. In particular if we can

get by with fewer vectors. We start with an abstract definition, then give an example.

23

Definition 2.27. A set of vectors ~u1 , . . . , ~um Rn is linearly independent if the only solution

to

x1 ~u1 + + xm ~um = ~0

is the trivial solution ~x = ~0 (x1 = 0, x2 = 0, . . . , xm = 0). Otherwise the set is linearly

dependent. Another way to say this is that the homogeneous equation A~x = 0 has only

one solution.

1 1 1

Example 2.28. Is 0 , 1 , 1 linearly independent?

0 0 2

1 1 1 0

x1 0 + x2 1 + x3 1 = 0?

0 0 2 0

We reduce the corresponding augmented matrix to echelon form,

1 1 1 : 0 1 0 0 : 0

0 1 1 : 0 0 1 0 : 0

0 0 2 : 0 0 0 1 : 0

So our only solution is x1 = x2 = x3 = 0, therefore the set of vectors is linearly independent.

Notation 2.29. Notice that when row reducing matrices for homogeneous equations the aug-

mented column doesnt change. So, its okay to omit it, and add it at the end.

This is a big deal in computing efficiency, but we will look at a more pragmatic example

first.

Example 2.30. Kyle pays very close attention to what he eats post workout, and his macronu-

trient needs vary greatly from week to week. He typically blends a smoothie with yogurt, banana,

avocado and tofu. These have macro-nutrient ratios per 100g as

Y B A T

Protein 9 1 2 5

.

Fat 5 1 15 3

Carb 4 22 1 2

Kyle is curious whether he can cut down on ingredients and still hit any macro-goal he might

have.

24

Say that (y, b, a, t) is the amount of each ingredient he uses. He reduces this matrix:

y b a t 745

9 1 2 5 1 0 0 1261 0

0 1 0 12 0 .

9 1 15 3

1261

2

0 0 1 13 0

4 22 1 2

Letting s = t, this has solution set

745 12 2

( s, s, s, s).

1261 1261 13

Set s = 1261 and we have

1

Y = (12B + 184A + 1261T ) .

745

So for any unit of yogurt, we can instead use the blend described above. This means we can

eliminate yogurt from the ingredient list and still make the same amount of macro-decomposition

smoothies!

We further observe that if we were to similarly try to eliminate bananas, we would be left

with the equation

1

B= (745Y 184A 1261T ) .

12

The negative coefficients on the right hand side make this substitution impossible in practice.

The same is true for avocados and tofu.

Theorem 2.31. If n < m, then ~u1 , . . . , ~um Rn are linearly dependent. If m n, then they

might be linearly independent.

x1 ~u1 + xm~vm = ~0

is a homogeneous linear system of n equations and m unknowns. Since m > n, there must

be infinitely many solutions, and therefore a nontrivial solution, meaning the set of vectors is

linearly dependent.

Span and linear independence, while different concepts, are fundamentally related. If we consider

a set of two vectors {~u1 , ~u2 }, the set is linearly dependent precisely when ~u1 is a scalar multiple

of ~u2 or vice-versa. But this just corresponds to ~u1 span{~u2 }. This connection was hinted at

in our macronutrient example, and we now generalize it.

Theorem 2.32. Let {~u1 , . . . , ~um } be a set of vectors in Rn . The set is linearly dependent if

and only if one of the vectors is in the span of the other vectors.

25

1 1 0 2

0 , , , 0 linearly dependent?

2 1

Example 2.33. Is the set of vectors

3 1 2

0

1 0 1 4

Answering this question corresponds to solving a homogeneous system, and to save time

we may omit the column of zeros in our augmented matrix. Converting the matrix to reduced

echelon form,

1 1 0 2 1 0 0 0

0 2 1 0 0 1 0 2

0 3 1 2 0 0 1 4 .

1 0 1 4 0 0 0 0

Letting x4 = t, the solution set to x1 ~u1 + x2 ~u2 + x3 ~u3 + x4 ~u4 = ~0 is (0, 2t, 4t, t). In particular,

this tells us that ~u1 is not in span{~u2 , ~u3 , ~u4 }. This example shows an important caveat of the

above theorem, that a set being linearly dependent only implies that some vector is in the span

of the others, but this need not be true for every vector.

Theorem 2.34. Let{~u1 , . . . , ~um } be a set of vectors in Rn . Then {~u1 , . . . , ~um } is linearly

independent if and only if the system [~u1 ~un : ~v ] has at most one solution for every ~v Rn .

So geometrically speaking, if its possible to reach a vector ~v with a linearly independent set

{~u1 , . . . , ~um }, then the displacement along ~u1 , . . . , ~um is always the same.

When m = n, we have a much stronger statement.

Theorem 2.35. Let {~u1 , . . . , ~un } be a set of vectors in Rn . The following are equivalent:

26

3 Matrices

Lets review the story so far. We have been interested in solving systems of linear equations

A~x = ~b where we try to solve for ~x in terms of A and ~v . We studied linear independence and

span to understand better when we have solutions. To do this we introduced a bit of matrix

algebraexplaining how a matrix acts on a vector. Here we look at this from a more geometric

viewpoint. Lets start with an example.

Example 3.1. You have a picture of a cat that is winking, and want it to cock its head to the

side in a cute way. How do you animate this?

Example 3.2. For an art project you are going to film the shadows of two statues over the

course of a 1 hour period. You want the statues to give each other a high-5 at the end of the

hour.

You need to calculate how the shadows move across the (say perfectly flat) pavement. The

sun at any given instant can be thought of as a function from R3 to R2 . We are interested in

characterizing properties of this function.

27

3.1 Linear Transformations

When we take an n m matrix A and multiply it by a vector in ~v Rm we get a linear

combination of the column vectors in A. This new vector, A~v , lives in Rn . So, we can think of

A as a function that takes vectors in Rm to vectors in Rn .

Example 3.4.

1 2 1 x1 2

A = 0 1 3 , ~x = x2 ,

~b = 0 .

1 2 5 x3 7

1 2 1 x1 2

0 1 3 x2 = 0 ,

1 2 5 x3 7

This is the same as

1 2 1 : 2

0 1 3 : 0 .

1 2 5 : 7

Fact 3.5. A nice general form of the above is if A = [~a1 ~am ] then

m n

~x2 R and outputs values f (~x) R .

Notation 3.6. Say that f is a function that takes vectors

x

We will use the f : Rm Rn . For example if f (x) = then f : R R2 . Often we call f

x+1

a map.

Definition 3.7. The domain of a function is what can be inputted in the function. The image

is the result of the function. We call the range the set of images. The codomain is the space

where images live.

Notation 3.8. We denote the domain and codomain of a function f by: Dom(f ), and Cod(f ).

Also, we let Range(f ) denote the range.

3.1.1 Basics

Definition 3.9. A function T : Rm Rn is a linear transformation if for all vectors ~u, ~v Rm

and scalars r we have

Fact 3.10. T (~0) = ~0. This is because of linearity: T (~u) = T (~u + 0) = T (~u) + T (~0).

28

x+y

x

Example 3.11. T = x is a linear transformation that takes R2 R3 , but

y

y

x xy

T = is not a linear transformation because it isnt linear.

y y

This may look daunting, but it is a property satisfied by matrices. That is for any n m

matrix A and vectors ~x, ~y Rn we have

1. A(~x + ~y ) = A~x + A~y

2. A(r~x) = r(A~x).

This will be our focus for this chapter. We record this fact as a theorem.

Theorem 3.12. Let A = [~a1 ~am ] be a matrix with ~ai Rn and define T (~x) = A~x. Then

T : Rm Rn is a linear transformation.

1 2

Example 3.13. Let A = 2 3 and define T (~x) = A(~x). Then T : R2 R3 is a linear

0 1

2 3 1

transformation. The domain is R and the range is R . Let ~u = . Then

3

1 2 7

A~u = 1 2 + 3 3 = 7 .

0 1 3

7 a

So 7 is the image of ~u under T . But is any vector b in the range of T ? That is, does

3 c

a

x

there exist ~x = 1 such that T (~x) = b ? Well this would mean

x2

c

" # x 2x x 2x a

1 2 x

1 2 1 2

2 3 1 = ?

2x1 + 3x2 = 2x1 + 3x2 = b .

0 1 x2

0 x2 x2 c

No, x2 = c, x1 = (b 3c)/2. So a = (b 3c)/2 2c, which means it isnt free.

Or, we could have row reduced this system and seen it was inconsistent for some values of

a

b .

c

1 2 a

0 1 27 a + 17 b

0 0 27 a 17 b + c

Theorem 3.14. Let A = [~a1 ~am ] be an n m matrix. and let T : Rm Rn with T (~x) = A~x

a linear transformation. Then,

~ Range(T ) if and only if T (~x) = Aw

1. w ~ is a consistent linear system.

2. Range(T ) = span{~a1 , . . . , ~am }.

29

3.1.2 What does T tell us?

Theorem 3.15. For any linear transformation T there is a matrix A so that T (~x) = A~x.

0

.

Proof: Let ~ei = ..1 with a 1 in the ith place. The matrix A = [T (~e1 ) T (~em )] is the matrix

..

.

we want. Any ~x Rm can be written as x1~e1 + xm~em . We then have

So, we are interested in properties of T . Here are a few things one can say about any function:

Definition 3.16. We say that T is one-to-one if T (~u) is unique for all ~u in the domain. We

say that T is onto if every ~v in the codomain is equal to T (~u) for some ~u in the domain.

Theorem 3.18. Let A = {~a1 ~an } and T : Rn Rn be given by T (~x) = A~x. The following

are equivalent:

1. A spans Rn .

2. A is linearly independent.

4. T is onto.

5. T is one-to-one.

30

3.1.3 Geometry of Linear Transformations

There are three basic transformations: Dilations, Rotations and Translations. Lets look at

these three examples in more depth.

Before we can describe a dilation, we need some notion of the size of a vector.

p

Definition 3.19. Given ~x Rn we let k~xk = x21 + x22 + + x2m .

Example 3.20. A dilation by a factor of r R+ is just the map T (~x) = r~x. Notice that

kr~xk = rk~xk so we

are indeed scaling by

r. Can

you think of the matrix attached to this map?

r 0 x1 rx1

It is Ar = . This is because Ar = . Notice that his works in any dimension!

0 r x2 rx2

Also, note that if we want to scale in two directions differently we multiply by the matrix

r 0

Ar,s = .

0 s

Some food for thought... what happens if r < 0? Well, we still scale, but something else

geometric also happens.

2 cos sin

Example 3.21. A rotation of every vector by in R is given by the matrix A = .

sin cos

Lets see why with the vector h1, 0i

1 cos ()

A = .

0 sin ()

x

If we think about a general vector then the trig is a little trickier. This is a CAMPS exercise.

y

If youre curious, here is the resulting vector:

x cos () sin () x x cos () y sin ()

A = = .

y sin () cos () y y cos () + x sin ()

Say we want to rotate in Rd about one of the axes x, y, or z by . We use the matrices

(respectively):

1 0 0 cos 0 sin cos sin 0

Ax, = 0 cos sin Ay, = 0 1 0 Az, = sin cos 0 .

0 sin cos sin 0 cos 0 0 1

31

More complicated rotations are possibly by multiplying these matrices together, but we wont

get into that now.

x

Example 3.22. Say we want to translate a vector, R2 , over 3 and up 2. The natural

y

x x+3

thing to try would be T = . But this isnt a linear transformation, because

y y+3

T (r~x) 6= rT (~x). It turns out that to make this work we need to think of R2 as a plane sitting

inside R3 .

1

Thinking of vectors as points, lets start with the vector ~e1 = . We would like to translate

0

it over 3 and up 2 to get

3

.

2

Weve seen that this isnt possible with a linear transformation from R2

R2 . The fix is to go

x

3 x

up a dimension and map the plane into Z via the function f = y . Note this isnt a

y

1

~ ~

linear transformation because f (0) 6= 0. Can you see where in R this takes R2 ? This gives

3

1

f (~e1 ) = 0 .

1

1 0 a

Now consider the matrix Aa,b = 0 1 b .

0 0 1

We have

x 1 0 a x x+a

Aa,b y = 0 1 b y = y + b

1 0 0 1 1 1

This works because this is whats called a shear. Think of pushing the sides of this slinky:

32

3.2 Matrix Algebra

3.2.1 The basics

Definition 3.23. Let c R. Let A = [~a1 ~am ] with ~ai Rn . We can write the row vectors of

A as ~ar1 , ~ar2 , . . . , ~arn . This lets us write A alternatively as

r

~a1

~ar2

A= ... .

~arm

Note that A is an n m matrix. Let B = [~b1 ~bk ] with ~bi Rm . Notice that B is an m k

matrix. We define addition, scalar multiplication and matrix multiplication:

1. A + B = [~a1 + ~b1 ~am + ~bm ] This only works if A and B have the same

2. cA = [c~a1 c~am ]

r

.

~arn ~b1 ~arn ~b2 ~arn ~bk

Explicitly, the entry in the ith row and jth column of the product matrix is given by ~ari ~bj .

The resulting matrix is n k.

Fact 3.24. Say that A is n m and B is j k. We can only multiply A and B if m = j. The

resulting matrix is n k.

Example 3.26.

1 2 3

1 1 2 4 8 3

1 2 0 = .

1 0 1 0 0 3

1 2 0

Note that if we switched the matrices then the product would be undefined.

~ m ].

W = T S if W : Rm Rn is the linear transformation given by

W (~x) = T (S(~x)).

If we have matrices A and B such that

then

W (~x) = A(B~x) = (AB)~x.

33

Example 3.29. Let T : R2 R2 be the linear transformation that rotates a vector ~x be . Let

cos sin

A = .

sin cos

T T : R2 R2

Our first example showed matrices for which AB was defined, but not BA. In general, even

when both products are defined they are not the same.

Example 3.30.

1 2 1 1 7 5

=

3 0 4 2 3 3

1 1 1 2 2 2

=

4 2 3 0 10 8

Moreover, if A is n m and B is m n, then AB is n n and BA is m m. So the two products

may not even have the same dimensions.

Definition 3.31. We define the following direction vectors of Rm ,

1 0 0

0 1 0

~e1 = . , ~e2 = . , , ~em = . .

.. .. ..

0 0 1

Although AB is not necessarily equal to BA, there are several properties of matrix multipli-

cation that coincide with the behavior weve come to expect from multiplication.

Fact 3.32. Let A, B and C be matrices. It holds that

1. A(BC) = (AB)C

2. A(B + C) = AB + AC

3. (A + B)C = AC + BC

5. AI = A

6. IA = A

The one we choose presently concerns identity matrices.

Theorem 3.33. Let A be an n m matrix, then

In A = A = AIm .

34

Equally important are the ways in which matrix algebra differs from the algebra of the real

numbers. Its important to keep these rules in mind to prevent erroneous mistakes.

Theorem 3.34. Let A, B, and C be nonzero matrices.

(a) It is possible that AB 6= BA.

(b) AB = 0 does not imply that A = 0 or B = 0.

(c) AC = BC does not imply A = B or C = 0.

Example 3.35. Weve already seen an example of (a). For part (b) we consider the product

1 2 2 2 0 0

= .

2 4 1 1 0 0

For part (c) we observe

3 1 1 1 5 5 5 0 1 1

= = .

2 1 2 2 4 4 0 2 2 2

Sometimes it is convenient to switch the rows and the columns. This is called taking the transpose

of a matrix.

Definition 3.36. Let A be a matrix. We call the matrix that puts the columns of A into the

rows, and the rows of A into the columns the transpose of A. This is denoted by AT .

Notice that if A is n m then AT is m n.

Example 3.37.

1 0 0

0 1 0 1 3

3 3 0 3 1 3

1

has transpose

1 3

0 3 3 3

3 3 3

Transposes have several properties:

Fact 3.38. Let A and B be n m matrices and C and m k matrix. Let s be a scalar. It

holds that

1. (A + B)T = AT + B T

2. (sA)T = sAT

3. (AC)T = C T AT .

Definition 3.39. A matrix A is symmetric if A = AT . It is skew-symmetric if A = AT .

Example 3.40.

2 1 0 2 1 0

A= 1 1 0 and AT = 1 1 0

0 0 2 0 0 2

0 2 1 0 2 1

T

A = 2 0 4 and A = 2 0 4

1 4 0 1 4 0

35

Example 3.41. Two neat applications of skew symmetric matrices to physics are as follows:

1. The cross product matrix B with the property that B~x = ~b ~x.

0 b3 b2

b3 0 b1

b2 b3 0

is skew-symmetric.

~ and let ~ri = ~ri C. Then I =

points ~ri with masses mi . Call the center of mass C

[~r1 ~rn ] is skew symmetric.

If A is an n n then A A is also n n, as is A A A, A A A A, etc. It therefore makes

sense to talk about powers of square matrices, and in general

Ak = |A A{z

A} .

k terms

We can encode this graph with an adjacency matrix where we put a 1 for any two vertices

that have an edge between them.

A B D E C F G

A 0 1 0 0 1 0 0

B 1 0 1 1 0 0 0

D 0 1 0 0 0 0 0

Q=

E 0 1 0 0 0 0 0

C 1 0 0 0 0 1 1

F 0 0 0 0 1 0 0

G 0 0 0 0 1 0 0

A neat feature of matrix multiplication is that Qk counts exactly how many paths of length k

between two vertices there are. The way to see this is to realize that when we take the dot

product of say the B row with the B column we will get a +1 for each connected vertex. Thus,

36

the result is exactly how many ways one can get from B back to itself in 2 steps. This reasoning

continues for higher powers.

2 0 1 1 0 1 1 0 4 0 0 4 0 0 8 0 4 4 0 4 4

0 3 0 0 1 0 0 4 0 3 3 0 1 1 0 10 0 0 6 0 0

1 0 1 1 0 0 0 0 3 0 0 1 0 0 4 0 3 3 0 1 1

2 3 4

Q = 1 0 1 1 0 0 0 , Q = 0 3 0 0 1 0 0 , Q =

4 0 3 3 0 1 1

0 1 0 0 3 0 0 4 0 1 1 0 3 3 0 6 0 0 10 0 0

1 0 0 0 0 1 1 0 1 0 0 3 0 0 4 0 1 1 0 3 3

1 0 0 0 0 1 1 0 1 0 0 3 0 0 4 0 1 1 0 3 3

A B D E C F G

A 0 4096 0 0 4096 0 0

B 4096 0 2080 2080 0 2016 2016

13

D 0 2080 0 0 2016 0 0

Q =

E 0 2080 0 0 2016 0 0

C 4096 0 2016 2016 0 2080 2080

F 0 2016 0 0 2080 0 0

G 0 2016 0 0 2080 0 0

Example 3.43. Suppose the internet was simpler and consisted of 6 webpages linked together

as follows:

Bryan is surfing the web and randomly clicking on links. Another term for this is random

walk. He starts on page 1. What is the probability that Brian is at page 5 after 10 clicks?

37

We can describe how a random walk behaves after 1 click with the transition probability

matrix:

0.0 0.5 0.5 0.0 0.0 0.0

0.5 0.0 0.5 0.0 0.0 0.0

0.0 0.5 0.0 0.5 0.0 0.0

P = 0.0

0.0 0.0 0.0 0.5 0.5

0.0 0.0 0.5 0.5 0.0 0.0

0.0 0.0 0.0 0.0 1.0 0.0

If we start at website #1 and ask where we can transition this is like computing P ~e1 . Now, to

figure out the next step this is like P (P ~e1 ). This continues so that the answer to our question

is look at the 5th entry of P 10~e1 .

0.076 0.151 0.225 0.22 0.22 0.108 .076

0.075 0.152 0.225 0.22 0.22 0.108 .151

10

0.076 0.148 0.223 0.225 0.217 0.111 10

.225

P =

, P ~e1 =

.

0.073 0.146 0.217 0.22 0.228 0.116 .22

0.07 0.149 0.225 0.228 0.22 0.108 .22

0.076 0.141 0.223 0.217 0.232 0.111 .108

This is .22. For a transition probability matrix P it holds that the matrix P = limk P k

describes the average amount of time a random walk spends at each site. Google actually uses

something very similar to this to rank pages. They meticulously index the web, put it into a

matrix, and the page they show first when you type in a query is the one that has the highest

probability in P . Well give more details about this in Chapter 6.

There are two matrix properties that are preserved when we take powers of matrices.

a11 0 0

0 a22 0

A= . .. .

. .

. . .

. . . .

0 0 ann

Theorem 3.45. If A is an n n diagonal matrix as above then Ak is also diagonal for all k 1.

In fact, k

a11 0 0

0 ak 0

22

Ak = . .. .

.. ..

.. . . .

0 0 aknn

Definition 3.46. Let A be an n n matrix. We say A is upper triangular if it has the form

a11 a12 a1n

0 a22 a2n

A= . .. .

.. ..

.. . . .

0 0 ann

38

We say A is lower triangular if it has the form

a11 0 0

a21 a22 0

A= . .

.. .. ..

.. . . .

an1 an2 ann

A matrix is triangular is it is either upper or lower triangular.

Theorem 3.47. Let A be an n n upper (lower) triangular matrix and k 1. Then Ak is also

upper (lower) triangular.

3.3 Inverses

In this section we tackle the problem of reversing a linear transformation, to the extent that

its possible. This process has countless applications, including cryptography and 3D transfor-

mations.

3.3.1 Basics

Example 3.48. Let T : R3 R3 be given by T (~x) = A~x, where

1 3 2

A = 2 7 2 .

2 6 5

When we solve T (~x) = ~y for some ~y R3 , we can alternatively interpret this as the inverse of

T for ~y , or T 1 (~y ). To do this we need to solve

x1 + 3x2 + 2x3 = y1

2x1 7x2 + 2x3 = y2

2x1 + 6x2 + 5x3 = y3

Transforming this system into reduced echelon form is equivalent to solving for the xi -s, giving

us the system

x1 = 47y1 + 3y2 20y3

x2 = 14y1 y2 + 6y3

x3 = 2y1 + y3

.

So if we set

47 3 20

B = 14 1 6 ,

2 0 1

then T 1 (~y ) = B~y . In particular, T 1 is also a linear transformation.

Definition 3.49. A linear transformation T : Rm Rn is invertible if T is one-to-one and

onto. When T is invertible, the inverse T 1 : Rn Rm is defined by

T 1 (~y ) = ~x if and only if T (~x) = ~y

39

Theorem 3.50. Let T : Rm Rn be a linear transformation. Then

Example 3.51. It is important to keep in mind that m = n is a necessary but not sufficient

requirement for T to be invertible. If T (~x) = A~x with

3 6

A= ,

4 8

then the columns of this matrix are linearly dependent, so T is neither one-to-one nor onto, and

therefore not invertible.

then for any ~x Rn ,

~x = T T 1 (~x) = T (B~x) = AB~x.

Since this holds for all ~x, it follows that AB = In , the identity matrix. This yields the corre-

sponding notion of invertible matrices.

AB = I.

1 2 1

1 2

A= and B = 2 5 1

3 5

1 2 0

Since

1 2 5 2 1 0

= ,

3 5 3 1 0 1

and

1 2 1 2 2 3 1 0 0

2 5 1 1 1 1 = 0 1 0 ,

1 2 0 1 0 1 0 0 1

both A and B are invertible.

Theorem 3.54. Let A be an invertible matrix with AB = I. Then BA = I, and this matrix

B is unique.

Definition 3.55. If A is invertible, then A1 is the inverse of A given by the unique matrix

such that AA1 = A1 A = I.

A square matrix that is not invertible is also called singular.

Then

40

1

(a) A1 is invertible, with A1 =A

(b) AB is invertible, with (AB)1 = B 1 A1

(c) If AC = AD then C = D

(d) If AC = 0nm then C = 0nm

Example 3.57. Inverses are useful for cryptography. Suppose Chuck Norris has a message:

When I slice onions, onions cry.

He can encode this as numbers in a simple way. Letting 0 denote a space, and each other

number the ordinal of the letter a = 1, b = 2, . . ..

23, 8, 5, 14, 0, 9, 19, 12, 9, 3, 5, 0, 15, 14, 9, 15, 14, 19, 15, 14, 9, 15, 14, 19, 0, 3, 18, 25

Lets put it into a 4 7 matrix

23 8 5 14 0 9 19

12 9 3 5 0 15 14

B=

9 15 14 19 15 14 9

15 14 19 0 3 18 25

This could easily be broken though. So, what he does is scramble it with a 4 4 matrix A. Lets

say

0 1 0 1

3 3 3 2

A= 2 4 2 3

3 3 2 2

This has inverse

1 0 1 1

5 2 3 0

A1 =

0 1 0 1

6 2 3 0

So, instead of B, Chuck can send the message AB:

0 1 0 1 23 8 5 14 0 9 19 27 23 22 5 3 33 39

3 3 3 2 12 9 3 5 0 15 14 162 124 104 114 51 150 176

AB =

2 4

=

2 3 9 15 14 19 15 14 9 157 124 107 86 39 160 187

3 3 2 2 15 14 19 0 3 18 25 153 109 90 95 36 136 167

Jimmy (the deputy Ranger) decodes it via the formula

A1 (AB) = (A1 A)B = IB = B.

That is

1 0 1 1 27 23 22 5 3 33 39 23 8 5 14 0 9 19

5 2 3 0 162 124 104 114 51 150 176 12 9 3 5 0 15 14

157 124 107 86 39 160 187 = 9 15 14 19 15 14 9 .

0 1 0 1

6 2 3 0 153 109 90 95 36 136 167 15 14 19 0 3 18 25

41

History: This was invented by Lester Hill in 1929, along with a machine that decoded it,

the Hill Cipher:

is completely linear. An opponent who intercepts n2 plaintext/ciphertext character pairs can

set up a linear system which can (usually) be easily solved; if it happens that this system is

indeterminate, it is only necessary to add a few more plaintext/cipher text pairs. Calculating

this solution by standard linear algebra algorithms then takes very little time.

3.3.2 Finding A1

If A = [~a1 ~an ] and B = [~b1 ~bn ] are matrices such that AB = In , then by breaking down

matrix multiplication column by column tells us.

We can then solve for each ~bi with the following augmented matrices

However, in solving each of these systems we will perform exactly the same row operations, so

it makes sense to combine it all into one process by setting up one larger augmented matrix:

Once weve transformed the left half to In , provided it is possible to do so, then the right hand

side will be transformed to B = A1 .

In short: We perform Gauss-Jordan elimination on [A : In ] to get [In : A1 ].

42

1 2 3

Example 3.58. Consider A = 0 1 5. To find A1 we will row-reduce [A : I3 ].

5 6 0

1 2 3 : 1 0 0 1 2 3 : 1 0 0

0 1 5 : 0 1 0 0 1 5 : 0 1 0

5 6 0 : 0 0 1 0 4 15 : 5 0 1

1 2 3 : 1 0 0

0

1 5 : 0 1 0

0 0 5 : 5 4 1

2 0 : 4 12 35

1 5

0

1 0 : 5 3 1

4 1

0 0 1 : 1 5 5

0 0 : 6 18 7

1 5 5

0

1 0 : 5 3 1

0 0 1 : 1 45 1

5

process look like when our matrix turns out to not be invertible?

1 1 2

Consider B = 2 1 3, we start the elimination process:

3 1 4

1 1 2 : 1 0 0 1 1 2 : 1 0 0

2 1 3 : 0 1 0 0 1 1 : 2 1

0

3 1 4 : 0 0 1 0 2 2 : 3 0 1

1 1 2 : 1 0 0

0 1 1 : 2 1 0

0 0 0 : 1 2 1

Our process terminates because it is no longer possible to transform the right half into I3 . This

is what will happen for any singular matrix.

Fact 3.60. In the 2 2 case we even have a nice formula for inverses. If

a b

A= ,

c d

1 1 d b

A = .

ad bc c a

We observe that this formula is undefined when ad bc = 0, and this is precisely when a 2 2

matrix is not invertible! Well explain why in Chapter 5.

(a) A is invertible.

(b) A~x = ~b has a uniue solution for all ~b, given by ~x = A1~b.

43

(c) A~x = ~0 has only the trivial solution.

Theorem 3.62 (The Big Theorem). Let A = [~a1 ~an ] be an n n matrix and T : Rn Rn

be given by T (~x) = A~x. Then the following are equivalent:

(d) T is onto.

(e) T is one-to-one.

(f) T is invertible.

(g) A is invertible.

44

4 Subspaces

We have studied linear systems of equations, matrices, and linear transformations. Now we are

interested in studying the subsets of Rn that these relate to. It is often the case that the span

of a matrix, or the range of a linear transformation is not all of the codomain.

1 2

Example 4.1. T (~x) = ~x has range 0 = 4x + y, or equivalently the line y = 4x.

4 8

Definition 4.2. A subset S Rn is a subspace if S satisfies the following 3 conditions:

1. S contains 0, that is 0 S.

2. If ~u, ~v S the ~u + ~v S.

Example 4.3. Let

x+y =6

xy =2

4

The solution set is not a subspace because ~0 is not a solution.

2

Example 4.4. Let

x+y =6

5x 5y = 30

6t

The solution set is . Notice that ~0 again is not in the solution set, so the solutions set is

t

not a subspace.

Theorem 4.5. Let S = span{~u1 , . . . ~um } be a subset of Rn . Then S is a subspace.

This is true because ~0 S, and linear combinations of vectors are in S by the definition of

the span.

There is an algorithm to determine if a set is subspace:

Step 2. If you can show that S is generated by a set of vectors, i.e. that is the span of some set of

vectors, then by the previous theorem S is a subspace.

Step 3. Show by hand that the second and third conditions of the subspace definition are or arent

met by S.

45

Example 4.6. S = {~0} and S = Rn are subspaces. We call them the trivial subspaces.

Example 4.7. Let ` be a line in Rn not containing the origin. The ` is not a subspace because

~0

/ `. However, any line L that contains ~0 is a subspace because it can be written as the span

of a single vector.

v1

v2 n

Example 4.8. Let S be the set of all vectors ... in R such that v1 + + vn = 0. Then

vn

~

S is a subspace because (1) 0 belongs to S, and one can check that the second two conditions

hold as well:

(2) ~u, ~v S implies that (u1 + v1 ) + (un + vn ) = 0 so ~u + ~v S.

(3) If ~u S then r(u1 + un ) = 0 so r~u S.

Matrices also have subspaces attached to them. For instance, homogeneous systems A~x = ~0

have solution sets that are subspaces.

Definition 4.10. The set of solutions to A~x = ~0 is called the null space of A. This is denoted

by null(A).

Example 4.11. We give three more intuitive examples of what the Span and null space might

describe:

1. Lets suppose that the matrix A represents a physical system. As an example, lets assume

our system is a rocket, and A is a matrix representing the directions we can go based on

our thrusters. So what do the null space and the span represent?

Well lets suppose we have a direction that were interested in. Is it in our column space?

If so, then we can move in that direction. The column space is the set of directions that

we can achieve based on our thrusters. Lets suppose that we have three thrusters equally

spaced around our rocket. If theyre all perfectly functional then we can move in any

direction. In this case our column space is the entire range. But what happens when a

thruster breaks? Now weve only got two thrusters. Our linear system will have changed

(the matrix A will be different), and our column space will be reduced.

46

Whats the null space? The null space are the set of thruster instructions that completely

waste fuel. Theyre the set of instructions where our thrusters will thrust, but the direction

will not be changed at all.

Suppose a space ship has thrusters which allow it to move in directions

1 0 0 2 1

~u1 = 1 , ~u2 = 0 , ~u3 = 1 , ~u4 = 3 ~u5 = 1 .

1 1 0 2 2

Suppose that each of these vectors has an antipodal thruster that can fire in the opposite

direction. So, it makes sense to apply a negative thrust in the direction. What thrust

amounts, t1 , t2 , t3 , t4 , t5 result in no motion?

We solve the system A~x = ~0 :

1 0 0 2 1 0

1 0 1 3 1 0

1 1 0 2 2 0

and obtain

1 0 0 2 1 0

0 1 0 4 1 0 .

0 0 1 1 2 0

2t4 t5

So, if we choose t4 and t5 , then any thrusting for the first three of the form 4t4 t5

t4 + 2t5

will result in no motion. Note that because the Span is all of R3 we can go anywhere with

our space ship.

2. Or A can represent measurements on investments. The range are all the measurements

achievable by a portfolio. The null space are all the investments that can be made that

wouldnt change the measurements at all.

47

R

An investment has 4 measurements attached to it where R is the average rate of

v

return, is the amount the stock has outperformed the index, is the correlation to the

index, and v is the riskiness or volatility of the stock. An investor can either buy an index

or short it. So, it is possible to have positive and negative linear combinations of the

following five investments:

3 1 14 1 5

1 0 2 1 1

Quascorp: 3; LundinWorx: 1; GruelInc: 11 ; Holtheroni: 1; Bruzzard 2 .

4 11 20 1 3

What stock blends will result in a completely neutral portfolio? We set up an augmented

matrix to solve for the nullspace:

3 1 14 1 5 0

1

0 2 1 1 0

3 1 11 1 2 0

4 11 20 1 3 0

0 586

1 0 0 65 0

54

0

1 0 0 65 0

0 101

0 1 0 65 0

723

0 0 0 1 65 0

So, the blend is determined by how many stocks of Bruzzard we buy.

3. Another example that we wont work out is room illumination. A math professor at

Indiana University once told Matt that the hardest math problem he couldnt solve was

buying lighting for his house (true story). Apparently it is really hard to get the light just

right in a room. The range of A represents the area of the room that can be illuminated.

The null space of A represents the power we can apply to lamps that dont change the

illumination in the room at all.

48

4.1.2 Kernel and range of linear transformations

We can define similar notions for linear transformations.

all vectors that T maps to ~0. i.e. all ~x Rm such that T (~x) = ~0. We denote this by ker T .

Both of these can be checked by using the linearity and scaling properties of linear transfor-

mations.

x1 2x2

x1

T = 3x1 + 6x2

x2

2x1 4x2

tells us what direction a space ship moves given its two thrusters fire with x1 and x2 . We can

write T (~x) = A~x with

1 2 1 2

A = 3 6 0 0 .

2 4 0 0

Augmenting with ~0, this is equivalent to the single equation x1 2x2 = 0. So, we can write for

all s R

2

ker T = s .

1

1

Notice that range(T) = span 3. This is just a line. A thruster redesign might be in order...

2

We can relate the kernel to one-to-one-ness.

Weve talked about how homogeneous solution sets, span and range relate to subspaces. Now

we want to introduce linear independence.

49

4.2.1 Introduction

Example 4.16. Bryan is very health conscious. He not only cares about macro nutrients, but

about the constitution of these macros. Here are the standard recommendations.

Calories 2000

Fat 65 g

Sodium 2,400 mg

Carbohydrate 300 g

Protein 50 g

Let

K

F

S R5

C

P

be the nutrition information of 100g of each food item. Notice that calories are determined by

the formula

K = 9F + 4C + 4P.

Thus, the space of possible nutrition vectors is the subspace W R5 described by all vectors

satisfying

0 = K + 9F + 4C + 4P.

Bryan also has a narrow palate. Suppose he only likes

whole milk, avocado, tahini, orange juice, pineapple and coconut milk.

Can he hit any nutrition goal with just these? What is the least amount of food he can eat

in order to be able to hit any nutrition goal he might have? We first gather the nutrition

information:

Nutrition Facts

AVOCADO

Serving Size 100 g

Calories 179

% Daily Value*

Total Fat 15 g 2%

Sodium 7 mg 8%

Total Carbohydrate 9 g 1%

Protein 2 g

2,000 calorie diet.

50

This gives a matrix of column vectors:

A T O M P C

179 638 44 41 41 232 K

15 54 0 1 1 24 F

7 115 1 44 15 2 S

9 21 10 5 6 1 C

2 17 1 3 2 3 P

Suppose also, that Bryan can have really targeted workouts so he depletes exactly the negative

of any of these nutrition vectors. The echelon form of the above is:

5176 8417

1 0 0 0 18807 12538

0 1 0 0 1741 1705

18807 6269

3690 8741

0 0 1 0

6269 12538

811 9483

6269 12538

0 0 0 1

0 0 0 0 0 0

Note that had we augmented with a generic vector we would have the echelon form:

1 638 44 41 41 232 1

179 179 179 179 179 179 K

0 1 55

8 109

24 109

24

17

2

179 5

96 F 32 K

3611 3379 2060 1791 449 8

0

0 1 4947 4947 1649 6596 F + 19788 K + 4947 S ,

0 0 0 1 6269 12538 12538 C + 41367

811 9483 4947

50152 F 4547

50152 K + 547

12538 S

0 0 0 0 0 0 4C + 9 F K + 4P

which checks out with the fact these vectors all lie in the subspace W .

The above tells us that Bryan can hit every nutritional goal in W . But can he cut out a

food item? Yes! These vectors are not linearly independent. Suppose he took out coconut milk

and pineapple. Then we would have {~a, ~t, ~o, m}

~ is a linearly independent system that still spans

W . This sets us up PERFECTLY to define a basis.

Lets define a basis.

Definition 4.17. A set B = {~u1 , . . . , ~um } is a basis for a subspace S if

1. span(B) = S.

2. B is linearly independent.

Theorem 4.18. If B is a basis for S then every element of S can be written as a unique linear

combination of vectors in B.

There are two ways to find a basis. One involves row reducing. It hinges on the idea that the

span of the rows a matrix is unaffected by row operations.

a~1

a~2 m

Definition 4.19. Let A = ... , with row vectors ~ai R . We define the rowspace to be

a~n

span{~a1 , . . . , ~am }. We denote this by rowspace(A).

51

Theorem 4.20. If A and B are equivalent (i.e. theres a way to go from one to the other by

row operations) then

rowspace(A) = rowspace(B).

So if we know that ~u1 , . . . , ~um span S and want to find a minimal set, it is enough to write

these as row vectors then go to echelon form. The remaining non-zero vectors will be linearly

independent, and still span S.

Example 4.21. The one in the book is fine:

1. To find a basis for S = span ~u1 , . . . , ~um , Use the vectors ~u1 , . . . , ~um to form the rows of a

matrix A.

There is another way to find a basis. This uses the fact that row operations preserve linear

combinations:

52

Fact 4.22. If U = [~u1 ~um ] and V = [~v1 ~vm ] are equivalent matrices, then any linear

dependence among the ~ui still exists for the ~vi .

independent, and the other columns will be linearly dependent on the pivot columns.

53

4.2.3 Dimension

Now that we have the notion of a basis, we can finally

make sense of dimension. For example, it

1

has always been tempting to say that span = R, because it basically looks like the real

0

line. However, this isnt true because it lives in R2 . We will see that this set has dimension

1, and so it looks like R, although it lives somewhere else...

Before defining a dimension we need to establish that every basis for a given subspace will

have the same size.

Theorem 4.24. If S is a subspace of Rn , then every basis of S has the same number of vectors.

Definition 4.25. Let S be subspace of Rn , then the dimension of S is the number of vectors

in any basis of S.

Note that {~0} and Rn have dimensions 0 and n, respectively. To see that Rn has dimension

n we can use the standard basis vectors ~e1 , . . . , ~en .

We are interested in three different tasks:

1. Determining the dimension of a subspace.

Lets look at examples of these.

Example 4.26. Determine the dimension of the subspace

6 1 1 7

4 0 4 1

S = span 1 , 2 , 9 , 0 .

1 3 16 1

2 4 22 3

We do so by row reducing the corresponding matrix and counting the number of pivots.

6 1 1 7 1 0 1 0

4 0 4 1 0 1 5 0

1

2 9 0

0 0

0 1

.

1 3 16 1 0 0 0 0

2 4 22 3 0 0 0 0

Before our next example, we give a theorem.

Theorem 4.27. Let U = {~u1 , . . . , ~um } be a set of vectors in a subspace S Rn .

1. If U is linearly independent then either U is a basis, or vectors can be added to U to form

a basis of S.

2. If span(U) = S, then either U is already a basis, or vectors can be removed from U to form

a basis.

54

Now lets expand a set to form a basis. We go about this in a round-about way. Put simply,

we add too many vectors in, then remove them using techniques we already know.

1 3 2 0 2 0

2 6 5 2 4 3

A= 0

0 5 10 0 15

2 6 0 8 4 18

55

Describe the nullspace of A and exhibit a basis. We row reduce and obtain

1 3 0 4 2 0 0

0 0 1 2 0 0 0

A 0

0 0 0 0 1 0

0 0 0 0 0 0 0

x2

2x 4

.

x4

x5

0

3 4 2

1 0 0

0

+ x4 0 + x5 0 .

x2

0 1 0

0 0 1

0 0 0

Thus,

3 4 2

1 0 0

0 0 0

nullspace(A) = span , ,

0 1 0 .

0 0 1

0 0 0

We often care about the dimension of the nullspace. This gets a special name.

Definition 4.30. The nullity of a matrix A is the dimension of the nullspace. We denote this

by nullity(A).

Knowing the dimension of a subspace helps us find the dimension.

Theorem 4.31. If S has dimension m and U is a set of m vectors that either span S or are

linearly independent, then U is a basis for S.

1

Example 4.32. Suppose that S is a subspace of dimension 2 and contains the vectors 0 and

2

0

1. Show that these two vectors form a basis for S.

0

To do this it suffices to check that these are linearly independent, which either visual inspec-

tion or row reducing quickly reveals.

56

Fact 4.33. If S1 and S2 are subspaces of Rn and S1 is a subset of S2 , then dim(S1 ) dim(S2 )

and dim(S1 ) = dim(S2 ) if and only if S1 = S2 .

Fact 4.34. If U = {~u1 , . . . , ~um } is a set of vectors in a subspace S of dimension k, then

1. If m < k, then U does not span S.

In developing a method for finding bases we started viewing matrices in the context of subspaces

spanned by its row and column vectors.

Definition 4.35. Let A be an n m matrix. The row space of A is the subspace of Rm

spanned by its row vectors, denoted row(A). Similarly, the column space of A is the subspace

of Rn spanned by the column vectors of A, denoted col(A).

Formalizing what we saw in section 4.2 gives the following theorem.

Theorem 4.36. Let A be a nonzero matrix and B an echelon form of A.

1. The nonzero rows of B form a basis for row(A).

2. The columns of A corresponding to the pivot columns of B form a basis for col(A).

Example 4.37. Find a basis and the dimension of the row and column spaces for A.

1 3 1 0 3

A = 2 6 0 2 4

3 9 3 6 3

1 3 1 0 3 1 3 1 0 3

A = 2 6 0

2 4 0 0 2 2 2 = B

3 9 3 6 3 0 0 0 0 0

By the previous theorem, the nonzero rows of B form a basis for row(A), or

1 0

3 0

1 , 2 .

0 2

3 2

For the row space, we observe that the pivot columns of B are the first and third columns, so

the corresponding columns of A will be a basis for col(A), or

1 1

2 , 0 .

3 3

Both subspaces have bases consisting of two vectors, so they have dimension 2.

57

In the above example the row space and column space of our matrix, even though they were

subspaces of different Euclidean spaces, had the same dimension. It turns out this is always the

case.

Theorem 4.38. For any matrix A, the dimensions of the row space and the column space are

the same.

Proof: Given A, we use elementary row operations to find an echelon form B of A. The

dimension of the row space of A will correspond to the number of nonzero rows of B, while

the dimension of the column space will correspond to the number of pivot columns of B. We

observe that each nonzero row of B has exactly one leading variable, or pivot, and by the

stair-step pattern of B different nonzero rows will have different pivot columns. It follows that

| {z } | {z }

dimension of row space dimension of column space

With this relationship established we can make sense of the rank of a matrix.

Definition 4.39. The rank of a matrix A is the dimension of the row (or column) space of A,

denoted rank(A).

1 2 3 0 1

Example 4.40. What is the rank and nullity of 0 0 1 3 5 ?

0 0 0 0 0

Well, we see that there are two pivots, so the rank is 2. To determine the nullity we augment

with ~0 and observe that there are 3 free variables: x2 , x4 , x5 . Back substitution yields the

following solutions for the null space:

x3 = 3x4 5xt ,

x1 = 2x2 3x3 + x5 = 2x2 9x4 + 16x5 .

2 9 16

1 0 0

0 + s2 3 + s3 5 .

~x = s1

0 1 0

0 0 1

These are linearly independent, and thus form a basis for the nullspace. We deduce that the

nullity is 3.

In this example we saw that for our 3 5 matrix the rank was 2 and the nullity was 3. Notice

that 2 + 3 = 5. This holds in general:

rank(A) + nullity(A) = m.

58

This is true because in echelon form, each pivot of A corresponds to a vector in the basis

for column (or row) space of A. The remaining columns correspond to free variables, each of

which is a parameter for a basis vector of the solution set A~x = 0. So, these two objects sum to

m = #columns.

Example 4.42. Suppose that A is 3742 and T (~x) = A~x has a kernel with dimension 12. What

is the rank of A? Well, since the kernel of T and the nullspace of A have the same dimension,

we can use the rank-nullity theorem to deduce that rank(A) = 42 12 = 30.

Theorem 4.43 (The Big Theorem). Let A = [~a1 ~an ] be an n n matrix and T : Rn Rn

be given by T (~x) = A~x. Then the following are equivalent:

(d) T is onto.

(e) T is one-to-one.

(f) T is invertible.

(g) A is invertible.

(j) col(A) = Rn

(k) row(A) = Rn

(l) rank(A) = n

(m) nullity(A) = 0.

We end with one last example that ties a lot of this together.

1 1

0 0 4

1 , ~u2 = 2 R

span ~u1 =

1 2

(x y z = 0) R3 .

Find a linear transformation T : R4 R3 such that T (P ) = Q and ker T has dimension 1.

Step 1: We We first find a basis for R4 that includes the vectors spanning P . To do this we

augment with the standard basis vectors of R4 and row reduce.

59

0 1 1 0 0 0 1 0 2 0 0 1

0 0 0 1 0 0 0 1 1 0 0 0

1 2 0 0 1 0 0 0 0 1 0 0

1 2 0 0 0 1 0 0 0 0 1 1

Looking at the pivots we have {~u1 , ~u2 , ~e2 , ~e3 } is a basis for R4 .

Next we need to find spanning vectors

for the subspace Q. We can write x = y + z, so

x

another way to represent vectors y Q is as

z

y+z 1 1

y = y 1 + z 0 .

z 0 1

1 1

This means that ~v1 = 1 and ~v2 = 0 is a spanning set for Q. We still need to find another

0 1

vector to make this a basis for R3 . A sure way to do this is to augment again with the standard

basis vectors and row reduce:

1 1 1 0 0 1 0 0 1 0

1 0 0 1 0 0 1 0 0 1

0 1 0 0 1 0 0 1 1 1

So, we have a basis {~v1 , ~v2 , ~e1 } is a basis for R3 .

Step 2: We observe that a linear transformation is completely determined by how it acts on a

basis. So, the map T : R4 R3 given by:

is a unique transformation and does what we want.

60

5 Determinants

5.1 Determinants

The determinant is a number we associate to a square matrix A. We call it det(A), or sometimes

just |A| (Caution, the determinant can be negative). It will tell us, among other things, whether

or not A is invertible. Rather than say how to compute it, lets say how we would like det(A)

to behave like:

Theorem 5.1. Let A be an n n matrix. The number det(A) has the following properties:

1. det A 6= 0 if and only if A is invertible. (Puts us in the Big Theorem situation of lots being

true about the row and column vectors of A, as well as the induced linear transformation

and bases.)

2. The area of the unit cube under the linear transformation T is | det(A)|.

3. det I = 1.

4. If B is obtained from A by switching a single row or column with another then det B =

det A.

a b

= c d .

c d a b

ta tb

= t a b .

c d c d

a + a0 b + b0 a b a0 b0

= +

c d c d c d

6. If two rows (or columns) are the same then det A = 0. This is by Property 3, since we

could swap two rows and obtain det A = det A and so det A = 0.

a b a b a b

c ra d rb = c d + ra rb

a b a b

= + r

c d a b

a b

= +0

c d

9. If B is the upper triangular we get from A after row reducing like in 6, then det A is the

product of the diagonal entries of B times (1)#row exchanges .

61

10. det(AB) = det A det B

11. det A1 = 1

det A .

Formula 5.2. If A is a 1 1 matrix, A = [a11 ], then det(A) = a11 .

Formula 5.3. If A is a 2 2 matrix, things get a little bit harder, but are still pretty easy to

compute. If A is the matrix

a11 a12

a21 a22

then the determinant of A is given by

.

Note: weve

talked about the quick formula for the inverse of a 2 2 matrix: if A is

already

a a

the matrix 11 12 , then

a21 a22

1 1 a22 a12

A = .

a11 a22 a12 a21 a21 a11

So weve already seen the determinant of a 2 2 matrix come up and we know that a 2 2

matrix A is invertible if and only if det(A) 6= 0.

2 4

Example 5.4. Find the determinant of the matrix A = .

3 1

The determinant is (2)(1) 4(3) = 14.

Formula 5.5. Once we know the formula for a 2 2 determinant, we begin to compute de-

terminants recursively. In order to do this, we first introduce some notation. If A is an n n

matrix, denote the ij th entry by aij . Let Mij be the (n 1) (n 1) matrix gotten by cross-

ing out the ith row and j th column of A. Mij is called the ij th minor of A. Finally, let

Cij = (1)i+j det(Mij ). Cij is called the ij th cofactor of A.

Example 5.6. As an example, consider the matrix

2 1 1

A = 3 2 4 .

2 1 0

Then,

a13 = 1

3 2

M13 =

2 1

62

and

C13 = (1)1+3 det M13 = (1)4 (3(1) (2)(2)) = 1.

Well use these to compute the determinant of any matrix. Heres the formula: if A is the

n n matrix

a11 a12 . . . a1n

.. .. .. ..

. . . .

A= .. .. .. ..

. . . .

an1 an2 . . . ann

then

det(A) = a11 C11 + a12 C12 + + a1n C1n .

2 1 1

Example 2. Find the determinant of the matrix A = 3 2 4 .

2 1 0

The formula above tells us det(A) = a11 C11 + a12 C12 + a13 C13 . We know

and

1+1 2 4

C11 = (1) det M11 = det = 4

1 0

1+2 3 4

C12 = (1) det M11 = det = 8

2 0

1+3 3 2

C13 = (1) det M13 = det = 1

2 1

so

det(A) = 2(4) + 1(8) 1(1) = 15.

determinants of 2 2 matrices. We compute determinants of 4 4 matrices by computing

determinants of 3 3 matrices, which we do by computing determinants of 2 2 matrices, and

so on.

63

If you are curious, here is a worked out example for a 4 4 matrix:

1 1 2 1

0 2 1 1

Example 5.8. Find the determinant of the matrix A = 0 3 2

.

4

0 2 1 0

The formula tells us det(A) = a11 C11 + a12 C12 + a13 C13 + a14 C14 . We know

a11 = 1, a12 = 1, a13 = 2, a14 = 1

and, by what we computed above (example (2)),

2 1 1

C11 = (1)1+1 det M11 = det 3 2 4 = 15.

2 1 0

We also need to find

0 1 1

C12 = (1)1+2 det M11 = det 0 2 4

0 1 0

0 2 1

C13 = (1)1+3 det M13 = det 0 3 4

0 2 0

0 2 1

C14 = (1)1+4 det M14 = det 0 3 2 .

0 2 1

Lets do the first one together:

0 1 1

2 4 0 4 0 2

C12 = det 0 2 4 = 0 det 1 det + (1) det = 0.

1 0 0 0 0 1

0 1 0

Similarly, we can compute C13 = 0 and C14 = 0 and we find that so

det(A) = 1(15) 1(0) + 2(0) + 1(0) = 15.

64

A lot of the determinants in the above example turned out to be 0, and thats a reflection

of the following fact: we can compute determinants by expanding along the first row (what

weve been doing in the formula above, using the entries from the first row to compute the

determinant), or we could expand along any row or column of our choice. This is summarized

in the following formulas.

a11 a12 ... a1n

.. .. .. ..

A= . . . .

an1 an2 ... ann

1. An expansion across row i: det(A) = ai1 Ci1 + ai2 Ci2 + + ain Cin

2. An expansion across column j: det(A) = a1j C1j + a2j C2j + + anj Cnj

Example 5.9. Lets revisit the earlier example and try applying part (2) of the formula to

the example above, with j = 1 (expanding across the first column). The formula tells us

det(A) = 1 C11 + 0 C21 + 0 C31 + 0 C41 . So, we need to find

2 1 1

C11 = (1)1+1 det M11 = det 3 2 4 = 15

2 1 0

Determinants let us quickly find inverses of a matrix:

1

A1 = adj(A)

det A

Where adj(A) is the adjoint of A given by

C11 C21 Cn1

.. .. .. ..

. . . .

adj(A) = .. .. .. ..

. . . .

C1n C2n Cnn

Example 5.11.

1

2 0 1 1 1 2

1 2 1 = 1 0 0 3

3

0 1 0 1 2 4

65

Fact 5.12. If T : Rn Rn with T (~x) = A~x then the volume of the image of the unit cube is

det A.

Fact 5.13. You will see in multivariable calculus that even for non-linear coordinate changes the

determinant tells you what happens to the volume of a cube. For example, the transformation

from polar coordinates F (r, ) = (r cos , r sin ) induces a determinant

cos r sin

sin r cos = r.

So the distortion is r. This means the amount of stretching increases as r increases. See the

picture below:

What this amounts to is an extra factor of r when we compute double integrals in polar

coordinates: Z Z Z Z

f (x, y)dxdy = f (r cos , r sin )rdrd.

{1, 0, 1}.

Fact 5.15. If A is totally unimodular and ~b is only integers then the unique solution to A~x = ~b

is an integer valued vector.

Example 5.16. The adjacency graphs of certain link structures (where there are two groups of

sites that do not link to on another) are totally unimodular.

For the matrix above we know that all equations A~x = ~b have integer solutions.

This follows from Cramers rule since

A~x = ~b ~x = A1~b each xi = .

det A

It is (among many other things) useful for minimum cost flow problems. Like getting goods

from a warehouse to a factory with different paths having different costs.

66

6 Eigenvalues and Eigenvectors

The prefix eigen- is adopted from the German word eigen for own- or unique to, peculiar

to, or belonging to in the sense of idiosyncratic in relation to the originating matrix.

6.1.1 Motivating examples

That said, lets get started with an example about centrality measures.

Example 6.1. Suppose you take facebook and draw lines between all friends. What is a good

way to measure how well-connected an individual is? It maybe isnt just how many friends

someone has, its also how important their friends are.

What if we assume that there are n people and each person has a certain importance score,

ci . To account for how important your friends are mattering, lets say this is proportional to its

neighbors scores. Another way to write this is as

1

ci = (sum of neighbors importance scores).

If ~c is everyones importance score, we can encode this with an equation involving the adjacency

matrix, A to be:

1

~c = A~c.

Thus, the importance scores can be found by solving A~c = ~c for and ~c. For example, if we

.41

1/2

.58

solve this for the two graphs below we get vectors (a) 2/2 and (b) .41 .

1/2 .52

.21

67

One more example: hippos.

Example 6.2. Suppose that a large hippo population exists in a wildlife preserve. Hippos

produce different average amounts of offspring at different times in their lives. In fact,

x1 5

We start with ~x = x2 of each stage hippo. Say ~x = 10 all in thousands.

x3 8

b1 0

~ ~

During stages s1 , s2 , s3 they produce b = b2 offspring on average. Lets say that b = 2.

b3 1

Suppose that the probability a hippo in s1 reaches s2 is .5, and the probability a hippo in

s2 reaches s3 is 1/5.

Does the population stabilize, go extinct, or grow exponentially? At what rate does this

happen?

Lets start by putting this data into a matrix:

b1 b2 b3 0 2 1

A = p1 0 0 = 21 0 0

0 p2 0 0 15 0

Notice that multiplying the age vector ~x by A gives the age distribution for the next gener-

ation:

57

5 28 7 2

A0 ~x = 10 , A1 ~x = 52 , A2 ~x = 14 , A3 ~x = 72 .

1 14

8 2 2 5

It would be nice to find a simpler formula for Ak ~x. That way we could make predictions

about the long term behavior of the hippo population.

Suppose we could solve the equation

A~u = ~u.

Ak ~u = k ~u.

Suppose further that there were three solutions (1 , ~u1 ), (2 , ~u2 ), (3 , ~u3 ) to this equation and

that the ~ui were linearly independent. Then we could write ~x as a linear combination of the ~ui .

This means that

68

~x = c1 ~u1 + c2 ~u2 + c3 ~u3 .

Now, if we multiply by A we have

Ak ~x = Ak (c1 ~u1 + c2 ~u2 + c3 ~u3 )

= c1 Ak ~u1 + c2 Ak ~u2 + c3 Ak ~u3

= c1 k1 ~u1 + c2 k2 ~u2 + c3 k3 ~u3 .

This decomposes the matrix multiplication into a simple formula involving just scalars. What

is eigen (peculiar) is that we can do this! We will say how soon, but for the hippo example we

have

1

1 = .95, ~u1 = .53

.11

1

2 = .10, ~u2 = 4.9

9.8

1

3 = 1.04, ~u3 = .48

.09

And, by solving a linear system we get

c1 11.8

c2 = .8 .

c3 16

This gives us a much simpler formula for Ak ~x:

1 1 1

Ak ~x = 11.8(.95)k .53 + .8(.10)k 4.9 + 16(1.04)k .48 .

.11 9.8 .09

Notice that because 1 , 2 < 1 they go to zero as k gets large. So, we see that the stable

equation is (note we multiply 16~u1 ):

16

Ak ~x 1.1k 7.68 .

1.44

This tells us the rate of growth, 1.04k , as well as the stabilizing population. For example,

16

16+7.68+1.44 .64 says that 64% of the hippos are young in the 1000th generation.

69

The calculation for the hippo example was done on a computer. The one for the centrality

scores is at the end of this section.

In the above examples, for our square matrix A, we heavily relied on knowing scalars

nonzero vectors ~u such that A~u = ~u. We wish to develop a method for finding these for any

square matrix.

Definition 6.3. A vector ~u is an eigenvector of A if there exists some scalar such that

A~u = ~u. We then call an eigenvalue of A. Finally, the set of all eigenvectors corresponding

to a given is called the eigenspace of .

Fact 6.4. If ~u is an eigenvector of a matrix with eigenvector , then so is any scalar multiple

of ~u.

3 5 5

Example 6.5. If A = , verify that ~u = is an eigenvector of A with corresponding

4 2 4

eigenvalues = 7.

To verify this, we need to check the A~u = 7~u. But this is true:

3 5 5 35 5

A~u = = =7 = 7~u.

4 2 4 28 4

We want to find and ~u such that

A~u = ~u.

Subtracting ~u from both sides, we get

A~u ~u = ~0.

(A I)~u = ~0.

Rearranging the eigenvalue problem in this way, were looking for values of such that the

equation

(A I)~x = ~0

has a nonzero solution ~x = ~u. By the Big Theorem, this equation has a nonzero solution if and

only if A I is a singular matrix, ie. det(A I) = 0.

By rearranging the problem in this way, we see that we can find eigenvalues of A by deter-

mining when det(A I) = 0. So we are solving for the zeros of a polynomial in .

So we find the eigenvalues of A by setting its characteristic polynomial to 0 then solving for

.

70

Example 6.7. Find the eigenvalues for the matrix

1 0 1

A = 0 1 2

0 4 3

1 0 1

1 2

det(A I) = 0

1 2 = (1 )

+0+0

4 3

0 4 3

= (1 )((1 )(3 ) 8)

= (1 )(2 4 5)

= (1 )( 5)( + 1)

Once we know the eigenvalues of A, we can find eigenvectors corresponding to an eigenvalue

as nonzero solutions to (A I)~x = ~0.

Fact 6.8. The eigenspace of is then seen to be the set of all solutions to (A I)~x = ~0. It it

is therefore the nullspace of A I, and a subspace of Rn .

3 3

Example 6.9. Find the eigenvalues and a basis for each eigenspace of the matrix A = .

6 4

We first want to find eigenvalues of A, which we do by finding such that det(A I) = 0.

3 3 1 0 3 3

A I = = .

6 4 0 1 6 4

Now we compute

We can find the corresponding eigenvectors (and eigenspaces) by solving the equation (A

I)~x = ~0. For 1 = 5,

2 3

A 5I =

6 9

and we solve the equation by reducing

2 3 0 1 3/2 0

[A 5I|~0] = .

6 9 0 0 0 0

The solutions to this equation are the eigenspace corresponding to 1 = 5, so the eigenspace is

the set of vectors of the form

3/2

s .

1

71

A basis for this eigenspace would be

3/2

.

1

Note that although ~0 is in the eigenspace, eigenvectors are required to be nonzero, so any value

s 6= 0 would give an eigenvector corresponding to 1 = 5.

~ 9 3 0 1 1/3 0

[A + 6I|0] =

6 2 0 0 0 0

meaning the eigenspace corresponding to 2 = 6 (the solutions to (A + 6I)~x = ~0) is the set of

vectors of the form

1/3

s .

1

One basis for this eigenspace would be

1/3

.

1

1 2 1

Example 6.10. Find the eigenvalues and a basis for each eigenspace of A = 1 0 1.

1 2 3

First we need to find the characteristic polynomial, det(A I). But,

1 2 1

det(A I) = det 1 1 = 3 + 42 4 = ( 2)2 .

1 2 3

This equals zero when is 0 or 2, so the eigenvalues are 1 = 0 and 2 = 2. To find the

corresponding eigenspaces, we want to find all solutions to (A I)~x = ~0.

1 2 1 0 1 0 1 0

[A 0I|~0] = 1 0 1 0 0 1 1 0 .

1 2 3 0 0 0 0 0

1

span 1 .

1

1 2 1 0 1 2 1 0

[A 2I|~0] = 1 2 1 0 0 0 0 0 .

1 2 1 0 0 0 0 0

72

The solutions to this system (the eigenspace corresponding to 2 = 2) are given by vectors of

the form

1 2

s1 0 + s2 1 ,

1 0

so a basis for the eigenspace of 2 = 2 is

1 2

0 , 1 .

1 0

In the above example, the characteristic polynomial was ( 2)2 . Because of the square,

2 = 2 appeared as a root twice.

Definition 6.11. The number of times an eigenvalues appears as a root of the characteristic

polynomial is called the multiplicity of .

Fact 6.12. The dimension of the eigenspace associated to is always less than or equal to the

multiplicity of (in the above example, it was actually equal).

2 0 0 0

1 2 0 0 2 2

Example 6.13. Let A = 0 1 3 0. This has characteristic polynomial ( 2) ( 3) ,

0 0 1 3

and so has eigenvalues 2 and 3 each with multiplicity 2.

Lets look at the eigenspace for = 2. This is the nullspace of

0 0 0 0 1 0 0 0

1 0 0 0 0 1 1 0

0 1 1 0 0 0 1 1 .

0 0 1 1 0 0 0 0

0

0

x4

= span 1 . So the nullspace has dimension 1, despite the

This has nullspace x4 1

x4 1

fact that = 2 has multiplicity 2.

Another thing to be aware of when calculating eigenvalues and eigenspaces is that a matrix

need not having solely real eigenvalues.

0 1

Example 6.14. Let B = , the matrix corresponding to a rotation by 2 in R2 . Thinking

1 0

about it geometrically, rotating a non-vector by /2 will never yield a scalar multiple of that

vector. Therefore our intuition is that this (invertible) matrix should have no eigenvectors.

1

Actually working it out, the characteristic polynomial of B is = 2 + 1. But then

1

2 + 1 = 0 implies that = 1 = i.

So we see that matrices can have i maginary or complex eigenvalues. However, any B is real-

valued, so multiplying it by any ~x Rn will result in another vector in Rn . So any eigenspaces

corresponding these eigenvalues will not be subspaces of Rn .

73

6.1.5 Remarks about Eigenvalues

Things to keep in mind when finding eigenvalues and eigenvectors:

1. If A is an n n matrix, the characteristic polynomial det(A I) should always be a

degree n polynomial (you get n s from the n terms on the diagonal).

3. If A is a triangular matrix, the eigenvalues of A are just the entries on the diagonal.

4. When youre finding eigenvectors and reducing [A I|~0], you should ALWAYS get free

variables and rows of zeros because weve chosen so that A I is a singular matrix. If

you dont get free variables and rows of zeros, something went wrong in your calculation!

5. Because eigenvalues are all scalars such that A I is a singular matrix, we can say

that A itself is a singular matrix if and only if 0 is an eigenvalue of A. But, this means we

can add something to the Big Theorem!

Theorem 6.15 (Big Theorem). Let S = {~a1 , . . . , ~an } be a set of n vectors in Rn , and let

A = [~a1 . . . ~an ]. Let T : Rn Rn be the linear transformation T (~x) = A~x. Then the following

are equivalent:

1. S spans Rn .

2. S is linearly independent.

4. T is onto.

5. T is one-to-one.

6. ker(T ) = {~0}.

7. range(T ) = Rn .

8. null(A) = {~0}.

9. col(A) = Rn .

10. row(A) = Rn .

11. nullity(A) = 0.

12. rank(A) = n.

14. det(A) 6= 0.

Here is the facebook example calculations from earlier:

74

Figure 3.2: Eigenvector Centrality Example.

Example 3.2. For the graph shown in Figure 3.2(a), the adjacency matrix is

2 3

66 0 1 0 77

6 7

A = 666 1 0 1 777 . (3.10)

4 5

0 1 0

Based on Equation 3.9, we need to solve Ce = ACe , or

(A I)Ce = 0. (3.11)

Assuming Ce = [u1 u2 u3 ]T ,

2 32 3 2 3

66 0 1 0 77 66 u1 77 66 0 77

66 1 0 1 777 666 u2 777 = 666 0 777 .

66 75 64 75 64 75 (3.12)

4

0 1 0 u3 0

Since Ce , [0 0 0]T , the characteristic equation is

0 1 0

det(A I) = 1 0 1 = 0, (3.13)

0 1 0

or equivalently,

2 3 2

( )( 1)

1( ) = 2 = (2 ) = 0. (3.14)

p p p

So the eigenvalues are ( 2, 0, + 2). We select the largest eigenvalue: 2.

We compute the corresponding eigenvector:

2 p 32 3 2 3

66 0 2 1p 0 77 6 u1 7 6 0 7

66 77 66 77 66 77

666 1 0 2 1p 777 666 u2 777 = 666 0 777 . (3.15)

64 75 4 5 4 5

0 1 0 2 u3 0

77

Assuming Ce vector has norm 1, its solution is

2 3 2 3

66 u1 77 666 p1/2 777

6 7

Ce = 666 u2 777 = 6666 2/2 7777 , (3.16)

4 5 4 5

u3 1/2

which denotes that node v2 is the most central node and nodes v1 and v3 have equal

centrality values.

Example 3.3. For the graph shown in Figure 3.2(b), the adjacency matrix is as

follows:

2 3

66 0 1 0 1 0 77

66 7

66 1 0 1 1 1 777

66 7

A = 666 0 1 0 1 0 7777 . (3.17)

66 1 1 1 0 0 77

66 77

4 5

0 1 0 0 0

The eigenvalues of A are ( 1.74, 1.27, 0.00, +0.33, +2.68). For eigenvector

centrality, the largest eigenvalue is selected: 2.68. The corresponding eigenvector

is the eigenvector centrality vector and is

2 3

66 0.4119 77

66 77

66 0.5825 77

6 77

Ce = 6666 0.4119 77 .

77 (3.18)

66 0.5237 77

66 75

4

0.2169

A major problem with eigenvector centrality arises when it considers di-

rected graphs (see Problem 1 in the Exercises). Centrality is only passed

on when we have (outgoing) edges, and in special cases such as when a

node is in a directed acyclic graph, centrality becomes zero, even though

the node can have many edges connected to it. In this case, the problem

can be rectified by adding a bias term to the centrality value. The bias term

is added to the centrality values for all nodes no matter how they are

situated in the network (i.e., irrespective of the network topology). The

78

Heres one more example with the Fuzzy Ball Graph:

Example 6.16.

This has eigenvector [1.0, 1.0, 1.0, 1.0, 1.0, 0.96, 0.95, 0.95, 0.95, 0.95, 0.54, 0.1]T . So we see that

five of the nodes are best connected and the rest have lower scores.

6.2

SKIP

6.3.1 Motivation

Before we get into the technical details of this section, lets first revisit the idea of a basis and

what wed use a basis for. If W is a subspace of Rn , then remember that a basis is a linearly

independent spanning set for W . Why would we want this? Remember the main theorem about

bases: if B = {~u1 , . . . , ~um } is a basis for W , then any vector w ~ in W can be written as a linear

combination

~ = c1 ~u1 + + cm ~um

w

~ in the basis B. In some

in a unique way. We call the scalars c1 , . . . , cm the coordinates of w

sense, were imposing a coordinate system on the subspace W where each basis vector acts like

a coordinate axis.

Were used to seeing coordinates in the standard basis. For example, if {~e1 , ~e2 } are the

standard basis vectors for R2 , then any vector in R2 can be written as a linear combination of

these coordinates in a unique way, namely

a 1 0

=a +b .

b 0 1

a

This tells us the coordinates of the vector in the standard basis are just (a, b). This makes

b

sensewe go a units in the direction of ~e1 (the x-direction) and b units in the direction of ~e2 (the

y-direction) to get to the point (a, b).

77

Now, using other bases, we just want to extend this idea to other subspaces. For example,

if I have some plane in R3 , can we find some vector that acts like the x-axis and some vector

that acts like the y-axis? Well, if we have a basis for that plane, we will be able to impose a

coordinate system.

We are also interested in extending our ideas of coordinate systems beyond the standard

basis because some bases are much easier to work with than others. For example, if we have an

n n matrix A and a basis B = {~u1 , . . . , ~un } of Rn consisting of eigenvectors of A, then it is very

easy to compute A~x for any vector ~x in Rn . Why? Well, suppose the eigenvectors {~u1 , . . . , ~un }

correspond to the eigenvalues 1 , . . . , n . Then, we know A~ui = i ~ui . So, if we can find scalars

c1 , . . . , cn such that

~x = c1 ~u1 + + cn ~un ,

then

A~x = c1 A~u1 + + cn A~un = c1 1 ~u1 + + cn n ~un .

This becomes very useful when were trying to find higher powers of A; it says

Ak ~x = c1 Ak ~u1 + + cn Ak ~un = c1 k1 ~u1 + + cn kn ~un .

So, instead of multiplying ~x by A k times, we just have to multiply the eigenvectors by powers

of scalars and add them up. This is much easier to do!

We can investigate some of these ideas further in the future, but for now, thats a bit of

motivation of why wed want to be able to find a basis and then change to a different basis.

Now well move on to the computational side of how to change to a new basis. We restrict our

attention to changing between bases of Rn .

6.3.2 Changing it

Notation 6.17. If B = {~u1 , . . . , ~un } is a basis for Rn , and ~x is a vector in Rn , we write

c1

..

~xB = .

cn B

~x = c1 ~u1 + + cn ~un .

So, when we write

c1

..

~xB = .

cn B

we DO NOT mean that ~x is equal to the vector with components c1 , . . . , cn . i.e we DO NOT

mean

c1

..

~x = . .

cn

78

Instead, we mean that ~x is a linear combination of the basis vectors in B with coefficients given

by c1 , . . . , cn .

2 1 2 4

Example 6.18. Let B be the basis , for R . If ~xB = , find .

1 3 5 B

4

The notation ~xB = means that

5 B

2 1 3

~x = 4 5 = .

1 3 19

Notation 6.19. When we write ~x with no subscript,

c1

n ..

the standard basis for R . In other words, ~x = . with no subscript actually means that ~x is

cn

the vector whose components are given by c1 , . . . , cn .

Example 6.20. Brian is all about smoothies. His favorite ingredients by far are banana, avocado

and tofu.

~b ~a ~t

Protein 1 2 5

.

Fat 1 15 3

Carb 22 1 2

These vectors form a basis for the set of macro nutrients R3 . Brian likes these ingredients so

much he would rather express macronutrients in terms of B = {~b, ~a, ~t} than in the standard

m1

basis. So instead of making a smoothie with a macro-profile m ~ = m2 , he would like to write

m3

r1

m

~ B = r2 where r1 , r2 and r3 describe the amount of banana, avocado and tofu he needs to

r2 B

put in his smoothie. Letting B = [~b ~a ~t], the equation m

~ = ~rB is the same as:

m

~ = Bm

~ B.

B 1 m

~ =m

~ B.

Note that

27 1 69

1

B 1 = 64 108 2 .

1490

329 43 13

79

So, if his partner

wants

to go get fried oysters later, Brian can understand the macro composition

13

of oysters ~o = 12 in terms of his favorite foods, ~oB by multiplying

9

111 0.1

1

N 1~o = 68 0.1 .

745

1932 2.6

So if we take 10g of banana (i.e. -.1 100g servings of banana), 10g of avocado and 260g of

avocado we end up with the same macros as fried oysters.

Recap.

If B = {~u1 , . . . , ~un } is a basis for Rn , and ~x is a vector in Rn , then we can find scalars

c1 , . . . , cn such that

~x = c1 ~u1 + + cn ~un .

However, this says

c1

..

~xB = . .

cn B

If U is the matrix whose columns are the basis vectors, A = [~u1 . . . ~un ], then we can rewrite the

first equation as

~x = U~xB .

In words, this is saying that multiplying the coordinate vector ~xB of ~x in the basis B by the

matrix U gives us ~x, the coordinate vector of ~x in the standard basis S. Hence, we call U the

change of basis matrix from B to the standard basis.

What if were given a vector in the standard basis, and we want to find ~xB ? Well, because

B is a basis, the matrix U is invertible, so we can solve for ~xB by multiplying both sides of the

equation above by U 1 to get the equation

~xB = U 1 ~x.

Here are a few more examples.

3 5 2

Example 6.21. Let B = {~u1 , ~u2 } = , . Find ~x if ~xB = .

1 2 1 B

We can do this problem using the formula ~x = U~xB . We first set U = [~u1 ~u2 ]. Then,

3 5 2 1

~x = U~xB = = .

1 2 1 0

1

Example 6.22. Let B be as in Example 2. Find ~xB if ~x = .

1

80

3 5

We do this problem using the formula ~xB = U 1 ~x. We know U = , so U 1 =

1 2

2 5

. Then,

1 3

1 2 5 1 3

~xB = U ~x = = .

1 3 1 2 B

Now we want to find the change of basis matrix from B1 to B2 , where B1 and B2 are any bases

for Rn . Let B1 = {~u1 , . . . , ~un } and B2 = {~v1 , . . . , ~vn }. Let U be the matrix U = [~u1 . . . ~un ] and

V be the matrix V = [~v1 . . . ~vn ]. From the formulas above, we know ~x = U~xB1 and ~xB2 = V 1 ~x.

Substituting the first equation into the second, we see that

~xB2 = V 1 U~xB1

~xB1 = U 1 V ~xB2

Lets make it real with a smoothie example.

Example 6.23. Recall Brian just wants to think of macro nutrients in terms of avocado, banana

and tofu. His friend Adrian only thinks in terms of coconut milk, yogurt and pineapple. These

have macro profiles:

~b ~a ~t

~c ~y p~

Protein 3 9 2 Protein 1 2 5

A= Fat

, B= .

24 5 1 Fat 1 15 3

Carb 1 4 6 Carb 22 1 2

One canreadily check that A = {~c, ~y , p~} is a basis. We already saw how to compute fried oysters,

13

~o = 12, in Brians basis, ~oB . Say, that Brian calls up Adrian and asks him about the macros

9

in his fried oyster meal. If Brian tells Adrian ~oB , how does Adrian translate to his basis A?.

That is how do we relate ~oB and ~oA ?

First, we compute some inverses

27 1 69 26 46 1

1 1

B 1 = 64 108 2 , A1 = 143 16 45

1490 1027

329 43 13 91 3 201

Notice that

A1~o = ~oA , and B~oB = ~o.

81

It follows that

A1 B~oB = ~oA .

This is our conversion formula. If youre curious we have

6.2

~oA 9.8 .

55.3

1 1 2 1 1 2

Example 6.24. Let B1 = 1 , 4 , 1

and let B2 = 0 , 3 , 1 . Find ~xB1 if

3 2 6 1 0 2

3

~xB2 = 2 .

1 B

2

1 1 2 1 1 2

First set U = 1 4 1 and V = 0 3 1. We know ~xB1 = U 1 V ~xB2 so we need to

3 2 6 1 0 2

1

find U . This is

22 2 7

U 1 = 3 0 1 ,

10 1 3

so

22 2 7 1 1 2 3 129

~xB1 = U 1 V ~xB2 = 3 0 1 0 3 1 2 = 16 .

10 1 3 1 0 2 1 B 60 B

2 1

82

6.4 Diagonalization

6.4.1 How to diagonalize

Suppose A is an n n matrix with eigenvectors B = {~u1 , . . . , ~un } that form a basis for Rn . Let

1 , . . . , n be the corresponding eigenvectors. If ~x is any vector in Rn , because these eigenvectors

form a basis, we can write

~x = c1 ~u1 + + cn ~un

or

c1

..

~xB = . .

cn B

Because these are eigenvectors, if we multiply the first equation by A, we get

or

1 c1

(A~x)B = ... .

n cn B

But, if D is the matrix

1 0 ... 0

..

0 2 .

D=

.. ..

,

. . 0

0 ... 0 n

we have just derived the formula

1 c1 c1

.. ..

(A~x)B = . = D . = D~xB .

n cn cn

~xB = P 1 ~x

A = P DP 1 .

Definition 6.25. We say that an nn matrix A is diagonalizable if we can find these matrices

P and D (where D is a diagonal matrix) such that A = P DP 1 .

83

In the situation outlined above, we see that we can find P and D by finding a basis for Rn

consisting of eigenvectors {~u1 , . . . , ~un } of A with corresponding eigenvalues 1 , . . . , n . Then

P = [~u1 . . . ~un ] and D is the diagonal matrix with the eigenvalues on the diagonal.

3 1

Example 6.26. Find matrices P and D to diagonalize A = .

2 0

First, we want to find the eigenvalues and eigenvectors of A. A computation shows that

the eigenvalues

are

1 = 1 and 2 = 2, and an eigenvector corresponding to 1 = 1 is the

1 1

vector ~u1 = . Similarly, an eigenvector corresponding to 2 = 2 is the vector ~u2 = .

2 1

1 1 1 0

Then, the vectors {~u1 , ~u2 } form a basis for R2 , so we set P = and D = . Then,

2 1 0 2

A = P DP 1 .

Remark. Not every matrix is diagonalizable. In order to do this computation, we need to find

a basis for Rn consisting of eigenvectors of A, and we cannot always do this. We can if either:

6.4.2 Applications

The main application of this is the following formula:

Dk is obtained by raising each entry of D to the kth power.

For example, (P DP 1 )2 = P DP 1 P DP 1 = P DP 1 .

Example 6.28 (Hippos). Recall that the hippo population model had a matrix:

b1 b2 b3 0 2 1

1

A = p1 0 0 =

2 0 0

0 p2 0 0 15 0

This has eigenvalues and vectors:

1

1 = .95, ~u1 = .53

.11

1

2 = .10, ~u2 = 4.9

9.8

1

3 = 1.04, ~u3 = .48

.09

84

.95 0 0

Let P = [~u1 ~u2 ~u3 ] and D = 0 .10 0 . As we take larger and larger powers Ak

0 0 1.04

0 0 0

we get the matrix Dk starts to look like Dk 0 0 0 .

0 0 1.04k

One final example. Remember that e 2.714 is a ubiquitous number in math. This number

mysteriously comes up all the time. For example,

p

e = lim n LCM(1, 2, . . . , n),

n

and

1

e 1 = 0.

Heres two from probability. If n people are at the dog park, the dogs run around, then eventually

return to a random owner, we will have n/e dogs return to their actual owner. Or, if you interview

n candidates for a job, and must make the hiring decision immediately after each interview, you

should interview n/e of the candidates then hire the next best one to come along (called the

secretary problem, wiki it).

Another way to define ex from Math 126 is as an infinite sum:

x0 x1 x2 xk

ex = + + + + + .

0! 1! 2! k!

We can define the exponential of a matrix in the same way.

A0 A1 A2 Ak

eA = + + + + + .

0! 1! 2! k!

This is hard in general. Computers cant even do it. But it becomes tractable for diagonal

matrices:

e 1 0 0

0 e 2

A 1

e =P . P

. .

. .

. .

.

. . . .

0 0 0 en

3 1

Example 6.30. Find eA for A = .

2 0

1 1 1 1

This is the example from just before. We have P = , P 1 = 1

3 , and

2 1 2 1

1 0

D= . By the fact we have

0 2

2

e2 + 13 e 1

e2 13 e

1

e 0 1

eA = P P = 3

2

3 .

0 e2 3 e2 23 e 1

3 e2 + 23 e

85

Note you can also write things like etA or 3A = elog(3)A . It is possible to check from the

d tA

definition that dt e = AetA . We can use this to use to solve systems of differential equations.

S(t)

Example 6.31. Suppose that ~x = is a function of t and represents people at different

F (t)

stages of having a facebook account. Let S(t) denote number of people that are susceptible to

making a facebook, but not yet on it. Let F (t) denote number of people who have an account.

We assume:

The fraction of the susceptible who make facebooks per unit time (say decades) is propor-

tional to the number with facebooks, b is the proportional number.

A fixed fraction rS of the facebook population deletes their facebook, but still is susceptible

to making a new account.

Another proportion dI vows to never use facebook again with 0 d 1 r.

The system of differential equations that model this phenomenom are

S 0 = bS + rF

I 0 = bS rF dF

S(t)

Letting ~x(t) = . The above looks like matrix multiplication. We can express this as

F (t)

d b r

~x(t) = ~x(t).

dt b r d

S(0)

Call the above matrix A. Suppose that ~x0 = . The solution to the above is given by

I(0)

For the concrete case r = 4/5, b = 2/3, d = 1/10 we have

2/3 4/5

A=

2/3 9/10

has eigenvalues and eigenvectors:

1 1

1.52, , 0.043, .

1.07 0.78

If we diagonalize and exponentiate we get

tA

e = .

0.45 e1.52t + 0.45 e.043t 0.57 e1.52t + 0.42 e.043t

4

Now if we plug in initial conditions of say ~x0 = (in billions) we can estimate when no

1

one is on facebook anymore by solving for when the second coordinate of etA ~x0 is equal to 0.

86

This amounts to multiplying the second row of A with ~x0 , setting equal to 0, and solving for

t

4(0.45 e1.52t + 0.45e.043t ) + 1(0.57e1.52t + 0.42e.043t ) = 0.

A solver numerically estimates this at t = 1.31 decades.

87

- Static Condensation (Guyan Reduction)Uploaded byGeorlin
- A2T-OlympicRegressionProjectUploaded byNicholas Yates
- EFS SurveyUploaded byIndrit Bulica
- Mass, Stiffness, And Damping Matrix Estimates From Structural MeasurementsUploaded bycarlos0094
- Algebra Plus Ch04 EnUploaded byYavuz Keleş
- c52bf71f-de8f-4b9f-9daf-704310f63bd6XII SCUploaded byRohit Debnath
- computation of electromagnetic transients DommelUploaded byMau Gar Ec
- Tutorial 1 SolutionsUploaded byAnonymous CELMtG6
- Excel TutorUploaded byZyldjyh C. Pactol-Portuguez
- qwrqwrUploaded byAmin Ali Hussin
- MIT18 06S10 Final ExamUploaded byAbdul Rahim
- 10th MathsUploaded byYou Are Not Wasting TIME Here
- Machine LearningUploaded byManab Chetia
- 233 Fall14 Practice FinalUploaded byBenBilloReiner
- Asa006 Prb OutputUploaded bybp
- MAT517 Numerical Linear Algebra Assign1Uploaded by寒雨
- warwick linear algebra innaUploaded byRushil Gholkar
- systems_des.pdfUploaded bydyaataha7902
- Exam2002Uploaded byapi-19919765
- 1702.06856Uploaded byTran Cuong
- Linear Systems Ch5 [Compatibility Mode]Uploaded byUmair Zulfiqar
- Linear AlgebraUploaded byNovember Chick
- 6System of EquationsUploaded bySomanshu Mishra
- Fd Discret IzationUploaded bycartamenes
- Excel TutorUploaded byRavish Verma
- Ahp Different Priority MethodUploaded byAmit Kapoor
- Sol2Uploaded byArkadebSengupta
- hw4Uploaded byNayim Mohammad
- CE306 Computer Programming and Computational TechniquesUploaded byEldho P JAMES
- Adelman, Lester, Rogers - 1973 - A Finite Element for Thermal Stress Analysis of Shells of RevolutionUploaded byDEEPAK

- CV ExampleUploaded byRui Gavina
- 123Uploaded byVenkat Macharla
- test.docxUploaded byjjjj
- Soil dataUploaded byVenkat Macharla
- Print 123Uploaded byVenkat Macharla
- CIVL2510 Review QuesUploaded byVenkat Macharla
- volume of spiral.pdfUploaded byVenkat Macharla
- A spiral.pdfUploaded byVenkat Macharla
- ConservationUploaded byVenkat Macharla
- Hardness.pdfUploaded byVenkat Macharla
- EPA Stand.pdfUploaded byVenkat Macharla
- FCI.pdfUploaded byVenkat Macharla
- appsc aee syllabus.pdfUploaded bySHAIK AJEES
- AUMECH-RAME-408.pdfUploaded byVenkat Macharla
- Complete Solutions to Selected ProblemsUploaded byVenkat Macharla
- 8_48.pdfUploaded byVenkat Macharla
- Solution Collins Cap 8Uploaded byHamid Masood
- 335595224-Detection-of-Crack-in-Cantilever-Beam-Using-Ann-and-Wavelet-Transform.pdfUploaded byVenkat Macharla
- Cengel Fluid Mechanics 6 Edition.pdfUploaded byVenkat Macharla
- Solution Collins Cap 8Uploaded byVenkat Macharla
- Aumech Rame 107Uploaded byVenkat Macharla
- The Validity of the Modification Factor Specified in the ACI 318Uploaded byVenkat Macharla
- c.pdfUploaded byVenkat Macharla
- Bifurcation-of-Andhra-Pradesh.pdfUploaded byVenkat Macharla
- 123civil.pdfUploaded byVenkat Macharla
- Water and Wastewater Engineering - CopyUploaded byVenkat Macharla

- An b Earth Retaining Bs 8002 Calc sUploaded byZayyan Romjon
- unit 3 family letterUploaded byapi-292860997
- Control L4Uploaded byGaurav Nagar
- The Essentials of Object Oriented Php SampleUploaded byEhrlichmann Casas Palacios
- Hw 3 SolutionUploaded byEddie Martinez Jr.
- Process Mapping and ManagementUploaded byBusiness Expert Press
- Running Faster Than NyquistUploaded byHossain Mohammad Mahbub
- IH Math Quiz_Grade 3-5_Novmber 3_2017Uploaded byJay Sharma
- Mass Proportional Damping in Nonlinear Time-history AnalysisUploaded byEvgeny Shavelzon
- AP Physics B - Projectile MotionUploaded byBaltazar Marcos
- 10 Mathematics Impq Sa 1 3 Pair of Linear Equations in Two VariablesUploaded byShubham Sharma
- Flash CalculationUploaded byHamidreza Hasani
- Downing Simple Rainflow Counting AlgorithmsUploaded byridzim4638
- B1 MathematicsUploaded byGourav Som
- Enhanced Document Clustering Using K-Means With Support Vector Machine (SVM) ApproachUploaded byEditor IJRITCC
- Mec411 Nl Chapter1.1Uploaded bynoaBawje
- SSC CHSL Exam SyllabusUploaded byAntony Raja
- Solutions to Problem Set 1Uploaded bywagdy87
- Grade 3 Math TestUploaded bysemone_83
- Exp_2 simple pendulam g determinationUploaded byNahid Hasan
- Advanced-Excel-Formulas-xls.xlsUploaded byballagabi
- DaaUploaded bylaxmikant
- chapter5-2.pptUploaded byAnonymous igVRM2mA6k
- Minitab 14 - StudentUploaded byJaime Luis Moncada
- Skills Sheet F12 - MatricesUploaded byAlice Yip
- Eee v Management and Entrepreneurship [10al51] NotesUploaded byVijayaraghavan Aravamuthan
- Gr 05 MathUploaded byroh009
- Bank DomainUploaded byUma Maheshwar
- Simulation_Choosing_Solver.pdfUploaded byNarendra Palande
- Influence of the distance in a roundhouse kick’s execution time andUploaded byThumrongkul Detcharoen