You are on page 1of 23

ECON 5100

4. Linear Algebra
Like real analysis, linear algebra is essentially also a study of Rn . But
rather than viewing elements in Rn as “points in space” like real
analysis does, linear algebra views those elements as “arrows”.

What’s useful in viewing things as arrows?


1. We can “add” elements to create a new element, just like
combining two arrows to create a new arrow.
2. We can “scale” an element to create a new element, just like
stretching or squeezing an arrow to create a new arrow.
Vector space

Definition. A set V along with a vector addition “ + ” : V × V → V


and a scalar multiplication “ · ” : R × V → V are jointly called a
vector space (in which elements are called vectors) if and only if:
1. v + u = u + v
2. (v + u) + w = v + (u + w) ∀v, u, w ∈ V
3. There exists a zero vector in V (denoted as 0) such that
0 + v = v ∀v ∈ V
4. For each v ∈ V there exists u ∈ V such that v + u = 0
5. a(b · v) = (ab) · v ∀v ∈ V , ∀a, b ∈ R
6. 1 · v = v ∀v ∈ V
7. a · (v + u) = a · v + a · u ∀v, u ∈ V ∀a ∈ R
8. (a + b) · v = a · v + b · v ∀v ∈ V ∀a, b ∈ R

I It is important to distinguish between vector addition and


numerical addition, and between scalar multiplication and
numerical multiplication.
Proposition 4.1 (Cancellation rule of vector addition) Given a vector space
(V , +, ·) and u, v, w ∈ V , if u + v = w + v, then u = w.

Proof. Fix u, v, w such that u + v = w + v. Let x be the vector such that


v + x = 0. Thus u = u + 0 = u + (v + x) = (u + v) + x = (w + v) + x
= w + (v + x) = w + 0 = w.

Proposition 4.2 Given a vector space (V , +, ·):


1. There is a unique zero vector 0
2. For each v ∈ V , there is a unique u ∈ V such that v + u = 0
3. a · 0 = 0 ∀a ∈ R
4. 0 · v = 0 ∀v ∈ V
5. If a · v + b · u = 0 where a 6= 0, then v = (−b/a) · u

Proof. (1) If 0 + v = v = 0̂ + v, then 0 = 0̂ by Proposition 4.1. (2) Suppose


v + u = 0 = v + w, then u = w by Proposition 4.1. (3) a · u + a · 0
= a · (u + 0) = a · u = a · u + 0, thus a · 0 = 0 by Proposition 4.1. (4)
v + 0 = v = 1 · v = (1 + 0) · v = 1 · v + 0 · v = v + 0 · v, thus 0 · v = 0 by
Propossition 4.1. (5) a · v + b · u = 0 → (1/a) · (a · v + b · u) = (1/a) · 0 = 0
→ v + (b/a) · u = 0 → v + ((b/a) · u + (−b/a) · u) = 0 + (−b/a) · u → v + 0 · u =
(−b/a) · u → v = (−b/a) · u
Example 1. A familiar vector space: (Rn , +, ·)
I Rn = {(x1 , ..., xn ) : x1 , ..., xn ∈ R}

I Vector addition:

(x1 , ..., xn ) + (y1 , ..., yn ) = (x1 + y1 , ..., xn + yn )

I Scalar multiplication:

a · (x1 , ..., xn ) = (ax1 , ..., axn )

Example 2. A less familiar vector space: (F, +, ·)


I F = {f : R → R} (the set of all single variable real functions)

I Vector addition:

f + g = h where h(x) := f (x) + g(x) ∀x ∈ R

I Scalar multiplication:

a · f = k where k(x) = af (x) ∀x ∈ R

(Exercise: verify that (F, +, ·) is indeed a vector space.)


Definition. Given a vector space (V , +, ·), (W , +W , ·W ) is said
to be a subspace of (V , +, ·) if and only if
1. (W , +W , ·W ) is also a vector space
2. W ⊂ V
3. v +W u = v + u ∀v, u ∈ W
4. a ·W v = a · v ∀v ∈ W ∀a ∈ R

3 and 4 say that +W /·W are identical to +/· when restricted to


W . We can thus simply write (W , +, ·) instead of (W , +W , ·W ).

Proposition 4.3. (W , +, ·) is a subspace of (V , +, ·) if W ⊂ V


and
1. W is closed under +: u + v ∈ W ∀u, v ∈ W
2. W is closed under ·: a · v ∈ W ∀v ∈ W , ∀a ∈ R

(Proof left as exercise)


→ ({0}, +, ·) is the only subspace of (R, +, ·).
Linear Combination

Definition. Given a vector space (V , +, ·), v ∈ V is said to be a


linear combination of u1 , ..., un ∈ V with coefficients
a1 , ..., an ∈ R if

v = a1 · u1 + ... + an · un .

Definition. Given a vector space (V , +, ·) and a nonempty set of


vectors S ⊂ V , the span of S is the set of all vectors which are
linear combinations of vectors in S.

I span(S) has all vectors that can be achieved by all possible


combinations of vectors in S.

Proposition 4.4 Given a vector space (V , +, ·) and a nonempty set


of vectors S ⊂ V , (span(S), +, ·) is a subspace of (V , +, ·).

An implication of Proposition 4.3. Proof left as exercise.


Definition. Given a vector space (V , +, ·), a finite set of
vectors {v1 , ..., vn } ⊂ V are said to be linearly independent
if a1 · v1 + ... + an · vn = 0 implies a1 = ... = an = 0.

If a set of vectors are not linearly independent, they are said to


be linearly dependent.
e.g. {(0, 1), (1, 0)} are linearly independent in (R2 , +, ·).

Proposition 4.5 Given a vector space (V , +, ·), a finite set of


vectors S are linearly independent if and only if no vector in
S is a linear combination of the other vectors in S.

I Useful in checking linear independence.


→ S ⊂ R2 is linearly dependent if |S| > 2 and (0, 1), (1, 0) ∈ S.

Proof. “If”: Suppose no vector in S = {v1 , ..., vn } is a linear combination of the


other vectors in S. Let a1 · v1 + ... + an · vn = 0. If vectors inS arenot linearly
a
independent, i.e. ai 6= 0 for some i, then we have vi = j6=i − aj · vj , so vi is
P
i
a linear combination of the other vectors in S, a contradiction.
P
“Only if”: Suppose vectors in S are linearly independent. If vi = j6=i aj · vj for
P
some coefficients (aj )j6=i , then j6=i aj · vj + (−1) · vi = 0, a contradiction.
Basis

Definition. Given a vector space (V , +, ·), S ⊂ V is its basis if


1. V = span(S)
2. Vectors in S are linearly independent

I A vector space can have many bases. The following are all bases of
(R2 , +, ·):
{(0, 1), (1, 0)} {(0, 2), (3, 0)} {(1, 1), (1, −1)}

I But all bases have the same size:

Proposition 4.6 (Dimension Theorem) If (V , +, ·) has a finite basis, then


every basis has the same number of vectors.

(Proof omitted)
I This unique size of bases is called the dimension of the vector space.
I That’s why we call (Rn , +, ·) “n-dimensional”.
I Any linearly independent set of n vectors in Rn is a basis.
I Canonical basis of Rn : {e1 , ..., en } where ei = (0, ..., 1 , .., 0)
|{z}
ith entry
Proposition 4.7 If (V , +, ·) has dimension n, then:
1. If V = span(S) for some S ⊂ V , then |S| ≥ n.
2. If vectors in S ⊂ V are linearly independent and |S| = n,
then S is a basis of (V , +, ·).
3. If vectors in S ⊂ V are linearly independent and |S| < n,
then there exists a basis S0 where S ⊂ S0 .
4. If (W , +, ·) is a subspace of (V , +, ·), then its dimension
is no greater than n.

Proof left as exercise (all are more or less immediate


implications of Proposition 4.6)
Linear Transformations
Linear transformations are mappings from one vector space to
another that are intended to preserve vector addition and
scalar multiplication.

Matrices, as we shall see, are closely related to linear


transformations. Understanding linear transformations is
helpful to understanding matrix operations.

We will focus on linear transformations between real valued


vectors equipped with the standard + and · operations.

Definition. A mapping T : Rn → Rm is a linear


transformation if and only if
1. T (v + u) = T (v) + T (u) ∀v, u ∈ Rn .
2. T (a · v) = a · T (v) ∀v ∈ Rn ∀a ∈ R.

I Condition 1: preservation of vector addition


I Condition 2: preservation of scalar multiplication
Example 1: Projection to the horizontal axis

T : R2 → R 2 T (x1 , x2 ) = (x1 , 0)

Example 2: Matrix multiplication

T : Rn → R m T (v) = Av where A is an m × n matrix

Proposition 4.8 For any linear transformation T : Rn → Rm :


1. T (0n ) = 0m
2. (T (Rn ), +, ·) is a subspace of (Rm , +, ·).
(Recall T (Rn ) := {w ∈ Rk : w = T (v) for some Rm }.)

(Proof left as exercise)

We call T (Rn ) the range of T , and the dimension of the


subspace (T (Rn ), +, ·) the rank of T .
I Thus rank(T ) ≤ m by Proposition 4.7(4).
Matrix Representation of T
We have already seen that matrix multiplication is a linear
transformation. In fact, every linear transformation T is equivalent to
a matrix multiplication.

Given a linear transformation T : Rn → Rm , here’s how to construct


the matrix AT that is equivalent to T , i.e. T (v) = AT v ∀v:

The ith column of AT is exactly T (ei ) (written horizontally)


(recall ei = (0, ..., |{z}
1 , ..., 0))
ith

→ Each column has m rows because T (ei ) ∈ Rm has m entries, and


there are n columns because there are n ei ’s, so AT is m × n.
Example: T : R2 → R2 where T (x1 , x2 ) = (x1 , 0)
 
1
I T (1, 0) = (1, 0), so the first column of AT is
0
 
0
I T (0, 1) = (0, 0), so the second column of AT is .
0
        
1 0 x1 x1 1 0 x1
I Hence AT = .Verify: T ( )= =
0 0 x2 0 0 0 x2
Proposition 4.9 T : Rn → Rm be a linear transformation, then
range(T ) = span({T (e1 ), ..., T (en )), i.e. every v ∈ range(T )
can be expressed as a linear combination of T (e1 ), ..., T (en ).

I Note: although e1 , ..., en form a basis of Rn , T (e1 ), ..., T (en )


do not necessarily form a basis of range(T ) because there
can be linear dependence (e.g. T (x1 , x2 ) = (x1 , 0))
I Thus rank(T ) ≤ min{m, n}.

Proposition 4.10 Let T : Rn → Rm be a linear transformation


and let AT := [T (e1 ) ... T (en )]. Then T (v) = AT v ∀v ∈ Rn .

Proof of Propositions 4.9 and 4.10. Pick any v = (v1 , ..., vn ) and let
u = (u1 , ..., um ) := T (v). Note that v = ni=1 vi · ei . Thus
P
Pn Pn
u = T (v) = T ( i=1 vi · ei ) = i=1 vi · T (ei ). This proves Proposition 4.9. Now
we wish to show that uk is the kth entry of AT v for every k = 1, ..., m. We have
uk = ni=1 vi Tk (ei ) where Tk (ei ) denotes the kth entry of T (ei ). Given the
P

construction of AT , the kth entry of AT v is also ni=1 vi Tk (ei ).


P
Thus there is a nice bijection T 7→ AT between all linear
transformations from Rn to Rm and m × n matrices:
I Each T is represented by a unique matrix AT .
I Each A represents a some linear transformation (and we
denote it as TA ).
In short, ∀v ∈ Rn
T (v) = AT v Av = TA (v)

It turns out that the product of two matrices nicely represents


the composite of two linear transformations.

Proposition 4.11 If T : Rn → Rm and S : Rm → Rl are two


linear transformations, then the matrix AS AT represents
S ◦ T : Rn → Rl , i.e.

AS AT v = S(T (v)) ∀v ∈ Rn .

(Proof left as exercise)


A special kind of linear transformations are from Rn to Rn .
I Domain = co-domain
I Sometimes called self-transformations.
I The corresponding matrix is square, i.e. n × n.

Definition. Linear transformation T : Rn → Rn is said to be


invertible if there exists another mapping U : Rn → Rn such
that U (T (v)) = v ∀v ∈ Rn . U is called the inverse of T and is
denoted as T −1 .

Proposition 4.12 Given a linear transformation T : Rn → Rn ,


its inverse T −1 is also a linear transformation. Moreover, T
is the inverse of T −1 .

(Proof left as exercise) Here’s a useful result which we will

state without proof.

Proposition 4.13 If T is a self-transformation and is


one-to-one, then it is onto, and hence, a bijection.
Proposition 4.14 Given a linear transformation T : Rn → Rn ,
the following statements are equivalent:
1. T is invertible
2. T is a bijection
3. There exists some n × n matrix, denoted as AT−1 , such
that AT−1 AT = In

 
1 0 ... 0
(Recall: In = ... 1 ... 0, which represents the identity
0 0 ... 1
transformation T (v) = v.)
Proof. (1) → (2): Suppose T is invertible but not one-to-one, then T −1 (T (v))
would contain more than one element for some v, a contradiction. Then T is a
bijection by Proposition 4.13. (2) → (3): Since T is a bijection, it has an
inverse mapping U : Rn → Rn such that U (T (v)) = v ∀v ∈ Rn , and let
AT−1 := AU , so from U (T (v)) = v ∀v we have AT−1 AT = In because In is the only
matrix such that In v = v ∀v. (3) → (1): Let U be the linear transformation AT−1
represents. Then AT−1 AT v = In v = v ∀v, which implies U (T (v)) = v ∀v.
The Rank of a Matrix
The rank of a matrix A is the rank of the linear transformation TA it
represents.

Proposition 4.15 Given an m × n matrix A:


1. rank(A) ≤ min{m, n}.
2. rank(A) is the same as the maximum number of its linearly
independent columns.
3. rank(A) = rank(At ) (where At denotes the transpose of A).
4. rank(A) is the same as the maximum number of its linearly
independent rows.

((1) and (2) are immediate implications of Proposition 4.9. Proof for (3)
is omitted. (4) immediately follows (3).)
I Example
 
1 2 3
rank 1 2 4 = 2
2 4 7
Matrix Invertibility

Definition. A square matrix A is said to be invertible if and


only if there exists another square matrix, denoted as A−1
and called the inverse of A, such that A−1 A = In .

I By Proposition 4.14, A is invertible if and only if the linear


transformation TA is invertible.
I Also, if A is invertible, then A−1 is also invertible, and A is
the inverse of A−1 .

Proposition 4.16 An n × n matrix is invertible if and only if


its rank is n.

(Proof omitted)
I Invertibility ↔ TA (e1 ), ..., TA (en ) constitute a basis of Rn .
I Invertibility of A can be verified checking if all columns or
all rows are linearly independent.
Linear Equations
A system of linear equations have the form
A x = b
|{z} |{z} |{z}
m×n n×1 m×1

where A is a given matrix and b is a given vector in Rm .


I Whereas x is an unknown vector in Rn .

Since A represents TA : Rn → Rm such that TA (v) = Av ∀v ∈ Rn :


−→ Solving the system is the same as answering:
Which vector is transformed to b under T ?

I Unique solution: unique v transformed to b by T .


I No solution: no v transformed to b by T (T is not onto!)
I Multiple solutions: many v transformed to b by T (T is not one-to-one!)
 
1 0
Example: A = .
0 0
I Recall: A represents the projection transformation T (x1 , x2 ) = (x1 , 0).
I If b is on the x−axis (b = (y, 0) for some y): many vectors are
transformed to it → Ax = b has many solutions.
I If b is not on the x−axis: no vector is transformed to it → Ax = b has no
solution.
An important case is when there are n equations and n unknowns (A is n × n).

Proposition 4.15 Given an n × n matrix A:


1. If A is invertible, then Ax = b has a unique solution for any b ∈ Rn ,
and this solution is x = A−1 b.
2. If A is not invertible, then Ax = b either has no solution or has
multiple solutions for each b ∈ Rn .

I If A is invertible, it represents a bijective self-transformation (Proposition


4.14), so every b in codomain is associated with a unique x in domain.
I If A is not invertible so rank(A) < n, it represents a transformation that
“squeezes” Rn to a lower dimensional space Rrank(A)
I Multiple vectors in the domain are squeezed into the same vector in
the codomain — where there are multiple solutions.
I A lot of “blank space” in the codomain — where there is no solution.
Proof. (1): Clearly x = A−1 b is a solution, as A(A−1 b) = (AA−1 )b = In b = b.
Also if Ax = b and Ax 0 = b, then left-multiplying both sides of both equations
by A−1 we have x = A−1 b = x 0 . (2): Suppose A is not invertible, yet Ax = b
has a unique solution for some x. By a result we have not stated, if a linear
transformation T : Rn → Rn is one-to-one at one vector (i.e. Ax = b has a
unique solution for some b), it is a bijection and hence invertible by
Proposition 4.14, a contradiction.
Eigenvalues and Eigenvectors
Let’s still think about self-transformations T : Rn → Rn .
I In general, a vector v is transformed to something T (v) quite
different from itself.
I But it is possible that a certain vector is “essentially”
transformed to itself, just rescaled:
T (v) = λ · v for some λ ∈ R.
I Eigen: “self/own” in German
Let’s focus on matrix representations of transformations.

Definition. Given an n × n matrix A, v ∈ Rn where v 6= 0n is an


eigenvector of A if and only if

Av = λv · v for some λv ∈ R.

In this case, we call λv the eigenvalue of v.

I We do not consider 0n as an eigenvector, although A0n = 0n .


I Thus, TA merely rescales v by scalar λv .
Clearly, if v is an eigenvector of A, then a · v is also an eigenvector of A for any
a ∈ R, and it has the same eigenvalue as v does. This can be generalized:

Proposition 4.16 Given an n × n matrix A:


1. If E ⊂ Rn is a set of eigenvectors with distinct eigenvalues, then
vectors in E are linearly independent.
2. A has at most n distinct eigenvalues.
3. If A has exactly n distinct eigenvalues, then it is invertible.

(Proof of (1) is omitted. (2) and (3) follow (1) immediately. )

When A has exactly n eigenvalues, it is said to be diagonalizable, because


instead of using the standard basis {e1 , ..., en }, we can use E = {x1 , ..., xn }
(with eigenvalues λ1 , ..., λn ) as a basis and give each v the coordinates
(v1 , ..., vn ) if v = v1 · x1 + ... + vn · xn . With this new coordinate representation:
Xn n
X n
X
T (v) = T ( vi · xi ) = vi · T (xi ) = (λi vi ) · xi = (λ1 v1 , ..., λn vn )
i=1 i=1 i=1
 
λ1 ... 0
=  ... λi ...  v.
0 ... λn
In other words, by changing the basis to E, the matrix representation of the
same transformation TA is now diagonal.

You might also like