Professional Documents
Culture Documents
Here we denote the mapping by the letter T, which stand for transformation.
Notation:
T : R2 → R2 defined by T(u) = Au for all u in R2
We shall call such a mapping a linear transformation.
1
2 x 2 Linear Transformation (slide 3)
We process this input vector like before and get the output vector Au:
T : R2 → R2 given by
The geometrical interpretation of this matrix can now be seen more clearly by considering a slightly
more complicated diagram, like the letter F.
We see that the resulting diagram is the letter of F “lying down”. In fact, the effect of this
transformation is the 90° counterclockwise rotation about the origin.
Geometric Transformations of R2 (slide 4)
Indeed, many 2 x 2 matrices give transformations that have interesting geometrical interpretation on
the xy-plane. We shall give the 4 basic types of transformation, namely rotation, reflection, scaling
and shearing.
Rotation
Reflection
The matrix represents the reflection about the y-axis,
while the matrix gives the reflection about the x-axis.
2
Scaling
The matrix represents an enlargement if a > 1, and represents a contraction if 0 < a < 1.
When the two diagonal entries are different, the x and y scaling will not have the same proportion.
Shearing
The matrix has the shearing effect parallel to the x-axis. How much the letter is slanted
Similarly, the matrix has the shearing effect parallel to the y axis.
3
This gives the formula for T:
𝑥𝑥1 𝑎𝑎11 𝑥𝑥1 + 𝑎𝑎12 𝑥𝑥2 + … + 𝑎𝑎1𝑛𝑛 𝑥𝑥𝑛𝑛
𝑥𝑥2 𝑎𝑎21 𝑥𝑥1 + 𝑎𝑎22 𝑥𝑥2 + … + 𝑎𝑎2𝑛𝑛 𝑥𝑥𝑛𝑛
𝑇𝑇 �� ⋮ �� = � �
⋮
𝑥𝑥𝑛𝑛 𝑎𝑎𝑚𝑚1 𝑥𝑥1 + 𝑎𝑎𝑚𝑚2 𝑥𝑥2 + … + 𝑎𝑎𝑚𝑚𝑚𝑚 𝑥𝑥𝑛𝑛
We can regard x and y as two parameters, and rewrite the formula as a linear combination as shown
in the middle term below:
1 1
Then this linear combination can be rewritten in matrix form with �2� and � 0 � forming the two
0 −3
1 1
columns of a 3 x 2 matrix A = �2 0 �.
0 −3
So we see that T is a linear transformation with this given standard matrix A.
4
Properties of Linear Transformation (slide 8)
If T : Rn → Rm is a linear transformation, then
1. T(0) = 0 (T preserves zero vector)
2. T(u + v) = T(u) + T(v) (T preserves addition)
3. T(cu) = cT(u) (T preserves scalar multiplication)
4. T( c1u1 + c2u2 + ··· + ckuk ) = c1T(u1) + c2T(u2) + ··· + ckT(uk) (T preserves linear combinations)
All these properties are referred to as the linearity properties of the linear transformation, and they
can be verified easily by converting the map T into matrix multiplication by the standard matrix A to
the input vector. For example, in the last property, the left hand side can be written as pre-
multiplying A to the linear combination:
T( c1u1 + c2u2 + ··· + ckuk ) = A( c1u1 + c2u2 + ··· + ckuk )
1 0
Based on this piece of information, how do we find the images of 𝒆𝒆1 = � � and 𝒆𝒆2 = � � under T ?
0 1
We shall see that the linearity property of T can help us.
First observe that the two input vectors form a basis for R2 (just need to check they are linearly
1 0
independent). This means it is possible to express 𝒆𝒆1 = � � and 𝒆𝒆2 = � � as linear combinations of
0 1
this basis.
1
� � = 2 �1� + 2 � 1 � (use Gaussian elimination, or simply by inspection)
1 1
0 1 −1
1 1 1
T�� �� =T �12 �1� + 12 � 1 �� =2T �� �� + 2T �� ��
1 1
(by linearity property)
0 1 −1 1 −1
4 0
=12 � � + 12 � �= �2� (from given information above)
2 6 4
Likewise,
�0� = 2 �1� − 2 � 1 �
1 1
1 1 −1
5
Images of a basis (slide 10)
From the above example, we have the following generalization:
Given linear transformation T : Rn → Rm and a basis {u1, u2, …, un} for Rn.
For any v in Rn, we have v = c1u1 + c2u2 + ··· + cnun and hence
We say:
the linear transformation T is completely determined by the images T(u1), T(u2), …, T(un) of the basis
{u1, u2, …, un}.
6
3.2 Eigenvalues and Eigenvectors
Given a square matrix, we can talk about the eigenvalues and eigenvectors associated to this matrix.
It is probably the most commonly used concepts in linear algebra that is being applied to solving
Engineering problems. In chapter 4, we shall see how these concepts are being used in solving
system of differential equations.
Let x be a nonzero (input) vector in Rn. If the output vector Ax is a scalar multiple of x, then we call x
an eigenvector of A.
Geometrically (in 2- or 3-spaces), the arrows representing x and Ax are parallel, pointing either in the
same or opposite directions.
0.96 0.01
A =
0.04 0.99
And
1 1 1 1 1 1
B = 1 1 1 x = 1 y= 0 z = −2
1 1 1 1 −1 1
7
First, we have
1 1 1 k k
B=
(kx) 1 1 1 = k
3
= k 3(kx)
1 1 1 k
k
In general,
1 1 1 1 0 1
Bz = 1 1 1 −2 = 0 = 0 −2 = 0z
1 1 1 1 0 1
Note:
2. Two eigenvectors that are not scalar multiples of each other may have the same eigenvalue.
Let A be n x n square matrix. Without given any input vector, how do we find the eigenvalues of A?
We will derive the answer using concepts that we have learned in earlier chapters.
8
So the condition now become:
The matrix equation (*) with x as unknown represents a homogeneous system with the coefficient
matrix 𝜆𝜆I – A.
Condition (△) implies this linear system has a non-trivial solution. Now recall that a homogeneous
system has non-trivial solution means the coefficient matrix 𝜆𝜆I – A is singular, and
det(𝜆𝜆I – A) = 0 (†)
Now if we expand the determinant using the inductive expansion (see section 1.7), the equation (†)
will turn into a polynomial equation in terms of 𝜆𝜆. In order for us to find the eigenvalue 𝜆𝜆, we need
to solve this polynomial equation.
=0
which gives the solutions 𝜆𝜆 = 1 and 0.95.
Hence the eigenvalues of A are 1 and 0.95.
Characteristic Polynomial (slide 8)
Let A be n x n square matrix. Then det(λI – A) is a polynomial of degree n. This polynomial is called the
characteristic polynomial of A.
0.96 0.01
Example: The characteristic polynomial of A = is
0.04 0.99
From the earlier discussion, we have:
λ is an eigenvalue of A ⇔ det(λI – A) = 0 ⇔ λ is a root of the characteristic polynomial
9
It may not be easy to factorize a degree polynomial. A short cut is to try guessing one possible root.
In this case, it is not difficult to guess the root λ = 1. Ultimately, the complete factorization of this
polynomial is given by
The eigenvalues of C are the roots of the above polynomial, which are 1, √2 and -√2.
The characteristic polynomial of this matrix is det(𝜆𝜆I – A) = 𝜆𝜆2 + 1. It has complex roots: 𝜆𝜆 = ± i.
So this matrix has complex eigenvalues. The corresponding eigenvectors will have complex
components. i.e. they come from vector space over complex numbers. We shall see some
applications of complex eigenvalues and eigenvectors in solving system of differential equations in
the last chapter.
10
3.3 Eigenspaces
We have seen in the previous section that, given a square matrix, there is a systematic way to find
the eigenvalues by solving the characteristic polynomial. In this section, we shall see how to find the
eigenvectors associated to a certain eigenvalue. The collection of all such eigenvectors form a
subspace of some n-space. This subspace is called an eigenspace.
Eigenspace (slide 2)
Given an n x n square matrix A.
Suppose 𝜆𝜆 is an eigenvalue of A. Then det(𝜆𝜆I – A) = 0. This means the homogeneous linear system
(𝜆𝜆I – A) x = 0
with coefficient matrix 𝜆𝜆I – A has non-trivial solutions. In fact,
all the non-trivial solutions of the system (𝜆𝜆I – A) x = 0 are the eigenvectors of A associated to 𝜆𝜆.
The solution space of this homogeneous system is called the eigenspace of A associated with 𝜆𝜆 and
we denote this subspace of Rn by E𝜆𝜆.
If u is a nonzero vector in E𝜆𝜆, then u is an eigenvector of A associated with the eigenvalue 𝜆𝜆.
Example 1: Eigenspace (slide 3-5)
0.96 0.01
A =
0.04 0.99
In the previous section, we have found the eigenvalues of A to be 1 and 0.95 by solving its
characteristic polynomial det(𝜆𝜆I – A) = (𝜆𝜆 - 1)(𝜆𝜆 - 0.95). So correspondingly there are two eigenspace
E1 and E-0.95. associated to eigenvalues 1 and 0.95 respectively. We solve for these two eigenspaces
separately.
11
-1
Any non-zero scalar multiple of � � is an eigenvector of A associated with the eigenvalue 0.95.
1
We have seen before that the eigenvalues of B are 3 and 0 by solving the characteristic polynomial
det(𝜆𝜆I – B) = (𝜆𝜆 - 3)𝜆𝜆2. So B has two eigenspaces E3 and E0 associated to eigenvalues 1 and 0.95
respectively.
These eigenspaces can be solved in a similar way. Here we will only find the eignespace E0.
Using Gaussian elimination, we solve the system to get the general solution with two parameters s
and t:
x − 1 − 1
y
= s 1 + t 0
z 0 1
-1 -1
This means any non-zero linear combination of � 1 � and � 0 � is an eigenvector of B associated with
0 1
the eigenvalue 0. The eigenspace can be expressed as the linear span of these two vectors:
− 1 − 1
E0 = span 1 , 0
0 1
and these two vectors give a basis for the eigenspace E0.
We use the 3 x 3 identity matrix as an example, but the result can be generalized to any n x n identity
matrix.
1 0 0
I3 = 0 1 0
0 0 1
12
1. The identity matrix I3 has only one eigenvalue 1.
Property 1 is easy to see, as I3 is a diagonal matrix, and eigenvalues of a diagonal matrix are all its
diagonal entries, which is just 1 in this case. In fact, if you consider the characteristic polynomial of
the identity matrix, it is given by det(𝜆𝜆I3 – I3) = (𝜆𝜆 - 1)3, which has only one root 1.
Property 3 says that any non-zero 3-vector in R3 is an eigenvector. To see this, we look at the matrix
0 0 0
λ I3 − I3 =
0 0 0
0 0 0
(𝜆𝜆I3 – I3) x = 0 ⟹ 0x = 0
This homogeneous system with zero matrix as the coefficient is satisfied by every vector in R3. This
shows that the eigenspace E1 is R3.
For the three matrices we have seen in this section, let us recap their characteristic polynomials.
0.96 0.01
A =
0.04 0.99
The characteristic polynomial of A is given by this factorized form det(𝜆𝜆I – A) = (𝜆𝜆 - 1)(𝜆𝜆 - 0.95).
Note that the power of each of the two factors (𝜆𝜆 - 1) and (𝜆𝜆 - 0.95) is 1. We call these powers the
respective multiplicities of the eigenvalues
1 1 1
B = 1 1 1
1 1 1
The factor (𝜆𝜆 - 3) corresponding to eigenvalue 3 has power 1, and the factor 𝜆𝜆2 corresponding to
eigenvalue 0 has power 2. So we say the eigenvalue 3 has multiplicity 1, and the eigenvalue 0 has
multiplicity 2.
1 0 0
I3 = 0 1 0
0 0 1
The characteristic polynomial of I is det(𝜆𝜆I – I) = (𝜆𝜆 - 1)3 has only one factor with power 3.
Now let’s look at the dimension of the various eigenspaces. Recall that the dimension is the number
of basis vectors for a certain vector space.
13
For matrix A, we have dim E1 = 1, dim E0.95 = 1.
Observe that the dimension of the eigenspace is equal to the multiplicity of the corresponding
eigenvalues given above.
This is not true in general. However, there is a relationship between the two quantities.
Suppose the characteristic polynomial of a matrix A is factorized as follow, with all the common
factors group together:
det(𝜆𝜆𝑰𝑰 − 𝑨𝑨) = (𝜆𝜆 − 𝜆𝜆1 )𝑟𝑟1 (𝜆𝜆 − 𝜆𝜆2 )𝑟𝑟2 ⋯ (𝜆𝜆 − 𝜆𝜆𝑘𝑘 )𝑟𝑟𝑘𝑘
What we know from this polynomial are: 𝜆𝜆1 to 𝜆𝜆k are all the eigenvalues of A, r1 to rk are the
respective multiplicities of these eigenvalues. If A is an n x n matrix, then
r1 + r2 + … + r𝑘𝑘 = n
dim E𝜆𝜆𝑖𝑖 ≤ r𝑖𝑖 for all 𝑖𝑖
The inequality above says that the number of basis vectors in each eigenspace cannot be more than
the multiplicity of the eigenvalue in the characteristic polynomial.
Here’s an example.
dim E2 ≤ 3.
In other words, the dimension can be 1, 2 or 3, but we can’t tell the exact value until we find the
eigenspace explicitly.
Similarly, for E4, the multiplicity of the eigenvalue 4 in the characteristic polynomial is 2, so
dim E4 ≤ 2.
As for E1, since the multiplicity of eigenvalue 1 in the characteristic polynomial is 1, we conclude that
dim E1 = 1
14
3.4 Diagonalizable Matrices
Diagonal matrix is a special type of matrix that has many nice features. For matrices that are
not diagonal, we would like to “convert” them to diagonal matrices. Those matrices that can
be converted are known as diagonalizable matrices. In this section, we will give a more
precise description, and show how this notion is related to eigenvalues and eigenvectors.
Example: Power of Matrix (slide 2-3)
We start with an example as a motivation, which is about taking powers of square matrices.
A = 0.96 0.01
0.04 0.99
n
0.96 0.01
We want to compute Ann =
0.04 0.99
We cannot simply raise each entry to power n, because of the way we perform matrix multiplication.
The right hand side is the product of three matrices, which will be explain later how they are derived.
(You may try to work backward to verify that the product is indeed equal to A.)
The two outer matrices in the product are inverses of each other, which we denote by P and P-1 and
the matrix in the middle is a diagonal matrix, which we denote by D. So we have
A = PDP –1
Our objective is to raise A to the power n. So we do the same to the product on the right.
An = (PDP –1)n = (PDP –1)(PDP –1)(PDP –1) ···(PDP –1) (n times) (*)
One nice property about matrix multiplication is the associative law, which allows us to rearrange
the parenthesis on the right hand side of (*):
We observe that there are many pairs of P –1P side by side in (**). All the intermediate pairs of P –1P
can be cancelled as their product is the identity matrix and we are left with a much simpler product:
An = PD D ··· DP –1 = PD nP –1.
Note that on the right hand side, we raise D to the power of n, while the powers of P and P-1 remain
unchanged.
As mentioned, diagonal matrices have many nice features. One of them being easy to perform
multiplication. In particular, raising a diagonal matrix to power n, is just raising its individual diagonal
entries to power n. So for our D, we have
15
For example, if we want to compute A100, instead of multiplying matrix A with itself 100 times, we
express it as:
Essentially we just need to perform multiplication on three matrices as shown. And this is a more
efficient way to get the answer.
What we have done is converting a non-diagonal matrix A to a diagonal matrix D. When such a
conversion can be carried out, we say the matrix A is a diagonalizable matrix, and P is the matrix that
diagonalizes A to give us a diagonal matrix D.
Diagonalizable Matrix (slide 5)
Let’s state the definition more generally:
An n x n square matrix A is called diagonalizable if we can find a non-singular matrix P such that
P –1AP is a diagonal matrix:
λ1 0
λ2
P AP =
−1
⋱
O
0 λn
We say: the matrix P diagonalizes A. Matrix P is necessarily non-singular, as we need to take its
inverse matrix.
Example 2: Diagonalizable Matrix (slide 6)
How do we get this matrix P? We shall see at the end of this section that P is related to the
eigenvectors of B.
16
Example: Non-diagonalizable Matrix (slide 7)
Not all square matrices are diagonalizable. Let us look at this example M, which is a non-
diagonalizable matrix.
2 0
M =
1 2
In other words, we cannot find a matrix P that diagonalizes M. To see why this is the case, we try to
argue by means of contradiction:
Now let us expand the product P –1MP and equate it with some diagonal matrix �e 0�.
0 f
Then
The next thing is to compare the entries on both sides of the equality above.
This means the above assumption is wrong, which implies that we cannot find the required matrix P.
This is an ad hoc method to show a non-diagonalizable matrix. In the next section, we will see a more
systematic way to determine whether a matrix is diagonalizable or not.
Diagonalizability (slide 8)
Let us now bring the eigenvalues and eigenvectors into the discussion.
A has two eigenvalues 1 and 0.95, B also has two eigenvalues 3 and 0, while M has one eigenvalue 2
(observe that M is a triangular matrix).
However, it is not the number of eigenvalues that decide whether the matrix is diagonalizable or not.
It is the eigenvectors.
17
For matrix B, we have one eigenvector associated to eigenvalue 3, 1
1
1
and two eigenvectors associated to eigenvalue 0: 1 0 − 2
1 − 1 1
For matrix M, I leave it to you to check that an eigenvector of M associated to eigenvalue 2 is given
by 0
1
Note that we are not saying the matrices above have only 2, 3 and 1 eigenvector respective. In fact,
these matrices have infinitely many eigenvectors.
0
For example, for M, any scalar multiple of is also an eigenvector:
1
The correct way to say this is to include the condition linearly independent:
In other words, matrix A has two linearly independent eigenvectors; matrix B has three linearly
independent eigenvectors; while M, it has only one linearly independent eigenvector.
In the last case, there are “not enough” linearly independent eigenvectors. But how do we tell
whether there are enough eigenvectors for a given matrix? The answer is to compare it with the size
of the matrix.
If we cannot find n eigenvectors of A that are linearly independent, then A is not diagonalizable.
The following two observations, not only explain the condition for diagonalizability, but they are also
techniques used commonly in matrix multiplications.
In other words, the 1st column of AB is given by Ab1, the 2nd column of AB is given by Ab2 and so on.
18
You can first multiply the two matrices on the left to see that you will get the right hand side.
Then you can check that the first column of AB is Ab1, the second column is Ab2, and the third
column is Ab3.
Suppose the diagonal entries of D are d1 , d2 , ··· , dn, and the columns of B are b1 , b2 , ··· , bn. Then
The product of the two matrices on the left hand side yield the matrix on the right.
Let b1 , b2 , b3 be the three columns of the matrix on the left, we again check the matrix on the right
by columns, and see that column 1 is 2b1, column 2 is 3 b2, and column 3 is 4 b3.
First, we denote the n eigenvectors by u1, u2, …, un and let 𝜆𝜆1, 𝜆𝜆2, …, 𝜆𝜆n be the corresponding
eigenvalues. (Note that some of these n eigenvalues may be the same.)
Then we form the n x n matrix P by using the n eigenvectors as its columns and let D be the diagonal
matrix where the entries are the eigenvalues:
Note that P is guaranteed to be non-singular because the columns are linearly independent.
The first equality above follows from the first observation in the previous segment.
The second equality is due to u1, u2, …, un being eigenvectors of A, so that Aui = 𝜆𝜆iui.
And by bringing the P on the right hand side to the left, we get P –1AP = D.
This means the matrix P diagonalizes A to give the diagonal matrix D and so we conclude that A is
diagonalizable.
19
3.5 Diagonalization
In this section, we will introduce a systematic procedure to determine whether a matrix is
diagonalizable or not. The approach will also show how to diagonalize a matrix. This process is called
diagonalization.
Algorithm for Diagonalization (slide 2)
Step 1: Solve the characteristic polynomial det(𝜆𝜆I – A) to find all distinct eigenvalues 𝜆𝜆1, 𝜆𝜆2, …, 𝜆𝜆k.
Step 2: For each 𝜆𝜆i, find a basis S𝜆𝜆𝑖𝑖 for the eigenspace E𝜆𝜆𝑖𝑖 by solving (𝜆𝜆iI – A)x = 0
Step 3: Let S = S𝜆𝜆1 ∪ S𝜆𝜆2 ∪ … ∪ S𝜆𝜆𝑘𝑘 . (Then |S| is the total number of basis vectors from all the
eigespaces)
(a) If |S| < n, then A is not diagonalizable.
In this case, there are not enough linearly independent eigenvectors to form a matrix P that
diagonalizes A.
(b) If |S| = n, then A is diagonalizable.
In this case, if S = {u1, u2, …, un}, then the square matrix P = (u1 u2 ··· un) diagonalizes A.
Example 1: Algorithm for Diagonalization (slide 3-4)
1 1 1
B = 1 1 1
1 1 1
Step 3: |S| = |S3| + |S0| = 3, which agree with the size 3 x 3 of matrix B.
So we conclude B is diagonalizable.
1 − 1 − 1
We form the matrix P using the three vectors in S as columns: P = 1 1 0
1
3 0 0 1 0
Then P BP = 0 0 0
−1
0 0 0
The diagonal matrix on the right has the eigenvalues of B as the diagonal entries. Note that 0
appears two times as there are two eigenvectors associated to this eigenvalue.
Note: In step 3, it is not necessary for you to actually perform the multiplication P-1BP. All you need
to do is to write down the diagonal matrix using the eigenvalues found in step 1.
Remark: Matrix P is not unique. There are many possible matrices that can diagonalize B to give a
diagonal matrix. We can use any other eigenvectors of B as the columns of P, as long as they are
linearly independent.
20
For example, can be replaced by
Note that the three columns in the matrix on the right are scalar multiples of the original
eigenvectors, and hence are still eigenvectors of B.
For example, matrix Q is obtained from P by moving the first column to the
last. As long as the three columns are linearly independent eigenvectors, it will
still diagonalize the matrix.
0 0 0
You can check that
Q−1BQ = 0 0 0
0 0 3
Take note that the order of the diagonal entries on the right is rearranged accordingly.
Example 2: Non-Diagonalizable Matrix (slide 5)
1 0 0
A = 1 2 0
− 3 5 2
Step 1: The eigenvalues are 1 and 2. (A is a lower triangular matrix, so the eigenvalues can be read
off from the diagonal entries directly.)
21
Matrix with Maximum Distinct Eigenvalues (slide 7)
Suppose the n eigenvalues of A are 𝜆𝜆1, 𝜆𝜆2, …, 𝜆𝜆n. Then each eigenvalue will correspond to an
eigenvector u1, u2, …, un. Since all the eigenvalues are distinct, the eigenvectors will be linearly
independent. So we have n linearly independent eigenvectors for A. This implies A is diagonalizable.
What this observation tells us is that, when we carry out the diagonalization algorithm, after step 1,
if we obtain n distinct eigenvalues, then we can straight away conclude that the matrix is
diagonalizable without going through the remaining steps.
However, if we need to find the matrix P that diagonalizes A, we still need to perform steps 2 and 3
of the algorithm to find explicit eigenvectors.
So B is also a diagonalizable matrix despite not having a maximum number of distinct eigenvalues.
Hence, if a matrix is an n x n diagonalizable matrix, it need not have n distinct eigenvalues.
Diagonalization and Linear Transformation (slide 9)
5 3
Suppose T : R2 → R2 defined by T(u) = Au where A = � �.
3 5
1 -1
A is diagonalizable with eigenvalues 8 and 2, and eigenvectors � � and � �. (Check!)
1 1
1 -1 8 0
So we let P = � � and D = � �.
1 1 0 2
Then the standard matrix A = PDP –1 and T(u) = PDP –1u.
We shall see how the input vector is being transformed to the output vector in three steps as follow.
u → P –1u → DP –1u → PDP –1u
We first draw the standard coordinate system (black arrows) corresponding to the standard basis
1 -1
vector. Then we construct a new coordinate system using our two eigenvectors � � and � � as a
1 1
basis (red arrows), as shown in the diagram.
22
Then the new coordinate systems will have the new axes (blue dotted arrows) along the two
eigenvectors.
Now for the step u → P –1u , the resulting vector gives the coordinate vector relative to the new
system.
For the second step P –1u → DP –1u, the effect is to do a scaling in the new coordinate system.
To illustrate this, let us use a square as an input figure, and look at the effect.
After multiplying by P-1, we should regard the square in the new coordinate system.
8 0
Now the matrix D = � � scales along the first axis by a factor 8 as shown in the diagram above,
0 2
transforming the top right and bottom left corners of the square.
At the same time, D scales along the second axis by a factor 2, transforming the top left and bottom
right corners of the square as shown.
So we obtain the four corners of the resulting figures, which is given by a rhombus.
Finally, DP –1u → PDP –1u, the effect is to bring the vector back to the original coordinate system.
23
3.6 Powers of Matrices
In this last section, we shall revisit the power of square matrices that are diagonalizable. We will also
give an example on application of power of matrices to population modeling.
Iterative Systems (slide 2)
Many real life examples come in the form of iterative systems.
These are systems with various stages (usually over a period of time) such that:
• the stages can be described using n-vectors, x0, x1, x2, …
• consecutive stages are related by a fixed n x n matrix A: xk = Axk-1
where xk-1 and xk are n-vectors representing the k-1th and kth stage respectively.
Here is an example of a system where the stages are represented by 3-vectors:
So we have
24
⋱
This gives a simple way to compute and analyse large powers of A by just performing matrix
multiplication on three matrices.
Example: Power of Diagonalizable Matrices (slide 4)
− 4 0 − 6
A= 2 1 2
3 0 5
1 0 1 0 0 2
(−1)m 0 0
Then
A m
= P 0 1m 0 P−1
0 0 2m
In fact, since this matrix is non-singular, we can even apply negative powers to this. In particular, we
can let m be -1 to get A-1 in the same form.
(−1)−1 0 0
A −1
= P 0 1−1
0 P−1
0 0 2−1
Population Modeling (slide 5-7)
We now apply the power of matrix to population modeling.
Suppose the population in a certain country is divided into rural and urban. Studies show that every
year, 4% of rural population will move to the urban area, while 1% of the urban population will move
to the rural area.
What will be the projected proportion of the two population in the long term?
Answer: 20% of the population will be rural, and 80% will be urban.
How do we compute that?
As this is an iterative system, we need to analyse the population in terms of the year. So we let
an = the rural population in year n, and bn = the urban population in year n, with respect to current
year as the starting point.
Now based on the information we have, the rural population in year n depends on both the rural and
urban population in year n-1:
In other words, 96% of rural population in year n-1 and 1% of urban population in year n-1 will
contribute to the rural population in year n.
25
Similarly, 4% of rural population in year n-1 and 99% of urban population in year n-1 will contribute
to the urban population in year n.
This is a kind of linear system that we can convert into the matrix equation form as before.
0.96 0.01 an
Let A= � � and xn = �b �. Then we have the iterative relation
0.04 0.99 n
The diagonal entries are 1 and 0.95, which are the eigenvalues of A. Since 0.95 < 1, when it is raised
to a large power, it will be closed to 0. So we can approximate this entry with 0 and hence the
resulting product can be computed to be approximately given by this:
1 1 1 0 1 1 −1 0.2 0.2
𝑨𝑨(𝑏𝑏𝑏𝑏𝑏𝑏 𝑛𝑛) ≈ � �� �� � =� �
4 −1 0 0 4 −1 0.8 0.8
We can then use this matrix to find the populations in the long term:
26