Eigenvalues and Eigenvectors

Chapter 2
Eigenvalues and Eigenvectors
2.1 Introduction
In this chapter, we investigate the phenomenon that for a square matrix A, there exist nonzero
vector x such that Ax is a scalar multiple of x. This is the mieigenvalue problem, and it is one
of the most central problems in linear algebra. In this chapter, we describe the classically ways
for solving eigenvalue problems which arise in many branches of science and engineering and seem
to be a very fundamental part of the structure of universe. Eigenvalue problems are important in
a less direct manner in numerical applications. For example, to discover the condition factor in
the solution of a set of linear algebraic equations involves finding the ratio of the largest to the
smallest eigenvalue values of the underlying matrix. Also, the eigenvalue problem is involved when
establishing the stiffness of ordinary differential equations problems.
Definition 2.1 (Eigenvalue Problem)
A scalar λ is called eigenvalue of n × n matrix A if there is a nonzero vector x such that
Ax = λx. (2.1)
!
1
Such a vector x is called an eigenvector of A corresponding to λ. For example, the vector x =
1
!
4 2
is an eigenvector of the matrix A = corresponding to the eigenvalue 6 since
2 4
! ! ! !
4 2 1 6 1
Ax = = =6 = 6x,
2 4 1 6 1
so it follows that x is an eigenvector of A corresponding to the eigenvalue 6. •
Thus in solving eigenvalue problems, mainly we concern with the task of finding the values of the
parameter λ and vector x which satisfying a set of equations of the form (2.1). The linear equation
(2.1) represents the eigenvalue problem where A is an n×n coefficient matrix, also called the system
matrix, x is an unknown column vector and λ is unknown scalar. If the set of equations has a zero
on right-hand side, then a very important special case arises. For such case, one solution of (2.1)
139
140 2.1 Introduction
for a real square matrix A is the trivial solution, x = 0. However, there is a set of values for the
parameter λ for which non-trivial solutions for the vector x exist. These non-trivial solutions are
called the eigenvectors, characteristic vectors or latent vectors of a matrix A and the corresponding
values of the parameter λ are called eigenvalues, characteristic values or latent roots of A. The set
of all eigenvalues of A is called the spectrum of A. Eigenvalues may be real or complex, distinct or
multiple. From (2.1), we deduce
Ax = λIx,
(A − λI)x = 0, (2.2)
where I is an n × n identity matrix. The matrix (A − λI) appears as follows
 
(a11 − λ) a12 ··· a1n

 a21 (a22 − λ) ··· a2n 

.. .. .. ,

..

 . . . .


an1 an2 · · · (ann − λ)
and the result of the multiplication of (2.2) is a set of homogeneous equations of the form
(a11 − λ)x1 + a12 x2 + ··· + a1n xn = 0
a21 x1 + (a22 − λ)x2 + · · · + a2n xn = 0
.. .. .. .. .. (2.3)
. . . . .
an1 x1 + an2 x2 + · · · + (ann − λ)xn = 0
Then by using the Cramer’s rule, we see that the determinant of the denominator, namely, the
determinant of matrix of the system (2.3) must vanish if there is to be a non-trivial solution, that
is, a solution other than x = 0. Geometrically, Ax = λx say that under transformation by A,
eigenvectors experience only changes in magnitude or sign - the orientation of Ax in Rn is the
same as that of x. The eigenvalue λ is simply the amount of ”stretch” or ”shrink” to which the
eigenvector x is subjected when transformed by A (see Figure 2.1).
7
Ax = λ x
6
x
x
Ax
5 Ax
Ax
x
4 x
3 Ax
(a) λ > 1 (b) 1 > λ > 0 (c) −1 < λ < 0 (d) λ < −1
0
1 1.2 1.4 1.6 1.8 2
Figure 2.1: Shows the situation in R2 .
Definition 2.2 (Trace of a Matrix)
For an n × n matrix A = (aij ), we define the trace of A to be the sum of the diagonal elements of
A; that is
trace(A) = a11 + a22 + · · · + ann .
Chapter Two Eigenvalues and Eigenvectors 141
 
7 3 1
For example, for the matrix, A =  2 6 2  , trace(A) = 7 + 6 + 3 = 16. •
 
1 4 3
Theorem 2.1 If A and B are square matrices with the same size, then:
1. trace (AT ) = trace (A).
2. trace (kA) = k trace (A).
3. trace (A + B) = trace (A) + trace (B).
4. trace (A − B) = trace (A) - trace (B).
5. trace (AB) = trace (BA). •
For example, consider the following matrices

   
5 −4 2 9 3 −4
A= 1 4 3  and B =  1 5 2 .
   
3 2 7 3 −2 8
Then  
5 1 3
AT =  −4 4 2  ,
 
2 3 7
and
trace(AT ) = 16 = trace(A).
Also  
20 −16 8
4A =  4 16 12  ,
 
12 8 28
and
trace(4A) = 64 = 4(16) = 4trace(A).
The sum of the above two matrices is defined as
 
14 −1 −2
A+B = 2 9 5 ,
 
6 0 15
and
trace(A + B) = 38 = 16 + 22 = trace(A) + trace(B).
Similarly, the difference of the above two matrices is defined as
 
−4 −7 6
A − B =  0 −1 1 ,
 
0 4 −1
and trace(A − B) = −6 = 16 − 22 = trace(A) − trace(B).

Finally, the product of the above two matrices is defined as
   
47 −9 −12 36 −32 −1
AB =  22 17 28  , BA =  16 20 31  .
   
50 5 48 37 −4 56
Then
trace(AB) = 112 = trace(BA).
We use the author-defined function trac and the following MATLAB commands to get the results
of this example as follows:
>> A = [5 − 4 2; 1 4 3; 3 2 7]; B = [9 3 − 4; 1 5 2; 3 − 2 8];

>> C = A + B; D = A ∗ B; trac(C); trac(D);
There should be no confusion between the diagonal entries and eigenvalues. For a triangular matrix
they are the same but that is exceptional. Normally the pivots and diagonal entries and eigenvalues
are completely different.
The classical method of finding eigenvalues of a matrix A is to estimate the roots of a characteristic
equation of the form
p(λ) = det(A − λI) = |A − λI| = 0. (2.4)
Then the eigenvectors are determined by setting one of the nonzero elements of x to unity and
calculating the remaining elements by equating coefficients in the relation (2.2).
2.1.0.1 Eigenvalues of 2 × 2 Matrices

Let λ1 and λ2 are the eigenvalues of a 2 × 2 matrix A, then a quadratic polynomial p(λ) is defined
as
p(λ) = (λ − λ1 )(λ − λ2 ) = λ2 − (λ1 + λ2 )λ + λ1 λ2 .
Note that
trace(A) = λ1 + λ2 and det(A) = λ1 λ2 .
So
p(λ) = λ2 − trace(A)λ + det(A).
For example, if the given matrix is !
5 4
A= .
3 4
Then !
5−λ 4
A − λI = ,
3 4−λ
and
p(λ) = |A − λI| = (5 − λ)(4 − λ) − 12 = λ2 − 9λ + 8.
By solving the above quadratic polynomial, we get, λ1 = 8, λ2 = 1, the possible eigenvalues of

the given matrix. Notice that
trace(A) = λ1 + λ2 = 9 and det(A) = λ1 λ2 = 8,
which satisfies the above result. •

The discriminant of a 2 × 2 matrix is defined as,!D = [trace(A)]2 − 4 det(A). For example, the
8 7
discriminant of the following matrix A = , can be calculated as
4 6
D = [14]2 − 4(20) = 116,
where the trace of A is 14 and the determinant of A is 20. •
Theorem 2.2 If D is discriminant of a 2 × 2 matrix, then the following statements hold:
1. The eigenvalues of A are real and distinct when D > 0.
2. The eigenvalues of A are a complex conjugate pair when D < 0.
3. The eigenvalues of A are real and equal when D = 0. •

!
3 2
For example, the eigenvalues of the matrix A = , are real and distinct because
2 4
D = [7]2 − 4(8) = 17 > 0.

!
3 −5
Also, the eigenvalues of the matrix A = , are a complex conjugate pair since
1 2
D = [5]2 − 4(11) = −19 < 0.

!
1 2
Finally, the matrix A = , has real and equal eigenvalues because
0 1
D = [2]2 − 4(1) = 0. •
Note that the eigenvectors of a 2 × 2 matrix A can be found easily corresponding to each of the
eigenvalues of a matrix A just by substituting each eigenvalues in the equation (2.2).
Example 2.1 Find the eigenvalues and eigenvectors of the following matrix
!
6 −3
A= .
−4 5
Solution. The eigenvalues of the given matrix are real and distinct because
D = [11]2 − 4(18) = 49 > 0.

Then
p(λ) = |A − λI| = (6 − λ)(5 − λ) − 12 = λ2 − 11λ + 18.
By solving the above quadratic polynomial, we get, λ1 = 9, λ2 = 2, the possible eigenvalues of
the given matrix. Notice that
trace(A) = λ1 + λ2 = 11 and det(A) = λ1 λ2 = 18.
Now to find the eigenvectors of the given matrix A corresponding to each of these eigenvalues, we
substitute each of these two eigenvalues in (2.2). When λ1 = 9, we have
! ! !
−3 −3 x1 0
= ,
−4 −4 x2 0
which implies that

−3x1 − 3x2 = 0, gives x1 = −x2 ,
hence the eigenvector x1 corresponding to first eigenvalue, 9, by choosing x2 = 1, is
x1 = α[−1, 1]T , where α ∈ R, α 6= 0.
When λ2 = 2, we have ! ! !
4 −3 x1 0
= .
−4 3 x2 0
From it we obtain
3
4x1 − 3x2 = 0, gives x1 = x2 ,
4
and
3
−4x1 + 3x2 = 0, gives
x1 = x2 .
4
T
Thus choosing x2 = 4, we obtain, x2 = α[3, 4] , where α ∈ R, α 6= 0, which is the second
eigenvector x2 corresponding to second eigenvalue, 2. •
We use the author-defined function EigTwo and the following MATLAB commands to get the
results of the Example 2.1 as follows:
>> A = [6 − 3; −4 5]; EigT wo(A)
For larger size matrices, there is no doubt that the eigenvalue problem is computationally more
difficult than linear system Ax = b. With linear system a finite number of elimination steps
produced the exact answer in a finite time. In the case of eigenvalue, no such steps and no such
formula can exist. The characteristic polynomial of 5 by 5 matrix is a quintic and it is proved
there can be no algebraic form for the roots of a fifth degree polynomial. Although, there is a few
simple checks on the eigenvalues, after they have been computed, and we mention here two of them:
1. The sum of the n eigenvalues of a matrix A equals the sum of n diagonal entries, that is
n
X n
X
λi = aii = (a11 + a22 + · · · + ann ),
i=1 i=1
which is the trace of A.
2. The product of n eigenvalues of a matrix A equals the determinant of A, that is

n
Y
λi = det(A) = |A| = (λ1 λ2 · · · λn ).
i=1
It should be noted that the system matrix A of (2.1) may be real and symmetric, or real and non-
symmetric, or complex with symmetric real and skew symmetric imaginary parts. These different
types of a matrix A are explained as follows:
1. If the given matrix A is a real symmetric matrix, then the eigenvalues of A are real but not
necessarily positive, and the corresponding eigenvectors are also real. Also, if λi , xi and λj ,
xj satisfy the eigenvalue problem (2.1) and λi and λj are distinct then
xTi xj = 0, i 6= j, (2.5)
and
xTi Axj = 0, i 6= j. (2.6)
The above equations (2.5) and (2.6) represent the orthogonality relationships. Note that if
i = j, then in general xTi xi and xTi Axi are not zero. Recalling that xi includes an arbitrary
scaling factor, then the product xTi xi must also be arbitrary. However, if the arbitrary scaling
factor is adjusted so that
xTi xj = 1, (2.7)
then
xTi Axj = λi , (2.8)
and the eigenvectors are known to be normalized.
Sometimes the eigenvalues are not distinct and the eigenvectors associates with these equal or
repeated eigenvalues are not, of necessity, orthogonal. If λi = λj and the other eigenvalues,
λk , are distinct then
xTi xk = 0, k = 1, 2, · · · , n k 6= i, k 6= j, (2.9)
and
xTj xk = 0, k = 1, 2, · · · , n k 6= i, k 6= j. (2.10)
When λi = λj , the eigenvectors xi and xj are not unique and a linear combination of them,
that is, axi + bxj , where a and b are arbitrary constants, also satisfy the eigenvalue problems.
One important result is that a symmetric matrix of order n has always n distinct eigenvectors
even if some of the eigenvalues are repeated.
2. If a given A is real nonsymmetric matrix, then a pair of related eigenvalue problems can arise
as follows:
Ax = λx, (2.11)
and
AT y = βy. (2.12)
By taking the transpose of (2.12), we have
yT A = βyT . (2.13)
The vectors x and y are called the right-hand and left-hand vectors of A respectively. The
eigenvalues of A and AT are identical, that is, λi = βi but the eigenvectors x and y will,
in general, differ from each other. The eigenvalues and eigenvectors of a nonsymmetric real
matrix are either real or pairs of complex conjugates. If λi , xi , yi and λj , xj , yj are solutions
that satisfy the eigenvalue problems of (2.11) and (2.12) and λi and λj are distinct then
yjT xi = 0, i 6= j, (2.14)
and
yjT Axi = 0, i 6= j. (2.15)
The above equations (2.14) and (2.15) are called the bi-orthogonal relationships. Note that
if, in these equations, i = j, then in general yiT xi and yiT Axi are not zero. The eigenvectors
xi and yi include arbitrary scaling factors and so the product of these vectors will also be
arbitrary. However, if the vectors are adjusted so that
yiT xi = 1, (2.16)
then
yiT Axi = λi . (2.17)
We cannot, in these circumstances, describe neither xi nor yi as normalized; the vectors still
include arbitrary scaling factor, only their product is uniquely chosen. If for a nonsymmetric
matrix λi = λj and the remaining eigenvalues, λk , are distinct then
yjT xk = 0, yiT xk = 0 k = 1, 2, · · · , n k 6= i, k 6= j, (2.18)
and
xTj xk = 0, ykT xj = 0 k = 1, 2, · · · , n k 6= i, k 6= j. (2.19)
For certain matrices with repeated eigenvalues, the eigenvectors may also be repeated, conse-
quently for an nth-order matrix of this type we may have less than n distinct eigenvectors.
This type of matrix is called deficient.
3. Let us consider the case when given A is a complex matrix. The properties of one particular
complex matrix is an Hermitian matrix which is defined as
H = A + iB, (2.20)
where A and B are real matrices such that A = AT and B = −B T . Hence A is symmetric
and B is skew symmetric with zero terms on the leading diagonal. Thus by definition of
an Hermitian matrix, H, has a symmetric real part and a skew symmetric imaginary part
making H equal to the transpose of its complex conjugate, denoted by H ∗ . Consider now the
eigenvalue problem
Hx = λx. (2.21)
If λi , xi are solutions of (2.21), then xi is complex but λi is real. Also, if λi , xi and λj , xj

satisfy the eigenvalue problem (2.21) and λi and λj are distinct then
x∗i xj = 0, i 6= j, and x∗i Hxj = 0, i 6= j, (2.22)
where x∗i is the transpose of the complex conjugate of xi . As before, xi includes an arbitrary
scaling factor and the product x∗i xi must also be arbitrary. However, if the arbitrary scaling
factor is adjusted so that
x∗i xi = 1, then x∗i Hxi = λi , (2.23)
and the eigenvectors are then said to be normalized.

A large number of numerical techniques have been developed to solve the eigenvalue problems.
Before discussing all these numerical techniques, we shall start with a hand calculation, mainly to
reinforce the definition and solve the following examples.
 
3 0 1
A =  0 −3 0  .
 
1 0 3
Solution. Firstly, we shall find the eigenvalues of the given matrix A. From (2.2), we have
    
3 0 1 x1 0
 0 −3 0   x2  =  0  .
    
1 0 3 x3 0
For non-trivial solutions, using (2.4), we get

3−λ 0 1

0 −3 − λ 0 = (λ + 3)(λ − 2)(λ − 4) = 0,

0 3−λ

1
gives the eigenvalues 4, -3, and 2 of the given matrix A. One can note that the sum of these eigen-
values is 3, and this agrees with the trace of A. •
The characteristic equation of the given matrix can be obtained by using the following MATLAB
commands:
>> syms lambd; A = [3 0 1; 0 − 3 0; 1 0 3]; determ(A − lambd ∗ eye(3)); f actor(ans)
After finding the eigenvalues of the matrix A, we turn to the problem of finding the corresponding
eigenvectors. The eigenvectors of A corresponding to the eigenvalues λ are the nonzero vector x
that satisfy (2.2). Equivalently, the eigenvectors corresponding to λ are the nonzero vectors in the
solution space of (2.2). We call this solution space the eigenspace of A corresponding to λ.
Now to find the eigenvectors of the given matrix A corresponding to each of these eigenvalues, we
substitute each of these three eigenvalues in (2.2). When λ1 = 4, we have
    
−1 0 1 x1 0
 0 −7 0   x2  =  0  ,
    
1 0 −1 x3 0
which implies that
−x1 + 0x2 + x3 = 0 ⇒ x1 = x3
0x1 − 7x2 + 0x3 = 0 ⇒ x2 = 0
x1 + 0x2 − x3 = 0 ⇒ x1 = x3
Solving this system, we got, x1 = x3 = ∞ and x2 = 0. Hence the eigenvector x1 corresponding to

first eigenvalue, 4, by choosing x3 = 1, is
x1 = α[1, 0, 1]T , where α ∈ R, α 6= 0.
When λ2 = −3, we have     

6 0 1 x1 0
 0 0 0   x2  =  0  ,
    
1 0 6 x3 0
which implies that
6x1 + 0x2 + x3 = 0 ⇒ x1 = − 61 x3
0x1 + 0x2 + 0x3 = 0 ⇒ x2 = ∞
x1 + 0x2 + 6x3 = 0 ⇒ x1 = −6x3
which gives the solution, x1 = x3 = 0 and x2 = ∞. Hence the eigenvector x2 corresponding to

second eigenvalue, -3, by choosing x2 = 1, is
x2 = α[0, 1, 0]T , where α ∈ R, α 6= 0.
Finally, when λ3 = 2, we have

    
1 0 1 x1 0
 0 −5 0   x2  =  0  ,
    
1 0 1 x3 0
which implies that
x1 + 0x2 + x3 = 0 ⇒ x1 = −x3
0x1 − 5x2 + 0x3 = 0 ⇒ x2 = 0
x1 + 0x2 + x3 = 0 ⇒ x1 = −x3
gives, x1 = −x3 = ∞ and x2 = 0. Hence by choosing x1 = 1, we obtain
x3 = α[1, 0, −1]T , where α ∈ R, α 6= 0,
the third eigenvector x3 corresponding to third eigenvalue, 2, of the matrix. •

MATLAB can handle eigenvalues, eigenvectors, and the characteristic polynomial. A built-in poly
function in MATLAB computes the characteristic polynomial of the matrix:
>> A = [3 0 1; 0 − 3 0; 1 0 3]; P = poly(A); P P = poly2sym(P )
The elements of vector P are arranged in decreasing power of x. To solve the characteristic equation
(in order to obtain the eigenvalues of A), ask for the built-in roots function of P .
>> roots(P );
If all we require are the eigenvalues of A, we can use MATLAB built-in eig function which is basic
eigenvalue and eigenvector routine. The command
>> d = eig(A);
returns a vector containing all the eigenvalues of a matrix A. If the eigenvectors are also wanted,
the syntax
>> [X, D] = eig(A);
will return a matrix X whose columns are eigenvectors of A corresponding to the eigenvalues in
the diagonal matrix D.
To get the results of the Example 2.2, we use MATLAB command window as follows:
>> A = [3 0 1; 0 − 3 0; 1 0 3]; P = poly(A); P P = poly2sym(P );

>> [X, D] = eig(A); λ = diag(D); x1 = X(:, 1); x2 = X(:, 2); x3 = X(:, 3);
 
1 2 2
A =  0 3 3 .
 
−1 1 1
Solution. From (2.2), we have
    
1 2 2 x1 0
 0 3 3   x2  =  0  .
    
−1 1 1 x3 0

1−λ 2 2

0 3 − λ 3 = −λ3 + 5λ2 − 6λ = λ(λ − 2)(λ − 3) = 0,

−1 1 1−λ

which gives the eigenvalues 0, 2, and 3 of the given matrix A. One can note that the sum of these
three eigenvalues is 5, and this agrees with the trace of A.
To find the eigenvectors corresponding to each of these eigenvalues, we substitute each of the three
eigenvalues of A in (2.2). When λ = 0, we have
    
1 2 2 x1 0
 0 3 3   x2  =  0  .
    
−1 1 1 x3 0
The augmented matrix form of the system is

 
1 2 2 0
 0 3 3 0 ,
 
−1 1 1 0
which can be reduce to  

1 2 2 0
 0 1 1 0 .
 
0 0 0 0
Thus the components of an eigenvector must satisfy the relation
x1 + 2x2 + 2x3 = 0
0x1 + x2 + x3 = 0
0x1 + 0x2 + 0x3 = 0
This system has infinite set of solutions. Arbitrary, we choose x2 = 1, then x3 can be equal to
−1, whence x1 = 0. This gives solutions of the first eigenvector of the form x1 = α[0, 1, −1]T ,
with α ∈ R and α 6= 0. Thus x1 = α[0, 1, −1]T as the most general eigenvector corresponding to
eigenvalue 0. Similar procedure can be applied to the other two eigenvalues. The result is that we
have two other eigenvectors x2 = α[4, 3, −1]T and x3 = α[1, 1, 0]T corresponding to the eigenvalues
2 and 3 respectively. •
 
3 2 −1
A= 2 6 −2  .
 
−1 −2 3
Solution. From (2.2), we have

    
3 2 −1 x1 0
 2 6 −2   x2  =  0  .
    
−1 −2 3 x3 0

3−λ 2 −1

2 6−λ −2 = λ3 − 12λ2 + 36λ − 32 = (λ − 2)2 (λ − 8) = 0,

−1 −2 3 − λ

and gives the eigenvalue 2 of multiplicity 2 and the eigenvalue 8 of multiplicity 1 and the sum of
these three eigenvalues is 12 which agrees with the trace of A. When λ = 2, we have
 
1 2 −1
(A − 2I) =  2 4 −2  .
 
−1 −2 1
and so from (2.2), we have

x1 + 2x2 − x3 = 0
2x1 + 4x2 − 2x3 = 0
−x1 − 2x2 + x3 = 0
Let x2 = s and x3 = t, then the solution to this system is
       
x1 −2s + t −2 1
 x2  =  s  = s 1  + t 0 .
       
x3 t 0 1
So the two eigenvectors of A are, x1 = α[−2, 1, 0]T and x2 = α[1, 0, 1]T , corresponding to the
eigenvalue 2, with s, t ∈ R, and s, t 6= 0. Similarly, we can find the third eigenvector, x3 =
α[0.5, 1, −0.5]T of A corresponding to the other eigenvalue 8. •
Note that in all above three examples and any other example, there is always an infinite number
of choices for each eigenvector. We arbitrary choose a simple one by setting one or more of the
elements xi ’s equal to a convenient number. Here we have set one of the element xi ’s equal to 1. •
2.2 Linear Algebra and Eigenvalues Problems

The solution of many physical problems requires the calculation of the eigenvalues and the corre-
sponding eigenvectors of a matrix associated with linear system of equations. Since a matrix A of
order n has n, not necessary distinct, eigenvalues which are the roots of a characteristic equation
(2.4). Theoretically the eigenvalues of A can be obtained by finding the n roots of a characteristic
polynomial p(λ) and then the associated linear system can be solved to determine the correspond-
ing eigenvectors. For large value of n, the p(λ) is difficult to obtain except for small values of n.
So it is necessary to construct approximation techniques for finding the eigenvalues of A.
Before discussing such approximation techniques for finding eigenvalues and eigenvectors of a given
matrix A, we need some definitions and results from linear algebra.
Definition 2.3 (Real Vector Spaces)
A vector space consists of a nonempty set V of objects (called vectors) that can be added, that can
be multiplied by a real number (called scalar) and for which certain axioms hold. If u and v are
two vectors in V , their sum is expressed as u + v, and scalar product of u by a real number α is
denoted by αu. These operations are called vector addition and scalar multiplication, respectively,
and the following axioms are assumed to hold:
Axioms for Vector Addition

152 2.2 Linear Algebra and Eigenvalues Problems
1. If u and v are in V , then u + v in V.
2. u + v = v + u, for all u and v in V.
3. u + (v + w) = (u + v) + w, for all u, v and w in V.
4. There exists an element 0 in V , called a zero vector, such that u + 0 = u.
5. For each u in V , there is an element −u (called negative of u) in V such that u + (−u) = 0.
Axioms for Scalar Multiplication
1. If v is in V , then αu is in V , for all α ∈ R.
2. α(u + v) = αu + αv, for all u, v ∈ V and α ∈ R.
3. (α + β)u = αu + βu, for all u ∈ V and α, β ∈ R.
4. α(βu) = (α + β)u, for all u ∈ V and α, β ∈ R.
5. 1u = u. •
For example, a real vector space is the space Rn consisting of column vectors or n-tuples of real
numbers u = (u1 , u2 , . . . , un )T . Vector addition and scalar multiplication are depend in the usual
manner:    
u1 + v1 αu1
 u2 + v2   αu2 
   
u+v =  ..  , αu =  ..  ,
  
 .   . 
un + vn αun
whenever    
u1 v1

 u2 


 v2 

u= ..  and v= .. .
. .
   
   
un vn
The zero vector is 0 = (0, 0, . . . , 0)T . The fact that vectors in Rn satisfy all of the vector space
axioms is an immediate consequence of the laws of vector addition and scalar multiplication. •
The following theorem presents several useful properties common to all vector spaces.
Theorem 2.3 If V is a vector space, then:
1. 0u = 0, for all u ∈ V.
2. α0 = 0, for all α ∈ R.
3. If αu = 0, then α = 0 or u = 0.
4. (−1)u = u, for all u ∈ V. •

Definition 2.4 (Subspaces)
Let V be a vector space and W a nonempty subset of V . If W is a vector space with respect to the
operations in V , then W is called a subspace of V.
For example, every vector space has at least two subspaces, itself and the subspace{0} (called the
zero subspace) consisting only the zero vector. •
Theorem 2.4 A nonempty subset W ⊂ V of a vector space is a subspace if and only if:
1. For every u, v ∈ W , then the sum u + v ∈ W.
2. For every u ∈ W and every α ∈ R, then the scalar product αu ∈ W. •
Definition 2.5 (Basis of Vector Space)
Let V be a vector space. A finite set S = {v1 , v2 , . . . , vn } of vectors in V is a basis for V if and
only if any vector v in V can be written, in a unique way, as a linear combination of the vectors
in S, that is, if and only if any vector v has the form
v = k1 v1 + k2 v2 + · · · + kn vn , (2.24)
for one and only one set of real numbers k1 , k2 , . . . , kn . •
Definition 2.6 (Linear Independent Vectors)
The vectors v1 , v2 , . . . , vn are said to be linear independent if whenever

k1 v1 + k2 v2 + · · · + kn vn = 0, (2.25)
then all of the coefficients k1 , k2 , . . . , kn must equal to zero, that is,
k1 = k2 = · · · = kn = 0. (2.26)
If the vectors v1 , v2 , . . . , vn are not linear independent, then we say that they are linear dependent.
In other words, the vectors v1 , v2 , . . . , vn are linear dependent if and only if there exists numbers
k1 , k2 , . . . , kn , not all zero, for which
k1 v1 + k2 v2 + · · · + kn vn = 0.
Sometimes we say that the set {v1 , v2 , . . . , vn } is linearly independent (or linearly dependent) in-
stead of saying that the vectors v1 , v2 , . . . , vn are linearly independent (or linearly dependent). •
Example 2.5 Let us consider the vectors v1 = (1, 2) and v2 = (−1, 1) in R2 . To show that the
vectors are linearly independent, we write
k1 v1 + k2 v2 = 0
k1 (1, 2) + k2 (−1, 1) = (0, 0)
(k1 − k2 , 2k1 + k2 ) = (0, 0)
showing that
k1 − k2 = 0 and 2k1 + k2 = 0,
the only solution to above system is trivial solution, that is, k1 = k2 = 0. Thus the vectors are
linearly independent. •
The above results can be obtained using MATLAB command window as follows:
>> v1 = [1 2]0 ; v2 = [−1 1]0 ; A = [v1 v2]; null(A)
Note that using above MATLAB command, we obtained
ans =
Empty matrix: 2 − by − 0
which means that the only solution to the homogeneous system Ak = 0 is the zero solution k = 0.
Example 2.6 Consider the following functions:
p1 (t) = t2 + t + 2, p2 (t) = 2t2 + t + 3, p3 (t) = 3t2 + 2t + 2,
Show that the set {p1 (t), p2 (t), p3 (t)} is linear independent.
Solution. Suppose a linear combination of these given polynomials vanishes, that is,
k1 (t2 + t + 2) + k2 (2t2 + t + 3) + k3 (3t2 + 2t + 2) = 0.
By equating the coefficients of 2, 1 and 0 degrees, we get the following linear system
k1 + 2k2 + 3k3 = 0
k1 + k2 + 2k3 = 0
2k1 + 3k2 + 2k3 = 0
Solving this homogenous linear system

         
1 2 3 k1 0 1 0 0 k1 0
 1 1 2   k2  =  0  , gives  0 1 0   k2  =  0  .
         
2 3 2 k3 0 0 0 1 k3 0
Since the only solution to above system is trivial solution
k1 = k2 = k3 = 0,
therefore, the given functions are linearly independent. •
Example 2.7 Suppose that the set {v1 , v2 , v3 } is linear independent in a vector space V . Show
that the set {v1 + v2 + v3 , v1 − v2 − v3 , v1 + v2 − v3 } is also linearly independent.
Solution. Suppose a linear combination of these given vectors v1 + v2 + v3 , v1 − v2 − v3 and

v1 + v2 − v3 vanishes, that is,
k1 (v1 + v2 + v3 ) + k2 (v1 − v2 − v3 ) + k3 (v1 + v2 − v3 ) = 0.
We must deduce that k1 = k2 = k3 = 0. By equating the coefficients of v1 , v2 and v3 , we obtain
(k1 + k2 + k3 )v1 + (k1 − k2 + k3 )v2 + (k1 − k2 − k3 )v3 = 0.

Since {v1 , v2 , v3 } is linear independent, therefore, we have
k1 + k2 + k3 = 0
k1 − k2 + k3 = 0
k1 − k2 − k3 = 0
Solving this linear system, we get the unique solution
k1 = 0, k2 = 0, k3 = 0,
which means that the set {v1 + v2 + v3 , v1 − v2 − v3 , v1 + v2 − v3 } is also linearly independent. •
Theorem 2.5 If {v1 , v2 , . . . , vn } is a set of n linearly independent vectors in Rn , then any vector
x ∈ Rn can be written uniquely as
x = k1 v1 + k2 v2 + · · · + kn vn , (2.27)
for some collection of constants k1 , k2 , . . . , kn . •
Example 2.8 Consider the vectors v1 = (1, 2, 1), v2 = (1, 3, −2) and v3 = (0, 1, −3) in R3 . If
k1 , k2 and k3 are numbers with
k1 v1 + k2 v2 + k3 v3 = 0,
which is equivalent to
k1 (1, 2, 1) + k2 (1, 3, −2) + k3 (0, 1, −3) = (0, 0, 0).
Thus we have system

k1 + k2 + 0k3 = 0
2k1 + 3k2 + k3 = 0
k1 − 2k2 − 3k3 = 0
This system has infinitely many solutions, one of which is, k1 = 1, k2 = −1, and k3 = 1. So
v1 − v2 + v3 = 0.
Thus, the vectors v1 , v2 and v3 are linearly dependent. •
>> v1 = [1 2 1]0 ; v2 = [1 3 − 2]0 ; v3 = [0 1 − 3]0 ; A = [v1 v2 v3]; null(A)
By using this MATLAB command, the answer we obtained means that there is a nonzero solution
to the homogeneous system Ak = 0.
Example 2.9 Find the value of α for which the set {(1, −2), (4, −α)} is linear dependent.
Solution. Suppose a linear combination of these given vectors (1, −2) and (4, −α) vanishes, i.e,
k1 (1, −2) + k2 (4, −α) = 0

It can be written in the linear system form as follows:

k1 + 4k2 = 0
−2k1 − αk2 = 0
or ! ! !
1 4 k1 0
= .
−2 α k2 0
By solving this system, we obtain
! ! !
1 4 k1 0
= ,
0 8−α k2 0
and it shows that the system has infinitely many solutions for α = 8. Thus the given set {(1, −2), (4, −α)}
is linear dependent for α = 8. •
Theorem 2.6 Let a set {v1 , v2 , . . . , vn } be linearly dependent in a vector space V . Then any set
of vectors in V that contains these vectors will also be linearly dependent. •
Note that any collection of n linearly independent vectors in Rn is a basis for Rn .
Theorem 2.7 If A is an n × n matrix and λ1 , . . . , λn are distinct eigenvalues of A with associated

eigenvectors v1 , . . . , vn , then the set {v1 , . . . , vn } is linearly independent. •
Theorem 2.8 A set of m vectors in Rn is linearly dependent if m > n. •
Example 2.10 Show that the vectors v1 = (1, 3), v2 = (2, 4) and v3 = (3, 1) in R2 are linearly
dependent.
Solution. Suppose a linear combination of these given vectors (1, 3), (2, 4) and (3, 1) vanishes, that
is,
k1 (1, 3) + k2 (2, 4) + k3 (3, 1) = 0
It can be written in the linear system form as follows:
k1 + 2k2 + 3k3 = 0
3k1 + 4k2 + k3 = 0
or  
! k1 !
1 2 3 0
 k2  = .
 
3 4 1 0
k3
By solving this system, we see that this system has infinitely many solutions, one of which is
k1 = 5, k2 = −4, k3 = 1.
So
5v1 − 4v2 + v3 = 0.
Thus, the vectors v1 , v2 and v3 are linearly dependent. So there cannot be more than two linearly
independent vectors in R2 . •
2.2.1 Orthogonal and Orthonormal Set of Vectors

Here, we will define orthogonal and orthonormal Set of Vectors which play important rule in the
eigenvalue problems.
Definition 2.7 (Orthogonal Set of Vectors)
A set of vectors {v1 , v2 , . . . , vn } in Rn is called orthogonal set if

viT vj = 0, for all i 6= j, i, j = 1, 2, . . . , n. (2.28)
For example, the set of vectors {v1 , v2 , v3 } in R3 , where
     
4 − 2
v1 =  2  , v2 =  2  , v3 =  1  ,
     
−5 0 2
is an orthogonal set in R3 because

v1T v2 = −4 + 4 + 0 = 0, v2T v3 = −2 + 2 + 0 = 0, v1T v3 = 8 + 2 − 10 = 0.
Theorem 2.9 An orthogonal set of vectors that does not contains the zero vectors is linearly
independent. •
The proof of this theorem is beyond the scope of this text and will be omitted. However, the
result is extremely important and can be easily understood and used. We illustrate this result by
considering the following matrix  
6 −2 2
A =  −2 5 0 ,
 
2 0 7
which has the eigenvalues 3, 6 and 9. The corresponding eigenvectors of A are [2, 2, −1]T , [−1, 2, 2]T ,
[2, −1, 2]T and they form orthogonal set. To show that the vectors are linearly independent, we
write
k1 v1 + k2 v2 + k3 v3 = 0,
then the equation
k1 (2, 2, −1) + k2 (−1, 2, 2) + k3 (2, −1, 2) = (0, 0, 0),
leads to the homogeneous system of three equations in three unknown, k1 , k2 , and k3 :
2k1 − k2 + 2k3 = 0
2k1 + 2k2 − k3 = 0
−k1 + 2k2 + 2k3 = 0
Thus the vectors will be linearly independent if and only if above system has trivial solution. By
writing above system as an augmented matrix form and then row-reduce, gives:
   
2 −1 2 0 1 0 0 0
 2 2 −1 0  −→  0 1 0 0  ,
   
−1 2 2 0 0 0 1 0
which gives us, k1 = 0, k2 = 0, k3 = 0. Hence the given vectors are linearly independent. •
Theorem 2.10 The determinant of a matrix is zero if and only if the rows (or columns) of the
matrix form a linearly dependent set. •
Definition 2.8 (Orthonormal Set of Vectors)
A set of vectors {v1 , v2 , . . . , vn } in Rn is called orthonormal set if it is an orthogonal set of unit

vector, that is,
viT vi = 1, for all i = 1, 2, . . . , n. (2.29)
For example, the set of vectors {u1 , v2 } in R3 , where
√ √ √ √ √ √
u1 = [1/ 3, 1/ 3, −1/ 3]T , u2 = [−1/ 6, 2/ 6, 1/ 6]T ,
is an orthonormal set in R3 because

√ √ √
uT1 u1 = 1/3 + 1/3 + 1/3 = 1, uT2 u2 = 1/6 + 4/6 + 1/6 = 1, uT1 u2 = −1/ 18 + 2/ 18 − 1/ 18 = 0.
If we have an orthogonal set, we can easily obtain an orthonormal set from it, we simply normalize
each vector. •
2.2.2 Similarity and Diagonalization of Matrices

As we know that diagonal and triangular matrices are good in the sense that their eigenvalues
are clearly displayed. It would be nice if we could relate a given square matrix to a diagonal or
triangular one in such a way that they had exactly the same eigenvalues. Since we know that
Gauss-elimination procedure converted a square matrix into triangular matrix but this procedure
does not preserve the eigenvalues of the matrix. Here, we discuss a different ways of transformation
of a matrix that does behave well with respect to eigenvalues.
2.2.2.1 Similar Matrices

Here, we discuss a simple but useful result which makes it desirable to find ways to transform a
general n × n matrix A into another matrix B having the same eigenvalues. Unfortunately, the
elementary operations that can be used to reduce A → B are not suitable because the scale and
subtract operations alter eigenvalues. What we needed are similarity transformations, which occur
frequently in applying linear algebra, very often in the context of relating co-ordinate systems.
Definition 2.9 (Similar Matrix)
Let A and B be square matrices of the same size. A matrix B is said to be similar to A if there
exists a nonsingular matrix Q such that B = Q−1 AQ. The transformation of A into B in this
manner is called a similarity transformation.
Note that matrix Q depends on A and B and is not unique for a given pair of similar matrices A
and B. If A is similar to B, then one can write
AQ = QB,
that implies that it is not necessary to compute inverse of Q.

For example, if we have
! ! !
5 −3 −1 1 7 −2
A= , B= , Q= ,
4 −2 −6 4 10 −3
then one can easily show that the matrix A is similar to the matrix B since
! ! !
5 −3 −7 −2 5 −1
AQ = = ,
4 −2 10 −3 8 −2
and ! ! !
−7 −2 −1 1 5 −1
QB = = = AQ.
10 −3 −6 4 8 −2
On the other hand, if we have
! ! !
3 −1 2 1 1 1
A= , B= , Q= ,
−5 7 −4 6 1 −5
then ! ! !
3 −1 1 1 2 8
AQ = = ,
−5 7 1 −5 2 −40
and ! ! !
1 1 2 1 −2 7
QB = = 6= AQ,
1 −5 −4 6 22 −29
and therefore, the matrix A is not similar to the matrix B. •
Theorem 2.11 Similar matrices have the same eigenvalues.
Proof. Let A and B be similar matrices. Hence there exists a nonsingular matrix Q such that
B = Q−1 AQ. The characteristic polynomial of B is |B − λI|. Substituting for B and using the
multiplicative properties of determinants, we get
|B − λI| = |Q−1 AQ − λI| = |Q−1 (A − λI)Q|
= |Q−1 ||A − λI||Q| = |(A − λI)|Q−1 ||Q|
= |A − λI||Q−1 Q| = |A − λI||I| = |A − λI|.

The characteristic polynomials of A and B are identical, so their eigenvalues are the same. •
Theorem 2.12 Similar matrices have the same traces.
Proof. Let A and B be similar matrices. Hence there exists a nonsingular matrix Q such that
B = Q−1 AQ. Therefore, we have
trace(B) = trace(Q−1 AQ) = trace(Q−1 (AQ)) = trace((AQ)Q−1 )
= trace(AQQ−1 ) = trace(AI) = trace(A).
So the traces of matrices A and B are identical. •
Example 2.11 Consider the following matrices A and Q, Q is nonsingular. Use the similarity
transformation Q−1 AQ to transform A into a matrix B.
   
0 0 −2 −1 0 −2
A= 1 2 1 , Q= 0 1 1 .
   
1 0 3 1 0 1
Solution. Let
 −1   
−1 0 −2 0 0 −2 −1 0 −2
B = Q−1 AQ =  0 1 1   1 2 1  0 1 1 
    
1 0 1 1 0 3 1 0 1
   
1 0 2 0 0 −2 −1 0 −2
=  1 1 1  1 2 1  0 1 1 
   
−1 0 −1 1 0 3 1 0 1
 
2 0 0
=  0 2 0 .
 
0 0 1
One can also check that  

−2 0 −2
AQ = QB =  0 2 1 .
 
2 0 1
In the Example 2.11, the matrix A is transformed into a diagonal matrix B. Not every square
matrix can be ”diagonalized” in this manner. Following we will discuss conditions under which
a matrix can be diagonalized and when it can, ways of constructing an approximate transforming
matrix Q. We will find that eigenvalues and eigenvectors play a key role in this discussion. •
2.2.2.2 Diagonalizable Matrices

The best situation (as in the Example 2.11) is when a square matrix is similar to a diagonal matrix.
Of special importance for the study of eigenvalues are diagonal matrices. These will be denoted by
 
λ1 0 0 ··· 0

 0 λ2 0 ··· 0 

D= .. .. .. . . .. ,

 . . . . .


0 0 0 ··· λn
and is called spectral matrix, that is, all the diagonal elements of D are the eigenvalues of A. This
simple but useful result makes it desirable to find ways to transform a general n × n matrix A
into a diagonal matrix having the same eigenvalues. Unfortunately, the elementary operations that
can be used to reduce A → D are not suitable because the scale and subtract operations alter
eigenvalues. Here, what we needed are similarity transformations.
Definition 2.10 (Diagonalizable Matrix)
A square matrix A is called diagonalizable if there exists an invertible matrix Q such that
D = Q−1 AQ, (2.30)
is a diagonal matrix. Note that all the diagonal elements of it are the eigenvalues of A, and a
invertible matrix Q can be written as

Q = x1 |x2 | · · · |xn ,
and is called model matrix because its columns contains, x1 , x2 , . . . , xn which are the eigenvectors
of A corresponding to the eigenvalues λ1 , . . . , λn . •
Theorem 2.13 Any matrix having linearly independent eigenvectors corresponding to distinct and
real eigenvalues is diagonalizable, that is
Q−1 AQ = D,
where D is a diagonal matrix and Q is invertible matrix.
Proof. Let λ1 , . . . , λn are the eigenvalues of a matrix A, with corresponding linearly independent
eigenvectors x1 , . . . , xn . Let Q be the matrix having x1 , . . . , xn as column vectors, that is
Q = (x1 · · · xn ).
Since Ax1 = λx1 , . . . , Axn = λn xn , matrix multiplication in terms of columns gives

AQ = (Ax1 · · · Axn ) = (λ1 x1 · · · λn xn )
λ1 0 λ1 0
   
= (x1 · · · xn ) 
 ..  = Q
  .. .

. .
0 λn 0 λn
Since the columns of Q are linearly independent, Q is invertible. Thus
λ1 0
 
Q−1 AQ = 
 ..  = D.

.
0 λn
Therefore, if a square matrix A has n linearly independent eigenvectors, these eigenvectors can be
used as the columns of a matrix Q that diagonalizes A. The diagonal matrix has the eigenvalues of
A as diagonal elements. Note that the converse of the above theorem is also exits, that is, if A is
diagonalizable, then it has n linearly independent eigenvectors. •
Example 2.12 Consider the following matrix

 
0 0 1
A =  3 7 −9  ,
 
0 2 −1
which has a characteristic equation

p(λ) = λ3 − 6λ2 + 11λ − 6 = (λ − 1)(λ − 2)(λ − 3) = 0.
The eigenvalues of A therefore are 1, 2, and 3, with sum 6 which agrees with the trace of A. The
corresponding to these eigenvalues, the eigenvectors of A are, x1 = [1, 1, 1]T , x2 = [1, 3, 2]T , and
x3 = [1, 6, 3]T . Thus the nonsingular matrix Q is given by
   
1 1 1 3 1 −3
Q =  1 3 6 , and Q−1 =  −3 −2 5 .
   
1 2 3 1 1 −2
Thus
     
3 1 −3 0 0 1 1 1 1 1 0 0
−1
Q AQ =  −3 −2 5   3 7 −9   1 3 6  =  0 2 0  = D,
     
1 1 −2 0 2 −1 1 2 3 0 0 3
so A is diagonalizable matrix. •
>> A = [0 0 1; 3 7 − 9; 0 2 − 1]; P = poly(A);

>> [X, D] = eig(A); eigenvalues = diag(D);
>> x1 = X(:, 1); x2 = X(:, 2); x3 = X(:, 3); Q = [x1 x2 x3]; D = inv(Q) ∗ A ∗ Q
It is possible that independent eigenvectors may exist even though the eigenvalues are not distinct
though no theorem exists to show under what conditions they do so. The following example shows
the situation that can arise.
 
2 1 1
A =  2 3 2 ,
 
3 3 4
p(λ) = λ3 − 9λ2 + 15λ − 7 = (λ − 7)(λ − 1)2 = 0.
The eigenvalues of A are, 7 of multiplicity one, and 1 of multiplicity two. The eigenvectors corre-
sponding to these eigenvalues are, x1 = [1, 2, 3]T , x2 = [−1, 1, 0]T , and x3 = [−1, 0, 1]T . Thus the
nonsingular matrix Q is given by
   
1 −1 −1 1 1 1
1
Q= 2 1 0 , and Q−1 =  −2 4 −2  .
  
6
3 0 1 −3 −3 3
Thus  
7 0 0
−1
Q AQ =  0 1 0  = D,
 
0 0 1
so A is diagonalizable matrix. •
2.2.3 Computing Powers of a Matrix

There are numerous problems in applied mathematics that require the computation of higher
powers of a square matrix. Now we shall show how a diagonalization can be used to simplify
such computations for diagonalizable matrices.
If A is a square matrix and Q is an invertible matrix , then
(Q−1 AQ)2 = Q−1 AQQ−1 AQ = Q−1 AIAQ = Q−1 A2 Q.
More generally, for any positive integer k, we have
(Q−1 AQ)k = Q−1 Ak Q.
It follows from this equation that if A is diagonalizable and Q−1 AQ = D is a diagonal matrix, then
Q−1 Ak Q = (Q−1 AQ)k = Dk .
Solving this equation for Ak yields
Ak = QDk Q−1 . (2.31)
Therefore, in order to compute the kth power of A, all we need to do is compute the kth power of a
diagonal matrix D and then form the matrices Q and Q−1 as indicated in (2.31). But taking the
kth power of a diagonal matrix is easy, for it simply amounts to taking the kth power of each of the
entries on the main diagonal.

 
1 1 −4
A =  2 0 −4  ,
 
−1 1 −2
p(λ) = λ3 + λ2 − 4λ − 4 = (λ + 1)(λ + 2)(λ − 2) = 0.
It gives eigenvalues 2, -2 and -1 of the given matrix A with corresponding eigenvectors [1, 1, 0]T ,
[1, 1, 1]T and [1, 2, 1]T . Then the factorization, A = QDQ−1 , becomes
     
1 1 −4 1 1 1 2 0 0 1 0 −1
 2 0 −4  =  1 1 2   0 −2 0   1 −1 1 ,
     
−1 1 −2 0 1 1 0 0 −1 −1 1 0
and from (2.31), we have

   
1 1 1 (2)k 0 0 1 0 −1
k
A =  1 1 2  0 (−2)k 0   1 −1 1 .
   
0 1 1 0 0 (−1)k −1 1 0
It implies that
 
2k + (−2)k − (−1)k −(−2)k + (−1)k −2k + (−2)k
k
A =  2 + (−2) − 2 ∗ (−1) −(−2) + 2 ∗ (−1) −2k + (−2)k  .
 k k k k k 
(−2)k − (−1)k −(−2)k + (−1)k (−2)k
For this formula,

 we can easily compute
 any power of a given matrix A. For example, if k = 10,
2047 −1023 0
then A10 =  2046 −1022 0  , is the required 10th power of the matrix. •
 
1023 −1023 1024
>> A = [1 1 − 4; 2 0 − 4; −1 1 − 2]; P = poly(A); [X, D] = eig(A);

>> eigenvalues = diag(D); x1 = X(:, 1); x2 = X(:, 2); x3 = X(:, 3);
>> Q = [x1 x2 x3]; A10 = Q ∗ Dˆ 10∗inv(Q);
Example 2.15 Show that the following matrix A is not diagonalizable.

!
5 −3
A= .
3 −1
Solution. To compute the eigenvalues and corresponding eigenvectors of the given matrix A, we
have the characteristic equation of the form
|A − λI| = λ2 − 4λ + 4 = (λ − 2)(λ − 2) = 0,
and gives repeated eigenvalues 2 and 2 of the given matrix A. To find the corresponding eigenvectors,
we solve (2.2) for λ = 2, we get ! !
3 −3 x1
= 0.
3 −3 x2
Solving above homogeneous system, gives 3x1 − 3x! 2 = 0 and we have x1 = x2 = α. Thus the
1
eigenvectors are nonzero vectors of the form α . The eigenspace is a one-dimensional space.
1
Since A is a 2 × 2 matrix, but it does not have two linearly independent eigenvectors. Thus A is
not diagonalizable. •
Note that defective matrices cannot be diagonalized because they do not possess enough eigenvec-
tors to make a basis.
If the λ is listed twice as an eigenvalue but only one eigenvector is listed. The following theorem
shows what to do in this situation.
Theorem 2.14 (Defective Multiplicity ≥ 2 Eigenvalues)
Suppose that the n × n constant matrix A has an eigenvalue λ of multiplicity ≥ 2 and all the
eigenvectors associated with λ are scalar multiples of an eigenvector v. Then there are infinitely
many vectors w such that
(A − λI)w = v.
Moreover, if w is any such vector then
x1 = v and x2 = w,
are linearly independent solutions of the homogeneous system Ax = 0. •

Using Theorem 2.14, we can find the second eigenvector of the matrix of the Example 2.15 using
(2.2) for λ = 2, we get
! ! !
3 −3 w1 1
=v= .
3 −3 w2 1
Solving above nonhomogeneous system, gives 3w1 − 3w2 = 1 !and we have w1 = 0 and w2 = −1/3.
0
Thus the eigenvector is nonzero vectors of the form α .
1/3
Theorem 2.15 (Defective Multiplicity ≥ 3 Eigenvalues)
Suppose that the n × n constant matrix A has an eigenvalue λ of multiplicity ≥ 3 and all the
eigenvectors associated with λ are scalar multiples of an eigenvector v. Then there are infinitely
many vectors w such that
(A − λI)w = v, (2.32)
and if w is any such vector then there are infinitely many vectors u such that
(A − λI)u = w. (2.33)
If w satisfies the equation (2.32) and u satisfies the equation (2.33), then
x1 = v,
x2 = w,
and
x3 = u,
are linearly independent solutions of the homogeneous system x0 = Ax. •
Theorem 2.16 Suppose that the n × n constant matrix A has an eigenvalue λ of multiplicity
≥ 3 and all the eigenvectors associated with λ are linear combinations of two linearly independent
eigenvectors v1 and v2 . Then there are constants α1 and α2 (not both zero) such that
v3 = α1 v1 + α2 v2 , (2.34)
Then there are infinitely many vectors w such that
(A − λI)w = v3 . (2.35)
If w satisfies the equation (2.35), then
x1 = v1 , x2 = v 2 , x 3 = w + v3 ,
are linearly independent solutions of the homogeneous system Ax = 0. •

Example 2.16 (a) Show that the following matrix A is diagonalizable.

(b) Find a diagonal matrix D that is similar to A.
(c) Determine the similarity transformation that diagonalizes A.
(d) Compute A5 for the matrix A. !
2 1
A= .
3 4
Solution. (a) To compute the eigenvalues and corresponding eigenvectors of the given matrix A,
we have the characteristic equation of the form
|A − λI| = λ2 − 6λ + 5 = (λ − 1)(λ − 5) = 0,
it gives the distinct eigenvalues 1 and 5 of A and one can easily compute the corresponding eigen-
vectors by solving (2.2) for λ1 = 1 and λ2 = 5, gives
x1 = α1 [−1, 1]T and x2 = α2 [−1, −3]T , where α1 , α2 6= 0.
Since A, a 2 × 2 matrix, has two linearly independent eigenvectors, it is diagonalizable.

(b) The matrix A is similar to the diagonal matrix D, which has diagonal elements λ1 = 1 and
λ2 = 5. Thus ! !
2 1 1 0
A= is similar to D = .
3 4 0 5
(c) To find the similarity transformation that diagonalizes the given matrix A, we select two con-
venient linearly independent eigenvectors, say
! !
−1 −1
x1 = and x2 = .
1 −3
!
−1 −1
Let these vectors be the column vectors of the diagonalizing matrix Q = . We then get
1 −3
!−1 ! !
−1 −1 2 1 −1 −1
Q−1 AQ =
1 −3 3 4 1 −3
! ! ! !
−3/4 1/4 2 1 −1 −1 1 0
= = = D.
−1/4 −1/4 3 4 1 −3 0 5
(d) To find the 5th power of the given matrix A, we use the values of the matrices Q and D and
! ! ! !
5 −1 −1 15 0 −3/4 1/4 782 781
A = = ,
1 −3 0 55 −1/4 −1/4 2343 2344
the required power of the given matrix A. •
Example 2.17 (a) Find a diagonalizable matrix A whose eigenvalues are λ1 = 2, λ2 = −1 and the
corresponding eigenvectors are x1 = [−1, 1]T , x2 = [−2, 1]T .
(b) Compute A5 for the matrix A.
(c) For what value of b, the linear system A5 x = b has a solution x = [1, 1]T .
Solution. (a) Since the matrix A is similar to the diagonal matrix D, which has diagonal elements
λ1 = 2 and λ2 = −1. Thus !
2 0
D= .
0 −1
To find the similarity transformation that diagonalizes the matrix A, we use the given two linearly
independent eigenvectors,
! !
−1 −2
x1 = and x2 = ,
1 1
and these vectors be the column vectors of the diagonalizing matrix Q.

!
−1 −2
Q= .
1 1
Thus we get
! ! !−1
−1 −2 2 0 −1 −2
QDQ−1 =
1 1 0 −1 1 1
! ! !
−1 −2 2 0 1 2
=
1 1 0 −1 −1 −1
!
−4 −6
= = A,
3 5
the required diagonalizable matrix A.
(b) To find the 5th power of the diagonalizable matrixA, we use the values of the matrices Q and
D and get
! ! ! !
5 −1 −1 −2 25 0 1 2 −34 −66
A = QDQ = = ,
1 1 0 (−1)5 −1 −1 33 65
the required power of the diagonalizable matrix A.

(c) To find the value of b, we will solve the linear system A5 x as follows
! ! !
5 −34 −66 1 −100
A x= = ,
33 65 1 98
the required column matrix b. •
Definition 2.11 (Orthogonal Matrix)
It is a square matrix whose inverse can be determined by transposing it, that is
A−1 = AT .
In other words, a square matrix whose columns form an orthonormal set is called an orthogonal
matrix. Such matrices do occur in some engineering problems. The matrix used to obtain rotation
of coordinates about the origin of a Cartesian system is one example of an orthogonal matrix. For
example, consider the orthogonal matrix
! !
0.6 −0.8 −1 0.6 0.8
A= , A = = AT .
0.8 0.6 −0.8 0.6
Theorem 2.17 Let A be an orthogonal matrix, then:

1- A−1 is orthogonal.
2- det(A) = ±1.
3- If λ is an eigenvalue of A, then |λ| = 1.
24- If A and B are orthogonal square matrices, then so is AB. •
2.2.4 Orthogonal Diagonalization

Let Q be an orthogonal matrix, that is, Q−1 = QT . Thus if such a matrix is used in a similarity
transformation, the transformation becomes D = QT AQ. This type of similarity transformation
is very much easier to calculate its inverse as this is simply its transpose. There is therefore
considerable advantage in searching for situations where a reduction to a diagonal matrix using an
orthogonal matrix is possible.
Definition 2.12 (Orthogonally Diagonalizable Matrix)
A square matrix A is said to be orthogonally diagonalizable if there exists an orthogonal matrix Q

such that, D = Q−1 AQ = QT AQ, is a diagonal matrix. •
The following theorem tells us that the set of orthogonally diagonalizable matrices is in fact the
set of symmetric matrices.
Theorem 2.18 A square matrix A is orthogonally diagonalizable if it is a symmetric matrix.
Proof. Suppose that a matrix A is orthogonally diagonalizable then there exists an orthogonal
matrix Q such that D = QT AQ and A = QDQT . Taking transpose, gives
AT = (QDQT )T = (QT )T (QD)T = QDQT = A.
Thus A is symmetric. The converse of this theorem is also true. •

Symmetric Matrices
Now our next goal is to devise a procedure for orthogonally diagonalizing a symmetric matrix, but
before we can do so, we need a important theorem about eigenvalues and eigenvectors of symmetric
matrices.
Theorem 2.19 If A is a symmetric matrix, then:
(a) The eigenvalues of A are all real numbers.
(b) Eigenvectors from distinct eigenvalues are orthogonal. •
Theorem 2.20 The following conditions are equivalent for an n × n matrix Q:
(a) Q is invertible and Q−1 = QT .
(b) The rows of Q are orthonormal.
(c) The columns of Q are orthonormal. •
Diagonalization of a Symmetric Matrices
As a consequence of the proceeding theorem we obtain the following procedure for orthogonally
diagonalizing a symmetric matrix.
1. Find a basis for each eigenspace of A.
2. Find an orthonormal basis for each eigenspace.
3. Form the matrix Q whose columns are these orthonormal vectors.
4. The matrix D = QT AQ will be a diagonal matrix.

 
3 −1 0
A =  −1 2 −1  ,
 
0 −1 3
λ3 − 8λ2 + 19λ − 12 = 0,
and it gives the eigenvalues 1, 3, and 4, for the given matrix A. The corresponding to these
eigenvalues, the eigenvectors of A are, x1 = [1, 2, 1]T , x2 = [1, 0, −1]T , and x3 = [1, −1, 1]T , and
they form orthogonal set. Note that the following vectors
x1 1 x2 1 x3 1
u1 = = √ [1, 2, 1]T , u2 = = √ [1, 0, −1]T , u3 = = √ [1, −1, 1]T ,
kx1 k2 6 kx k
2 2 2 kx k
3 2 3
form an orthonormal set, since they inherit orthogonality from x1 , x2 , and x3 , and in addition
ku1 k2 = ku2 k2 = ku3 k2 = 1.
Then an orthogonal matrix Q formed from an orthonormal set of vectors as

 √ √ √ 
1/√6 1/ 2 1/√3
Q =  2/√6 √0 −1/√3  ,
 
1/ 6 −1/ 2 1/ 3
and
 √ √ √   √ √ √ 
1/√6 2/ 6 1/√6 3 −1 0 1/√6 1/ 2 1/√3
QT AQ =  1/√2 √0 −1/√2   −1 2 −1   2/√6 √0 −1/√3  ,
   
1/ 3 −1/ 3 1/ 3 0 −1 3 1/ 6 −1/ 2 1/ 3
which implies that  
1 0 0
QT AQ =  0 3 0  = D.
 
0 0 4
Note that the eigenvalues 1, 3 and 4 of A are real and its eigenvectors form an orthonormal set,
since they inherit orthogonally from x1 , x2 and x3 which satisfy the proceeding theorem. •
The results of the Example 2.18 can be obtained using MATLAB command window as:
>> A = [3 − 1 0; −1 2 − 1; 0 − 1 3]; P = poly(A); P P = poly2sym(P );

>> [X, D] = eig(A); λ = diag(D); x1 = X(:, 1); x2 = X(:, 2); x3 = X(:, 3);
>> u1 = x1/norm(x1); u2 = x2/norm(x2); u3 = x3/norm(x3);
>> Q = [u1 u2 u3]; D = Q0 ∗ A ∗ Q;
Theorem 2.21 (Principal Axis Theorem)
The following conditions are equivalent for an n × n matrix A:

(a) A has an orthonormal set of n eigenvectors.
(b) A is orthogonally diagonalizable.
(c) A is symmetric. •
2.2.4.1 Special Matrices

In the following section we shall discuss some extremely important properties of the eigenvalue
problem. Before this we discuss some special matrices as follows.
Definition 2.13 (Conjugate of Matrix)
If the entries of an n × n matrix A are complex numbers, we can write

A = (aij ) = (bij + icij ),
where bij and cij are real numbers. The conjugate of a matrix A is a matrix
Ā = (āij ) = (bij − icij ).
For example, the conjugate of the following matrix A
   
2 π i 2 π −i
A= 3 7 0  is Ā =  3 7 0 .
   
4 1−i 4 4 1+i 4
•

Eigenvalues and Eigenvectors

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Eigenvalues and Eigenvectors

Uploaded by

Copyright:

Available Formats

Chapter 2

Eigenvalues and Eigenvectors

Definition 2.1 (Eigenvalue Problem)

A scalar λ is called eigenvalue of n × n matrix A if there is a nonzero vector x such that

so it follows that x is an eigenvector of A corresponding to the eigenvalue 6. •

1 1.2 1.4 1.6 1.8 2

Figure 2.1: Shows the situation in R2 .

Definition 2.2 (Trace of a Matrix)

1. trace (AT ) = trace (A).

2. trace (kA) = k trace (A).

3. trace (A + B) = trace (A) + trace (B).

4. trace (A − B) = trace (A) - trace (B).

5. trace (AB) = trace (BA). •

For example, consider the following matrices

and trace(A − B) = −6 = 16 − 22 = trace(A) − trace(B).

>> A = [5 − 4 2; 1 4 3; 3 2 7]; B = [9 3 − 4; 1 5 2; 3 − 2 8];

2.1.0.1 Eigenvalues of 2 × 2 Matrices

By solving the above quadratic polynomial, we get, λ1 = 8, λ2 = 1, the possible eigenvalues of

trace(A) = λ1 + λ2 = 9 and det(A) = λ1 λ2 = 8,

which satisfies the above result. •

D = [14]2 − 4(20) = 116,

where the trace of A is 14 and the determinant of A is 20. •

Theorem 2.2 If D is discriminant of a 2 × 2 matrix, then the following statements hold:

1. The eigenvalues of A are real and distinct when D > 0.

2. The eigenvalues of A are a complex conjugate pair when D < 0.

3. The eigenvalues of A are real and equal when D = 0. •

D = [7]2 − 4(8) = 17 > 0.

D = [5]2 − 4(11) = −19 < 0.

D = [11]2 − 4(18) = 49 > 0.

trace(A) = λ1 + λ2 = 11 and det(A) = λ1 λ2 = 18.

which implies that

x1 = α[−1, 1]T , where α ∈ R, α 6= 0.

>> A = [6 − 3; −4 5]; EigT wo(A)

which is the trace of A.

2. The product of n eigenvalues of a matrix A equals the determinant of A, that is

By taking the transpose of (2.12), we have

yjT xk = 0, yiT xk = 0 k = 1, 2, · · · , n k 6= i, k 6= j, (2.18)

If λi , xi are solutions of (2.21), then xi is complex but λi is real. Also, if λi , xi and λj , xj

x∗i xj = 0, i 6= j, and x∗i Hxj = 0, i 6= j, (2.22)

and the eigenvectors are then said to be normalized.

For non-trivial solutions, using (2.4), we get

>> syms lambd; A = [3 0 1; 0 − 3 0; 1 0 3]; determ(A − lambd ∗ eye(3)); f actor(ans)

which implies that

Solving this system, we got, x1 = x3 = ∞ and x2 = 0. Hence the eigenvector x1 corresponding to

x1 = α[1, 0, 1]T , where α ∈ R, α 6= 0.

When λ2 = −3, we have     

which gives the solution, x1 = x3 = 0 and x2 = ∞. Hence the eigenvector x2 corresponding to

x2 = α[0, 1, 0]T , where α ∈ R, α 6= 0.

Finally, when λ3 = 2, we have

which implies that

gives, x1 = −x3 = ∞ and x2 = 0. Hence by choosing x1 = 1, we obtain

x3 = α[1, 0, −1]T , where α ∈ R, α 6= 0,

the third eigenvector x3 corresponding to third eigenvalue, 2, of the matrix. •

>> A = [3 0 1; 0 − 3 0; 1 0 3]; P = poly(A); P P = poly2sym(P )

>> [X, D] = eig(A);

>> A = [3 0 1; 0 − 3 0; 1 0 3]; P = poly(A); P P = poly2sym(P );

The augmented matrix form of the system is

which can be reduce to  

Solution. From (2.2), we have

For non-trivial solutions, using (2.4), we get

and so from (2.2), we have

2.2 Linear Algebra and Eigenvalues Problems

Definition 2.3 (Real Vector Spaces)