You are on page 1of 31

copy.

jpg
1
Module 3
More on Vector Spaces and Linear transformations
1.1 Let X be a vector space over the eld F and given a set I, let us write
X
I
:= I
x
X is a function.
For x X
I
, we shall write x(i) X as x
i
X at each i I. For x, u X
I
, F, dene
x +u X
I
by (x +u)
i
:= x
i
+u
i
.
Verify that this turns X
I
into a vector space(over F).
1.2 Write (X
I
)
0
:= x X
I
[x
i
= 0 for all but nitely many i I
Verify that (X
I
)
0
is a vector subspace of X
I
.
1.3 Take X = F and show that I is (in bijective correspondence with) a basis of (F
I
)
0
Hint:
Dene functions I
e
i
F by e
i
(j) =
ij
=
_

_
1, if i = j;
0, if i ,= j.
Then each e
i
(F
I
)
0
and if i ,= j, we have e
i
(i) = 1 while e
j
(i) = 0 so that e
i
,= e
j
. Thus I
e
(F
I
)
0
dened by e(i) := e
i
is injective which means that I
e
e(I) (F
I
)
0
is bijective. We thus identify
i I with e
i
(F
I
)
0
. If x (F
I
)
0
, we have x
i
= 0 for all but nitely many i I, let us say
x
k
1
, . . . , x
k
n
are these nonzero elements of F. Then (e
k
i
x
k
i
)(i) = (e
k
i
(i))x
k
i
=
_

_
x
k
i
, if k
i
= i;
0, if k
i
,= i.
And thus x
i
= x(i) =
n

l=1
(e
k
l
x
k
l
)(i) at each i I which means x =
n

l=1
e
k
l
x
k
l
i.e. each x (F
I
)
0
can be expressed as a nite linear combination of distinct vectors e
k
l
e(I) (F
I
)
0
in exactly one
way.
Thus the set e(I) is a basis of (F
I
)
0
, and since we agreed to identify I with e(I), via
e
I e(I), we
can say I is a basis of (F
I
)
0
; this of course means that dim (F
I
)
0
= [I[ where [I[ is the (nite or in-
nite) number of elements of I. the vector space (F
I
)
0
is called the vector space freely generated by I.
To sum up:
Given any nite or innite number d, take a set I with [I[ = d. The vector space (F
I
)
0
has dimension
c Copyright Reserved IIT Delhi
copy.jpg
2
d and e(I) (F
I
)
0
supplies one explicit basis for (F
I
)
0
; this will also be expressed by saying that
(F
I
)
0
is freely generated by I as well as by saying that I is a basis for (F
I
)
0
.
1.4 As instances of (1.3) above,
(i) The vector space of dimension 0 is 0, freely generated by and is a basis.
(ii) The vector space of dimension 1 is F, freely generated by a singleton 1 with e
1
= 1 F as a
basis.
(iii) The vector space of dimension n N is F
n
, freely generated by some n-element set 1, 2, . . . , n,
with
_

_
e
i
=
_

_
0
.
.
.
1
.
.
.
0
_

_
_

_
i th row
_

_
_

_
as a basis. We have, for x F
n
, x =
n

i=1
e
i
x
i
=
_

_
x
1
x
2
.
.
.
x
n
_

_
, x
i
F obtained for a unique ntuple (x
1
, . . . , x
n
) of scalars.
Note that 1, 2, . . . , n
e
i
F, e
i
(j) =
ij
for 1 i n and 1 j n completely de-
scribed by the column n 1 display as above; further, that for n = 0 we have (i), for n = 1,
we have (ii).
1.5 A dierent display style is in vogue for the case I = N = 0, 1, 2, . . .. N
x
X, is of course just
a sequence (x(0), x(1), . . .) of vectors x(i) X; they will be written x
i
now and the sequence
will be displayed as x =

n=0
x
n

n
. The vector X
N
will be denoted by X[] with (X
N
)
0
being
denoted by X[]. Thus for x, u X[], F, we write x + u =

n=0
(x + u)
n

n
=

n=0
(x
n
+u
n
)
n
=

n=0
x
n

n
+

n=0
u
n

n
For X = F therefore, an element F[] is written as =

n=0

n
; this will be called a
power series in the (single) indeterminate (one also says variable for indeterminate) with
coecient
n
F.
c Copyright Reserved IIT Delhi
copy.jpg
3
An element a =
m

n=0
a
n

n
(F
N
)
0
will be called a polynomial of degree m if a
m
,= 0; the
zero polynomial for which a
n
= 0 at each n will be said to be of degree (some authors
would say the degree is undened). We note that each nonzero F can be regarded as a
polynomial of degree 1; when we wish to think of F as F[] = (F
N
)
0
in this way, we say
it is a constant. Thus F is a subspace of F[] which is in turn a subspace of F[].
An explicit basis of F[] is obtained by the set 1, ,
2
, . . .. Thus dimension of F[] is not
nite.
Verify that P
n
[], the collection of polynomials of degree at most n, (thus a = a() P
n
[] i
a = a
0
+a
1
+. . . +a
m

m
with 0 m n) is a subspace of F[] and that 1, ,
2
, . . . ,
n
is
a basis of P
n
[]. Thus dim P
n
[] = n + 1 and as such, P
n
[]

= Fn + 1.
1.6 (i) Given =

n=0

n
F[], and x =

n=0
x
n

n
X[], dene x:=

n=0
(x)
n

n
where
(x)
n
:=

p+q=n
x
p

q
. Verify that x X[].
Hint:
For any n N, only nitely many p, q N exist with p + q = n so that

p+q=n
x
p

q
is a
nite sum of vectors in X and thus is in X; let us remind ourselves that each x
p

q
X since
x
p
X,
q
F.
(ii) In particular, with =

n=0

n
F[] and =

n=0

n
F[], we obtain =

n=0

F[], where
n
= ()
n
:=

p+q=n

q
F and thus F[]; we say that is the convolution
of and . Verify that with convolution as multiplication, F[] becomes a ring with identity
(what is the identity?) of which F[] is a subring (well known to you as the polynomial ring).
Further, verify that these are commutative rings.
Hint:
Repeat what you did for the polynomial ring F[] earlier; the same proof works for F[] as
well.
1.7 For vector spaces X and Y over F, we shall also say X
A
Y is F linear instead of simply
saying A is linear when we wish to emphasize the role of F.
(i) Verify that X[]

Y [] dened by (x) := x (that is, x =

n=0
x
n

n=0
x
n

n+1
= x)
is Flinear. (Since it shifts the sequence (x
0
, x
1
, . . .) to (0, x
0
, x
1
, . . .), it is called the right
shift operator).
c Copyright Reserved IIT Delhi
copy.jpg
4
Thus if X[]
A
Y [] is Flinear, it raises two F linear transformations:
X[]
A
Y [] =X[]

X[]
A
Y [],
and X[]
A
Y [] =X[]
A
Y []

Y [],
(ii) Show that A = A i A

k =
k
A for each k N.
Hint:
Clearly, for k = 0, the result holds with
0
being identity. If A
k
=
k
A, we have (A
k
) =
(A
k
) = (
k
A) i.e. A
k+1
=
k+1
A.
(iii) Prove that for X[]
A
Y [], the following are equivalent:
(a) A is F[]linear (i.e. A(x + u) = A(x) + (A(x)) for x, u X[] and F[]; this does
not mean of course that X[]andY [] are vector spaces over F[] since F[] is not a eld but
does mean that they are modules over the ring F[], a concept which we do not cover in this
course).
(b) A is F[]linear (i.e. A(x + ua) = A(x) + A(u)a for x, u X[] and a F[], again, F[] is
not a eld).
(c) A is Flinear with A = A.
Hint:
F[]linearity of Acertainly entail the F[]linearity of Awhich in turn entails the Flinearity
of A since F is a subring of F[] and F[] is a subring in turn of F[]. Further, if A is F[]linear,
we do have A = A since F[] and this (a) (b) (c) is established. If (c) is true, we
have A
k
=
k
A for each k N and if a = a
0
+a
1
+. . . +a
n

n
F[], we nd
(Aa)(x) = A(a
0
+a
1
+. . . +a
n

n
)(x)
= (Aa
0
)(x) + (Aa
1
)(x) +. . . +A(a
n

n
)(x)
= a
0
A(x) +a
1
(A)(x) + +a
n
(A
n
)(x)
= a
0
A(x) +a
1
(A(x)) + +a
n

n
A(x)
= aA(x) at each x A
Which means that A is in fact F[]linear. Thus (c) (b) is also true.
Now assume (b). Given =

n=0

n
F[], write it as = (
p1

n=0

n
) + (

n=0

n+p

n
)
p
=
a +
p
with a =
p1

n=0

n
F[], =

n=0

n+p

n
F[] and
p
F[]. Then for x X[],
c Copyright Reserved IIT Delhi
copy.jpg
5
we have
A(x) = A(xa +x
p
) [ x X[], a F[] F[]
F[],
p
F[] F[], so that

p
F[], and for F[], F[],
we have x( +) = x +x]
= A(xa +u
p
) [with u = x X[] since x X[] and
F[] ]
= A(x)a +A(u)
p
[ X[]
A
Y [] is F[] linear by
assumed (b)]
while (Ax) = (Ax)(a +
p
)
= (Ax)a + (Ax)(
p
) [ a F[],
p
F[], A(x) Y [],
and for y Y [], , F[],
y( +) = y +y.]
Therefore (Ax)
and A(x) have the same coecients for powers k of upto k < p. But p was arbitrary, there-
fore (Ax) and A(x) have the same coecient for all powers of and thus A(x) = (Ax)
for any x X[], F[] i.e. A is F[]linear and (a) holds.
This proves the advertized result
(a) (b) (c).
2. Finding the dual basis to a given basis e
1
, . . . , e
n
of X will be done by applying the dening
formula e
i
[e
j
=
ij
. Thus suppose we want to nd the basis dual to L = L
0
, . . . , L
n
of
P
n+1
(= the space of polynomials with degree at most n) where L
i
[0 i n are the
lagrange polynomials (refer to Handout-I, page (7)). If the required basis (for X

) is given by
L
i
[0 i n then we have P
n+1
L
i
F given by L
i
[a := a(
i
) ( a() =
n

i=0
a(
i
)L
i
() as
proved on Handout-I, page (7)).
(i) Find the basis dual to e
i
=
i
[0 i n of P
n+1
Hint:
Since by the Taylor formula a() =
n

i=0
(D
i
(a))(0)
i!

i
, we shall have e
i
[a =
(D
i
(a))(0)
i!
for any
a = a( P
n+1
).
Since we did not say F = R, you may worry about the validity of this Hint. However, the
c Copyright Reserved IIT Delhi
copy.jpg
6
following should make you happy.
Let us say a vector space / is an (associative) algebra (with an identity) i the vectors actually
form a (associative) ring (with identity) and further
(xu) = (x)u = x(u)
holds for all x, u /, F.
Example 0.1. / = L(X, X) is an algebra when BA for X
A,B
X is regarded as multipli-
cation of matrices; because of the correspondence A a between L(X, X) and Mat
n
(F) for
dim X = n < , these two are actually the same.
Example 0.2. The polynomials F[] form an algebra since they form a ring under convolution
as multiplication.
Now given an algebra /, say that a linear transformation /
D
/ is a derivation if it satises
D(xu) = (Dx)u +x(Du) for all x, u /
and prove that Taylor formula copying it from your calculus text, assuming char F = 0.
This may still leave you unsatised since P
n+1
is not an algebra: x, u P
n+1
does not mean
xu P
n+1
. However, convince yourself that the dual basis can still be found by the trick
suggested.
If you dont believe that Derivation is a concept of Linear Algebra, you may assume for this
problem that F = R
(i) Take e
0
= 1, e
1
= 1 +, e
2
= +
2
. Prove it is a basis for P
2
[] and nd the dual basis
e

0
, e

1
, e

2
.
Hint:
Any a() P
2
[] is uniquely given as a
0
+a
1
+a
2

2
with a
0
= x
0
+x
1
, a
1
= x
1
+x
2
, a
2
= x
2
when we write
a() = x
0
e
0
+x
1
e
1
+x
2
e
2
= (x
0
+x
1
) + (x
1
+x
2
) +x
2

2
Thus X
0
= a
0
a
1
+a
2
, x
1
= a
1
a
2
, x
2
= a
2
Since e
i
[a = x
i
, we see that the required P
2
[]
e
i
F are given by e
0
[a = e
0
[a
0
+ a
1
+
c Copyright Reserved IIT Delhi
copy.jpg
7
a
2

2
= a
0
a
1
+a
2
= x
0
e
1
[a = x
1
= a
1
a
2
e
2
[a = x
2
= a
2
3. We already know that a vector space X is accompanied by its twin, X

. However, complex
vector spaces, i.e. vector spaces over F = C, actually come as quadruplets. To be explicit,
suppose X is a complex vector space. Then there is another scalar multiplication given by
x := x where the scalar multiplication on the RHS is the scalar multiplication in X.
The abelian group X of the vectors then turns into another vector space, call it X, under
this new scalar multiplication. We shall say X is the conjugate-space of X. Then there is
naturally (X)

; let us write X

= (X)

. So we have X, X, X

, X

; these are the four vector


spaces signaled by saying the complex vector space X.
3.1 When we say X
A
Y is a linear map, we mean that it is a homomorphism of the abelian
groups:
A(x +u) = Ax +Au (the addition in X is preserved by A into the addition in Y ), and,
A(x) = (Ax) (the scalar multiplication in X is preserved by A into the scalar multiplication
in Y )
Thus a linear map X
A
Y would be such that
A(x +u) = Ax +Au and A( x) = (Ax) i.e. A(x) = (Ax).
Such a map is called a semi-linear map X Y (rather than a linear map X
A
Y );
other names for semi-linear are conjugate-linear and anti-linear. Thus what is called
sesqui[=1
1
2
]-linear form on X in Handout-II, is simply a bilinear form X X
s
C.
Note:
If X
A
Y
B
Z is a scheme of conjugate-linear maps,X
BA
Y is linear (BA(x + u) =
B(A(u) +A(x)) = BA(x) +BA(u))
To sum up:
In a linear map, the scalar emerges unscathed, in a conjugate-linear map, the scalar is ipped
into its conjugate.
3.2 We equip X with the inner product with x[u := u[x (the RHS being the inner-product in
X)
c Copyright Reserved IIT Delhi
copy.jpg
8
Since we have u[x = x[u, we obtain
x[u = u[x (=x[u)
= u[x
i.e. the inner product [

is linear in the rst variable and conjugate-linear in the second


variable while the inner product on X was conjugate-linear in the rst and linear in the second
variable) if considered as an inner product on X; note however, that the denition is quite
consistent since we do have
x[u

= u[x = (x[u

)
So that with the scalar multiplication as dened on X, we indeed have conjugate-linearity in
the rst variable and linearity in the second variable.
Note:
It is important to be clear about these things. Thus some text-books assert that the import
of the Riesz Representation theorem is to say that if X is a Hilbert space, it is isomorphic
to its dual (= transpose) X

; this is false since the bijection displayed is conjugate-linear and


not linear.
4. Prove that R is a vector space of innite dimension over F = Q.
Hint:
Since R is transcendental, there is no polynomial a = a
0
+ a
1
+ . . . + a
n

n
with a
i
Q
for which a() = 0, i.e. 0 = a
0
+ a
1
+ . . . + a
n

n
with a
i
Q forces a
i
= 0. This means
1, ,
2
, . . . ,
n
is linearly independent over Q for any n and thus we cannot have a space
over Q and must have therefore a basis which is seen now to have an innite number of
elements. Thus linear independence is tied to the eld F under consideration.
5. Given a = a() F[], a = a
0
+ a
1
+ . . . + a
m

m
, a
m
,= 0 (i.e. deg (a) = m) and X
A
X,
dim (X) = n < , we get X
a(A)
X dened by
a(A) := a
0
Id +a
1
A+. . . +a
m
A
m
(A
0
= Id on X)
5.1 Verify
(i) a() = b() a(A)b(A),
(ii) a() +b() = c() a(A) +b(A) = c(A),
c Copyright Reserved IIT Delhi
copy.jpg
9
(iii) a()b() = c() a(A).b(A) = c(A),
(iv) a() = d()q() +r() (q() is the quotient and r() is the remainder on dividing a() by
d()) a(A) = d(A)q(A) +r(A) and illustrate this by
a() =
3
+ 3 + 2 + ( F), a() = d()q() +r() for
d() = + 1, A =
_

_
1 2
1 0
_

_
Hint:
q() = + 2, r() = , d(A) = A+Id
2
=
_

_
2 2
1 1
_

_, a(A) =
_

_
8 + 8
4 4 +
_

_,
q(A) =
_

_
3 2
1 2
_

_, r(A) =
_

_
0
0
_

_.
Here F
2
A
F
2
is given by Ax =
_

_
1 2
1 0
_

_
_

_
x
1
x
2
_

_ F
2
for x =
_

_
x
1
x
2
_

_ F
2
.
(v) Prove that if
A
a
_
X, e
_

_
X, e
_
(i.e. a =
_
a
i
j
_
= a =
_
e
j
(e
i
)
_
= [e
j
[e
i
]
nn
is the
matrix of A with respect to the basis e = (e
1
, . . . , e
n
)) and p() = p
0
+p
1
()+. . .+p
m

m
F[]
then
p(A)
P
_
X, e
_

_
X, e
_
with p =
_
p
i
j
_
where p
i
j
is the j i entry of the matrix p(a) =
p
0
+p
1
a+. . .+p
m
a
m
in Mat
n
(F). Illustrate this for X = F
2
, p() =
2
+3, Ax =
_

_
x
1
+x
2
x
1
2x
2
_

_.
Hint:
For e = e
1
, e
2
, e
1
=
_

_
1
0
_

_, e
2
=
_

_
0
1
_

_, we have Ae
1
=
_

_
1
1
_

_, Ae
2
=
_

_
1
2
_

_ and
thus the matrix of A is a =
_

_
1 1
1 2
_

_ while (p(A))(x) = (A
2
+ 3Id)(x) =
_

_
5x
1
x
2
x
1
+ 8x
2
_

_
supplying (p(A))(e
1
) =
_

_
5
1
_

_, (p(A))(e
2
) =
_

_
1
8
_

_ and thus
_

_
5 1
1 8
_

_ as the matrix
of p(A) (with respect to e) which is indeed a
2
+ 3Id
2
. Note further that with d() = 1,
we have q() = + 1, r() = 4 and (p(A))
_

_
x
1
x
2
_

_ =
_

_
5x
1
x
2
x
1
+ 8x
2
_

_ =
_

_
x
2
x
1
3x
2
_

_ +
_

_
2x
1
+x
2
x
1
x
2
_

_ +
_

_
4x
1
4x
2
_

_ = (d(A).q(A) +r(A))
_

_
x
1
x
2
_

_ exactly as expected, in analogy with


(iv) and thus p(A) = d(A).q(A) +r(A) because p() = d()q() +r().
c Copyright Reserved IIT Delhi
copy.jpg
10
(vi) Show that p(a) of (v) above has determinant
det p(a) = p(
1
).p(
2
) . . . p(
n
)
for F = C where
1
, . . . ,
n
are the eigenvalues of a =
_
a
i
j
_
nn
Hint:
p() can be resolved into linear factors:
p() = (
1
) . . . (
m
) for ,
1
, . . . , C and then the matrix polynomial p(a) =

0
(
1
I
n
a) . . . (
m
I
n
a). So if
a
is the characteristic polynomial of a, we get det(a) =

1
. . .
n
, p
0
= (1)
n

1
. . .
n
, p
n1
=-trace(a) = (
1
+. . . +
n
) and thus
det p(a) =
n
(det(
1
I
n
a)) . . . (
m
I
n
a)
=
n

a
(
1
) . . .
a
(
m
)
=
n

i=1
(
1

1
)(
2

2
) . . . (
m

m
)
since
a
() = I
n
a = (
1
) . . . (
n
), and thus
det p(a) = p(
1
)p(
2
) . . . p(
n
)
(vii) Prove that in (vi), the eigenvalues of p(a) are p(
1
), . . . , p(
n
) and if the eigenvector
w
i
corresponds to the eigenvalues
i
for a, the eigenvector w
i
corresponds to the eigenvalues
p(
i
) for p(a) also, 1 i n.
Hint:
We get
det
p(a)
= det[I
n
p(a)] = det[I
n
(
1
) . . . (
m
)]
= [ p(
1
)] . . . [ p(
n
)]
supplying the eigenvalues p(
i
), 1 i n, for p(a), given that
i
are the eigenvalues of a.
Further,
(p(a))(w
i
) = (p
0
I
n
+p
1
a +. . . +p
m
a
m
)(w
i
)
= (p
0
+p
1

i
+. . . +p
m
(
i
)
m
)(w
i
)
= p(
i
)w
i
so that p(
i
) corresponds to the eigenvector w
i
for p(a).
(viii) In (vii) above, consider arbitrarily given polynomials g(), f() with g(
i
) 0 for
1 i n. Prove that the eigenvalues of h(a) = f(a)g
1
(a) = f(a)(g(a))
1
= (g(a))
1
f(a)
are given by h(
i
) :=
f(
i
)
g(
i
)
with corresponding eigenvectors w
i
for 1 i n.
c Copyright Reserved IIT Delhi
copy.jpg
11
Hint:
We have det(g(a)) = g(
1
) . . . g(
n
) ,= 0 so (g(a))
1
exists and det (h(a)) = (det(f(a)))[det g(a)]
1
=
n

i=0
f(
i
)
g(
i
)
=
n

i=0
h(
i
) which means det [I
n
h(a)] =
n

i=0
[ h(
i
)] and provides h(
i
)
as the eigenvalues. Further, (g(a))(w
i
) = g(
i
)w
i
and (f(a))w
i
= f(
i
)w
i
which means
h(a)g(a)(w
i
) = f(a)(w
i
) forces (h(a))(w
i
)
f(
i
)
g(
i
)
.
(ix) In (viii) above, do we have g
1
(a) = (g(a))
1
as a polynomial in the matrix a?
Hint:
Writing B = g(a), we know that B is invertible. Suppose it has the minimum polynomial

B
() =
0
+
1
+. . . +
r
. Then
0
,= 0 since otherwise = 0 is an eigenvalue of B = g(a)
and B is not invertible. Now this means
0
I
n
+
1
B + . . . + B
r
= 0 which supplies B
1
=

0
[
I
n
+. . . +B
r1
] which means that (g(a))
(
1) =
1

0
[
1
I
n
+
2
(g(a)) +. . . +(g(a))
r1
]
which is a polynomial in the matrix a since g(a) is a polynomial in the matrix a.
6. Suppose d
n1
() is the gcd of all minors of order n 1 in the matrix a =
_
a
i
j
_
nn
being
thus the matrix polynomial det(I
n
a) =
a
(). Prove that the minimal polynomial of the
matrix a is given by

A
() =

a
()
d
n1
()
.
Hint:
Since
a
() = det (I
n
a) can be expressed as the sum of elements in the rst row of the
matrix I
n
a multiplied by their corresponding cofactors, we have d
n1
() as a divisor of

a
() which means
a
() is a polynomial which must be monic since both
a
() and d
n1
()
are monic. Since d
n1
() is a divisor of all the elements in adj(I
n
a) (the adjutant of
I
n
a, not the adjoint), we can write d
n1
()r() = adj(I
n
a) where the gcd of all the
elements in the matrix r() (entries from F[]) is 1.
Now (I
n
a) adj (I
n
a) =
a
()I
n
so that, on division by d
n1
(), we get (I
n
a)r() =
()I
n
and thus (a) = 0 (remainder theorem). Therefore, we do have () as an annihilating
polynomial of a. If p() is some annihilating polynomial of a with deg p() < deg (), the
remainder theorem supplies (I
n
a)q() = p()I
n
; with p() q() = () which must exist
since p(a) = 0 = (a), we have now
(I
n
a) q()q() = ()I
n
c Copyright Reserved IIT Delhi
copy.jpg
12
which supplies r() = q()q() and contradicts the fact that the gcd of all the elements in the
matrix r() is 1. Thus () must be the annihilating polynomial of a having least degree i.e.
() is the minimal polynomial.
Example 0.3. For a =
_

_
1 0 1
0 1 0
0 0 1
_

_
we have I
3
a =
_

_
1 0 1
0 1 0
0 0 + 1
_

_
, adj(I
n

a) =
_

2
1 0 1
0
2
1 0
0 0 ( 1)
2
_

_
with d
2
() = 1,
a
() = ( 1)(
2
1), and
a
() =

2
1; verify that
a
(a) = a
2
I
2
= 0.
7. If W is a subspace of a vector space over F, the collection XW := x + W[x X is turned
into a vector space (over F, the same eld) under the operation, given x, u X, F,
(x +W) + (u +W) := (x +u) +W
Hint:
XW is certainly an abelian group, seeing that X is an abelian group and W is hence
automatically a normal subgroup (before becoming a subspace). We note that W = 0 + W
is the additive identity for this group XW. Writing [x] for x + W therefore, we see that
elements of XW are [x] for x X and the operation supplied is
[x] + [u] := [x +u].
Since [u] = [u

] means uu

W and this ensures (uu

) W i.e. [(uu

)] = [uu

] =
[0], we get [u] [u

] = 0 which means the scalar multiplication inbuilt in this is well dened;


the addition fragment is well dened anyway (from the theory of abelian groups). It is now
almost trivially veried that XW becomes a vector space. XW is called the quotient
space of X by W; its dimension is called the codimension of W.
7.1 dimX = dimW+ dim(XW)
Hint:
For a proof, see the Appendix Handout, page(7).
c Copyright Reserved IIT Delhi
copy.jpg
13
7.2 The rst isomorphism theorem for vector spaces says:
If X
A
Y is a linear transformation,
A(X)

= XkerA
Prove it.
Hint:
Just repeat the proof of the rst isomorphism theorem for groups.
8. Consider the vector space Z
f
F of all functions for which f(j) = f(j +n) i.e. all functions
with period n for a xed integer n. Writing any such f(j) as z
j
F, prove that this can be
written as z :=
_

_
z
j+1
.
.
.
z
j+n
_

_
as long as any n consecutive integers j +i, . . . , j +n are xed.
In particular, this can be written as z :=
_

_
z
0
z
1
.
.
.
z
n1
_

_
8.1 Show that this space can be regarded as the space of all functions Z
n
F and that it is
isomorphic to F
n
.
Hint:
Take e
0
=
_

_
1
0
.
.
.
0
_

_
, e
1
=
_

_
0
1
.
.
.
0
_

_
, . . . , e
n1
=
_

_
0
0
.
.
.
1
_

_
as basis.
8.2 Fix a positive integer N. Prove that E
0
(j) :=
1

N
, . . . , E
N1
(j) :=
1

n
e
2i(N1)j
/N, (i =

1) for j = 0, . . . , N 1 supplies an orthonormal basis for C


N
(in the sense of (8.1)) with
respect to the inner product (u[v) :=
N1

n=0
u
n
v
n
.
Hint:
c Copyright Reserved IIT Delhi
copy.jpg
14
(E
k
[E
m
) =
N1

j=0
E
m
(j)E
k
(j)
=
N1

j=0
(
1

N
e
2imj/N
)(
1

N
e
2ikj/N
) [ E
l
has j th component
1

N
e
2ilj/n
where i is the complex
number

1]
=
n1

j=0
(
1
N
e
2i(mk)j
N
j
)
Thus |E
m
|
2
=
1 ( mk = 0) and for m ,= k, we have a geometric progression with sum
1e
(
2i(mk)
N )
N
1e
2i(mk)
N
= 0
since mk is an integer.
This means E
0
, . . . , E
N1
is a set of N distinct orthonormal vectors in C
N
and provides
therefore an orthonormal basis for C
N
.
8.3 In the Hilbert space C
N
, prove the Parsevals relation
(v[w) =
N

k=1
(v[b
k
)(w[b
k
)
where b
1
, . . . , b
N
is some orthonormal basis (just calculate directly) and derive the Pancherels
formula |v|
2
=
N

k=1
[(v[b
k
)[
2
(just put v = w in Parsevals relation).
8.4 Apply the formulas of (8.3) to the basis found in (8.2) and calculate |z|
2
=
N1

m=0
[(z[E
m
)[
2
where (z[E
m
) =
N1

n=0
z(n)
1

N
e
2imn/N
(Recall: z(n) = z
n
is the nth component of z i.e.
the value at n Z
N
of Z
N
z
C; we know now that z =
_

_
z
0
z
N1
_

_ C
N
).
8.5 Write z(m) :=
N1

n=0
z(n)e
2imn/N
so that Z
N
z
C given by ( z(0), . . . , z(N 1)) C
N
and
prove that z z denes a linear transformation C
N
DFT
C
N
. (Just verify).
Note:
In this context, one usually writes l
2
(Z
N
) for C
N
to be consistent with the notation l
2
(N) for
the space of square-summable complex sequences.
l
2
(N) := z(j) C[

j=1
[z(j)[
2
<
with the inner product (z[w) :=

j=1
z(j)w(j) (which is an innite-dimensional Hilbert space).
The linear transformation l
2
(Z
N
)
DFT
l
2
(Z
N
) is known as the discrete fourier transform.
c Copyright Reserved IIT Delhi
copy.jpg
15
9. Recall that (Handout-I, page (18), (1.8.3)) that if X
A
Y is a linear transformation, dim
X = n < , dim Y = m < , then there exists a basis e for X and a basis d for Y such
that the matrix representation for A with respect to these bases is of the form
_

_
I
r
0
0 0
_

_
where I
r
is the identity matrix of order r, r being the rank of A. Let us make a denition: say
that two linear transformations X
A,B
Y are equivalent i there exist invertible operators
P L(X, X), Q L(X, X) with PAQ = B. This is then an equivalence relation on L(X, X)
(A = Id
Y
AID
X
shows reexivity, PAQ = B A = P
1
BQ
1
shows symmetry, and
PAQ = B, RBS = C C + (RP)A(QS) shows transitivity) which should remind us of
an equivalence relation on L(X, X) introduced earlier (change-of-Basis, page(10)) two linear
transformations X
A,B
X are similar i there exists one invertible X
P
X with PAP
1
= B.
There exists in fact a result which enables us to slide similarity into equivalence and vice versa:
X
A,B
X are similar i
X[]

A
(),
B
()
X[] are equivalent
(stated in terms of matrices, two n n matrices a, b are similar i their characteristic poly-
nomials
a
() and
b
() are equivalent; this is the more popular version. For X[], refer to
page III-(3) of this handout)
Which is in fact a special case of the following:
For a ring K and , K, the following statements are equivalent:
(i) There is an invertible y K such that =
1
(ii) There are invertible p(), q() K[] with = p()( )q(). When K = L(X, X);
note that we are talking about a noncommutative ring here.
9.1 From this perspective every linear transformation X
A
Y has a very simple representative
out of its equivalence class under the relation termed equivalence introduced above, namely,
the linear transformation raised by the matrix
_

_
I
r
0
0 0
_

_ (of course, we are talking about


dim X < , dim Y < ). This is frequently expressed by saying that every mn matrix
has a normal or canonical form supplied by this matrix.
However, the similarity relation is not simply a special case of equivalence relation. When
talking about equivalence, we are talking about dierent coordinatization e and d, the invert-
c Copyright Reserved IIT Delhi
copy.jpg
16
ibles P and Q introduce new coordinatization e

, d

and A becomes B. But PAP


1
= B is
dierent: whatever change of basis is done by P is undone by P
1
; we are back to the same
coordinatization of the vector space X.
The problem here is to nd, for each X
A
X, just one basis for X relative to which A is cast
into some as simple as possible form.
9.2 We now state the problem in a more precise form.
(i) To begin with the (left) action of a group G an a set X is dened to be a function
GX X, written (g, x) gx satisfying
1
G
x = xand g(g

x) = (gg

)x for all x X, g, g

G
we say X is a Gset.
Writing x y (for x, y X) i there is g G such that gx = y supplies an equivalence relation on X
(1
G
x = x ensures reexivity gx = y x = g
1
y ensures symmetry, gx = y, g

y = z (g

g)x = z
ensures transitivity); the equivalence class of x [x]:=y X[gx = y for some g Gis called the
orbit of x.
(ii) A vector space X is made into a GL(x) set (where GL(X) is the general linear group of
X consisting under composition, of all invertibles A L(X, X)) via the action x Ax, 0 and
X 0 being the two orbits. Call this the natural action.
(iii) Now consider the vector space L(X, X). Each invertible P in L(X, X) raises a linear transfor-
mation
L(X, X)
Ad P
L(X, X)
given by (Ad P)A := PAP
1
and thus L(X, X) becomes a G set
GL(X) L(X, X)
(P,A)(AdP)A
L(X, X)
(Ad(P
1
P
2
) = (AdP
1
)(ADP
2
), (AdId) = Id are immediate); the orbit of A L(X, X) is the
similarity class of A,
orbit(A) = B L(X, X)[B = (Ad P)A = PAP
1
for some P GL(X)
A canonical form for X
A
X is a representative from its orbit by a unique choice i.e. given by a
function which selects exactly one member from each GL(X) orbit.
c Copyright Reserved IIT Delhi
copy.jpg
17
To give an illustration, suppose X is a complex vector space, dim X = n < . Then there are two
results (we prove neither here; this is just for illustration)
(i) Each A L(X, X) has a triangular form i.e. there is some basis for X such that with respect
to that, A can be represented by an upper triangular matrix.
This does not provide a canonical form, since two distinct triangular forms
_

_
1
0
_

_,
_

_
2
0
_

_
correspond to the same GL(V ) orbit.
(The result is : eachA L(X, X) is given by
_

1

.
.
.
0
n
_

_
where
1
, . . . ,
n
are the n
eigenvalues of A with possible repetitions and means anything)
(ii) Each A L(X, X) has a Jordan Canonical Form i.e there is some basis for X such that with
respect to that,A can be represented by
_

_
J

0
.
.
.
0 J

k
_

_
where J

k
=
_

k
1 0
.
.
.
.
.
.
.
.
.
1
0
k
_

_
each
k
on the diagonal n
k
times ,n
1
+
n
2
+... +n
k
= n,
1
,
2
, ...,
k
distinct eigen values of A. This provides a canonical form since (
but for the ordering of the eigenvalues) this is uniquely given from each GL(V )orbit. Thus, while
both these statements are equivalent in the sense that both the triangular and the jordan forms
are always available for any X
A
X as long as its characteristic polynomials splits into linear
factors
X
A
() = (
1
)
n
1
. . . (
k
)
n
k
, n
1
+n
2
+. . . +n
k
= n
for dim
F
X = n which happens whenever F is algebraically closed (and in particular for F = C),
one of them provides a canonical form, the other does not.
(iii) For X equipped with an inner product (indeed equipped with some bilinear form; remember
from III-8 that the inner product is a special kind of bilinear form X X C) we ask for more;
since we are not interested here in canonical form theory, we halt this discussion at this point
and proceed to the two situations which are somewhat special. But note that here the interest is
in the Gspace X where G is the subgroup of GL(X) preserving the inner product and thus one
GL(X)orbit may split into several Gorbits.
c Copyright Reserved IIT Delhi
copy.jpg
18
9.4 suppose we have n distinct eigenvectors, i.e.,in the factorization
X
A
() = (
1
)
n
1
. . . (
k
)
n
k
, n
1
+. . . +n
k
= n,
we have 1 = n
1
= . . . = n
k
, k = n. Then the eigenvectors corresponding to these are automat-
ically linearly independent. But we do know that if there are n linearly independent eigenvectors
x
1
, . . . , x
n
then regardless of the n eigenvalues
1
, . . . ,
n
being repeated or not,we can form the
model matrix P =
_
x
1
. . . x
n
_
nn
with a = PDP
1
where (i) D = diag(
1
, . . . ,
n
) (each

i
repeated as many times as it occurs in X
A
() on the diagonal), (ii) a is same matrix of A
k
to start
with; further, we know that this amounts to X =
k
j=1
E
j
where E
j
=ker(
j
idA)=ker(A
j
id)
is the eigenspace corresponding to the eigenvalue
j
, and (iii) such matrices are called diagonal-
izable (eigenspectrum page 7). We examine these diagonalizable matrices. Noting that when-
ever we can nd r linearly independent eigenvectors x
1
, . . . , x
r
, by extending this to a basis
x
1
, . . . , x
r
, x
r+1
, . . . , x
n
we have a representation
_

1
0
.
.
.
0
r
0
_

_
(* means anything)
the diagonalizable matrix is a particularly favourable instance of this: X
T
X is diagonalizable
i X =

(T) i T = P
1
DP, D = diag(
1
, ...,
n
) P is modal matrix i there is some
basis such that T has the representation
_

1
0
.
.
.
0
n
_

_
i there are n linearly independent
eigenvectors (dim X=n)
9.5 illustrations
9.5.1 if A =
_

_
1 . . . 1
1 . . . 1
_

_ , with v
1
=
_

_
1
1
.
.
.
1
_

_
for and v
i
= e
i
1 e
i
=
_

_
0
.
.
.
1
1
0
0
_

_
for 2 i n; then
v
1
, . . . , v
n
are linearly independent (prove it ) and with respect to this basis, A has the matrix
diag(n, 0, . . .).Thus A is diagonalizable.
c Copyright Reserved IIT Delhi
copy.jpg
19
9.5.2 Suppose X
T
X X = T(X)

kerT and T(X) = x X[Tx = x = a + b[a T(X), b


kerT, a = a +b = a +b[a T(X), b = 0 then choose a basis x
1
, x
2
, . . . , x
r
of T(X) extending
it to a basis x
1
, . . . , x
r
, x
r
+ 1, . . . , x
n
where x
r
+ 1, . . . , x
n
is some basis of kerT, then since
T(X) = x X[Tx = x, we have that with respect to this basis of X, T has the representation
_

_
id
r
0
0 r
_

_ One says that T is a projection of X onto T(X) along kerT. Prove that this is
equivalent to T
2
= T and compare with 1.8.3 on page 18 of handout-I.
9.5.3 Prove that if X = ker(T id
x
)

ker(T + id) then there is a basis of X such that with respect


to this, T has the representation
_

_
id 0
0 id
_

_ (choose a basic of ker(T id


X
) and it to a basic
of X by choosing some basic of ker(T +id
X
) one says that T is a reection at ker(T id
X
) along
B. Prove that this is equivalent to T
2
= id
X
. ( Note that we are requiring that each x X can be
written as
1
2
[(id
X
T)x + (id
X
+T)x] and thus we need characteristic of to be ,= 2).
9.5.4 Let us recaptitulate : if
1
, . . . ,
k
are distinct eigenvalues of X
A
X and E
i
= ker(A
i
), for
the polynomials L
i
() =

j=i

i

j
, then given 0 = v
1
+ ... + v
k
=
k

j=1
v
j
, v
j
E
i
, we have 0 =
L
i
(A)

j
v
j
=

j
L
i
(
j
)v
j
=

i
j
v
j
= v
i
for 1 i k which answers that W := E
1

...

E
k
is
a distinct sum (and also that the eigenvectors corresponding to
1
, ...,
k
are linearly independent;a
bitslicker than the arguments supplied for 1. .1,1. .2 on page 6 of the eigenspectrum handout);
One says that A is diagonalizable i X = E
i

...

E
k
. Now if x X ,x ,= 0,we have x E
j
for exactly one j; then Ax =
j
x and thus Ax E
j
again. All told, we have A =

i
P
i
where
X
P
i
X is given by P
i
(x) = x
i
for x = x
1
+ . . . + x
k
X, 1 i k. Clearly, we have
(P
1
+. . . +P
k
)(x) = x
1
+. . . +x
k
= x so that P
1
+. . . +P
k
= id, and also, P
i
P
j
= 0 for i ,= j which
means P
i
= P
i
id = P
i
(P
1
+. . . +P
k
) = P
2
i
so that P
i
is actually a projection (sec 9.5.2, III 22).
9.5.5 Now suppose X
P
i
X, 1 i k are given with the properties id =
k

j=1
P
i
, P
i
P
j
= 0
for i ,= j (so that in particular P
2
i
= P
i
P
i
is a projection) we have X = E
1

...

E
k
with
E
i
,= 0, P
i
projection of X onto E
i
along

j=i
E
i
, E
i
= P(X). If now we construct the operator
A :=
k

i=1

i
P
i
,
i
F all distinct; we shall have, for x
j
E
j
, Ax
j
= (

i
P
i
)x
j
=
j
x
j
which means x
j
= 0 + . . . + x
j
+ 0 + . . . X is an eigen vector of A with
j
as eigen value;
further, the entire spectrum is
1
, . . . ,
k
since if an additional distinct eigenvalue
k+1
exits,
c Copyright Reserved IIT Delhi
copy.jpg
20
E
k+1
= ker(A
k+1
id) is another new summand and X = E
1

. . .

E
k
is not adequate. But
this means that A is diagonalizable.
9.5.6 suppose X
A
X has the minimal polynomial
A
() =
k

i=1
(
i
) so that o =
A
(A) =
(
1
id A)...(
k
id A) L(X, X) then if we write P
i
= L
i
(A) for the lagrange polynomials L
i
,
we get P
i
=

j=i
(
i
id A)

j
and clearly,
(a) P
1
+. . . +P
k
= id
(P
1
+P
2
=

1
idA

1
+

2
idA

2
=

2
1
id
1

2
id
1
A+
2
A+
2
2
id
2

1
id
2
A+
1
A
(
2

1
)
2
=
(
1

2
)
2
id
(
1

2
)
2
= id
and so on)
(b) P
i
P
j
= 0, and
(c) A =

i
P
i
as simple algebraic consequences so that we are in the (9.5.5) situations.
9.5.7 The form A =
k

i=1

i
P
i
obtained in (9.5.4) as well as (9.5.5) above,is called the spectral form
of the operator X
A
X. Noting that this means A
2
= (

i
P
i
)(

j
P
j
) =

j
P
i
P
j
=

2
i
P
2
i
=

2
i
P
i
( P
i
is a projection and then P
2
i
= P
i
) and that in general
A
n
=

n
i
P
i
similarly we, observe that for any polynomial a() := a
0
+ a
1
+ . . . + a
n

n
, we
have a(A) =

i
a
i
A
n
i
=

i
a(
i
)P
i
and summarize this discussion as follows:
Theorem 0.1. If X
A
x is diagonalizable with eigenspaces E
i
, . . . , E
k
corresponding to
distinct eigenvalues
1
, . . . ,
k
with projections P
1
, . . . , P
k
corresponding to the decomposition
X = E
1

. . .

E
k
, we get
(i) P
2
i
= P
i
,= 0,
(ii) P
i
P
j
= 0 for i ,= j,
(iii) id
X
=
k

i=1
P
i
,
(iv) ImP
i
= E
i
= ker(
i
id
X
A),
(v) A =
k

i=1

i
P
i
,
(vi) a() =
k

i=i
a(
i
)P
i
for a() F[],
c Copyright Reserved IIT Delhi
copy.jpg
21
(vii) P
i
=

j=i

i
id A

j

i
, and
(viii) the minimal polynomial
A
() of A is () =
k

i=1
(
i
) Conversely: if X
A
X satises
(v) i.e. A =
k

i=1

i
P
i
where
1
, . . . ,
k
are distinct scalars and P
i
: X X are non zero
operators satisfying (ii) and (iii) then A is diagonalizable with spectrum
1
, . . . ,
k
and properties
(i) (viii) all hold. Further, given only that (viii) holds, i.e.
A
() =
k

i=1
(
i
),
i
all
distinct linear factors, A is diagonalizable with spectrum
1
, . . . ,
k
and properties (i) (vii)
hold with P
i
dened by (vii) i.e. P
i
:=

j=i

i
id A

j

i
. (please rework and ll in any details you
think are lacking).
Illustration Show that the operator X
P
i
X given by P
i
:= [e
i
>< e
i
[ for a basis e
1
, . . . , e
n
of
X, dual basis e
1
, . . . , e
n
, are all projections.
9.6 For X equipped with an inner product, F = C, we seek that (i) the basis for diagonalization be
orthonormal, and (ii) the modal matrix P used in the diagonalization preserve the inner product (
for modal matrix refer to III 22,(9.40), or the eigen spectrum page 7, as well as page 9 para 2.2).
9.6.1 To generate practice in the quite common usage of the mathematicans convention, we take the
inner product to be conjugate-linear in the second variable and linear in the rst variable and for
the same reason, use A

for A
+
; thus our denition is generated from (Av[w) = (v[A

w) for all
v V, w W for the Hilbert Adjoint W
A

V of V
A
W and A

has the matrix [ a


i
j
]
nm
for A having matrix [ a
j
i
]
mn
, 1 i n, 1 j m, with respect to chosen orthonormal bases e of
V, d of W; this is of course simply the tranjugate (= conjugate - transpose =transpose - conjugate)
of the matrix a = [ a
j
i
]
mn
introduced previously of the eigenspectrum, then (AB)

= B

,
det(A

) = det(A), (Id)

n
= Id
n
.
9.6.2 Thus given V
A
V , we have V
A

V . We say A is
(a) normal i AA

= A

A (this is not in conict with the denition given on page 9 of the


eigenspectrum as we shall soon show),
(b) self-adjoint i A = A

(c) skew-adjoint i A = A

, and
(d) unitary i A

= A
1
.
c Copyright Reserved IIT Delhi
copy.jpg
22
9.6.3 since (Ax[Ay) = (x[A

Ay) = (x[y) holds i A

= A
1
, we see that A is unitary i A preserves
the inner product.
9.6.4 Since[A

id
V
[ = [(A

id
V
)

[ = [A

id
V
[, is an eigenvalue of A i

is an eigenvalue of
A

, (with the same multiplicity).


9.6.5 if Ax = x, A

y = y, we get (x[y) = (x[y) = (Ax[y) = (x[A

y) = (x[y) =

(x[y) (remember
the mathematicians convention) so that ( )(x[y) = 0 forcing x y if ,= .
9.6.6 Suppose X
A
X is normal, dimX = n < . If n = 1, an orthonormal basis of eigenvectors
of A trivially exists ( here X = C, the vector 1 is a basis, and since A is simply C for some
as per the remark on page 2 of the rst handout, we do have Ax = x with x = 1). Assume
now that n 2. Then C is algebrically closed, some eigenvalue and a corresponding eigenvector
x with | x |= 1 can be found. The subspace x

:= u X[(u[x) = 0 is then A - invariant.


( (Au[x) = (u[A

x) = (u[

x) = (u[x) = .0 = 0 so Au x

) But x

A
x

is again normal
(prove this) and dimx

< n so by induction hypothesis, an orthonormal basis of x

can be surely
found, consisting of eigenvectors of x

A
x

which are of course eigenvectors of X


A
X (
verify this). We simply add x to this basis, obtaining a basis of X and conclude: If X
A
Xis
normal, an orthonormal basis of eigenvectors of A exists for X.
If conversely, X has an orthonormal basis e
1
, ..., e
n
, Ae
i
=
i
e
i
, 1 i n then each x X can be
written as
n

i=1
e
i
x
i
, x
i
C and we have AA

(x) = A[

(A

e
i
)x
i
] = A[

e
i
x
i
] =

i
(Ae
i
)x
i
=

i
(

i
e
i
)x
i
=

i
(A

e
i
)x
i
=
A

i
e
i
x
i
) = A

(Ae
i
)x
i
= A

A[

e
i
x
i
] = A

A(x). This being true at each x X, we


conclude that AA

= A

A i.e. A is normal.
To sum up:
X
A
X is normal i X possesses an orthonormal basis consisting of eigenvectors of A.
9.6.7 Suppose X
A
X is normal, dim
C
X = n < . Then there exist an orthonormal basis
e
1
, . . . , e
n
consisting of eigenvectors of A, sayAe
i
=
i
e
i
for 1 i n. Then A

= A i
A

e
i
= Ae
i
i

i
e
i
=
i
e
i
i (

i
)e
i
= 0 i

i
=
i
for each i, 1 i n; that is,
A is
selfadjoint
skewadjoint
i all eigenvalues of A are
real
purelyimaginary
and A

= A
1
i A

e
i
= A
1
e
i
i

i
e
i
=
1
i
e
i
i

i
=
1
i
i [
i
[
2
= 1 for each i, 1 i n, that
is, A is unitary i all eigenvalues of A have absolute value 1.
c Copyright Reserved IIT Delhi
copy.jpg
23
(i) This is a result about normal operators; if some linear operator X
A
X has real eigenvalues only,
it does not follow that T is self-adjoint.
(ii) Thus if A is normal, we can nd a unitary matrix P such that P

DP = A, D is diagonal and note


that unitary matrices are exactly those which preserve the inner product.
10 We shall now provide some illustrations and close this discussion
10.1 The matrix A =
_

_
1 1 1
1 2 1
1 1 2
_

_
is normal with
A
() =
3
5
2
+9 9 supplying eigenvalues

1
= 3,
2
= 1 +

2,
3
= 1

2; for which we have eigenvectors x


1
=
_

_
0
1
1
_

_
, x
2
=
_

_
1

2
i

2
_

_
,
x
3
=
_

_
1
i

2
_

_
which are orthogonal with norm | x
i
|=

2 ensuring that
1

2
x
i
are orthogonal.
So we take U :=
1

2
[x
1
x
2
x
3
]
33
which is unitary and U

AU = diag(1, 1 +

2i, 1

2i).
verify all these statements and carry out a similar discussion for
(a)
_

_
1 1 1
1 1 1
1 1 1
_

_
(b)
_

_
2 1 1
1 2 2
1 2 2
_

_
(c)
_

_
1 1 1
1 2 1
1 1 2
_

_
Ans (a) U =
1

3
_

_
1 1 1
1
1
_

_
=
1
2
(1 +

3i)
(b) U =
1

6
_

_
2 1 1
1
1
_

_
=
1
2
(2 +

6i)
(c)U =
_

_
0


_

_
=

2
(1 +

3 , =

2
(1 +

3), = (3 +

3)
1
2
= (3

3)
1
2
,
=
1

2
.
c Copyright Reserved IIT Delhi
copy.jpg
24
10.2 Take X = C
2
with e
0
=
_

_
1
0
_

_, e
1
=
_

_
0
1
_

_ as basis. Show that the mirror U


M
=
_

_
0 1
1 0
_

_,
the beam splitter U
B
=
1

2
_

_
1 1
1 1
_

_, and the phase shift U


P
=
_

_
e
i
0
0 1
_

_ are all unitary.


If X

in
X is given by [e
0
>< e
0
[ =
in
, calculate
out
:= U
B
U
M
U
P
U
B

in
U
+
B
U
+
P
U
+
M
U
+
B
(
A
+
stands for the Hilbert Adjoint of A).
Hint:
in
is just the operator
_

_
1
0
_

_ [1 0] =
_

_
1 0
0 0
_

_, and U
B
U
M
U
P
U
B
=
1
2
_

_
e
i
+ 1 e
i
1
e
i
+ 1 e
i
1
_

_ by direct computation;
A
+
is just tranjugate of A. Now nd
out
=
1
2
_

_
1 + cos() i sin()
i sin() 1 cos()
_

_
10.3 Consider the Pauli matrices

1
:=
_

_
0 1
1 0
_

_,
2
:=
_

_
0 i
i 0
_

_,
3
:=
_

_
1 0
0 1
_

_ Show that they are all Hermitian (=self-


adjoint) and traceless. For a given unit vector n R
3
, dene n. := n
1

1
+ n
2

2
+ n
3

3
=:

and calculate

2
= I
2
concluding that

is unitary. Find the eigenvalues of

, calculate
<

[e
0
>=

(e
0
) and [([

(e
0
)[
2
. Writing

(n) :=
1
2
[

2
+

], Show that

(n) is selfad-
joint, a projection, has trace 1 and nd

(n)
_

_
e
i
cos()
sin()
_

_.
Hint:
Direct calculations.traceless means has trace 0.

is selfadjoint and

2
= I
2
so

is

1
and
thus unitary; its eigenvalues can thus be only real with modulus 1 but since the trace is 0, they
must be +1 and -1. e
0
=
_

_
1
0
_

_ as in 10.2; then

(e
0
) = n
1
_

_
0
1
_

_ + n
2
_

_
0
i
_

_ + n
3
_

_
1
0
_

_ and
hence [([

(e
0
))[
2
= n
2
3
. You will need n
2
1
+ n
2
2
+ n
2
3
= 1 somewhere in all this. It is helpful to
note that if [A , B]
+
:= AB + BA, then 0 = [
1
,
2
]
+
= [
2
,
3
] = [
3
,
1
]
+
in calculating
(

(n))
2
=

(n) proving thus that is projection.

(n)
_

_
e
i
cos()
sin()
_

_ =
1
2
_

_
(1 +n
3
)e
i
cos + (n
1
in
2
) sin
(n
1
+in
2
)e
i
cos + (1 n
3
) sin
_

_.
10.4 If the minimum polynomial
A
() of A L(X, X) has degree m, prove that for any f() F[]
with deg(f()) m, we have f(A) = r(A) for some r() F() with deg(r()) < m. Calculate
c Copyright Reserved IIT Delhi
copy.jpg
25
f(A) for
f() = 2
5
+ 5
3
+ 7, A =
_

_
1 0 1
0 1 0
0 0 1
_

_
Hint:
f() = q()
A
() + r() with deg(r()) < m (division algorithm, deg(f()) deg(
A
()) = m
given). Then since
A
() = 0, we have f(A) = r(A); for the given minimal,
A
() =
2
1, q() =
2
3
+ 7; r() = 14, f(A) = 14A).
10.5 Show that (i) self adjoint Normal Unitary, but (ii) A =
_

_
1 1
i 3 + 2i
_

_ is normal while
neither selfadjoint nor unitary so that the implications are not reversible
Hint:
AA

=
_

_
1 1
i 3 + 2i
_

_
_

_
1 i
1 3 2i
_

_ =
_

_
2 3 2i
3 + 3i 14
_

_ = A

A
and, (iii) B =
_

_
1 i
0 1
_

_ is not normal.
10.6 Show that T is normal T id is normal.
Hint: direct calculation.
10.7 Show that A =
_

_
3 0 0
0 2 5
0 1 2
_

_
is not digonalizable over R but is digonalizable over C.
Hint:
Over R, there is a single eigenvalue = 3 with exactly one independent eigenvector; over C, 3, i,
-i are distinct eigenvalues of this 3 3 matrix. Find the modal matrix P to calculate P
1
AP =
_

_
3 0 0
0 i 0
0 0 i
_

_
10.8 For A =
_

_
0 1
0 0
_

_, g() =
2
+1, Show that g(A) has e
1
=
_

_
1
0
_

_, e
2
=
_

_
0
1
_

_ as eigenvectors
and e
2
is not an eigenvector of A though e
1
is ; revisit III 11(vii) to see why this needs to be
noted.
c Copyright Reserved IIT Delhi
copy.jpg
26
10.9 (a) Show that for A, B Mat
n
(F), AB and BA have the same eigenvalues.
Hint:
Since the product of two invertible matrices is again invertible, 0 is an eigenvalue of AB i AB is
non invertible i A or B is noninvertible i BA is noninvertible i 0 is an eigenvalue of BA.
A non zero is an eigenvalue of AB ABx = x with x ,= 0; then ABx ,= 0 but since BAx =
B x = Bx, we have Bx is an eigenvector of BA with as eigenvalue. Now interchange A and
B.
(b) Show that X
T
X i I T is not invertible.
(c) Prove that if idAB is invertible then idBAis invertible and (idBA)
1
= id+b(idAB)
1
A.
(d) Examine whatever interconnection among (a), (b), and (c) you think might exist.
10.10 Find a matrix B whose minimal polynomial is
4
5
3
2
2
+ 7 + 4
Hint: How about the companion matrix ?
10.11 Find a matrix B whose characteristic polynomial is
4
5
3
2
2
+ 7 + 4
Hint: How about the companion matrix ?.
10.12 The companion matrix of a() =
n
+ a
n1

n1
+ . . . + a
1
+ a
0
F[] was dened as
C
a
=
_

_
0 1 0 . . . 0
a
0
a
1
a
2
. . . a
n1
_

_ on page 8 of the eigenspectrum (2.1) where it was


proved that (C
a
()) = a() = (C
a
()). Some author prefer to dene the companion matrix
as
_

_
a
n1
1 0 . . . 0
a
n2
0 1 . . . 0
.
.
.
.
.
.
.
.
. . . .
.
.
.
a
1
0 0 . . . 1
a
0
0 0 . . . 0
_

_
, some others as
_

_
a
n1
a
n2
. . . a
1
a
0
1 0 . . . 0 0
.
.
.
.
.
. . . .
.
.
.
.
.
.
0 0 . . . 1 0
_

_
still others
as
_

_
0 . . . 0 a
0
1 . . . 0 a
1
.
.
. . . .
.
.
.
.
.
.
0 . . . 1 a
n1
_

_
. What happens to the result?
Hint:
() = () is still true; just verify.
10.13 Find the similarity classes of 2 2 matrix A over C.
c Copyright Reserved IIT Delhi
copy.jpg
27
Hint:
There must be two eigenvalues, say , . If there are two independent eigenvectors, the ma-
trix is similar to the diagonal matrix, A = P
1
_

_
0
0
_

_P with = possible. If there is


only one independent eigenvector, we have A = Q
1
_

_
0
1
_

_Q since A I is similar to
_

_
0 0
1 0
_

_ If there is only one eigenvector in which case A I = Q


1
_

_
0
1
_

_Q A =
Q
1
_

_
_

_
0
0
_

_ +
_

_
0 0
1 0
_

_
_

_
Q = Q
1
_

_
0
1
_

_Q ; note that eigenvector means nonzero


vector so at least one nonzero eigenvector must be there. Note further that if = , we have
(A I)
2
= N
2
= 0 i.e. N has the single eigenvalue 0 so that the eigenspace of N is ker N
with dim(kerN)= 2; dim(kerT) = 2 corresponds to N = 0 ( ker N=C
2
then) and we must have
A diagonalizable with A =
_

_
0
1
_

_ while dim(kerN) = 1 means that there is just one-element


basis for ker N which is our situation AI = Q
1
_

_
0 0
1 0
_

_ Q above.
10.14 When is A =
_

_
0 0 0 0
a 0 0 0
0 b 0 0
0 0 c 0
_

_
diagonalizable ?
Hint:
0 is the only eigenvalue so dim(kerA) = 4 i.e. nullity A = 4 so that rank A = 0 i.e. A is similar
to O i.e. a = b = c = 0.
10.15 Find the eigenvalues of A =
_

_
k 1 0 . . . 0
1 k 1 . . . 0
0 1 k . . . 0
.
.
.
.
.
.
.
.
. . . .
.
.
.
0 0 0 . . . k
_

_
nn
if it is given that X
i
= column vector
with j th entry sin
ij
n+1
are eigenvectors.
c Copyright Reserved IIT Delhi
copy.jpg
28
Hint:
AX
i
=
_

_
k sin
i
n+1
+ sin
2i
n+1
. . . . . . . . . . . . . . . . . .
k sin
ni
n+1
+ sin
(n1)i
n+1
_

_
= [k+2 cos
i
n+1
]X
i
so that the eigenvalues are k+2 cos
i
n+1
, 1
i n.
Suggestion: The syllabus is inordinately long for about 14 lectures allotted. Even so, I have tried
to include every term mentioned, with comprehensive arguments and illustrations ( = problems
with adequate hints)in the handouts. For the examination which is rather close, since the treat-
ment is necessarily abstract with new ideas and terminology piling up at every stage of the linear
development of the subject, here are a few tips :
1) De-emphasize pages 17-21(ldota,canonical forms) of this handout, the next 3 sheets, para 11, on
Banach spaces, and also bilinear forms (pages (19-20)) in the rst handout.
2) Concentrate on matrices (I am saying it reluctantly but never mind); it is easier to see what the
tranjugate is rather than what the Hilbert adjoint is.
3) Some of the proofs are quite lengthy for some theorems. though they form the core of the subject,
you should see how they are used in action; it is not necessary to cram the proof; for that matter,
it is bad practice to cram anything.
4) Do not ignore the Dirac notation; it is the easiest way to understand the component forms. Never-
theless, remarks like a change-of-basis matrix is simply an invertible matrix are key statements
rather than just wily summaries.
5) Work out the illustrations using the hints on paper with a pen/pencil; dont read the handouts like
a newspaper.
6) Prepare in advance; dont postpone till the day before the exam.
7) There are additional consultation hours, every Thursday and Monday, 1700-1800, in my oce. Use
them but dont start studying the handout in search of a question after arriving there, they start
22 April come prepared with a list of queries.
8) Be happy; maths is good fun if treated with respect.
c Copyright Reserved IIT Delhi
copy.jpg
29
11 While it not the intention of this course to go into Banach spaces beyond the denition, it is important
to know that
(i) All nite - dimensional normed spaces are Banach spaces, but
(ii) All nite - dimensional normed spaces are not Hilbert however,
(iii) IPS Hilbert is true for nite dimensional spaces,
(iv) any two norms on a nite dimensional space are equivalent as matrices,
(a) topologically, so anything framed in terms of open sets ( continuity, compactness, convergence etc),
and
(b) uniformly, so that anything framed in terms of uniformities ( uniform continuity, total boundedness,
cauchy sequences etc ) remain the same,
(v) Any linear transformation X Y between two normed nite dimensional spaces (hence Banach
also) is uniformly continuous hence continuous
(vi) nearly all these statements fail for innitedimensional spaces.
11.1 On any normed space X ( regardless of the dimension) the norm function X
.
R is continuous
and thus attains its maximum as well as minimum on a compact subset. One can prove that
the unit disk x X[ | x | 1 as well as the unit sphere x X[ | x |= 1 are compact if
dim X < ; indeed, the unit disk is compact i dim X < . These things enable one to show
that if X and Y are normed spaces and a linear transformation X
A
Y is continuous, one gets
| A |:= inf
>0
| A(x) | | x |, for all x X
= sup
x1
| Ax |= sup
x=1
| Ax |<
and that with this denition, L(X, Y ) turns into a normed space which is Banach i Y is Banach.
In particular, if

X stands for the space of all continuous linear forms,

X is Banach whether or not
X is Banach. Further,
| Ax || A | . | x | clearly holds.
11.2 For dim X < , dim Y < infty, anyX
A
Y is continuous ( as stated in 9.5(v) above ) and is
represented by some mn matrix
a = [a
j
i
]:
A
a = [a
j
i
]
(X , e) (Y , d), dim X = n < , dimY = m < as seen earlier (handout I).
c Copyright Reserved IIT Delhi
copy.jpg
30
Thus it makes sense to speak of matrix norm , or norm of a matrix. Since dimL(X, Y ) = mn =
dimMat
mn
(F) < , such a thing is part of discussion about nite dimensional spaces.
(i) But of course, the formula | A |:= sup
x1
parallelAx |= sup
x=1
| Ax |< = sup
x=0
Ax
x
supplied in 9.6 is not expected to be the only norm turning L(X, Y ), and in particular Mat
mn
(F),
into a normed space (necessarily Banach also if dim(L(X, Y )) < ). This formula supplies what
is known as the operator norm and naturally depends on which two norms on X and Y are being
used. We illustrate for m = n = 2, X = Y = C
2
, with C
2
A
C
2
being given by A =
_

_
1 1
1 1
_

_.
For | z |
1
:=

2
i=1
[z
i
[ on both X and Y , the operator norm is | A |
1
:= sup
x
1
=1
| Ax |
1
= sup
x
1
[[x
1

x
2
[ +x
1
+x
2
[] = 1 + 1 = 2 (attained at x
1
= 0, x
2
= 1 for instance)
For | z |

:= max
i
[z
i
[ on both X and Y , the operator norm is | A |

:= sup
x

=1
| Ax |

=
max
x

=1
_
_
_
[x
1
x
2
[,
[x
1
+x
2
[
_
_
_ = 2
For m = n, X = Y = C
n
, with | z |
2
:= +
_
(z[z) on both X and Y , the operator norm | A |:=
sup
x
2
=1
| Ax |
2
is also given by | A |
2
:=

max
where
max
is the maximum eigenvalue of A
i.e.
max
:= max
1in

i
[
i
is an eigen value of A. For A =
_

_
1 1
1 1
_

_ of (i), that is | A |
2
=

2.
We know that L(X, X) is an algebra ( sec III-7) and in particular, so is therefore Mat
n
(F) where
dimX = n < . Any normed vector space / which is also an algebra when | ab || a || b | is satised,
and not otherwise, called a normed algebra. Thus while L(X, X) is a normed vector space in various
possible ways, it is a normed algebra i the norm chosen on L(X, X) satises | BA || B || A | for
A, B L(X, X). The term matrix norm is applied in general to only those norms on Mat
n
(F) which
satisfy this inequality
| ba || b || a | fora, b Mat
n
(F)
.
Addendum: While we promised to say no more ondirect sum, you may still wonder how F
2
= F

F
is claimed where F and F are not disjoint. Actually there is an external direct sum and an internal
c Copyright Reserved IIT Delhi
31
direct sum, both sliding into each other in the nite - dimensional situation but rather than worrying
about all this at present, note that in F
2
= F

F, we are saying
_

_ =
_

0
_

_ +
_

_
0

_ so that
_

_ =
_

0
_

_+
_

_
0

_, thus
_

0
_

_ F, the rst summand,


_

_
0

_ F, the second summand and the


two summands are disjoint. Similarly, for F
3
= F

F, we have
_

_
=
_

0
0
_

_
+
_

_
0

0
_

_
+
_

_
0
0

_
so that the three copies of F are disjoint. F[] has innite sequences
_

_
a
0
a
1
a
2
.
.
_

_
= a() and this is why we
say F[] = F

...

You might also like