You are on page 1of 4

Norms and inner products

Vector norms
A vector norm on a (real or complex) vector space V is a measure |x| of the size of the vector x. A
norm must satisfy the following simple axioms:
1. If x ,= 0 then |x| > 0.
2. |cx| = [c[ |x|.
3. |x + y| |x| +|y|.
The last axiom is called the Triangle Inequality, because in the usual picture of vector addition x, y, and
x + y are the sides of a triangle.
x
+ y
y
x
On R
n
or C
n
there is a family of important examples of vector norms, namely the p-norms:
|x|
p
=

[x
i
[
p

p
, if p 1. (1)
When p = 2 this is the familiar euclidean norm (distance). As p in (1) we get the -norm or the
max-norm:
|x|

= max [x
i
[ . (2)
Here, for comparison of the norms, we picture the sets x [ |x|
p
= 1, for the three most important
p-norms: p = 1, 2, and .
x
2
|| || = 1 x
1

|| || = 1 x
|| || = 1
With each norm we get dierent estimates of closeness, but not dierent notions of convergence.
That is, if a sequence of vectors converges in one norm that is, the norm of the dierence between
the sequential terms and the limit tends to 0 then the sequence converges to the same limit in every
other vector norm.
This is because of the following fact: If | | and | |

are two dierent norms on a nite dimensional


space then there are constants c and c

such that, for all x we have the inequalities


|x| c |x|

and |x|

|x| . (3)
For example, for the p-norms we have the following specic constants:
|x|
2
|x|
1


n |x|
2
|x|

|x|
2


n |x|

|x|

|x|
1
n |x|

(4)
1
If all norms on a nite-dimensional vector space are equivalent, then why is there more than one
norm? In applications, the specic estimates aorded by one norm may lead to better results than those
aorded by another. For example, in image processing it has been observed in many many experimental
results that estimates using the 2-norm lead to poorer quality than those using the 1-norm. This may
seem paradoxical, but bear in mind that in practice the limit is only a theoretical construct. You
always have to make do with some particular approximation. The measure of how good a particular
approximation is depends very much on your means of measurement.
By the way, the p-norms can be dened on spaces of functions, using the integral instead of the sum.
This is a very useful application of the idea of norms, especially to dierential equations and numerical
integration. However the dierent norms are not equivalent to one another on these spaces. A sequence
of functions may converge in one norm but not another. We can see a hint of this in the inequalities (4)
above: the constants depend on the dimension.
Inner products
The 2-norm is associated with a more rened measure of closeness, the inner product. The motivating
example of an inner product is the familiar dot product from calc III a function of 2 variables from
which we can compute both length and angle. So, if x, y R
n
then we learned in calc III that
y
T
x =

x
i
y
i
= |x|
2
|y|
2
(x, y). (5)
In particular, |x|
2
=

x
T
x. To generalize this when we have vectors in C
n
we use the formula
y

x =

x
i
y
i
. (6)
Although this is not always real and hence the latter part of formula (5) does not work we do still
have that |x|
2
=

x

x for x C
n
.
More generally, we dene an inner product on a (real or complex) vector space V to be a function of
two variables x, y) satises the following axioms:
1. x, y) = y, x).
2. x + y, z) = x, z) +y, z) and x, y + z) = x, y) +x, z).
3. cx, y) = c x, y) and x, cy) = c x, y).
4. If x ,= 0 then x, x) > 0.
Note that if V is a real vector space then the complex conjugate is to be ignored, since the conjugate of
a real number is real (and conversely).
One of the most important consequences of these axioms is the Cauchy-Schwarz Inequality:
[x, y)[
2
x, x)
2
y, y)
2
. (7)
As with the dot product, we can use an inner product to dene a norm, by the rule
|x| =

x, x) (8)
Even tho all norms are equivalent from the point of view of convergence, not all norms can be dened
by an inner product. So, even tho norms are all analytically equivalent, they are geometrically quite
dierent as we saw in gure 2 above. A geometric property of norms which distinguishes those dened
by inner products from all other norms is the Parallelogram Law:
|x + y|
2
+|x y|
2
= 2 |x|
2
+ 2 |y|
2
. (9)
Let me stress again that this law is valid for the 2-norm, but not for any other p-norm. We can picture
this as saying that the sum of the squares of the diagonals of a parallelogram equals the sum of the
squares of the four sides:
2
y
+ y
x y
y
x
x
x
In a (real or complex) vector space with an inner product we say that vectors x and y are orthogonal
(or perpendicular) in case their inner product is 0. We write x y in this case. For a real vector space
orthogonality really means what you imagine: the vectors are at right angles to one another. However
in a complex vector space it has a slightly more subtle meaning. For example, in C
1
that is, the usual
complex plane the inner product of two complex numbers x and y is simply xy, which is never 0
unless either x or y is. On the other hand C
1
can also be pictured as R
2
. The angle between x and y as
real vectors is the real part of xy. This phenomenon persists in higher dimensions: the real inner product
of vectors x, y C
n
is the real part of their complex inner product. Hence if x y as complex vectors
then certainly x y as real vectors, but the latter condition is in general weaker than the former.
Matrix norms
The set of (real or complex) matrices of a given size is a (real or complex) vector space. If we consider
matrices of size m n, say, then this is a vector space of dimension mn. Hence it makes sense to talk
about matrix norms. However, there is more to the algebra of matrices than simply addition and scalar
multiplication. There is also matrix multiplication, and we want out matrix norms to reect this.
More precisely, by a matrix norm we mean a norm (in the sense above) on the space of matrix which
satises the additional axiom
|AB| |A| |B| . (10)
Here are two simple examples: the max norm
[A[ = max
i,j
[A
i,j
[ ; (11)
and the Frobenius norm
|A|
F
=

i,j
[A
i,j
[
2
. (12)
In the context of vector norms we used the notation | |

and | |
2
for these denitions, but in the
context of matrix norms we reserve this notation for the more important notions of operator norms.
An operator norm is dened by means of the relative eect of an operator on a vector, as measured
by some vector norm. So, each vector norm |x| gives rise to an associate operator norm by either of the
(equivalent) rules
|A| = max
x=0
|Ax|
|x|
= max
x=1
|Ax| . (13)
The operator norm is not always easy to compute, but it has the useful theoretical condition number
property
|Ax| |A| |x| . (14)
In certain cases it is easy to compute the p-operator norm. For p = we have that the (operator)
-norm of a matrix A is the maximum of the (vector) 1-norms of all its rows:
|A|

= max
i
|row i of A|
1
. (15)
For p = 1 we have that the (operator) 1-norm of a matrix A is the maximum of the (vector) 1-norms of
all its columns:
|A|

= max
j
|column j of A|
1
. (16)
Thus |A|

= |A

|
1
.
3
Here are some inequalities which we will nd very useful. (In each of these, assume A is an n n
matrix.)
|Ax|
2
|A|
F
|x|
2
|A|
2
= |A

|
2
n
1/2
|A|
2
|A|
1
n
1/2
|A|
2
n
1/2
|A|
2
|A|

n
1/2
|A|
2
n
1
|A|

|A|
1
n |A|

|A|
1
|A|
F
n
1/2
|A|
2
(17)
The operator 2-norm is not as easy to compute, and hence we sometimes use the Frobenius norm as a
substitute.
Hermitian, symmetric, unitary, and orthogonal matrices
A hermitian matrix A satises A = A

. When A is real and hermitian we use the terminology symmetric


matrix, instead. Now if x, y) = y

x then for any matrix A we have that Ax, y) = x, A

y). Hence
another way to say that A is hermitian is to say that it is selfadjoint:
A = A

Ax, y) = x, Ay) . (18)


In particular, if A is hermitian then Ax, x) is real. This is because
Ax, x) = x, Ax) = Ax, x).
We say that a hermitian matrix is positive denite in case
Ax, x) > 0 for all nonzero x. (19)
This is such an important property that we abbreviate it as hpd, for complex matrices, or spd for real
ones.
Finally, we say that Q is unitary if Q

= Q
1
. We use the term orthogonal if Q is real and unitary.
To say that Q is unitary (or orthogonal) is equivalent to saying that it preserves the (standard) inner
product:
Q

= Q
1
Qx, Qy) = x, y) . (20)
Homework problems: due Monday, 6 February
1. Suppose that A is hermitian.
(a) Show that the eigenvalues of A are real. (Hint: If Ax = x consider Ax, x).)
(b) Show that if A is hpd then its eigenvalues are positive. Is the converse true?
(c) Suppose that v
1
and v
2
are eigenvectors for A, with corresponding eigenvalues
1
and
2
.
Show that v
1
is orthogonal to v
2
.
2. Suppose that Q is unitary.
(a) Show that if is an eigenvalue of Q then [[ = 1. (Hint: If Qx = x consider Qx, Qx).)
(b) Suppose that v
1
and v
2
are eigenvectors for Q, with corresponding eigenvalues
1
and
2
. Is
it true that v
1
is orthogonal to v
2
?
3. Show that rk(A) = 1 if and only if there are column vectors x and y such that A = xy
T
. Show
that in this case |A|
F
= |A|
2
= |x|
2
|y|
2
.
4. We say that an n n matrix A is strictly upper triangular if all its entries on or below the main
diagonal are 0. Show that if A is strictly upper triangular then A
n
= 0. What does this say about
the eigenvalues of A?
5. Show that if A is both unitary and upper triangular then it is diagonal. In this case what can you
say about its diagonal entries?
4

You might also like