You are on page 1of 5

The Cauchy-Schwarz Inequality

July 24, 2009

1 Introduction
The Cauchy Schwarz Inequality states that any time you take an inner product
(i.e. dot product) of two vectors, you can’t get something longer than if you
just multiplied the two vectors’ lengths. This is the crucial theorem underlying
Heisenberg’s Uncertainty Principle.
In the simple vectors we draw in physics 101, the inequality is obvious,
because the dot product of two vectors ~a and ~b is
~a · ~b = |a||b| cos θ (1)
with θ the angle between the vectors. Because cos θ ≤ 1, the dot product is less
than or equal to the product of the lengths. Also, the dot product equals the
product of the lengths only when the vectors point in the same direction. Here,
we want to generalize this result to abstract vector spaces, working from the
axioms. Since we’re talking about abstract vectors from now on, rather than
little arrows in a plane, we’ll switch to the bra-ket notation.

2 Inner Products and Norms


Vector spaces can exist without inner products, but they’re less interesting that
way because we can’t define orthogonality or take projections. Inner products
are also closely tied to norms. If a vector space has an inner product defined,
we can define the norm of a vector v by
p
|v| = hv | vi (2)
Alternatively, if we have a norm but no inner product (as might be the case
in a physical situation, where the norm is what you get by laying down a ruler),
we can define the inner product of two vectors by
4hv | wi ≡ |v + w|2 − |v − w|2 (3)
Although it isn’t immediately obvious, this definition is linear in both vec-
tors. For a complex vector space, the appropriate definition is
4ha|bi ≡ |a + b|2 − |a − b|2 − ı(|a + ıb|2 − |a − ıb|2 ). (4)

1
3 Decomposition of a Vector
When we have an inner product space, there are some vectors whose inner
products are especially simple - parallel and orthogonal vectors. Intuitively,
parallel vectors are vectors that point the same direction. Their definition is
two vectors for which the length of the sum is the sum of the lengths. Orthogonal
(or perpendicular) vectors are defined as having an inner product of zero. For
these vectors, the Pythagorean Theorem tells us that the square of the length
of the sum is the sum of the squares of the lengths.
Looking at any two vectors |yi and |xi, it would be nice if we could under-
stand them simply in terms of parallel and orthogonal vectors, even though |yi
and |xi are in general neither of these. What we’ll try to do then is take |yi
apart into two pieces, one of which is parallel to |xi, and the other orthogonal
to it.
Specifically, we want to find some scalar c such that
| yi = c | xi+ | wi (5)
with

hx | wi = 0 (6)
Dotting both sides of (5) with hx | gives

hx | yi = chx | xi + hx | wi
hx | yi = chx | xi + 0
hx | yi
c = (7)
hx | xi
Putting this result for c back into (5) yields

hx | yi
| yi = | xi+ | wi. (8)
hx | xi
Geometrically, we’re creating a right triangle with | yi as the hypotenuse.

4 Proof of the Cauchy-Schwarz Inequality


We suspect that
|hx | yi| ≤ |x||y| (9)
.
That’s the Cauchy-Schwarz Inequality. We want to prove it from the axioms
of the inner product, which are
hx | xi ≥ 0, and hx | xi = 0 ⇔| xi = 0
hx | yi = hy | xi∗
hx | αy + βzi = αhx | yi + βhx | zi (10)

2
Written out entirely in inner products, the Cauchy-Schwarz inequality is
p
|hx | yi| ≤ hx | xihy | yi. (11)
Because the right hand side is positive, the statement is equivalent to what
we get by squaring both sides.

hx | yi2 ≤ hx | xihy | yi. (12)


We broke | yi down into parallel and perpendicular parts, so let’s substitute
(8) into (12) and see what we’ve got.
!
2
2 hx | yi
hx | yi ≤ hx | xi hw | wi + hx | xi (13)
hx | xi
Simplifying a little gives that the Cauchy-Schwarz Inequality is equivalent
to
hx | yi2 ≤ hx | yi2 + hx | xihw | wi (14)
which is obviously true. Further, the equality obtains only when | xi = 0 or
| wi = 0, meaning | xi and | yi are linearly dependent.

5 Alternate Proof
Here’s another proof that is a bit more clever. We’ll work in a real vector space
for this one.
The Cauchy-Schwarz Inequality is trivial if | xi and | yi are dependent. So
assume they are independent, meaning

∀λ ∈ R, λ | yi− | xi =
6 0. (15)
Then the norm of this subtraction must be positive.

∀λ ∈ R, λ2 hy | yi − 2λhx | yi + hx | xi ≥ 0 (16)
The left hand side is quadratic in λ, but the quadratic has no roots because
it’s always greater than zero. Therefore the discriminant is negative. This
means

4hx | yi2 − 4hy | yihx | xi < 0


hx | yi2 < hx | xihy | yi (17)

The square root of that is the Cauchy-Schwarz Inequality.

3
6 The Triangle Inequality
Once we have the Cauchy-Schwarz Inequality, we can get the Triangle Inequality,
which says

|x + y| ≤ |x| + |y| (18)


Both sides are positive, so we can square them. Writing the result in inner
products gives

p
hx + y | x + yi ≤ hx | xi + 2 hx | xihy | yi + hy | yi
p
2hx | yi ≤ 2 hx | xihy | yi (19)

showing that the Triangle Inequality is equivalent to the Schwarz Inequality.


Another Triangle Inequality is

||x| − |y|| ≤ |x − y| (20)


This follows from

|x||y| ≥ |hx | yi|


−2|x||y| ≤ 2hx | yi
|x|2 − 2|x||y| + |y|2 ≤ |x|2 + hx | yi + |y|2
2
(|x| − |y|) ≤ hx | xi + 2hx | yi + hy | yi
2
(|x| − |y|) ≤ hx + y | x + yi
||x| − |y|| ≤ |x + y|
||x| − |y|| ≤ |x − y| (21)

Combining the two Triangle Inequalities gives

||x| − |y|| ≤ |x − y| ≤ |x| + |y|. (22)


This means that when you add or subtract two vectors, the length of the
resultant is always between what you get by adding and subtracting the lengths
of the two vectors directly.

7 The Uncertainty Principle


Suppose two Hermitian linear operators A and B have the commutator

[A, B] = C (23)

with C some other linear operator. We’re interested in the simultaneous eigen-
vectors of A and B. Or, if they don’t have them, we’re interested in knowing
how close we can get.

4
When |ψi is an eigenvector of A, it has zero variance, meaning hψ|A2 |ψi = hψ|A|ψi2 .
So the variance is a measure of how close we are to an eigenvector. To make
things easier on ourselves, we could consider the operator A′ ≡ A − hψ|A|ψi.
Adding a constant to an operator doesn’t change its variance, but this particu-
lar constant does kill the mean. So without loss of generality, let’s kill off the
means of A and B.
Here it is:
p
∆A∆B ≡ hΨ|A2 |ΨihΨ|B 2 |Ψi
p
= hAΨ|AΨihBΨ|BΨi
≥ |hAΨ|BΨi|
= |hΨ|AB|Ψi|

1

≥ (hΨ|AB|Ψi − hΨ|AB|Ψi )



1
= hΨ|C|Ψi


1
= |hΨ|C|Ψi| (24)
2
We invoked the Cauchy-Schwarz Inequality to go from the second to the
third line. To go from the fourth to fifth line, we said that the magnitude of
a complex number is greater than or equal to the magnitude of its imaginary
part.
For the special case where A is the position operator and B is the momentum
operator, the canonical commutation relation tells us C = h̄. This is a constant
times the identity, and plugging this into (24) gives


∆x∆p ≥ (25)
2
Also note that when we asserted that the magnitude of a number is less
than or equal to that of its imaginary part, we could have made an equivalent
statement about its real part. In this case we’d get an uncertainty relation
involving the anti-commutator.

8 References
Axler, Sheldon. Linear Algebra Done Right
Spivak, Michael. Calculus on Manifolds
Blandord, Roger and Kip Thorne. Applications of Classical Physics
Hassani, Sadri. Mathematical Physics
Shankar, R. Principles of Quantum Mechanics
Wikipedia Uncertainty Principle