Professional Documents
Culture Documents
Dan Garber
https://dangar.net.technion.ac.il/
Recap
Definition (Norm)
A function k·k : X → R is a norm, if
1 ∀x ∈ X kxk ≥ 0, and kxk = 0 if and only if x = 0 (positivity)
2 ∀x, y ∈ X kx + yk ≤ kxk + kyk (triangle inequality)
3 ∀α ∈ R, x ∈ X : kαxk = |α| · kxk (homogeneity)
Definition (Inner product)
An inner product on a (real) vector space X is a function which maps any
pair x, y ∈ X into a real scalar denoted by hx, yi, which satisfies the
following axioms for any x, y, z ∈ X and scalar α ∈ R:
1 hx, xi ≥ 0, and hx, xi = 0 if and only if x = 0 (positivity)
2 hx + y, zi = hx, zi + hy, zi (additivity)
3 hαx, yi = αhx, yi (homogeneity)
4 hx, yi = hy, xi (symmetry)
@ Dan Garber (Technion) Lesson 2 Winter 2020-21 2 / 22
Recap
Theorem
Let X be an
pinner product space. Then, the function k·k : X → R given
by kxk := hx, xi is a norm.
Recap
Definition
Given an inner product space X and vectors x(1) , . . . , x(n) in X , all are
non-zero, we say that x(1) , . . . , x(n) are mutually orthogonal if and only if
hx(i) , x(j) i = 0 for all i 6= j.
Theorem
Given an inner product space X , any mutually orthogonal vectors
x(1) , . . . , x(n) are linearly independent.
Since from the last theorem orthonormal vectors are linearly independent,
we have that a set of orthonormal vectors S forms an orthonormal basis
for the linear span of S.
Example: for X = Rn with the standard inner product, the set of vectors
{e1 , . . . , en }, where ∀i, j ∈ {1, . . . , n} i 6= j, ei (j) = 0 and ei (i) = 1,
forms an orthonormal basis to Rn .
Corollary
A square n × n orthonormal matrix X is invertible and X−1 = X> .
That is, X> X = I. Since X−1 exists, using the above we can deduce:
XX−1 = I =⇒ X> XX−1 = X> =⇒ X−1 = X> .
@ Dan Garber (Technion) Lesson 2 Winter 2020-21 6 / 22
Orthogonal Decomposition of Linear Spaces
Observation
The orthogonal complement S ⊥ is always a subspace.
Example: Let X = Rn with the standard inner product and consider the
subspace S := {x ∈ Rn | ∀i > 1 : xi = 0}.
It is not hard to see that the orthogonal complement is given by
S ⊥ = {x ∈ Rn | x1 = 0}.
where the norm used is the one induced by the inner product.
Since the first term is always non-negative, it follows the minimum over y
is obtained by taking y = xv , which proves that xv is indeed the
projection we sought. Note in particular the projection is unique.
@ Dan Garber (Technion) Lesson 2 Winter 2020-21 13 / 22
hx, vi
xv = ΠSv (x) = v.
kvk2
xv is usually called the component of x along the direction v.
In particular, if kvk = 1 then xv = hx, viv.
@ Dan Garber (Technion) Lesson 2 Winter 2020-21 14 / 22
Projection onto Arbitrary Subspaces
We now extend the previous result to the case when S is not necessarily
one-dimensional. In this case also orthogonality plays a key role.
Theorem (projection theorem)
Let X be an inner product space, let x be a given element in X , and let S
be a subspace of X . Then, there exists a unique vector x∗ ∈ S which is
the solution to the problem miny∈S ky − xk.
Moreover, a necessary and sufficient condition for x∗ being the optimal
solution for this problem is that x∗ ∈ S, (x − x∗ ) ⊥ S.
Consider the previous case with the difference that now the vectors
x1 , . . . , xm are orthonormal (recall we can always construct an
orthonormal basis via the Gram-Schmidt procedure, to be detailed later).
Now, we get
m
X
αk = αi hxi , xk i = hx, xk i, k = 1, . . . , m,
i=1
In the special case in which the linear space is X = Rn with the standard
inner product, i.e., x1 , . . . , xm ∈ Rn , the above can be written in the
following matrix form:
m m m
!
X X X
x∗ = x> xi xi = xi x>
i x= xi x>
i x = PP> x,
i=1 i=1 i=1
where P is the n × m matrix whose columns are exactly the basis vectors
x1 , . . . , xm .
PP> is called the projection matrix onto span(x1 , . . . , xm ).
@ Dan Garber (Technion) Lesson 2 Winter 2020-21 19 / 22
We now complement the above results by showing that given any basis
(x1 , . . . xm ) for a subspace S ⊆ X , one can construct an orthonormal
basis for S. This could be done via the Gram-Schmidt procedure.
The actual proof is by induction, showing that for all i ∈ {1, . . . , m},
span(y1 , . . . , yi ) = span(x1 , . . . xi ).
Not a real problem: in the tutorial you will see a more detailed version
and proof of Gram-Schmidt which DOES NOT rely on the Projection
theorem.