Professional Documents
Culture Documents
1. Problem formulation
We wish to find a set of p orthonormal vectors in Rn which are optimal wrt to a real-valued
function F:
min F ( X ) s.t. X T X = I
X ∈Rn× p
2. Preliminaries
• two subspaces A and B of a vectorspace V are called orthogonal complements, if they
are ortogonal and their direct sum is V.
• To show that two subspaces A and B are equal, it is sufficient to show that A is a sub-
space of B and that dim( A) = dim( B).
• To show that two subspaces A and B are orthogonal complements in V, it is sufficient
to show that A is orthogonal to B, and that dimV = dimA + dimB.
• Representation: If ⟨., .⟩ is an inner product defined on V , then corresponding to any
linear functional L : V → R there is a v ∈ V with the property that ⟨v, u⟩ =
L(u) for all u ∈ V . The vector v is the representation of L. The representation of L
depends on the inner product.
• An invertible linear map L : V → W between two vector spaces V, W is an isomor-
phism. An isomorphism between two spaces shows that the spaces are essentially
identical as far as their vector space structure is concerned. An isomorphism between
a space and itself is an automorphism.
1
( )
• The Euclidean inner product for the vector space of matrices is ⟨ A, B⟩ = tr A T B
• the subspaces consisting of all symmetric and all skew-symmetric matrices are orthog-
onal complements
2.3 Euclidean and Canonical Inner Products for the Tangent Space
• The Stiefel manifold becomes a Riemannian manifold by introducing an inner product in its
tangent spaces.
2
• There are two natural inner products for tangent spaces of Stiefel manifolds: the Euclidean
inner product and the canonical inner product ( )
• we define the Euclidean inner product for 2 elements Z1 and Z2 as ⟨ Z1 , Z2 ⟩e = tr Z1T Z2
• from our representation we know that ⟨ Z, Z ⟩e = tr ( A T A) + tr ( B T B) = ∑i> j 2a2 (i, j) +
∑i,j b2 (i, j), after some reformulation
• thus, the Euclidean metric ⟨ Z, Z ⟩e induced by the Euclidean inner product weights the ele-
ments of A (the ‘A’ coordinates) twice as much as the elements of B (the ‘B’ coordinates),
• the canonical inner product weights the coordinates equally
• the idea is to find the “A” matrix of the tangent vector Z and weigh it by 21 in the inner
product ( ( ) )
• from this, we define the canonical inner product as ⟨ Z1 , Z2 ⟩c = tr Z1T I − 21 XX T Z2
• because of the above argument, we use exclusively the canonical inner product
• If F is a function from Rn× p to R, and X, Z ∈ Rn× p , then DFX : Rn× p → R, called the
∂F
differential of F, gives the derivative of F in the Z direction at X by DFX ( Z ) = ∑i,j ∂X Zi,j =
( T ) [ ] i,j
∂F
tr G Z , where G = ∂X i,j
∈ Rn× p
• Because DFX is a linear functional on Rn× p a representation of DFX in Rn× p of any of its
subspaces is the gradient of F in Rn× p or in the subspace (?). We can view the sum above as
the vector product of G and Z, and thus G is the representation of DFX
• DFX is also a linear functional on V p (Rn )
• DFX restricted to the tangent space has the following representation:
• Lemma 4. Under the canonical inner product, the vector AX with A = ( GXTXGT ) repre-
sents the action of DFX on the tangent space V p (Rn ).
Sketch of the algorithm: Generate X [k+1] from X [k] by a curvilinear search along Y (t) by chang-
ing τ.
The search is carried out using the Armijo-Wolfe rule.
Since the matrix inversion is costly, we use the e Sherman-Morrison-Woodbury formula.
( ) −1 T
We get Y (τ ) = X − τU I + τ2 V T U V X, where U = [ G, X ] and V = [ X, − G ] (this means
concatenation).
This reduces the matrix size from n × n to 2p × 2p.
6. Curvilinear Search
3
Curvilinear search is just traditional linear search applied to the curve Y (τ ). The search termi-
nates when the Armijo-Wolfe conditions are satisfied. The Armijo-Wolfe conditions require two
parameters 0 < ρ1 < ρ2 < 1.
Curvilinear Search: Initialize τ to a non-zero value.
$ \begin{array}{l} Until { F (Y (τ )) ≤ F (Y (0)) + ρ
7.
6.
• Find constraints of the problem
Prove that the feasible set is a manifold
Find the tangent space of the manifold
Find/derive an inner product for the tangent space
Find a representation of the gradient and with this the descent curve
Use curvilinear search, i.e. gradient descent
The Cayley transformation is orthogonal:
• Q = ( I − A)( I + A)T = ( I + A)T ( I − A)
[ ]: