You are on page 1of 2

Orthogonal triangularization: the Householder method

The Gram-Schmidt process is “triangular orthogonalization”. That is, you start with a matrix A and
multiply on the right by a series of upper-triangular matrices until you obtain an orthogonal one:
AR1 R2 · · · Rk = Q. (1)
This yields the QR-factors of A:
A = Q(Rk−1 · · · R2−1 R1−1 ) = QR. (2)
Householder’s idea is instead to pursue a method of “orthogonal triangularization”. That is, you
start with a matrix A and multiply on the left by a series of orthogonal matrices until you obtain an
upper-triangular one:
Qk · · · Q2 Q1 A = R. (3)
This also yields the QR-factors of A:
A = (Q∗1 Q∗2 · · · Q∗k )R = QR. (4)
In the Householder method the Qi take a particularly nice form — they are reflectors. This means in
particular that Qi = Q∗i = Q−1
i . Now reflectors are closely related to orthogonal projectors. Indeed, if
P is the orthogonal projector onto the line spanned by u then x 7→ x − 2P x is the reflection of x across
H = u⊥ , the hyperplane orthogonal to u. (A hyperplane is a subspace of codimension 1.)
x H = perp{u}

Px
x − Px

x − 2Px

span {u}
The goal of the first Householder reflector is to carry A1 (the first column of A) onto ± kA1 k2 e1
(where e1 is the first standard basis vector). The unit vector u for this reflector must be in the span of
v = −2P x = A1 ± kA1 k2 e1 .
Which sign do we take? We take the one which is farthest from x! Why do this? For reasons of
numerical stability. The point is that to compute u we first compute v then divide by its length. So,
we want kvk2 to be as large as possible. Note that the choice of sign which moves x the farthest is the
opposite of the sign of its first coordinate, A11 .
Now that we have reflected the first column of A into place we apply this same procedure recursively
to the submatrix which ignores the first row and first column of A. Here is the algorithm:
for j = 1:n
x = A(j:m,j)
x(1) = x(1) + sign(x(1))*norm(x)
u = x/norm(x)
A(j:m,j:n) = A(j:m,j:n) - 2*u*( u’*A(j:m,j:n) )
end
At the end of this algorithm we arrive at an upper triangular matrix, but we have not computed
an orthogonal matrix. The orthogonal matrix could be constructed from the sequence of unit vectors
computed. However this is not usually necessary. It is much more efficient simply to store the set of unit
vectors u and use them in sequence whenever we need to computer a product Q∗ x or Qx. This method is
called “implicit multiplication”. It is in effect what the last line of the algorithm above does. If we really
need to compute Q from this information we can apply implicit multiplication to the identity matrix.
This does raise a subtle point: if this algorithm doesn’t explicitly compute Q, what is meant by “the
computed value” of Q in the theorem on backward stability? It is the implicit algorithm! That is, the
Q2 of that theorem is the exact product of the reflectors defined by the values of u that are computed
in the Householder algorithm.

1
Homework problems: due Friday, 31 March
1. Count flops for the Householder method, both theoretically and empirically. How does the efficiency
of this algorithm compare to that of the Gram-Schmidt process?
2. Count flops for the implicit multiplications Q∗ x and Qx, again both theoretically and empirically.
If we add to the Householder method the computation of Q — using implicit multiplication — how
much does this add to the flop count for the Householder method? How does the efficiency of this
combined algorithm compare to that of the Gram-Schmidt process?

You might also like