Professional Documents
Culture Documents
T. Gambill
f (t, x) = x1 + x2 t + x3 t2
so that
t21
1 t1 y1
1 t2 t22 x1 y2
A ∗ x = . .. ∗ x2 ≈ .. = b
..
.. . . x3 .
1 t21 t221 y21
The matrix A has the form of a Vandermonde matrix.
0 −1 1 475
Definition
The residual vector is
r = b − Ax.
The least squares solution is given by minimizing the square of the residual
in the 2-norm.
Definition
The system of normal equations is given by
AT Ax = AT b.
The normal equations has a unique solution if rank(A) = n.
T. Gambill (UIUC) CS 357 March 15, 2011 6 / 22
Normal equations, a Geometric View
If the vector b is not in the span (the set of all linear combinations of vectors)
of the columns of A then in order to find the minimum distance from b to the
span of the columns of A we need to find x∗ such that r = (b − Ax∗ ) is
orthogonal to Ax for any x.
and thus
AT b = AT Ax∗
Consider
1 1
A = 0
0
√
where 0 < < mach . The normal equations for this system is given by
1 + 2
1 1 1
AT A = =
1 1 + 2 1 1
Definition
Let Am×n have rank(A) = n. Then we define the pseudo-inverse A+ of A as
A+ = (AT A)−1 AT
and we define the condition number of A as,
cond(A) = ||A||2 ||A+ ||2
Theorem
cond(AT A) = (cond(A))2
How can we solve the least squares problem without squaring the condition of
the matrix?
QR factorization.
I For A ∈ Rm×n , factor A = QR where
F Q is an m × m orthogonal matrix
F R is an m × n upper triangular
0 matrix (since R is an m × n upper triangular
R
matrix we can write R = where R 0 is n × n upper triangular and 0 is the
0
(m − n) × n matrix of zeros)
Definition
A matrix Q is orthogonal if
QT Q = QQT = I
Now that we know orthogonal matrices preserve the euclidean norm, we can
apply orthogonal matrices to the residual vector without changing the norm of
the residual. Note that A is m x n, Q is m x m, R 0 is n x n, x is n x 1 and b is
m x 1.
0 2 0 2 0 2
2 2 R T T R T R
krk2 = kb−Axk2 = b − Q x = Q b−Q Q x = Q b− x
0 2
0 2
0 2
c1
If QT b = where c1 is an n x 1 vector then
c2
2 2 2
R0
0
c1 − R 0 x
c1 Rx
T
Q b− x = − = = ||c1 − R 0 x||22 + ||c2 ||22
0 2
c2 0 2
c2 2
For each column of A, subtract out its component in the other columns.
For the simple case of 2 vectors {a1 , a2 }, first normalize a1 and obtain
a1
q1 = .
||a1 ||
a1 = q1 r11
a2 = q1 r12 + q2 r22
.. ..
. = .
an = q1 r1n + q2 r2n + ... + qn rnn
3 m = size(A ,1);
4 n = size(A ,2);
5
6 for i = 1:n
7 R(i,i) = norm(A(:,i) ,2);
8 Q(:,i) = A(:,i)./R(i,i);
9 for j = i+1:n
10 R(i,j) = Q(:,i)’ * A(:,j);
11 A(:,j) = A(:,j) - R(i,j)*Q(:,i);
12 end
13 end
14
15 end
Assume that A has rank k (and hence k nonzero singular values σi ) and recall
that we want to minimize
||r||22 = ||b − Ax||22 .
Substituting the SVD for A we find that
X
k X
m
||r||22 = (ci − σi yi )2 + c2i
i=1 i=k+1
ci
which is minimized when yi = σi for 1 6 i 6 k.
Theorem
Let A be an m × n matrix of rank r and let A = USV T , the singular value
decomposition. The least squares solution of the system Ax = b is
X
r
x= (σ−1
i ci )vi
i=1
where ci = uTi b.