Professional Documents
Culture Documents
Lecture 11
J. K. Verma
Department of Mathematics
Indian Institute of Technology Bombay
J. K. Verma 0 / 12
Least squares approximation
y (xi ) = yi , i = 1, . . . , n.
4 Due to uncertainty in data and experimental error, in practice the points will
deviate somewhat from a straight line and so it is impossible to find a linear
y (x) that passes through all of them.
5 So we seek a line that fits the data well, in the sense that the errors are made
as small as possible.
J. K. Verma 1 / 12
Least squares approximation
1 A natural question that arises now is: how do we define the error? Consider
the following system of linear equations, in the variables s and t, and known
coefficients xi , yi , i = 1, . . . , n:
s + x1 t = y1
s + x2 t = y2
.
.
s + xn t = yn
2 Note that typically n would be much greater than 2. If we can find s and t to
satisfy all these equations, then we have solved our problem.
3 However, for reasons mentioned above, this is not always possible.
J. K. Verma 2 / 12
Least squares approximation
1 For given s and t, the error in the ith equation is |yi − s − xi t|.
2 There are several ways of combining the errors in the individual equations to
get a measure of the total error.
3 The following are three examples:
v
u n n
uX X
t (yi − s − xi t)2 , |yi − s − xi t|, max 1≤i≤n |yi − s − xi t|.
i=1 i=1
4 Both analytically and computationally, a nice theory exists for the first of
these choices and this is what we shall study.
qP
n 2
The problem of finding s, t so as to minimize i=1 (yi − s − xi t)
5
J. K. Verma 3 / 12
Least squares approximation
1 Suppose that
1 x1 y1 s + tx1
1 x2 y2 s + tx
2
s
A= . . , b = . , x = , so Ax = . .
t
. . . .
1 xn yn s + txn
2 The least squares problem is finding an x such that ||b − Ax|| is minimized,
i.e., find an x such that Ax is the best approximation to b in the column
space C (A) of A.
3 This is precisely the problem of finding x such that b − Ax ∈ C (A)⊥ .
4 Note that b − Ax ∈ C (A)⊥ ⇐⇒ At (b − Ax) = 0 ⇐⇒ At Ax = At b. These
are the normal equations for the least square problem.
J. K. Verma 4 / 12
Least squares approximation
1 Example. Find s, t such that the straight line y = s + tx best fits the
following data in the least squares sense:
y = 1 at x = −1, y = 1 at x = 1, y = 3 at x = 2.
1 −1
1
2 Project b = 1 onto the column space of A = 1 1 .
3 1 2
3 2 5
3 Now At A = and At b = . The normal equations are
2 6 6
3 2 s 5
= .
2 6 t 6
y (x) = s0 + s1 x + s2 x 2 + · · · + sm x m
J. K. Verma 6 / 12
Fitting data points by a parabola
c + dt1 + et12 = b1
c + dt2 + et22 = b2
..
.
2
c + dtm + etm = bm
J. K. Verma 7 / 12
Fitting by a parabola
t12
1 t1
1 t2 t22
A= .
.. .. ..
. . .
2
1 tm tm
x = (At A)−1 At b.
J. K. Verma 8 / 12
Review of orthogonal matrices
1 A real n × n matrix Q is called orthogonal if Q t Q = I .
2 A 2 × 2 orthogonal matrix has two possibilities:
cos θ − sin θ cos θ sin θ
A= or B = .
sin θ cos θ sin θ − cos θ
3 TA represents rotation of R2 by θ radians in anticlockwise direction.
4 The matrix B represents a reflection with respect to y = tan(θ/2)x.
5 Definition. A hyperplane in Rn is a subspace of dimension n − 1.
6 A linear map T : Rn → Rn is called a reflection with respect to a hyperplane
H if Tu = −u where u ⊥ H and Tu = u for all u ∈ H.
7 Definition. Let u be a unit vector in Rn . The Householder matrix of u, for
reflection with respect to L(u)⊥ is H = I − 2uu t . Hence
Hu = u − 2u(u t u) = −u. If w ⊥ u then Hw = w − 2uu t w = w .
8 So H induces reflection in the plane perpendicular to the line L(u).
9 Exercise. Show that H is a symmetric and orthogonal.
J. K. Verma 9 / 12
The QR decomposition of a matrix
J. K. Verma 10 / 12
The QR decomposition of a matrix
J. K. Verma 11 / 12
The QR decomposition and the normal equations
J. K. Verma 12 / 12