Lec 17

2072U Computational Science I
winter 2024
Week Topic
1 Introduction
2–3 Solving nonlinear equations in one variable
4–5 Solving systems of (non)linear equations
6 Computational complexity
6–8 Interpolation and least squares
9–10 Differentiation
10 Integration
11–12 Initial value problems & APS
12 Revision
Outline
1. Least Squares: introduction
2. Overdetermined systems of linear equations
3. Overdetermined systems with N UM P Y
4. A first-order least-squares fit
2072U, Winter 2024 1 / 15

Today’s questions:
◮ Why use least squares instead of interpolation?
◮ What type of equations do we need to solve?
◮ What is “least”? What do we minimise?
◮ How can we so this with N UM P Y?
◮ What equations do we get for a linear approximation?
2072U, Winter 2024 2 / 15

Least Squares: introduction
◮ Interpolation is useful for
approximating smooth 6
functions, e.g. 4
◮ hard-to-evaluate functions 2
(sin, cos, arctanh, . . . ) 0
◮ solutions to differential or −2
functional equations. −4
◮ Existence of the interpolant is −6
guaranteed. −8
−3 −2 −1 0 1 2 3 4
◮ Many methods exist

(Vandermonde matrix, divided
differences, cardinal functions,
...)
◮ We have an explicit upper
bound for the interpolation
error.
2072U, Winter 2024 3 / 15

◮ Interpolation does not make
sense for data with noise. In 6
real-world applications, noise 4
stems from 2
◮ uncertainty in measurements, 0
◮ cumulating numerical error,
◮ stochastic processes, −2
◮ ... −4
◮ Forcing the approximant to −6

−3 −2 −1 0 1 2 3 4
interpolate through noisy data

gives strange results.
2072U, Winter 2024 4 / 15

◮ Instead we could try to find a 6
low-order polynomial 4
approximant that interpolates 2
the data. 0
◮ The set of equations for this −2
approximant will be −4
over-determined. −6
◮ We can only find an −8

−2 −1 0 1 2 3 4
approximate solution to these

equations.
◮ We want to do this in a unique,
optimal way.
◮ This leads to the least-squares
solution.
2072U, Winter 2024 5 / 15

Overdetermined systems . . .
There are more conditions of the form P(xi ) = yi then there are
coefficients in P. This leads to an overdetermined system of
linear equations:
   
 
   
   
 A   ~x  =  ~b 
   
   
A system of linear equations A~x = ~b where,

◮ A ∈ RN×M (i.e., A is “tall and thin”)
◮ ~b ∈ RN×1 (RHS vector)
◮ ~x ∈ RM×1 (vector of unknowns)
◮ N > M (more equations than unknowns)
2072U, Winter 2024 6 / 15

An overdetermined linear system in R2

x1 − 2x2 = 1/2
x1 − x2 = 1
An overdetermined system
x1 + x2 = −1 2
x1 −2x2 = 21
1.5
x1 −x2 = 1
x1 +x2 = −1
◮ N = 3 equations 1
◮ M = 2 unknowns 0.5
x2
0
◮ Overdetermined
(N > M) −0.5
◮ No solution exists −1
−1.5
◮ In matrix form,
−2
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
    x1
1 −2 1/2
1 −1 x1 =  1 
x2
1 1 −1
2072U, Winter 2024 7 / 15

Reminder: Residuals and norms on RN .

When solving
A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN
the residual of ~y is
~r (~y ) = ~b − A~y
“the amount by which ~y fails to satisfy system A~x = ~b” Recall:

to quantify size of vectors, introduce norms:
2072U, Winter 2024 8 / 15


When solving
A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN
~r (~y ) = ~b − A~y

2072U, Winter 2024 8 / 15


When solving
A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN
~r (~y ) = ~b − A~y

N
X
◮ ℓ1 -norm: k~x k1 := |xk | for any ~x ∈ RN
k =1
" N
# 12
X
◮ ℓ2 -norm: k~x k2 := |xk |2 for any ~x ∈ RN
k =1
◮ ℓ∞ -norm: k~x k∞ := max |xk | for any ~x ∈ RN
1≤k ≤N
2072U, Winter 2024 8 / 15

How to “solve” an overdetermined system?

◮ A~x = ~b has solution ~x ∈ RM iff ~b ∈ range(A).
◮ Generically, no solution exists for overdetermined systems.
2072U, Winter 2024 9 / 15


⇒ Goal: find vector ~x ∗ ∈ RM that minimises size of residual
~r (~x ∗ ).
2072U, Winter 2024 9 / 15


⇒ Goal: find vector ~x ∗ ∈ RM that minimises size of residual
~r (~x ∗ ).
◮ To minimise ~r (~x ∗ ), measure size with some norm:
min k~r (~x )k = min k~b − A~x k

~x ∈RM ~x ∈RM
◮ ~x ∗ is a minimiser of k~r (~x )k in that norm, i.e.,
k~r (~x ∗ )k ≤ k~r (~y )k for all ~y ∈ RM
2072U, Winter 2024 9 / 15

Conceivable objective functions to minimise:

◮ Using ℓ∞ -norm, define
M
X
Φ∞ (~y ) : = k~b − A~y k∞ = max |bk − Ak ℓ yℓ |
1≤k ≤N
ℓ=1
= maximum abs. value of components of residual.
◮ Using ℓ1 -norm, define
N M
1 ~ 1X X
Φ1 (~y ) : = kb − A~y k1 = |bk − Ak ℓ yℓ |
N N
k =1 ℓ=1
= average of abs. value of residual components.
◮ Both these norms are nonsmooth (nondifferentiable).
2072U, Winter 2024 10 / 15

Least-squares approximation of A~x = ~b
1 ~
Define Φ(~y ) : = kb − A~y k22
N
N M
!2
1 X X
= bk − Ak ℓ yℓ
N
k =1 ℓ=1
= Mean-square residual of ~y .
2072U, Winter 2024 11 / 15

Least-squares approximation of A~x = ~b
1 ~
Define Φ(~y ) : = kb − A~y k22
N
N M
!2
1 X X
= bk − Ak ℓ yℓ
N
k =1 ℓ=1
= Mean-square residual of ~y .
A least-squares approximation of a solution of A~x = ~b is a

vector ~x ∗ that minimises Φ over RM , i.e., so that Φ(~x ∗ ) ≤ Φ(~y )
for all ~y ∈ RM .
2072U, Winter 2024 11 / 15

How to minimise this objective function?

◮ Φ for least-squares approximation is differentiable
◮ Condition for ~x ∗ minimising Φ: critical point
∂
[Φ(~x ∗ )] = 0 (k = 1 : M)
∂xk∗
◮ For N > M with A of full rank, critical point ~x ∗ of Φ is
~x ∗ = (AT A)−1 (AT ~b),
i.e., ~x ∗ is determined by solving the normal equations
AT A~x ∗ = AT ~b
◮ Coefficient matrix AT A is square (M × M), symmetric
2072U, Winter 2024 12 / 15

Previous example: preceding system A~x = ~b is

   
1 −2 1/2
x
1 −1 1 =  1 
x2
1 1 −1
Normal equations AT A~x ∗ = AT ~b are

   
1 −2 ∗ 1/2
1 1 1  x 1 1 1 
1 −1 1∗ = 1  or
−2 −1 1 x2 −2 −1 1
1 1 −1
∗
3 −2 x1 1/2
∗ =
−2 6 x2 −3
∗
∗ x1 −3/14
⇒ ~x = ∗ =
x2 −4/7
2072U, Winter 2024 13 / 15

Overdetermined systems with N UM P Y
Least-squares approximation & normal equations

◮ numpy.linalg.lstsq(A,b) computes least-squares
approximations of overdetermined systems
A = np.array([[1.0,-2.0],[1.0,-1.0],[1.0,1.0]])
b = np.array([0.5,1.0,-1.0])
xstar = np.linalg.lstsq(A,b)
What you need to understand:

◮ Use numpy.linalg.lstsq(A,b) for tall/thin systems
◮ It also computes the mean-square residual.
2072U, Winter 2024 14 / 15

A first-order least-squares fit
Least-squares fit with straight line:

◮ Let Π1 (x) be a straight line, i.e., Π1 (x) = a0 + a1 x
◮ Given {(xk , yk )}nk =0 , interpolation conditions are
Π1 (xk ) = a0 + a1 xk = yk (k = 0 : n)
◮ Write system of equations to fit in matrix form

    
1 · a0 + x0 · a1 = y0  1 x0 y0
 1 x1  y1 
V ∈ R(n+1)×2

1 · a0 + x1 · a1 = y1 
    
1 x2  a0 y2  ~a ∈ R2×1
..  ⇒   =  
.   . .  a 1 .

 .
 . .  | {z }  .. 
. ~y ∈ R(n+1)×1

1 · a0 + xn · a1 = yn 1 xn ~a
yn
| {z } | {z }
V ~y
◮ The least-squares approximant is found by solving the

2 × 2 problem V T V ~a = V T ~y .
2072U, Winter 2024 15 / 15

Lec 17

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec 17

Uploaded by

Copyright:

Available Formats

2072U Computational Science I

1. Least Squares: introduction

2. Overdetermined systems of linear equations

3. Overdetermined systems with N UM P Y

4. A first-order least-squares fit

2072U, Winter 2024 1 / 15

2072U, Winter 2024 2 / 15

(sin, cos, arctanh, . . . ) 0

◮ Existence of the interpolant is −6

◮ Many methods exist

2072U, Winter 2024 3 / 15

real-world applications, noise 4

◮ Forcing the approximant to −6

interpolate through noisy data

2072U, Winter 2024 4 / 15

approximant that interpolates 2

◮ The set of equations for this −2

◮ We can only find an −8

approximate solution to these

2072U, Winter 2024 5 / 15

A system of linear equations A~x = ~b where,

2072U, Winter 2024 6 / 15

An overdetermined linear system in R2

2072U, Winter 2024 7 / 15

Reminder: Residuals and norms on RN .

A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN

“the amount by which ~y fails to satisfy system A~x = ~b” Recall:

2072U, Winter 2024 8 / 15

Reminder: Residuals and norms on RN .

A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN

“the amount by which ~y fails to satisfy system A~x = ~b” Recall:

2072U, Winter 2024 8 / 15

Reminder: Residuals and norms on RN .

A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN

“the amount by which ~y fails to satisfy system A~x = ~b” Recall:

2072U, Winter 2024 8 / 15

How to “solve” an overdetermined system?

2072U, Winter 2024 9 / 15

How to “solve” an overdetermined system?

2072U, Winter 2024 9 / 15

How to “solve” an overdetermined system?

min k~r (~x )k = min k~b − A~x k

◮ ~x ∗ is a minimiser of k~r (~x )k in that norm, i.e.,

k~r (~x ∗ )k ≤ k~r (~y )k for all ~y ∈ RM

2072U, Winter 2024 9 / 15

Conceivable objective functions to minimise:

◮ Using ℓ1 -norm, define

◮ Both these norms are nonsmooth (nondifferentiable).

2072U, Winter 2024 10 / 15

Least-squares approximation of A~x = ~b

2072U, Winter 2024 11 / 15

Least-squares approximation of A~x = ~b

A least-squares approximation of a solution of A~x = ~b is a

2072U, Winter 2024 11 / 15

How to minimise this objective function?

◮ For N > M with A of full rank, critical point ~x ∗ of Φ is

~x ∗ = (AT A)−1 (AT ~b),

i.e., ~x ∗ is determined by solving the normal equations

◮ Coefficient matrix AT A is square (M × M), symmetric

2072U, Winter 2024 12 / 15

Previous example: preceding system A~x = ~b is

Normal equations AT A~x ∗ = AT ~b are