You are on page 1of 21

2072U Computational Science I

winter 2024
Week Topic
1 Introduction
2–3 Solving nonlinear equations in one variable
4–5 Solving systems of (non)linear equations
6 Computational complexity
6–8 Interpolation and least squares
9–10 Differentiation
10 Integration
11–12 Initial value problems & APS
12 Revision
Outline

1. Least Squares: introduction

2. Overdetermined systems of linear equations

3. Overdetermined systems with N UM P Y

4. A first-order least-squares fit

2072U, Winter 2024 1 / 15


Today’s questions:
◮ Why use least squares instead of interpolation?
◮ What type of equations do we need to solve?
◮ What is “least”? What do we minimise?
◮ How can we so this with N UM P Y?
◮ What equations do we get for a linear approximation?

2072U, Winter 2024 2 / 15


Least Squares: introduction
◮ Interpolation is useful for
approximating smooth 6

functions, e.g. 4

◮ hard-to-evaluate functions 2

(sin, cos, arctanh, . . . ) 0

◮ solutions to differential or −2

functional equations. −4

◮ Existence of the interpolant is −6

guaranteed. −8
−3 −2 −1 0 1 2 3 4

◮ Many methods exist


(Vandermonde matrix, divided
differences, cardinal functions,
...)
◮ We have an explicit upper
bound for the interpolation
error.

2072U, Winter 2024 3 / 15


Least Squares: introduction
◮ Interpolation does not make
sense for data with noise. In 6

real-world applications, noise 4

stems from 2

◮ uncertainty in measurements, 0
◮ cumulating numerical error,
◮ stochastic processes, −2

◮ ... −4

◮ Forcing the approximant to −6


−3 −2 −1 0 1 2 3 4

interpolate through noisy data


gives strange results.

2072U, Winter 2024 4 / 15


Least Squares: introduction
◮ Instead we could try to find a 6

low-order polynomial 4

approximant that interpolates 2

the data. 0

◮ The set of equations for this −2

approximant will be −4

over-determined. −6

◮ We can only find an −8


−2 −1 0 1 2 3 4

approximate solution to these


equations.
◮ We want to do this in a unique,
optimal way.
◮ This leads to the least-squares
solution.

2072U, Winter 2024 5 / 15


Overdetermined systems . . .

There are more conditions of the form P(xi ) = yi then there are
coefficients in P. This leads to an overdetermined system of
linear equations:
   
 
   
   
 A   ~x  =  ~b 
   
   

A system of linear equations A~x = ~b where,


◮ A ∈ RN×M (i.e., A is “tall and thin”)
◮ ~b ∈ RN×1 (RHS vector)
◮ ~x ∈ RM×1 (vector of unknowns)
◮ N > M (more equations than unknowns)

2072U, Winter 2024 6 / 15


Overdetermined systems . . .

An overdetermined linear system in R2


x1 − 2x2 = 1/2
x1 − x2 = 1
An overdetermined system
x1 + x2 = −1 2

x1 −2x2 = 21
1.5
x1 −x2 = 1
x1 +x2 = −1
◮ N = 3 equations 1

◮ M = 2 unknowns 0.5

x2
0
◮ Overdetermined
(N > M) −0.5

◮ No solution exists −1

−1.5

◮ In matrix form,
−2
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
    x1
1 −2   1/2
1 −1 x1 =  1 
x2
1 1 −1

2072U, Winter 2024 7 / 15


Overdetermined systems . . .

Reminder: Residuals and norms on RN .


When solving

A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN

the residual of ~y is
~r (~y ) = ~b − A~y

“the amount by which ~y fails to satisfy system A~x = ~b” Recall:


to quantify size of vectors, introduce norms:

2072U, Winter 2024 8 / 15


Overdetermined systems . . .

Reminder: Residuals and norms on RN .


When solving

A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN

the residual of ~y is
~r (~y ) = ~b − A~y

“the amount by which ~y fails to satisfy system A~x = ~b” Recall:


to quantify size of vectors, introduce norms:

2072U, Winter 2024 8 / 15


Overdetermined systems . . .

Reminder: Residuals and norms on RN .


When solving

A~y = ~b; ~y ∈ RM , A ∈ RN×M , ~b ∈ RN

the residual of ~y is
~r (~y ) = ~b − A~y

“the amount by which ~y fails to satisfy system A~x = ~b” Recall:


to quantify size of vectors, introduce norms:
N
X
◮ ℓ1 -norm: k~x k1 := |xk | for any ~x ∈ RN
k =1
" N
# 12
X
◮ ℓ2 -norm: k~x k2 := |xk |2 for any ~x ∈ RN
k =1
◮ ℓ∞ -norm: k~x k∞ := max |xk | for any ~x ∈ RN
1≤k ≤N

2072U, Winter 2024 8 / 15


Overdetermined systems . . .

How to “solve” an overdetermined system?


◮ A~x = ~b has solution ~x ∈ RM iff ~b ∈ range(A).
◮ Generically, no solution exists for overdetermined systems.

2072U, Winter 2024 9 / 15


Overdetermined systems . . .

How to “solve” an overdetermined system?


◮ A~x = ~b has solution ~x ∈ RM iff ~b ∈ range(A).
◮ Generically, no solution exists for overdetermined systems.
⇒ Goal: find vector ~x ∗ ∈ RM that minimises size of residual
~r (~x ∗ ).

2072U, Winter 2024 9 / 15


Overdetermined systems . . .

How to “solve” an overdetermined system?


◮ A~x = ~b has solution ~x ∈ RM iff ~b ∈ range(A).
◮ Generically, no solution exists for overdetermined systems.
⇒ Goal: find vector ~x ∗ ∈ RM that minimises size of residual
~r (~x ∗ ).
◮ To minimise ~r (~x ∗ ), measure size with some norm:

min k~r (~x )k = min k~b − A~x k


~x ∈RM ~x ∈RM

◮ ~x ∗ is a minimiser of k~r (~x )k in that norm, i.e.,

k~r (~x ∗ )k ≤ k~r (~y )k for all ~y ∈ RM

2072U, Winter 2024 9 / 15


Overdetermined systems . . .

Conceivable objective functions to minimise:


◮ Using ℓ∞ -norm, define

M
X
Φ∞ (~y ) : = k~b − A~y k∞ = max |bk − Ak ℓ yℓ |
1≤k ≤N
ℓ=1
= maximum abs. value of components of residual.

◮ Using ℓ1 -norm, define

N M
1 ~ 1X X
Φ1 (~y ) : = kb − A~y k1 = |bk − Ak ℓ yℓ |
N N
k =1 ℓ=1
= average of abs. value of residual components.

◮ Both these norms are nonsmooth (nondifferentiable).

2072U, Winter 2024 10 / 15


Overdetermined systems . . .

Least-squares approximation of A~x = ~b

1 ~
Define Φ(~y ) : = kb − A~y k22
N
N M
!2
1 X X
= bk − Ak ℓ yℓ
N
k =1 ℓ=1
= Mean-square residual of ~y .

2072U, Winter 2024 11 / 15


Overdetermined systems . . .

Least-squares approximation of A~x = ~b

1 ~
Define Φ(~y ) : = kb − A~y k22
N
N M
!2
1 X X
= bk − Ak ℓ yℓ
N
k =1 ℓ=1
= Mean-square residual of ~y .

A least-squares approximation of a solution of A~x = ~b is a


vector ~x ∗ that minimises Φ over RM , i.e., so that Φ(~x ∗ ) ≤ Φ(~y )
for all ~y ∈ RM .

2072U, Winter 2024 11 / 15


Overdetermined systems . . .

How to minimise this objective function?


◮ Φ for least-squares approximation is differentiable
◮ Condition for ~x ∗ minimising Φ: critical point


[Φ(~x ∗ )] = 0 (k = 1 : M)
∂xk∗

◮ For N > M with A of full rank, critical point ~x ∗ of Φ is

~x ∗ = (AT A)−1 (AT ~b),

i.e., ~x ∗ is determined by solving the normal equations

AT A~x ∗ = AT ~b

◮ Coefficient matrix AT A is square (M × M), symmetric

2072U, Winter 2024 12 / 15


Overdetermined systems . . .

Previous example: preceding system A~x = ~b is


   
1 −2   1/2
x
1 −1 1 =  1 
x2
1 1 −1

Normal equations AT A~x ∗ = AT ~b are


   
  1 −2  ∗    1/2
1 1 1  x 1 1 1 
1 −1 1∗ = 1  or
−2 −1 1 x2 −2 −1 1
1 1 −1
   ∗  
3 −2 x1 1/2
∗ =
−2 6 x2 −3
 ∗  
∗ x1 −3/14
⇒ ~x = ∗ =
x2 −4/7

2072U, Winter 2024 13 / 15


Overdetermined systems with N UM P Y

Least-squares approximation & normal equations


◮ numpy.linalg.lstsq(A,b) computes least-squares
approximations of overdetermined systems

A = np.array([[1.0,-2.0],[1.0,-1.0],[1.0,1.0]])
b = np.array([0.5,1.0,-1.0])
xstar = np.linalg.lstsq(A,b)

What you need to understand:


◮ Use numpy.linalg.lstsq(A,b) for tall/thin systems
◮ It also computes the mean-square residual.

2072U, Winter 2024 14 / 15


A first-order least-squares fit

Least-squares fit with straight line:


◮ Let Π1 (x) be a straight line, i.e., Π1 (x) = a0 + a1 x
◮ Given {(xk , yk )}nk =0 , interpolation conditions are

Π1 (xk ) = a0 + a1 xk = yk (k = 0 : n)

◮ Write system of equations to fit in matrix form


    
1 · a0 + x0 · a1 = y0  1 x0 y0
 1 x1    y1 
V ∈ R(n+1)×2

1 · a0 + x1 · a1 = y1 
    
1 x2  a0 y2  ~a ∈ R2×1
..  ⇒   =  
.   . .  a 1 .

 .
 . .  | {z }  .. 
. ~y ∈ R(n+1)×1

1 · a0 + xn · a1 = yn 1 xn ~a
yn
| {z } | {z }
V ~y

◮ The least-squares approximant is found by solving the


2 × 2 problem V T V ~a = V T ~y .
2072U, Winter 2024 15 / 15

You might also like