RVN 2021 Lecture 1B

COMP0130 Robot Vision and Navigation
1B: Introduction to Least-Squares Estimation

Dr Paul D Groves
Session Objectives
Show how to
• Use least-squares estimation to
determine unknown parameters
from a set of measurements
• Extend least-squares estimation
to nonlinear problems
• Account for variation in
measurement quality in a least-
squares solution
Apply these techniques to some
example problems
RVN Lecture 1B – Introduction to Least Squares Estimation 2

Contents
1. Formulating the Problem

2. Linear Least-Squares Estimation
3. Applying Least Squares to Nonlinear Problems
4. Weighted Least-Squares Estimation

Mathematical Notation
a, A Italic for scalars
 A11 A12  A1n 
a Bold, lower-case for vectors  
 A21 A22  A2 n 
A A=
  
Bold capitals for matrices
 
 
A Am 2  Amn 
 a1   m1
 
 a2  a = (a1 a2  am )
T
a= 

 
a  Different symbols mean different things
 m
Non-standard notation is
occasionally used to avoid clashes

The Problem (1)
We want to build a mathematical
model from experimental data
Suppose z is a function of y:
z = G(y) z Example: points on a wall
where G is an unknown function
If we have some pairs of z3
observations: z2
y1 , z1 [z1 = G(y1)] z1
y2 , z2 [z2 = G(y2)]
y3 , z3 [z3 = G(y3)]
y1 y2 y3 y
How do we find G ?
1. Formulating the Problem z Two straight lines
The Problem (2)
z3
z = G(y) z2
There’s lots of options for G: z1
y1 y2 y3 y
z A curve z Let’s have some fun!
z3 z3
z2 z2
z1 z1
y1 y2 y3 y y1 y2 y3 y
1. Formulating the Problem z A straight line
The Problem (3)
z3
z = G(y) z2
There’s lots of options for G: z1
Even more if you assume the
observations have errors:
y1 y2 y3 y
z Another curve z Another straight line
z3 z3
z2 z2
z1 z1
y1 y2 y3 y y1 y2 y3 y
The Problem (4)
We need to know what the shape
z z
of the function should be
z3 z3
z2 z2 We use the physics of the problem
z1 z1
to propose a suitable model
y1 y2 y3 y y1 y2 y3 y
z z
z3 z3
z2 z2
z1 z1
y1 y2 y3 y y1 y2 y3 y
z z
z3 z3
z2 z2
z1 z1 e.g. a wall forms a straight line in
the horizontal plane
y1 y2 y3 y y1 y2 y3 y

Assume the shape of the function
We can select a suitable model based on knowledge of the physics:
Linear function: z Quadratic function: z

z = x1 + x2 y z = x1 + x2 y + x3 y 2
y y
Fourier series: z = x1 cos y + x2 sin y + x3 cos 2 y + x4 sin 2 y + 
Now only the coefficients need to be determined:

x = ( x1 x2 )
T
Which greatly simplifies the problem

The Measurement Model
With the shape of the function known,
z = G ( y ) = h(x, y ) z
where h is a known function and z3
x = ( x1 x2 )
T
z2
This is a measurement model z1
It expresses a known observation or
measurement, z, in terms of y1 y2 y3 y
• another known observation, y,
• the unknown coefficients of the model, x
The coefficients, x, are known as states or parameters
The function h is known as the measurement function
Linear Measurement Models
In general, h
z = h ( x, y ) Linear
function:
If z is a linear function of the all of the x1
coefficients, x, then we may write
z = H ( y ) x = H1 ( y ) x1 + H 2 ( y ) x2 +  x1 
 
x =  x2 
where H is the measurement or
 
observation or design matrix, which  
• relates the measurements to the states
• is a known matrix function of y, that need not be linear.
• is not a function of x when h is a linear function of x

The Measurement Matrix
In general,
z = h ( x, y )
For both linear and nonlinear functions of the coefficients, x, the
measurement matrix, H, comprises the partial derivatives of the
measurement function, h, with respect to the states
x1 x2 xn
dz (x, y ) dh(x, y )  h h h 
H( y ) = = =   
dx dx  x1 x2 xn 
There is one column of H for each component of the state vector, x

Measurement Model of a Line Function
Suppose z is a linear function of y: z
z = x1 + x2 y
where x1 is the intercept and x2 is the gradient
y
As z is also a linear function of the
coefficients, we can write this as: Example: a straight wall
 x1 
z = (1 y ) 
 x2 
or z = H( y )x
 x1 
where x =   H ( y ) = (1 y )
 x2 
Modelling Multiple Measurements
Where the same measurement z1 = h ( x, y1 )
function, h, applies to multiple
z2 = h ( x, y2 )
measurements:
We can write this as: where

 h ( x, y1 )   z1   y1 
     
z = h ( x, y ) =  h ( x , y 2 )  z =  z 2  y =  y2 
     
     
Where z is a linear function of  z1   H ( y1 ) 

the all of the coefficients, x:    
z
 2 = z = H ( )  ( 2 )x
y x = H  y
   
   
Matrix Solution of Linear Equations
For a set of linear equations written in matrix-vector form as
z = Hx
Where the number of equations equals the number of unknowns, H
is square so generally has an inverse.
We can multiply both sides of the equation by this, giving
H −1z = H −1Hx
1 0 
Multiplying a matrix by its inverse
H −1H = I =  0 1 
gives the identity matrix, so  
 
Re-arranging, x = H −1z
BUT…This only works when H is square and nonsingular (i.e., |H|  0)
Example 1: A Straight Line Function (1)
At least two y, z observations are needed to
solve for x1 and x2:
 h ( x, y1 )   x1 + x2 y1 
z = h ( x, y ) =  =x +x y 
 h ( x, y2 )   1 2 2 
where:
 x1   y1   z1  z
x=  y =  z =  z = x1 + x2 y
 x2   y2   z2 
As z is a linear function of z2
both x1 and x2 z1
1 y1  x1 
z = H (y) x =   x 
 1 y 2  2  y1 y2 y
Example 1: A Straight Line Function (2)
We are solving z = Hx where
z z = x1 + x2 y
x  z   1 y1 
x =  1  z =  1  H (y ) =  
x
 2 z
 2  1 y 2
z2
The measurement matrix, H, is square
z1
and non-singular (provided y1  y2), so it
can be inverted.
The solution is thus x = H–1z y1 y2 y
Let (y1 , z1) = (4, 4) and (y2 , z2) = (12, 6) Therefore:
−1 −1
 x1  1 y1   z1  1 4   4   1.5 −0.5  4   3 
= = =
 x  1 y   z  1 12   6   −0.125 0.125  6  =  0.25 
 2  2  2         
See RVN Least-Squares Examples.xlsx on Moodle
General Problem Formulation
A measurement, z1, can depend on multiple known parameters, y1, y2…
A known parameter, y1, can impact multiple measurements, z1, z2…
Any component of z and h can be a function of any component of y.
Different components of z and h can also be functions of different states
 x1   y1   z1   h1 ( x, y ) h1 ( x, y ) 
x =  x2  y =  y2  z =  z2   
 
 
 
 
 
   x1 x2 
 h2 ( x, y ) h2 ( x, y ) 
H (y) =  
 h1 ( x, y ) 
   x1 x2 
z = h ( x, y ) =  h2 ( x, y )   
   
   
Handling Real Measurements (1)
Measurements are always subject to error
Measured
value
z = z + Error ~ is called ‘tilde’
True value
Therefore, states or parameters determined from those measurements
are also subject to error
Estimated
value
x̂ = x + e Error ^ is called ‘caret’
True value
ˆ = H −1z and x = H −1z then e = H −1ε
For a linear system, if x

Handling Real Measurements (2)
Because measurements are always subject to error, states estimated from
those measurements will also be subject to error
The effect of random errors can be reduced by using more measurements
1
Solution error e 
standard deviation
m Number of
measurements
Is proportional to
But, we cannot use x = H–1z if there are more measurements than states
1. Only square matrices can be inverted
2. The simultaneous equations will contradict each other because of the
measurement errors
We need a new approach: Least-squares Estimation
Contents


More Measurements than States
Due to measurement errors, observations will contradict each other
Different combinations of measurements give different solutions
There is no exact solution
Example: For a linear function z = x1 + x2y – each

z pair of measurements leads to a different solution
z3 Formulating the problem

z2 as z = H(y)x
z1 H cannot be inverted
What do we do?
y1 y2 y3 y
Adjusting the Measurements to Fit
We assume that z is subject to measurement error, but ignore errors in y
We make an adjustment to each z observation to make z fit the
function h(x,y). This adjustment is called the residual, v.
z Sometimes the opposite
sign convention is used:
~
v = – dz+
z3
~
~ z3 + v3
~
z2 z2 + v2
~
~
z1 z1 + v1
Tilde denotes a
~
z measurement
– not the exact
y1 y2 y3 y value
Modifying the Measurement Model
General measurement model: z + v = h ( x, y )
Linear measurement model: z + v = H(y )x
~
where  x1   y1   z1   v1 
 x2   y2   z2   v2 
x=  y =  z=  v=  n = number of states
        m = number of
 xn   ym   zm   vm  measurements
 h1 x1 h1 x2 h1 xn 

 h2 x1 h2 x2 h2 xn 
H (y) =  
 
 hm x1 hm x2 hm xn  How do we solve this?
Obtaining a Solution
Known
 z1   v1   H11 H12 H1n   x1 
 z2   v2   H 21 H 2 n   x2 
z + v = H(y )x
~ H 22
 +  =   
       
Unknown  zm   vm   H m1 H m2 H mn   xn 
There are as many residuals as rows in the equation (m)

 There are more unknown terms (m + n) than simultaneous equations (m)
The problem is underdetermined
 There is no unique solution for states, x, and residuals, v
 We need more information

Introducing the Least-Squares Constraint
Known
 z1   v1   H11 H12 H1n   x1 
 z2   v2   H 21 H 2 n   x2 
z + v = H(y )x
~ H 22
 +  =   
       
Unknown  zm   vm   H m1 H m2 H mn   xn 
We need more information to solve this

The Least-Squares solution is that which minimises the sum of the
squares of the residuals
i i
v 2
= v T
v
It delivers the solution that passes closest to the set of y, z observations

Deriving the Linear Least-Squares Solution (1)
To solve for x and v Carat denotes an estimated
value – solution is not exact
z + v = H ( y ) xˆ + – (1)
Constraint: select values of x that +
minimise the sum of squares of the x̂
residuals,
 v2
i i “+” denotes ‘a posteriori’
– incorporating the

+ (
Thus… v T
v) = 0 – (2) measurement data
xˆ
Substituting (1) into (2) :
 
( ( ) ) ( ( ) ) =0
+ T +
H y ˆ
x − z H y ˆ
x − z – (3)
xˆ +  
 
( ( ) ) ( ( ) ) =0
T
From before: H y ˆ
x +
− z H y ˆ
x +
− z – (3)
xˆ +  
Expanding: 

+ 
ˆ
x +T
H T
ˆ
Hx +
− ˆ
x +T
H T
z − z T
ˆ
Hx +
+ z T
z  = 0 – (4)
xˆ
Differentiating:
+T  T
2xˆ H H − 2z H = 0
T T
– (5) Noting that a b = bT
a
Transposing and rearranging: H T Hxˆ + = H T z – (6)

From before: H T Hxˆ + = H T z – (6)
Multiplying both sides by (HTH)–1:
(H H) H Hxˆ = ( H H ) H T z
T −1 T + T −1
– (7)
xˆ = ( H H ) H T z
+ T −1
Cancelling: – (8)
This is the unweighted least-squares solution for a linear problem
Note that (HTH)–1 HT is the left pseudo-inverse of H
See also Derivation 1 in RVN Least-Squares Derivations.docx on Moodle

Residuals
z + v = H ( y ) xˆ + is xˆ + = (H T H ) H T ~z
−1
Least-squares solution of
The residuals are given by v = Hxˆ + − z
z (
= H ( H H ) HT − I z
T −1
)
~
z3
~
~ z3 + v3
~
z2 z2 + v2
~
~
z1 z1 + v1
The red line shows zˆ = H ( y ) x
ˆ+
y1 y2 y3 y
Example 2: A Straight Line (1)
We have y, z coordinates of four points along a
wall. We assume…
1. The y coordinates are exact
2. The z coordinates have measurement errors
3. The Wall is straight 20
z (3,18)
A straight line is represented by z = x1 + x2y, 15
where x1 is the intercept and x2 is the gradient. (2,11)

10
We use least-squares estimation to obtain (1,7)

5
values of x1 and x2 from the data
(0,3) y
0
See RVN Least-Squares Examples.xlsx 0 1 2 3
on Moodle RVN Lecture 1B – Introduction to Least Squares Estimation 31

Example 2: A Straight Line (2)
Our model for a straight line is z = x1 + x2y 20
z (3,18)
This is linear, so z = H(y)x 15
and z + v = H ( y ) xˆ + where (2,11)

10
 z1   3  1 y1  1 0 (1,7)
 z2   7   1 y2   1 1 5
z =   =   H (y) =   = 1
z 11
 3     1 y3 2 (0,3) y
   0
 z4  18   1 y4   1 3 0 1 2 3
 xˆ1+   2.4   z = 2.4 + 4.9y

( )
−1
The solution is  +  = xˆ + = H T H H z = 
T
 xˆ2   4.9 
Contents


Nonlinear Problems (1)
Unfortunately, observations are not always linear functions of the states:
Example A: Finding the centre and radius of a chimney
Applying Pythagoras’ theorem:
z x32 = ( y − x1 ) + (z − x2 )
2 2
x3  z = x2  x32 − ( y − x1 )
2
x2 z is a linear function of x2,

But it is a nonlinear function
of x1 and x3
The least-squares method can

x1 y only solve linear problems
Unfortunately, observations are not always a linear functions of the states.
Example B: Determining positions from ranging measurements,
such as GNSS
Landmark 1 Landmark 2
yE1, yN1 yE2, yN2
Range 1 Range 2 The least-squares

User xE, xN method can only solve
linear problems
 2 
( ) ( E1 E ) ( N 1 N ) 
2
 zr1   h x, y   y − x + y − x
= 
 z  h x, y
1
 =
 r2   2 ( )  2 
( y E 2 − xE ) + ( y N 2 − x N ) 
2

Unfortunately, observations are not always a linear functions of the states.
Example C: Determining positions
Landmark 2
from optical angle measurements North yE2, yN2
Landmark 1
yE1, yN1
Bearing 2
Bearing 1
User
xE, xN
 z 1   h1 ( x, y )   arctan 2 ( ( yE1 − xE ) , ( y N 1 − xN ) ) 
 z  =  h x, y  =  
 2   2 ( )   arctan 2 ( ( yE 2 − xE ) , ( yN 2 − xN ) ) 
The least-squares method can only solve linear problems
Finding an Equivalent Linear Problem
We cannot solve a nonlinear  h1 (x , y ) 
problem using least-squares  h (x , y ) 
estimation directly z= 2   h ( x,y )  H ( y ) x
 
 h (x , y ) 
Instead, we must formulate an  m 
equivalent linear problem, such as
z1 dz1/dx1 = H11
dz = H(x,y) dx, where
dz is the change in z, and
dx is the change in x dz1
To use least-squares we must essentially
turn a nonlinear problem into a linear one
dx1 x1
Linearisation using Taylor’s Theorem
Applying Taylor’s theorem to the measurement model…
h ( x, y )  r h ( x, y )  x − x

r

z = h ( x , y ) = h ( x , y ) +  x − x + 
x r =2  x r
r!
If we select x such that this term is negligible,
h ( x, y )
then… h ( x , y )  h ( x, y ) +  x − x
x
h ( x, y )
or h ( x , y )  h ( x, y ) + H ( x, y )  x − x where H ( x, y ) =
x
Rearranging: h ( x , y ) − h ( x, y )  H ( x, y )  x − x
This first-order approximation is known as linearisation

The Measurement Matrix
H is the measurement (or observation matrix), which
• relates changes in the measurements to changes in the states
• comprises the partial derivatives of h with respect to the states
• is a function of both x and y
x1 x2 xn
 h1 x1 h1 x2 h1 xn  z1
 
 
h(x , y )  2 1h x  h x  h x n  z2
H ( x , y ) = = 2 2 2
x  
 
Each row corresponds to one  m
h  x1 h m  x2 h m  xn  x=x zm
component of the function, h, Each column corresponds to one

and the measurement, z component of the state vector, x
We calculate H using the predicted values of x. i.e., x = x
−
ˆ
Linearising the Problem (1)
To use least-squares we must turn a nonlinear problem into a linear one
ˆ + and v:
To solve for x
z + v = h ( xˆ + , y )  H ( y ) xˆ +
−
We use a prediction of the states, x̂
to predict the measurements:
(
zˆ − = h xˆ − ,y ) z1 + v1
Subtracting this from both sides:

ẑ1−
z − h ( xˆ − , y ) + v = h ( xˆ + , y ) − h ( xˆ − , y )
x̂1− x1
We can then use this to model a linear relationship…
Linearising the Problem (2)
From the previous slide…
First-order
z − h ( xˆ − , y ) + v = h ( xˆ + , y ) − h ( xˆ − , y ) Taylor series
approximation
Measurement  H (xˆ − , y )  xˆ + − xˆ −  – Linearisation
innovation,
b, or dz− = H (xˆ − , y )d x
= measurements where dx = xˆ + − xˆ −
minus predictions
b1 + v1
b + v  H(xˆ − ,y )dx
This can be solved using least-squares estimation d x1

Nonlinear Least-Squares Solution
We now have a linear equation to solve for
REMEMBER
dx and v:
b + v  H(xˆ ,y )dx
−
b=~
z − h xˆ − ,y ( )
We select values of dx that minimise the h(x,y )
H(xˆ − ,y ) =
sum of squares of the residuals,
v 2
i i
x x=xˆ −
dx = xˆ + − xˆ −
The solution is the same as for linear
least-squares estimation.
dx  (H H ) H T b
T −1
Thus: The residuals are
Giving xˆ + = xˆ − + dx v  Hd x − b
 xˆ + (H H ) H T b
− T −1
(
= H ( H H ) HT − I b
T −1
)
See Derivation 1 on Moodle
The Linearisation Error
The solution to the nonlinear equation, z + v = h ( xˆ + , y )
( ) ( )
z − h ( xˆ − , y )
+ − −1
ˆ  xˆ + H H
is x
T
HT
This is only an approximate solution because we have made the
 h ( xˆ , y )  xˆ − xˆ 
assumption r − + − r

r =2 x r
r!
0
This is the linearisation approximation z1 + v1

+
+
ˆ will be a better solution than xˆ
But, x
− ẑ 1
−
−
ˆ , to the new ẑ
If we set the predicted states, x 1
ˆ +, and compute another least-squares

solution, x
solution, that will be better. This is iteration. x̂1− x̂1+ x1
(We must recalculate H)
Iterative Least-Squares (ILS)
Iteration minimises the Predict states, x̂
−
linearization error for
nonlinear least-squares − xˆ − = xˆ +
ˆ
Predict measurement, z
problems
Calculate measurement innovation, b
Calculate measurement matrix, H, using predicted states
Obtain state vector innovation, dx, by least-squares
Update state estimates,xˆ = xˆ + dx

+ −
Repeat until convergence (dx within threshold)

Nonlinear Least-Squares Step-by-Step
Establish: Unknown states (coefficients) to estimate, x
Known parameters, y
Measured parameters ~z
1) Determine the measurement model: z = h(x, y)
−
2) Predict states, x̂
3) Calculate measurement innovation, b = ~
z − h xˆ − ,y ( )
4) Calculate the measurement matrix, h(x , y )
H (xˆ − , y ) =
x x = xˆ −
( )
+ − −1
ˆ  xˆ + H H
5) Compute the solution, x
T
H Tb
6) Iterate where necessary
See the Step-by-Step Guide on Moodle
Example 3: Total Station Positioning (1)
A total station measures 13 C A
ranges between 6 points
Coordinates of point A are known
Coordinates of the other 5 points B
are to be determined D
E
The bearing of A to B (with respect
to north) is also measured
F
States to Estimate, x: E & N coordinates of B, C, D, E & F (10 parameters)

Known Parameters, y: E & N coordinates of A (2 parameters)
Measurements, z: 13 ranges and one bearing
Step 1: Determine C: coordinates x3, x4 A: coordinates y1, y2
the measurement
model - Ranging Measurement z1
Measurement z6 Range AB
Range BC
B: coordinates x1, x2
 
( x1 − y1 ) + ( x2 − y2 )
2 2
 z1   h1 ( x, y )   
     
 = = 
 z6   h6 ( x, y )   
( x3 − x1 ) + ( x4 − x2 )
2 2
     
     
 
Step 1: Determine North Step 2: Predict
the measurement the states
A: coordinates y1, y2
model - Bearing
Point Easting Northing
Measurement z14 B 10.10 7.10
Bearing AB from N C 6.50 9.90
D 6.60 6.60
B: coordinates x1, x2 E 6.60 5.40
F 8.30 4.40
z14 = h14 ( x, y ) = arctan 2 ( ( x1 − y1 ) , ( x2 − y2 ) )
Step 3: Measurement Measured Predicted b = z − zˆ −
Calculate the Range AB = z1 2.882 2.902 -0.020
Measurement Range BC = z6 4.491 4.561 -0.070
innovation Bearing AB = z14 3.124 3.107 0.017

Step 4: Calculate the Measurement matrix - Ranging
See Step-by-
Measurement model: h1 ( x, y ) = ( x1 − y1 ) + ( x2 − y2 )
2 2
Step guide:
General
Differentiate with
h1 ( x ) x1 − y1 Advice for
= help with the
respect to 1st state: x1 ( x1 − y1 ) + ( x2 − y2 )
2 2
derivatives
h1 ( x ) xˆ1− − y1
( )
Use predicted states −
for the measurement H11 xˆ = x =
( xˆ − y1 ) + ( xˆ − y2 )
− 2 − 2
1 −
matrix: x = xˆ
1 2
−
ˆ −y
x
Simplifying: H11 ( xˆ ) = 1 − 1 ( xˆ − y1 ) + ( xˆ − y2 )
− − 2 − 2
as ẑ =
−
1 1 2
zˆ1
Step 4: Calculate the Measurement matrix - Bearing
Measurement model: h14 ( x, y ) = arctan 2 (( x − y ) , ( x

1 1 2 − y2 ) )
Differentiate with h14 ( x )

See Step-by-Step
x2 − y2
= guide: General Advice
respect to 1st state: x
( 1 1) ( 2 2 )
2 2
1 x − y + x − y for help with the
derivatives
Use predicted states h14 ( x ) xˆ2− − y2
−
for the measurement H14,1 xˆ = ( ) =
x1 ( xˆ − y1 ) + ( xˆ − y2 )
− 2 − 2
−
matrix: x = xˆ 1 2
−
ˆ −y
x
Simplifying: H14,1 ( xˆ ) = 2 − 2 2 ( xˆ − y1 ) + ( xˆ − y2 )
− − 2 − 2
as ẑ =
−
1 1 2
( zˆ1 )
Step 5: Solve C A
xˆ +  xˆ − + ( H T H ) H Tb
−1
Step 6: Iterate as needed B

D
E
After a second iteration…
Point Easting Northing F
B 10.05 7.11
C 6.52 9.88
D 6.67 6.72
E 6.72 5.36
F 8.43 4.39

Contents


Measurements of Varying Accuracy
Often, some measurements are more precise than others.
Positioning accuracy from angular GNSS accuracy can vary

measurements depends on range between signals

Processing Measurements of Varying Accuracy
A simple straight-line example Equal weighting of measurements
is not appropriate where some are
much more precise than others.
z
~ Imprecise
z2 measurement – large residual The function z = h(x, y)
~
z3 should be closer to the more
precise measurements
~
z1 Precise measurements
– small residuals Residuals should thus be
larger for less precise
y1 y2 y3 y measurements
 We need to give higher weighting to more precise measurements

How do we do this?
Mean and Variance
The measurement error is given by
Error  = z−z True value
Measured value ~ is called ‘tilde’
Least-squares estimation assumes measurement errors are zero mean:
Expectation operator
– Gives the mean value of E ( ) = 0 E(z) = z
an infinitely large sample
The variance is then:  =E ( 

2
z
2
) =E ( ( z − z ) 2
)
Multiple Measurements
The variances are (
 z21 =E ( 12 ) =E ( z1 − z1 )) 2
Different
measurements may
=E (  ) =E ( ( z − z ) )
2
 z22 2
2 2 2
have different
variances or the
variances may be
 =E ( 
2
zm
2
m ) =E ( ( z m − zm )
2
) the same:
Error sources can affect multiple measurements, so we also need to

consider covariance:
( )
Czij = E ( zi − zi ) ( z j − z j ) =  zi zj  zij Correlation coefficient
Varies between
−1: fully anticorrelated
Covariance of ith and jth Measurement error 0: uncorrelated
measurement errors standard deviations +1: fully correlated
Measurement Error Covariance Matrix
Expectation of square of the measurement error vector
Comprises variances and covariances of all of the measurement errors
Vector of measured values   z21 Cz12 Czlm 

 
C
 z 21  2
Czlm 
Cz = E ( εε ) = E ( z − z )( z − z ) = 
 
T T z2
  
 
C 2 
 zm 
Vector of true values
 zm1 Czm 2
Off-diagonal elements Diagonal elements
are covariances are variances
Covariance matrices are symmetric: CTz = Cz
This sometimes called the stochastic model of the measurements
Introducing Weighted Least-Squares (1)
The weighted residual is the ratio of the residual, v, to the measurement
error standard deviation, z
 zi = E (  ) = E ( zi − zi ) 
 2
The ith weighted residual is vi/zi 2
i
 
Where measurement errors are independent…
Minimising the sum of squares of the weighted residuals, not the raw
residuals, gives higher weighting to more precise measurements
In general, we minimise v T C−z 1 v   z21 0 0 

vi2  0 2 0 
v T C−z 1 v =  where Cz =  z2
i  zi2  
 0 0 2 
 zm
 
Introducing Weighted Least-Squares (2)
In general, measurement errors are not independent
  z21 Cz12 Cz1m 

 
C  2
Cz 2 m 
Cz =  z 21 z2  Stochastic Model
 
 2 

 Czm1 Czm 2  zm  CTz = Cz
By minimising v T C−z 1 v , errors correlated across different measurements

are accounted for

Linear Weighted Least-Squares Solution
Derivation is similar to unweighted least-squares
We solve for x and v: z + v = H (y )xˆ +
Constraint: minimise v T C−z 1 v
Solution:
xˆ = ( H C H ) H T C−z 1z
+ T −1 −1
z
Residuals:
Unweighted solution for comparison
v = Hxˆ + − z
xˆ = ( H H ) H T z
( )
−1
= H ( H Cz H ) H TC−z 1 − I z
+ T −1
T −1

Nonlinear Weighted Least-Squares Solution
The same derivation applies
Where
ˆ ,y )dx
We solve for dx and v: b + v  H(x
−
b=~
z − h xˆ − ,y ( )
Constraint: minimise v T C−z 1 v h(x,y )
H(xˆ − ,y ) =
Solution: x x=xˆ −
dx = xˆ + − xˆ −
xˆ  xˆ + ( H C H ) H TC−z 1b
+ − T −1 −1
z
Iterate where necessary Residuals:

Unweighted solution for comparison v  Hd x − b
+ −
(
xˆ  xˆ + H H T
)−1 T
H b (
= H ( H C H ) H TC−z 1 − I b
T −1
z
−1
)
How Accurate Are the State Estimates?
The state estimation error is given by
+
Error e=x −x
ˆ True value
Estimated value ^ is called ‘caret’
Least-squares estimation assumes state estimation errors are zero mean:
E(x ) = x
Expectation operator
– Gives the mean value of E (e) = 0 ˆ +
an infinitely large sample
The variance is then:

2
x
2
(
 =E ( e ) =E ( xˆ − x ) + 2
)
Multiple States
The variances are
2
x1
2
1 (
 =E ( e ) =E ( xˆ − x1 ) +
1 ) 2
Different state
=E ( e ) =E ( ( xˆ − x ) )
2
estimates will usually
 2
x2
2
2
+
2 2
have different
variances
=E ( e ) =E ( ( xˆ − x ) )
2
 2
xn
2
n
+
n n
Error sources can affect multiple measurements, so we also need to

consider covariance:
( )
Cxij = E ( xˆi+ − xi )( xˆ +j − x j ) =  xi xj  xij Correlation coefficient
Will be non-zero unless
the state estimation can
Covariance of ith and jth State estimation error be partitioned into
state estimation errors standard deviations separate problems
State Estimation Error Covariance Matrix
Expectation of square of the error in the state vector
Comprises variances and covariances of all of the state estimation errors
Vector of estimated values   x21 Cx12 Cx1n 

 
 C  2
Cx 2 n 
Cx = E ( eeT ) = E ( xˆ + − x )( xˆ + − x )  =  x 21
T x2
  
 
C 2 
 xn 
Vector of true values  xn1 Cxn 2
Off-diagonal elements Diagonal elements

are covariances are variances
Covariance matrices are symmetric: CTx = Cx

State Estimation Error Covariance
xˆ = ( H C H ) H T C−z 1z
+ T −1 −1
Weighted linear least-squares solution: z
xˆ  xˆ + ( H C H ) H T C−z 1b
+ − T −1 −1
Weighted nonlinear solution: z
In both case, we can express the state estimation error, e, as a function of

( )
−1
the measurement error, , using: T −1
e = H C H H T C−1ε z z
The state estimation error covariance is therefore:

Cx = E ( ee ) = E ( H Cz H ) H Cz εε Cz H ( H Cz H ) 

T T −1 −1 T −1 T −1 T −1 −1
 
Cx = ( H Cz H ) H Cz E ( εε ) Cz H ( H Cz H )
T −1 −1 T −1 T −1 T −1 −1
E ( εε )=C Cx = ( H C H ) H C Cz C H ( H C H )
T T −1 −1 T −1 −1 T −1 −1
z z z z z
Cx = ( H C H ) ( H C H )( H C H )
T −1 −1 T −1 T −1 −1
z z z
Cx = ( H C H )
T −1 −1
z
Building on Example 3 C A
A total station measures 13
ranges between 6 points
Coordinates of point A are known B
D
Coordinates of the other 5 points
E
are to be determined
The bearing of A to B (with respect F
to north) is also measured
States to Estimate, x: E & N coordinates of B, C, D, E & F (10 parameters)

Known Parameters, y: E & N coordinates of A (2 parameters)
Measurements, z: 13 ranges and one bearing
We now have the measurement error standard deviation information:
Ranging measurements: 0.1 m
Bearing measurements: 0.5 = 8.7210–3 rad
All measurements are independent
 0.01 0 0  0
 0 0.01 0  0
Cz =  

 0 0 0.01 0 
 0 0 0 7.62 10−5 

In RVN Least-Squares Examples.xlsx on Moodle, a weighted least-
squares solution is calculated.
This is the same as the unweighted solution because the bearing
measurement is essential for obtaining a unique solution
Calculating the uncertainty of
C A
the state estimates using
Cx = ( H C H )
T −1 −1
z gives
B
State Uncertainty State Uncertainty D
EB 0.025 NB 0.086 E
EC 0.090 NC 0.133 F
ED 0.100 ND 0.130
EE 0.140 NE 0.140
EF 0.197 NF 0.089 Details in RVN Least-
Squares Examples.xlsx
The east coordinate of B is more accurate than the on Moodle
others due to the angle measurement precision

RVN 2021 Lecture 1B

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RVN 2021 Lecture 1B

Uploaded by

Copyright:

Available Formats

COMP0130 Robot Vision and Navigation

1B: Introduction to Least-Squares Estimation

RVN Lecture 1B – Introduction to Least Squares Estimation 2

1. Formulating the Problem

RVN Lecture 1B – Introduction to Least Squares Estimation 3

RVN Lecture 1B – Introduction to Least Squares Estimation 4

RVN Lecture 1B – Introduction to Least Squares Estimation 8

Linear function: z Quadratic function: z

Fourier series: z = x1 cos y + x2 sin y + x3 cos 2 y + x4 sin 2 y + 

Now only the coefficients need to be determined:

Which greatly simplifies the problem

RVN Lecture 1B – Introduction to Least Squares Estimation 11

There is one column of H for each component of the state vector, x

RVN Lecture 1B – Introduction to Least Squares Estimation 12

We can write this as: where

Where z is a linear function of  z1   H ( y1 ) 

RVN Lecture 1B – Introduction to Least Squares Estimation 19

1. Formulating the Problem

RVN Lecture 1B – Introduction to Least Squares Estimation 21

Example: For a linear function z = x1 + x2y – each

z3 Formulating the problem

 h1 x1 h1 x2 h1 xn 

There are as many residuals as rows in the equation (m)

RVN Lecture 1B – Introduction to Least Squares Estimation 25

We need more information to solve this

It delivers the solution that passes closest to the set of y, z observations

RVN Lecture 1B – Introduction to Least Squares Estimation 26

Transposing and rearranging: H T Hxˆ + = H T z – (6)

RVN Lecture 1B – Introduction to Least Squares Estimation 28

Multiplying both sides by (HTH)–1:

This is the unweighted least-squares solution for a linear problem

Note that (HTH)–1 HT is the left pseudo-inverse of H

See also Derivation 1 in RVN Least-Squares Derivations.docx on Moodle

The residuals are given by v = Hxˆ + − z

where x1 is the intercept and x2 is the gradient. (2,11)

We use least-squares estimation to obtain (1,7)

on Moodle RVN Lecture 1B – Introduction to Least Squares Estimation 31

and z + v = H ( y ) xˆ + where (2,11)

 xˆ1+   2.4   z = 2.4 + 4.9y

1. Formulating the Problem

RVN Lecture 1B – Introduction to Least Squares Estimation 33

x2 z is a linear function of x2,

The least-squares method can

Range 1 Range 2 The least-squares

h ( x, y )  r h ( x, y )  x − x

This first-order approximation is known as linearisation

component of the function, h, Each column corresponds to one

Subtracting this from both sides:

This can be solved using least-squares estimation d x1

This is the linearisation approximation z1 + v1

ˆ +, and compute another least-squares

Calculate measurement matrix, H, using predicted states

Obtain state vector innovation, dx, by least-squares

Update state estimates,xˆ = xˆ + dx

Repeat until convergence (dx within threshold)

RVN Lecture 1B – Introduction to Least Squares Estimation 44

States to Estimate, x: E & N coordinates of B, C, D, E & F (10 parameters)

See RVN Least-Squares Examples.xlsx on Moodle

Measurement model: h14 ( x, y ) = arctan 2 (( x − y ) , ( x

Differentiate with h14 ( x )

Step 6: Iterate as needed B

See RVN Least-Squares Examples.xlsx on Moodle