Least Squares Fitting of Data by Linear or Quadratic Structures

Least Squares Fitting of Data by Linear or Quadratic
Structures
David Eberly, Geometric Tools, Redmond WA 98052
https://www.geometrictools.com/
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy
of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons,
PO Box 1866, Mountain View, CA 94042, USA.
Created: July 15, 1999
Last Modified: September 7, 2021
Contents
1 Introduction 3
2 The General Formulation for Nonlinear Least-Squares Fitting 3
3 Affine Fitting of Points Using Height Fields 4

3.1 Fitting by a Line in 2 Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1.1 Pseudocode for Fitting by a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Fitting by a Plane in 3 Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2.1 Pseudocode for Fitting by a Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Fitting by a Hyperplane in n + 1 Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.1 Pseudocode for Fitting a Hyperplane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Affine Fitting of Points Using Orthogonal Regression 10

4.1 Fitting by a Line [1 Dimension] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.1.1 Pseudocode for the General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Fitting by a Hyperplane [(n − 1) Dimensions] . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3 Fitting by a Flat [k Dimensions] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 Fitting a Hypersphere to Points 15

5.1 Fitting Using Differences of Lengths and Radius . . . . . . . . . . . . . . . . . . . . . . . . . 16
1
5.2 Fitting Using Differences of Squared Lengths and Squared Radius . . . . . . . . . . . . . . . 18
5.2.2 Pseudocode for Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2.3 Pseudocode for Spheres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 Fitting the Coefficients of a Quadratic Equation . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3.2 Pseudocode for Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3.3 Pseudocode for Spheres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6 Fitting a Hyperellipsoid to Points 27

6.1 Updating the Estimate of the Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.2 Updating the Estimate of the Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.3 Pseudocode for the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7 Fitting a Cylinder to 3D Points 30

7.1 Representation of a Cylinder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7.2 The Least-Squares Error Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3 An Equation for the Radius . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.4 An Equation for the Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.5 An Equation for the Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7.6 Fitting for a Specified Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.7 Pseudocode and Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.8 Fitting a Cylinder to a Triangle Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8 Fitting a Cone to 3D Points 45

8.1 Initial Choice for the Parameters of the Error Function . . . . . . . . . . . . . . . . . . . . . . 46
8.1.1 Simple Attempt to Reconstruct Height Extremes . . . . . . . . . . . . . . . . . . . . . 49
8.1.2 Attempt to Reconstruct the Cone Axis Direction . . . . . . . . . . . . . . . . . . . . . 51
8.1.3 Attempt to Reconstruct the Cone Vertex . . . . . . . . . . . . . . . . . . . . . . . . . 51
9 Fitting a Paraboloid to 3D Points of the Form (x, y, f (x, y)) 55
2
1 Introduction
This document describes least-squares minimization algorithms for fitting point sets by linear structures
or quadratic structures. The organization is somewhat different from that of the previous version of the
document. Modifications include the following.
A section on the general formulation for nonlinear least-squares fitting is now available. The stan-
dard approach is to estimate parameters using numerical minimizers (Gauss–Newton or Levenberg–
Marquardt).
A new algorithm for fitting points by a circle, sphere or hypersphere is provided. The algorithm is
non-iterative, so the computation time is bounded and small.
In the previous version, the sections about fitting of points by ellipses or ellipsoids were severely lacking
details and not useful for developing algorithms. Several algorithms are now provided for such fitting,
including a general approach for fitting points by hyperellipsoids.
The document for fitting points by a cylinder has been moved to this document. The website hyperlink
to the cylinder document has been redirected to this document.
A section has been added for fitting points by a single-sided cone.
Pseudocode is now provided for each of the algorithms. Hyperlinks still exist for those algorithms
implemented in the GTE source code.
Other documents using least-squares algorithms for fitting points with curve or surface structures are avail-
able at the website. The document for fitting points with a torus is new to the website (as of August
2018).
Least-Squares Fitting of Data with Polynomials

Least-Squares Fitting of Data with B-Spline Curves
Least-Squares Reduction of B-Spline Curves
Fitting 3D Data with a Helix
Least-Squares Fitting of Data with B-Spline Surfaces
Fitting 3D Data with a Torus
The document Least-Squares Fitting of Segments by Line or Plane describes a least-squares algorithm where
the input is a set of line segments rather than a set of points. The output is a line (segments in n dimensions)
or a plane (segments in 3 dimensions) or a hyperplane (segments in n dimensions).
2 The General Formulation for Nonlinear Least-Squares Fitting
Let F (p) = (F0 (p), F1 (p), . . . , Fn−1 (p)) be a vector-valued function of the parameters p = (p0 , p1 , ..., pm−1 ).
The nonlinear least-squares problem is to minimize the real-valued error function E(p) = |F (p)|2 .
3
Let J = dF /dp = [dFr /dpc ] denote the Jacobian matrix, which is the matrix of first-order partial derivatives
of the components of F . The matrix has n rows and m columns, and the indexing (r, c) refers to row r and
column c. A first-order approximation is
.
F (p + d) = F (p) + J(p)d (1)
where d is an m × 1 vector with small length. Consequently, an approximation to the error function is
E(p + d) = |F (p + d)|2 = |F (p) + J(p)d|2 (2)
The goal is to choose d to minimize |F (p) + J(p)d|2 and, hopefully, with E(p + d) < E(p). Choosing an
initial p0 , the hope is that the algorithm generates a sequence pi for which E(pi+1 ) < E(pi ) and, in the
limit, E(pj ) approaches the global minimum of E. The algorithm is referred to as Gauss–Newton iteration.
For a single Gauss–Newton iteration, we need to choose d to minimize |F (p) + J(p)d|2 where p is fixed.
This is a linear least-squares problem which can be formulated using the normal equations
J T (p)J(p)d = −J T (p)F (p) (3)
T
The matrix J J is positive semidefinite. If it is invertible, then
d = −(J T (p)J(p))−1 F (p) (4)
If it is not invertible, some other algorithm must be used to choose d; one option is to use gradient descent
for the step. A Cholesky decomposition can be used to solve the linear system.
During Gauss–Newton iteration, if E does not decrease for a step of the algorithm, one can modify the
algorithm to Levenberg–Marquardt iteration. The idea is to smooth the linear system to
J T (p)J(p) + λI d = −J T (p)F (p)

(5)
where I is the identity matrix of appropriate size and λ > 0 is the smoothing factor. The strategy for
choosing the initial λ and how to adjust it as you compute iterations depends on the problem at hand.
For a more detailed discussion, see Gauss–Newton algorithm and Levenberg–Marquardt algorithm. Imple-
mentations of the Cholesky decomposition, Gauss–Newton method and Levenberg–Marquardt method in
GTE can be found in CholeskyDecomposition.h, GaussNewtonMinimizer.h and LevenbergMarquardtMini-
mizer.h.
3 Affine Fitting of Points Using Height Fields
We have a set of measurements {(X i , hi )}m n

i=1 for which X i ∈ R are sampled independent variables and
hi ∈ R is a sampled dependent variable. The hypothesis is that h is related to X via an affine transformation
h = A · X + b, where A is an n × 1 vector of constants and b is a scalar constant. The goal is to estimate
A and b from the samples. The choice of name h stresses that the measurement errors are in the direction
of height above the plane containing the X measurements.
3.1 Fitting by a Line in 2 Dimensions
The measurements are {(xi , hi }m

i=1 where x is an independent variable and h is a dependent variable. The
affine transformation we want to estimate is h = ax + b, where a and b are scalars. This defines a line that
4
best fits the samples in the sense that the sum of the squared errors between the hi and the line values
axi + b is minimized. Note that the error is measured only in the h-direction.
Define the error function for the least-squares minimization to be
m
X
E(a, b) = [(axi + b) − hi ]2 (6)
i=1
This function is nonnegative and its graph is a paraboloid whose vertex occurs when the gradient satisfies
∇E(a, b) = (∂E/∂a, ∂E/∂b) = (0, 0). This leads to a system of two linear equations in a and b which can
be easily solved. Precisely,
Pm
0 = ∂E/∂a = 2 i=1 [(axi + b) − hi ]xi
Pm (7)
0 = ∂E/∂b = 2 i=1 [(axi + b) − hi ]
and so  P   P  
m Pm m
x2 i=1 xi   a  xi hi
 Pi=1 i =  Pi=1  (8)
m Pm m
i=1 xi i=1 1 b i=1 h i
The system is solved by standard numerical algorithms. If implemented directly, this formulation Pmcan lead
to an ill-conditioned
Pm linear system. To avoid this, you should first compute the averages x̄ = ( i=1 xi )/m
and h̄ = ( i=1 hi )/m and subtract them from the data. The fitted line is of the form h − h̄ = ā(x − x̄) + b̄.
The linear system of equations that determines the coefficients is
 P    P 
m 2 m
(x
i=1 i − x̄) 0 ā (x
i=1 i − x̄)(hi − h̄)
  =  (9)
0 m b̄ 0
and has solution Pm

(x − x̄)(hi − h̄)
ā = Pm i
i=1
2
, b̄ = 0 (10)
i=1 (xi − x̄)
In terms of the original inputs, a = ā and b = h̄ − āx̄.
3.1.1 Pseudocode for Fitting by a Line
Listing 1 contains pseudocode for fitting a height line to points in 2 dimensions.
Listing 1. Pseudocode for fitting a height line to points in 2 dimensions. The number of input points
must be at least 2. The returned Boolean value is true as long as the numerator of equation (10) is positive;
that is, when the points are not all the same point. An implementation in a slightly more general framework
is ApprHeightLine2.h.
b o o l F i t H e i g h t L i n e ( i n t numPoints , V e c t o r 2 p o i n t s [ ] ,
R e a l& barX , R e a l& barH , R e a l& barA )
{
// Compute t h e mean o f t h e p o i n t s .
V e c t o r 2 mean = { 0 , 0 } ;
f o r ( i n t i = 0 ; i < nu mPo int s ; ++i )
{
mean += p o i n t s [ i ] ;
5
}
mean /= num Po int s ;
// Compute t h e l i n e a r s y s t e m m a t r i x and v e c t o r e l e m e n t s .
R e a l xxSum = 0 , xhSum = 0 ;
{
V e c t o r 2 d i f f = p o i n t s [ i ] = mean ;
xxSum += d i f f [ 0 ] * d i f f [ 0 ] ;
l i n e a r += d i f f [ 0 ] * d i f f [ 1 ] ;
}
// S o l v e t h e l i n e a r s y s t e m .
i f ( xxSum > 0 )
{
// Compute t h e f i t t e d l i n e h ( x ) = barH + barA * ( x = barX ) .
barX = mean [ 0 ] ;
barH = mean [ 1 ] ;
barA = l i n e a r / xxSum ;
return true ;
}
else
{
// The o u t p u t i s i n v a l i d . The p o i n t s a r e a l l t h e same .
barX = 0 ;
barH = 0 ;
barA = 0 ;
return false ;
}
}
3.2 Fitting by a Plane in 3 Dimensions
The measurements are {(xi , yi , h)}mi=1 where x and y are independent variables and h is a dependent variable.
The affine transformation we want to estimate is h = a0 x + a1 y + b, where a0 , a1 and b are scalars. This
defines a plane that best fits the samples in the sense that the sum of the squared errors between the hi and
the plane values a0 xi + a1 yi + b is minimized. Note that the error is measured only in the h-direction.
m
X
E(a0 , a1 , b) = [(a0 xi + a1 yi + b) − hi ]2 (11)
i=1
This function is nonnegative and its graph is a hyperparaboloid whose vertex occurs when the gradient sat-
isfies ∇E(a0 , a1 , b) = (∂E/∂a0 , ∂E/∂a1 , ∂E/∂b) = (0, 0, 0). This leads to a system of three linear equations
in a0 , a1 and b which can be easily solved. Precisely,
Pm
0 = ∂E/∂a0 = 2 i=1 [(Axi + Byi + C) − zi ]xi
Pm
0 = ∂E/∂a1 = 2 i=1 [(Axi + Byi + C) − zi ]yi (12)
Pm
0 = ∂E/∂b = 2 i=1 [(Axi + Byi + C) − zi ]
and so  P  
 P 
m 2
Pm Pm m
 i=1 xi i=1 xi yi i=1 xi 
a0
  i=1 xi hi 
 Pm Pm 2 Pm   Pm
  a1  =  (13)
 
xy i=1 yi i=1 yi yh
 Pi=1 i i   Pi=1 i i
 
 
m Pm Pm m
i=1 xi i=1 yi i=1 1 b i=1 hi
6
The solution is solved by standard numerical algorithms. If implemented directly, this formulation Pm can lead
to an Pill-conditioned linear system. To avoid this, you should first compute the averages x̄ = ( i=1 xi )/m,
m Pm
ȳ = ( i=1 yi )/m and h̄ = ( i=1 hi )/m and subtract them from the data. The fitted plane is of the form
h − h̄ = ā0 (x − x̄) + ā1 (y − ȳ) + b̄. The linear system of equations that determines the coefficients is
    Pm Pm  
2

`00 `01 0

ā0
  i=1 (xi − x̄) i=1 (xi − x̄)(yi − ȳ) 0

ā0

 Pm Pm 2
= i=1 (xi − x̄)(yi − ȳ) i=1 (yi − ȳ)
    
 `01 `11 0   ā1   0   ā1 
     
0 0 m b̄ 0 0 m b̄
(14)
 P   
m
 i=1 (hi − h̄)(xi − x̄)   0
r

 Pm
= i=1 (hi − h̄)(yi − ȳ)
 =  r1
  
 
   
0 0
and has solution

`11 r0 − `01 r1 `00 r1 − `01 r0
ā0 = , ā1 = , b̄ = 0 (15)
`00 `11 − `201 `00 `11 − `201
In terms of the original inputs, a0 = ā0 , a1 = ā1 and b = h̄ − ā0 x̄ − ā1 ȳ.
3.2.1 Pseudocode for Fitting by a Plane
Listing 2 contains pseudocode for fitting a height plane to points in 3 dimensions.
Listing 2. Pseudocode for fitting a height plane to points in 3 dimensions. The number of input points
must be at least 3. The returned Boolean value is true as long as the matrix of the linear system has nonzero
determinant. An implementation in a slightly more general framework is ApprHeightPlane3.h.
b o o l F i t H e i g h t P l a n e ( i n t numPoints , V e c t o r 3 p o i n t s [ ] ,
R e a l& barX , R e a l& barY , R e a l& barH , R e a l& barA0 , R e a l& barA1 )
{
V e c t o r 3 mean = { 0 , 0 , 0 } ;
{
}
mean /= num Poi nt s ;
// Compute t h e l i n e a r s y s t e m m a t r i x and v e c t o r e l e m e n t s .
R e a l xxSum = 0 , xySum = 0 , xhSum = 0 , yySum = 0 , yhSum = 0 ;
{
V e c t o r 3 d i f f = p o i n t s [ i ] = mean ;
xxSum += d i f f [ 0 ] * d i f f [ 0 ] ;
xySum += d i f f [ 0 ] * d i f f [ 1 ] ;
xhSum += d i f f [ 0 ] * d i f f [ 2 ] ;
yySum += d i f f [ 1 ] * d i f f [ 1 ] ;
yhSum += d i f f [ 1 ] * d i f f [ 2 ] ;
}
R e a l d e t = xxSum * yySum = xySum * xySum ;
i f ( d e t != 0 )
{
// Compute t h e f i t t e d p l a n e h ( x , y ) = barH + barA0 * ( x = barX ) + barA1 * ( y = barY ) .
7
barX = mean [ 0 ] ;
barY = mean [ 1 ] ;
barH = mean [ 2 ] ;
barA0 = ( yySum * xhSum = xySum * yhSum ) / d e t ;
barA1 = ( xxSum * yhSum = xySum * xhSum ) / d e t ;
return true ;
}
else
{
// The o u t p u t i s invalid . The p o i n t s a r e a l l t h e same o r t h e y a r e c o l l i n e a r .
barX = 0 ;
barY = 0 ;
barH = 0 ;
barA0 = 0 ;
barA1 = 0 ;
return false ;
}
}
3.3 Fitting by a Hyperplane in n + 1 Dimensions
The measurements are {(X i , hi )}m

i=1 where the n components of X are independent variables and h is a
dependent variable. The affine transformation we want to estimate is h = A · X + b, where A is an n × 1
vector of constants and b is a scalar constant. This defines a hyperplane that best fits the samples in the
sense that the sum of the squared errors between the hi and the hyperplane values A · X i + b is minimized.
Note that the error is measured only in the h-direction.
m
X
E(A, b) = [(A · X i + b) − hi ]2 (16)
i=1
This function is nonnegative and its graph is a hyperparaboloid whose vertex occurs when the gradient
satisfies ∇E(A, b) = (∂E/∂A, ∂E/∂b) = (0, 0). This leads to a system of n + 1 linear equations in A and b
which can be easily solved. Precisely,
Pm
0 = ∂E/∂A = 2 i=1 [(A · X i + b) − hi ]X i
Pm (17)
0 = ∂E/∂b = 2 i=1 [(A · X i + b) − hi ]
and so  P    P 
m T Pm m
i=1 X i X i i=1 X i A
= P
h
i=1 i X i
 P   (18)
m T Pm m
i=1 X i i=1 1 b i=1 hi
The solution is solved by standard numerical algorithms. If implemented directly, this formulation Pm can lead
to an ill-conditioned
Pm linear system. To avoid this, you should first compute the averages X̄ = ( i=1 X i )/m
and h̄ = ( i=1 hi )/m and subtract them from the data. The fitted hyperplane is of the form h − h̄ =
Ā · (X − X̄) + b̄. The linear system of equations that determines the coefficients is
 P T    P 
m m
i=1 X i − X̄ X i − X̄ 0 Ā i=1 (hi − h̄) X i − X̄
  =  (19)
T
0 m b̄ 0
8
and has solution
m
!−1 m
!
X X
Ā = (X i − X̄)(X i − X̄)T (hi − h̄)(X i − X̄) , b̄ = 0 (20)
i=1 i=1
In terms of the original inputs, A = Ā and b = h̄ − Ā · X̄.
3.3.1 Pseudocode for Fitting a Hyperplane
Listing 3 contains pseudocode for fitting a height hyperplane to points in n + 1 dimensions.
Listing 3. Pseudocode for fitting a height hyperplane to points in n + 1 dimensions. The number of input
points must be at least n. The returned Boolean value is true as long as the matrix of the linear system has
nonzero determinant.
b o o l F i t H e i g h t H y p e r p l a n e ( i n t numPoints , V e c t o r<n + 1> p o i n t s [ ] ,
V e c t o r<n>& barX , R e a l& barH , V e c t o r<n>& barA )
{
V e c t o r<n + 1> mean = V e c t o r<n > : :ZERO ;
{
}
mean /= nu mPo int s ;
// Compute t h e l i n e a r s y s t e m m a t r i x and v e c t o r e l e m e n t s . The f u n c t i o n

// V e c t o r<n> Head<n>( V e c t o r<n + 1> V) r e t u r n s (V [ 0 ] , . . . , V [ n = 1 ] ) .
M a t r i x<n , n> L = M a t r i x<n , n > : :ZERO ;
V e c t o r<n> R = V e c t o r<n > : :ZERO ;
{
V e c t o r<n + 1> d i f f = p o i n t s [ i ] = mean ;
V e c t o r<n> XminusBarX = Head<n>( d i f f ) ;
R e a l HminusBarH = d i f f [ n ] ;
L += O u t e r P r o d u c t ( XminusBarX , XminusBarX ) ; // (X [ i ] = barX [ i ] ) * ( X [ i ] = barX [ i ] ) ˆ T
R += HminusBarH * XminusBarX ;
}
Real det = Determinant (L ) ;
i f ( d e t != 0 )
{
// Compute t h e f i t t e d p l a n e h (X) = barH + Dot ( barA , X = barX ) .
barX = Head<n>(mean ) ;
barH = mean [ n ] ;
barA = S o l v e L i n e a r S y s t e m ( L , R ) ; // s o l v e L *A = R
return true ;
}
else
{
// The o u t p u t i s i n v a l i d . The p o i n t s a p p e a r n o t t o l i v e on a
// h y p e r p l a n e ; t h e y m i g h t l i v e i n an a f f i n e s u b s p a c e o f d i m e n s i o n
// s m a l l e r t h a n n .
barX = V e c t o r<n > : :ZERO ;
barH = 0 ;
barA = V e c t o r<n > : :ZERO ;
return false ;
}
}
9
4 Affine Fitting of Points Using Orthogonal Regression
We have a set of measurements {X i }m n

i=1 for which X i ∈ R are sampled independent variables. The
hypothesis is that the points are sampled from a k-dimensional affine subspace in n-dimensional space.
Such a space is referred to as a k-dimensional flat. The classic cases include fitting a line to points in n
dimensions and fitting a plane to points in 3 dimensions. The latter is a special case of fitting a hyperplane,
an (n − 1)-dimensional flat, to points in n dimensions.
In the height-field fitting algorithms, the least-squares errors were measured in a specified direction (the
height direction). An alternative is to measure the errors in the perpendicular direction to the purported
affine subspace. This approach is referred to as orthogonal regression.
4.1 Fitting by a Line [1 Dimension]
The algorithm may be applied to sample points {X i }m i=1 in any dimension n. Let the line have origin A
and unit-length direction D, both n × 1 vectors. Define Y i = X i − A, which can be written as Y i =
(D · Y i ) D + D ⊥ ⊥
i where D i is the perpendicular vector from X i to its projection on the line. The squared
⊥ 2 2
Pmvector⊥is2 |D i | = |Y i − di D| . The error function for the least-squares minimization is
length of this
E(A, D) = i=1 |D i | . Two alternate forms for this function are
m
X
E(A, D) = YT
i I − DD T
Yi (21)
i=1
and !
m
X
T T
E(A, D) = D (Y i · Y i )I − Y i Y i D = DT M D (22)
i=1
where M is a positive semidefinite symmetric matrix that depends on A and the Y i but not in D.
Compute the derivative of equation (21) with respect to A to obtain
m
∂E h iX
= −2 I − DD T Yi (23)
∂A i=1
Pm
At a minimumPmvalue of E, it is necessary that this derivative is zero, which it is when i=1 Y i = 0, implying
A = (1/m) i=1 X i , the average of the sample points. In fact there are infinitely many solutions, A + sD,
for any scalar s. This is simply a statement that A is a point on the best-fit line, but any other point on the
line may serve as the origin for that line.
Equation (22) is a quadratic form D T M D whose minimum is the smallest eigenvalue of M , computed using
standard eigensystem solvers. A corresponding unit length eigenvector
PmD completes our construction
Pm ofT the
least-squares line. The covariance matrix of the input points is C = i=1 Y i Y T
i . Defining δ = i=1 Y i Y i ,
we see that M = δI − C, where I is the identity matrix. Therefore, M and C have the same eigenspaces.
The eigenspace corresponding to the minimum eigenvalue of M is the same as the eigenspace corresponding
to the maximum eigenvalue of C. In an implementation, it is sufficient to process C and avoid the additional
cost to compute M .
10
4.1.1 Pseudocode for the General Case
Listing 4 contains pseudocode for fitting a line to points in n dimensions with n ≥ 2.
Listing 4. Pseudocode for fitting a line to points in n dimensions using orthogonal regression. The number
of input points must be at least 2. The returned Boolean value is true as long as the covariance matrix of
the linear system has a 1-dimensional eigenspace for the maximum eigenvalue of the covariance matrix.
b o o l F i t O r t h o g o n a l L i n e ( i n t numPoints , V e c t o r<n> p o i n t s [ ] ,
V e c t o r<n>& o r i g i n , V e c t o r<n>& d i r e c t i o n )
{
V e c t o r<n> mean = V e c t o r<n > : :ZERO ;
{
}
// Compute t h e c o v a r i a n c e m a t r i x o f t h e p o i n t s .
M a t r i x<n , n> C = M a t r i x<n , n > : :ZERO ;
{
V e c t o r<n> d i f f = p o i n t s [ i ] = mean ;
C += O u t e r P r o d u c t ( d i f f , d i f f ) ; // d i f f * d i f f ˆT
}
// Compute t h e e i g e n v a l u e s and e i g e n v e c t o r s o f C , w h e r e t h e e i g e n v a l u e s a r e s o r t e d
// i n n o n d e c r e a s i n g o r d e r ( e i g e n v a l u e s [ 0 ] <= e i g e n v a l u e s [ 1 ] <= . . . ) .
Real e i g e n v a l u e s [ n ] ;
V e c t o r<n> e i g e n v e c t o r s [ n ] ;
S o l v e E i g e n s y s t e m (C , e i g e n v a l u e s , e i g e n v e c t o r s ) ;
// S e t t h e o u t p u t i n f o r m a t i o n .
o r i g i n = mean ;
d i r e c t i o n = e i g e n v e c t o r s [ n =1];
// The f i t t e d l i n e i s u n i q u e when t h e maximum e i g e n v a l u e h a s m u l t i p l i c i t y 1 .

r e t u r n e i g e n v a l u e s [ n = 2] < e i g e n v a l u e s [ n = 1 ] ;
}
Specializations for 2 and 3 dimensions are simple, computing only the upper-triangular elements of C and
passing them to specialized eigensolvers for 2 and 3 dimensions. Implementations are ApprOrthogonalLine2.h
and ApprOrthogonalLine3.h.
4.2 Fitting by a Hyperplane [(n − 1) Dimensions]
The algorithm may be applied to sample points {X i }m i=1 in any dimension n. Let the hyperplane be defined
implicitly by N · (X − A) = 0, where N is a unit-length normal to the hyperplane and A is a point on
the hyperplane. Define Y i = X i − A, which can be written as Y i = (N · Y i )N + N ⊥ ⊥
i where N i is a
vector that is perpendicular to N . The squared length of the projection of Y i onto the normal Pm line for the
hyperplane is (N · Y i )2 . The error function for the least-squares minimization is E(A, N ) = i=1 (N · Y i )2 .
Two alternate forms for this function are
m
X
E(A, N ) = YT
i NN
T
Yi (24)
i=1
11
and !
m
X
T T
E(A, N ) = N Y iY i N = N T CN (25)
i=1
Pm
where C = i=1 Y iY T
i is the covariance matrix of the Y i .

m
∂E X
= 2 NNT Yi (26)
∂A i=1
Pm
At a minimum
Pmvalue of E, it is necessary that this derivative is zero, which it is when i=1 Y i = 0, implying
A = (1/m) i=1 X i , the average of the sample points. In fact there are infinitely many solutions, A + W ,
where W is any vector perpendicular to N . This is simply a statement that the average is on the best-fit
hyperplane, but any other point on the hyperplane may serve as the origin for that hyperplane.
Equation (25) is a quadratic form N T CN whose minimum is the smallest eigenvalue of C, computed using
standard eigensystem solvers. A corresponding unit-length eigenvector N completes our construction of the
least-squares hyperplane.
Listing 5 contains pseudocode for fitting a hyperplane to points in n dimensions with n ≥ 3.
Listing 5. Pseudocode for fitting a line to points in n dimensions using orthogonal regression. The number
of input points must be at least n. The returned Boolean value is true as long as the covariance matrix of
the linear system has a 1-dimensional eigenspace for the minimum eigenvalue of the covariance matrix.
b o o l F i t O r t h o g o n a l H y p e r p l a n e ( i n t numPoints , V e c t o r<n> p o i n t s [ ] ,
V e c t o r<n>& o r i g i n , V e c t o r<n>& n o r m a l )
{
{
}
{
}
// Compute t h e e i g e n v a l u e s and e i g e n v e c t o r s o f M, w h e r e t h e e i g e n v a l u e s a r e s o r t e d
// S e t t h e o u t p u t i n f o r m a t i o n .
normal = e i g e n v e c t o r s [ 0 ] ;
// The f i t t e d h y p e r p l a n e i s u n i q u e when t h e minimum e i g e n v a l u e h a s m u l t i p l i c i t y 1 .

return eigenvalues [0] < eigenvalues [ 1 ] ;
}
12
A specialization for 3 dimensions is simple, computing only the upper-triangular elements of C and passing
them to a specialized eigensolver for 3 dimensions. An implementations is ApprOrthogonalPlane3.h.
4.3 Fitting by a Flat [k Dimensions]
Orthogonal regression to fit n-dimensional points by a line or by a hyperplane can be generalized to fitting
by an affine subspace called a k-dimensional flat, where you may choose k such that 1 ≤ k ≤ n − 1. A line
is a 1-flat and a hyperplane is an (n − 1)-flat.
For dimension n = 3, we fit with flats that are either lines (k = 1) or planes (k = 2). For dimensions n ≥ 4
and flat dimensions 1 ≤ k ≤ n − 1, the generalization of orthogonal regression is the following. The bases
mentioned here are for the linear portion of the affine subspace; that is, the basis vectors are relative to an
origin at a point A. The flat has an orthonormal basis {F j }kj=1 and the orthogonal complement has an
orthonormal basis {P j }n−k n
j=1 . The union of the two bases is an orthonormal basis for R . Any input point
X i , 1 ≤ i ≤ m, can be represented by
   
Xk n−k
X Xk n−k
X
Xi = A + fij F j + pij P j = A + fij F j  +  pij P j  (27)
j=1 j=1 j=1 j=1
The left-parenthesized term is the portion of X i that lives in the flat and the right-parenthesized is the
portion that is the deviation of X i from the flat. The least-squares problem is about choosing the two bases
so that the sum of squared lengths of the deviations is as small as possible.
Define Y i = X i − A. The basis coefficients are fij = F j · Y i and pij = P j · Y i . The squared length of
Pn−k
the deviation is j=1 p2ij . The error function for the least-squares minimization is the sum of the squared
Pm Pn−k
lengths for all inputs, E = i=1 j=1 p2ij . Two alternate forms for this function are
 
m
X n−k
X
E(A, P 1 , . . . P n−k ) = YT
i
 P jP T
j
Y i (28)
i=1 j=1
and !
n−k
X m
X n−k
X
E(A, P 1 , . . . P n−k ) = PT
j Y iY T
i Pj = PT
j CP j (29)
j=1 i=1 j=1
Y iY T
P
where C = i=1 i.

 
n−k m
∂E X X
= 2 P jP T
j
 Yi (30)
∂A j=1 i=1
Pm
At a minimum
Pmvalue of E, it is necessary that this derivative is zero, which it is when i=1 Y i = 0, implying
A = (1/m) i=1 X i , the average of the sample points. In fact there are infinitely many solutions, A + W ,
where W is any vector in the orthogonal complement of the subspace spanned by the P j ; this subspace is
13
the one spanned by the F j . This is simply a statement that the average is on the best-fit flat, but any other
point on the flat may serve as the origin for that flat. Because we are choosing A to be the average of the
input points, the matrix C is the covariance matrix for the input points.
The last term in equation (29) is a sum of quadratic forms involving the matrix C and the vectors P j that
are a basis (unit length, mutually perpendicular). The minimum value of the quadratic form is the smallest
eigenvalue λ1 of C, so we may choose P 1 to be a unit-length eigenvector of C corresponding to λ1 . We must
choose P 2 to be unit length and perpendicular to P 1 . If the eigenspace for λ1 is 1-dimensional, the next
smallest value we can attain by the quadratic form is the smallest eigenvalue λ2 for which λ1 < λ2 . P 2 is
chosen to be a corresponding unit-length eigenvector. However, if λ1 has an eigenspace of dimension larger
than 1, we can choose P 2 in that eigenspace but which is perpendicular to P 1 .
Generally, let {λ` }r`=1 be the r distinct eigenvalues of the covariance matrix C; we know Pr1 ≤ r ≤ n and
λ1 < λ2 < · · · < λr . Let the dimension of the eigenspace for λ` be d` ≥ 1; we know that `=1 d` = n. List
the eigenvalues and eigenvectors in order of increasing eigenvalue, including repeated values,
λ1 ··· λ1 λ2 ··· λ2 λr ··· λr

···
1 1 2 2 r
V 1 ··· V d1 V 1 ··· V d2 V 1 ··· V rdr (31)
| {z } | {z } | {z }
d1 terms d2 terms dr terms
The list has n items. The eigenvalue d` has an eigenspace with orthonormal basis {V `j }dj=1 `
. In this list,
choose the first k eigenvectors to be P 1 through P k and choose the last n − k eigenvectors to be F 1 through
F n−k .
It is possible that one (or more) of the P j and one (or more) of the F j are in the eigenspace for the same
eigenvalue. In this case, the fitted flat is not unique, and one should re-examine the choice of dimension k
for the fitted flat. This is analogous to the following situations in dimension n = 3:
The input points are nearly collinear but you are trying to fit those points with a plane. The covariance
matrix likely has two distinct eigenvalues λ1 < λ2 with d1 = 2 and d2 = 1. The basis vectors are
P 1 = V 11 , F 1 = V 12 , and F 2 = V 21 . The first two of these are from the same eigenspace.
The input points are spread out over a portion of a plane (and are not nearly collinear) but you are
trying to fit those points with a line. The covariance matrix likely has two distinct eigenvalues λ1 < λ2
with d1 = 1 and d2 = 2. The basis vectors are P 1 = V 11 , P 2 = V 21 , and F 1 = V 22 . The last two of
these are from the same eigenspace.
The input points are not well fit by a flat of any dimension. For example, your input points are
uniformly distributed over a sphere. The covariance matrix likely has one distinct eigenvalue λ1 of
multiplicity d1 = 3. Neither a line nor a plane is a good fit to the input points—in either case, a
P -vector and an F -vector are in the same eigenspace.
The computational algorithm is to compute the average A and covariance matrix of the points. Use an
eigensolver whose output eigenvalues are sorted in nondecreasing order. Choose the F j to be the last n − k
eigenvectors output by the eigensolver.
Listing 6 contains pseudocode for fitting a k-dimensional flat to points in n-dimensions where n ≥ 2 and
1 ≤ k ≤ n − 1.
14
Listing 6. Pseudocode for fitting a k-dimensional flat to points in n dimensions using orthogonal regression.
The number of input points must be at least k + 1. The returned Boolean value is true as long as the
covariance matrix of the linear system has a basis of eigenvectors sorted by nondecreasing eigenvalues for
which the following holds. The subbasis that spans the linear space of the flat and the subbasis that spans
the orthogonal complement of the linear space of the flat do not both contain basis vectors from the same
eigenspace.
b o o l F i t O r t h o g o n a l F l a t ( i n t numPoints , V e c t o r<n> p o i n t s [ ] ,
V e c t o r<n>& o r i g i n , V e c t o r<k>& f l a t B a s i s , V e c t o r<n=k>& c o m p l e m e n t B a s i s )
{
{
}
{
}
// S e t t h e o u t p u t i n f o r m a t i o n . The b a s i s f o r t h e f i t t e d f l a t c o r r e s p o n d s
// t o t h e l a r g e s t v a r i a n c e s and t h e b a s i s f o r t h e complement o f t h e f i t t e d
// f l a t c o r r e s p o n d s t o t h e s m a l l e s t v a r i a n c e s .
f o r ( i n t i = 0 ; i < n = k ; ++i )
{
complementBasis [ i ] = e i g e n v e c t o r s [ i ] ;
}
f o r ( i n t i = 0 ; i < k ; ++i )
{
flatBasis [ i ] = eigenvectors [n = k + i ];
}
// The f i t t e d f l a t and i t s complement do n o t h a v e v e c t o r s from t h e same e i g e n s p a c e .

return eigenvalues [ n = k = 1] < eigenvalues [ n = k ] ;
}
In the special case of fitting with a line (k = 1), the line direction is the only element in flatBasis. In the special
case of fitting with a hyperplane (k = n − 1), the hyperplane normal is the only element in complementBasis.
5 Fitting a Hypersphere to Points
The classic cases are fitting 2-dimensional points by circles and 3-dimensional points by spheres. In n-
dimensions, the objects are called hyperspheres, defined implicitly by the quadratic equation |C − X|2 = r2 ,
where C is the center and r is the radius. Three algorithms are presented for fitting points by hyperspheres.
15
5.1 Fitting Using Differences of Lengths and Radius
The sample points are {X i }m

i=1 . The least-squares error function involves the squares of differences between
lengths and radius,
Xm
2
E(C, r) = (|C − X i | − r) (32)
i=1
The minimization is based on computing points where the gradient of E is zero. The partial derivative with
respect to r is
m
∂E X
= −2 (|C − X i | − r) (33)
∂r i=1
Setting the derivative equal to zero and solving for the radius,
m
1 X
r= |C − X i | (34)
m i=1
which says that the radius is the average of the distances from the sample points to the center C. The
partial derivative with respect to C is
i|
Pm
∂E
∂C = 2 i=1 (|C − X i | − r) ∂|C−X
∂C
Pm C−Xi
= 2 i=1 (|C − X i | − r) |C−X i |
(35)
Pm
C−Xi
= 2 i=1 (C − X i ) − r |C−X i|
Setting the derivative equal to zero and solving for the center,
m m m m
! m
!
1 X 1 X C − Xi 1 X 1 X 1 X C − Xi
C= Xi + r = Xi + |C − X i | (36)
m i=1 m i=1 |C − Xi | m i=1 m i=1 m i=1 |C − Xi |
The average of the samples is X̄. Define length Li = |C − X i |; the average of the lengths is L̄. Define
unit-length vector U i = (C − X i )/|C − X i |; the average of the unit-length vectors is Ū . Equation (36)
becomes
C = X̄ + L̄Ū =: F (C) (37)
were the last equality defines the vector-valued function F . The function depends on the independent
variable C because both L̄ and Ū depend on C. Fixed-point iteration can be applied to solve equation (37),
C 0 = X̄; C i+1 = F (C i ), i ≥ 0 (38)
Depending on the distribution of the samples, it is possible to choose a different initial guess for C 0 that
(hopefully) leads to faster convergence.
Listing 7 contains pseudocode for fitting a hypersphere to points. The case n = 2 is for circles and the case
n = 3 is for spheres.
16
Listing 7. Fitting a hypersphere to points using least squares based on squared differences of lengths and
radius. If you want the incoming hypersphere center to be the initial guess for the center, set inputCenterIsIni-
tialGuess; otherwise, the initial guess is computed to be the average of the samples. The maximum number
of iterations is also specified. The returned function value is the number of iterations used.
i n t F i t H y p e r s p h e r e ( i n t numPoints , V e c t o r<n> X [ ] , i n t maxIterations , bool i n p u t C e n t e r I s I n i t i a l G u e s s ,
V e c t o r<n>& c e n t e r , R e a l& r a d i u s )
{
// Compute t h e a v e r a g e o f t h e d a t a p o i n t s .
V e c t o r<n> a v e r a g e X = X [ 0 ] ;
{
a v e r a g e X += X [ i ] ;
}
a v e r a g e X /= num Poi nts ;
// The i n i t i a l g u e s s f o r t h e c e n t e r is e i t h e r the incoming center of the

// a v e r a g e o f t h e s a m p l e p o i n t s .
i f (! inputCenterIsInitialGuess )
{
c e n t e r = averageX ;
}
int iteration ;
f o r ( i t e r a t i o n = 0 ; i t e r a t i o n < m a x I t e r a t i o n s ; ++i t e r a t i o n )
{
// Update t h e e s t i m a t e f o r t h e c e n t e r .
V e c t o r<n> p r e v i o u s C e n t e r = c e n t e r ;
// Compute a v e r a g e L and a v e r a g e U .
Real averageL = 0;
V e c t o r<n> a v e r a g e U = V e c t o r<n > : :ZERO ;
{
V e c t o r<n> CmXi = c e n t e r = X [ i ] ;
R e a l l e n g t h = L e n g t h ( CmXi ) ;
i f ( l e n g t h > 0)
{
a v e r a g e L += l e n g t h ;
a v e r a g e U == CmXi / l e n g t h ;
}
}
a v e r a g e L /= n um Poi nts ;
a v e r a g e U /= n umP oin ts ;
c e n t e r = averageX + averageL * averageU ;

r a d i u s = averageL ;
// T e s t f o r c o n v e r g e n c e .
i f ( c e n t e r == p r e v i o u s C e n t e r )
{
break ;
}
}
r e t u r n ++i t e r a t i o n ;
}
The convergence test uses an exact equality, that the previous center C 0 and the current center C are the
same. In practice you might want to specify a small ε > 0 and instead exit when |C − C 0 | ≤ ε.
Specializations for 2 and 3 dimensions have the same implementation as the general case. Implementations
are ApprCircle2.h and ApprSphere3.h.
17
5.2 Fitting Using Differences of Squared Lengths and Squared Radius
The sample points are {X i }m

i=1 . The least-squares error function involves the squares of differences between
squared lengths and squared radius,
m
X 2
E(C, r2 ) = |C − X i |2 − r2 (39)
i=1
The minimization is based on computing points where the gradient of E is zero. The partial derivative with
respect to r2 is
m
∂E X
|C − X i |2 − r2

= −2 (40)
∂r2 i=1
Define ∆i = C − X i . Setting the derivative to zero and solving for the squared radius,
m m
1 X 1 X T
r2 = |C − X i |2 = ∆ ∆i (41)
m i=1 m i=1 i
which says that the squared radius is the average of the squared distances from the sample points to the
center C. The partial derivative with respect to C is
m m
∂E X X
|C − X i |2 − r2 (C − X i ) = 4 ∆T 2

=4 i ∆i − r ∆i (42)
∂C i=1 i=1
Setting this to zero, the center and radius must satisfy

Xm
∆Ti ∆i − r
2
∆i = 0 (43)
i=1
Expanding the squared lengths,

∆T 2 T
i ∆i = |C| − 2C X i + |X i |
2
(44)
Substituting into equation (41),
m
! m m
T 1 X 1 X 1 X
2 2
r = |C| − 2C Xi + |X i |2 = |C|2 − 2C T A + |X i |2 (45)
m i=1 m i=1 m i=1
Pm
where A = ( i=1 X i ) /m is the average of the samples. Define Y i = X i − A. Some algebra will show that
Pm
∆Ti ∆i − r
2
= |C|2 − 2C T X i + |X i |2 − |C|2 − 2C T A + m 1
j=1 |X j |
2
Pm
= −2C T Y i + |X i |2 − m
1
j=1 |X j |
2
(46)
1 m
= −2(C − A)T Y i + |Y i |2 − m 2
P
j=1 |Y j |
= −2(C − A)T Y i + Bi
where the last equality defines Bi . Equation (43) becomes

Pm
0 = i=1 ∆T i ∆ i − r 2
∆i
Pm
= i=1 −2(C − A)T Y i + Bi ((C − A) − Y i ) (47)
Pm P
T
m T Pm Pm
= (C − A) i=1 Y i (C − A) + 2 i=1 Y i Y i (C − A) + ( i=1 Bi ) (C − A) − i=1 Bi Y i
18
Pm Pm
It is easily shown that i=1 Y i = 0 and Bi = 0; therefore,
i=1
m
! m
X X
T
0=2 Y i Y i (C − A) − YT
i Y i Yi (48)
i=1 i=1
The least-squares center is obtained by solving the previous equation,

m
!−1 m
1 X
T
X
C =A+ YY i YT
iYi Yi (49)
2 i=1 i=1
Listing 8. Fitting a hypersphere to points using least squares based on squared differences of squared
lengths and square radius. The algorithm requires inverting the covariance matrix. If the matrix is invertible,
the output center and radius are valid and the function returns true. If the matrix is not invertible, the function
returns false, and the center and radius are invalid (but set to zero so at least they are initialized).
b o o l F i t H y p e r s p h e r e ( i n t numPoints , V e c t o r<n> X [ ] , V e c t o r<n>& c e n t e r , R e a l& r a d i u s )
{
V e c t o r<n> A = V e c t o r<n > : :ZERO ;
{
A += X [ i ] ;
}
A /= numP oin ts ;
// Compute t h e c o v a r i a n c e m a t r i x M o f t h e Y [ i ] = X [ i ] =A and t h e r i g h t =hand s i d e R o f t h e l i n e a r

// s y s t e m M* (C=A) = R .
M a t r i x<n , n> M = M a t r i x<n , n > : :ZERO ;
V e c t o r<n> R = V e c t o r<n > : :ZERO ;
{
V e c t o r<n> Y = X [ i ] = A ;
M a t r i x<n , n> YYT = O u t e r P r o d u c t (Y , Y ) ; // Y* T r a n s p o s e (Y)
R e a l YTY = Dot (Y , Y ) ; // T r a n s p o s e (Y) * Y
M += YYT ;
R += YTY * Y ;
}
R /= 2 ;
// S o l v e t h e l i n e a r s y s t e m M* (C=A) = R f o r t h e c e n t e r C . The f u n c t i o n ’ b o o l S o l v e (M, R , S ) ’ t r i e s

// t o s o l v e t h e l i n e a r s y s t e m M* S = R . I f M i s i n v e r t i b l e , t h e f u n c t i o n r e t u r n s t r u e and S i s t h e
// s o l u t i o n . I f M i s n o t i n v e r t i b l e , t h e f u n c t i o n r e t u r n s f a l s e and S i s i n v a l i d .
V e c t o r<n> CmA;
i f ( S o l v e (M, R , CmA) )
{
c e n t e r = A + CmA;
Real r s q r = 0;
{
V e c t o r<n> d e l t a = X [ i ] = c e n t e r ;
r s q r += Dot ( d e l t a , d e l t a ) ;
}
r s q r /= n umP oin ts ;
radius = sqrt ( rsqr );
19
return true ;
}
else
{
c e n t e r = V e c t o r<n > : :ZERO ;
radius = 0;
return false ;
}
}
5.2.2 Pseudocode for Circles
Listing 9 is a specialization of Listing 8 for circles in 2 dimensions. The matrix inversion requires only
computing the upper-triangular part of the covariance matrix and uses cofactors for inversion.
Listing 9. Fitting a circle to points using least squares based on squared differences of squared lengths
and square radius. The algorithm requires inverting the covariance matrix. If the matrix is invertible, the
output center and radius are valid and the function returns true. If the matrix is not invertible, the function
b o o l F i t C i r c l e ( i n t numPoints , V e c t o r 2 X [ ] , V e c t o r 2& c e n t e r , R e a l& r a d i u s )
{
Vector2 A = { 0 , 0 };
{
A += X [ i ] ;
}
A /= numP oin ts ;

// s y s t e m M* (C=A) = R .
R e a l M00 = 0 , M01 = 0 , M11 = 0 ;
Vector2 R = { 0 , 0 };
{
Vector2 Y = X[ i ] = A;
R e a l Y0Y0 = Y [ 0 ] * Y [ 0 ] , Y0Y1 = Y [ 0 ] * Y [ 1 ] , Y1Y1 = Y [ 1 ] * Y [ 1 ] ;
M00 += Y0Y0 ; M01 += Y0Y1 ; M11 += Y1Y1 ;
R += ( Y0Y0 + Y1Y1 ) * Y ;
}
R /= 2 ;
// S o l v e t h e l i n e a r s y s t e m M* (C=A) = R f o r t h e c e n t e r C .
R e a l d e t = M00 * M11 = M01 * M01 ;
i f ( d e t != 0 )
{
c e n t e r [ 0 ] = A [ 0 ] + (M11 * R [ 0 ] = M01 * R [ 1 ] ) / d e t ;
c e n t e r [ 1 ] = A [ 1 ] + (M00 * R [ 1 ] = M01 * R [ 0 ] ) / d e t ;
Real r s q r = 0;
{
Vector2 d e l t a = X[ i ] = center ;
}
return true ;
}
else
{
center = { 0 , 0 };
20
radius = 0;
return false ;
}
}
5.2.3 Pseudocode for Spheres
Listing 10 is a specialization of Listing 8 for spheres in 3 dimensions. The matrix inversion requires only
computing the upper-triangular part of the covariance matrix and uses cofactors for inversion.
Listing 10. Fitting a sphere to points using least squares based on squared differences of squared lengths
and square radius. The algorithm requires inverting the covariance matrix. If the matrix is invertible, the
output center and radius are valid and the function returns true. If the matrix is not invertible, the function
b o o l F i t S p h e r e ( i n t numPoints , V e c t o r 3 X [ ] , V e c t o r 3& c e n t e r , R e a l& r a d i u s )
{
Vector3 A = { 0 , 0 , 0 };
{
A += X [ i ] ;
}
A /= numP oin ts ;

// s y s t e m M* (C=A) = R .
R e a l M00 = 0 , M01 = 0 , M02 = 0 , M11 = 0 , M12 = 0 , M22 = 0;
Vector3 R = { 0 , 0 , 0 };
{
R e a l Y0Y0 = Y [ 0 ] * Y [ 0 ] , Y0Y1 = Y [ 0 ] * Y [ 1 ] , Y0Y2 = Y[0] * Y[ 2 ] ;
R e a l Y1Y1 = Y [ 1 ] * Y [ 1 ] , Y1Y2 = Y [ 1 ] * Y [ 2 ] , Y2Y2 = Y[2] * Y[ 2 ] ;
M00 += Y0Y0 ; M01 += Y0Y1 ; M02 += Y0Y2 ;
M11 += Y1Y1 ; M12 += Y1Y2 ; M22 += Y2Y2 ;
R += ( Y0Y0 + Y1Y1 + Y2Y2 ) * Y ;
}
R /= 2 ;
// S o l v e t h e l i n e a r s y s t e m M* (C=A) = R f o r t h e c e n t e r C.
R e a l c o f 0 0 = M11 * M22 = M12 * M12 ;
R e a l c o f 0 1 = M02 * M12 = M01 * M22 ;
R e a l c o f 0 2 = M01 * M12 = M02 * M11 ;
R e a l d e t = M00 * c o f 0 0 + M01 * c o f 0 1 + M02 * c o f 0 2 ;
i f ( d e t != 0 )
{
R e a l c o f 1 1 = M00 * M22 = M02 * M02 ;
R e a l c o f 1 2 = M01 * M02 = M00 * M12 ;
R e a l c o f 2 2 = M00 * M11 = M01 * M01 ;
center [ 0 ] = A[ 0 ] + ( cof00 * R[ 0 ] + cof01 * R[ 1 ] + cof02 * R [ 2 ] ) / det ;
Real r s q r = 0;
{
Vector3 d e l t a = X[ i ] = center ;
}
return true ;
21
}
else
{
center = { 0 , 0 , 0 };
radius = 0;
return false ;
}
}
5.3 Fitting the Coefficients of a Quadratic Equation
The general quadratic equation that represents a hypersphere in n dimensions is
b0 + b1 · X + b2 |X|2 = 0 (50)
where b0 and b2 6= 0 are scalar constants and b1 is an n × 1 vector of scalars. Define

   
b 1
 0   
b =  b1 , V =  X  (51)
   
   
2
b2 |X|
which are both (n + 2) × 1 vectors. The quadratic equation is b · V = 0. Because b is not zero, we can
remove a degree of freedom by requiring |b| = 1.
Given samples {X i }m
i=1 , we can estimate the constants using a least-squares algorithm where the error
function is !
Xm m
X
T
E(b) = 2
(b · V i ) = b V i V i b = bT V b
T
(52)
i=1 i=1
Pm
where as a tuple, V i = (1, X i , |X i | ), and where V = i=1 V i V T
2
i is a positive semidefinite matrix (sym-
metric, eigenvalues are nonnegative). The error function is minimized by a unit-length eigenvector b that
corresponds to the minimum eigenvalue of V . The equation (50) is factored into
2 2
X + b 1 = b 1 − b0

(53)
2b2 2b2 b2
p
from which we see that the hypersphere center is C = −b1 /(2b2 ) and the radius is r = (|b1 |2 − 4b0 b2 )/(4b22 ).
As is typical of these types of problems, it is better to subtract translate the samples by their
Pm average to
obtain numerical robustness when computing with floating-point arithmetic. Define A = ( i=1 X i ) and
Y i = X i − A. Equation (50) becomes
f0 + f 1 · Y + f2 |Y |2 = 0 (54)
where       
b0 f0 − f 1 · A + f2 |A|2 1 −AT |A|2 f0
      
b =  b1  =  f 1 − 2f2 A = 0 −2A   f 1  = Mf (55)
      
I
      
b2 f2 0 0T 1 f2
22
with I the n × n identity matrix and 0 the n × 1 zero vector. The last equality defines the (n + 2) × (n + 2)
upper-triangular matrix M and the (n + 2) × 1 vector f . The error function is now
E(f ) = bT V b = (Rf )T V (Rf ) = f T RT V R f = f T W f

(56)
where the last equation defines the positive semidefinite matrix W . Observe that
m
! m m
X X T X
W = RT V iV T RT V i RT V i = W iW T

i R= i (57)
i=1 i=1 i=1
where     
1 0T 0 1 1
    
W i = RT V i =  −A = Xi − A (58)
    
I 0   Xi 
    
|A|2 −2AT 1 |X i |2 |X i − A|2
Therefore, we can subtract A from the samples, compute the matrix W , extract its eigenvector f corre-
sponding to its minimum eigenvalue, compute b = Rf and then the center and radius using equation (53).
The matrix W written as a block matrix and divided by the number of points m to keep the numerical
intermediate values on the order of the sample values is
 Pm 
1 0T 1
m i=1 |Y i |2
 
1
Pm T 1
Pm 2
W = |Y | (59)
 
0 m i=1 Y Y
i i m i=1 i Y i 
 
1
P m 2 1
Pm 2 1 4
m i=1 |Y i | m i=1 |Y i | Y i m |Y i |
Listing 11. Fitting a hypersphere to points using least squares to fit the coefficients of a quadratic equation
defining the hypersphere. The algorithm requires computing the squared radius in terms of the components
of an eigenvector. For samples not really distributed well on a hypersphere, the purported squared radius
might be negative. The function returns true when that computation is nonnegative or false when it is
negative.
b o o l F i t H y p e r s p h e r e ( i n t numPoints , V e c t o r<n> X [ ] , V e c t o r<n>& c e n t e r , R e a l& r a d i u s )
{
// Compute t h e a v e r a g e o f t h e d a t a p o i n t s and t h e s q u a r e d l e n g t h o f t h e a v e r a g e .
V e c t o r<n> A = V e c t o r<n > : :ZERO ;
{
A += X [ i ] ;
}
A /= numP oin ts ;
R e a l s q r L e n A = Dot (A , A ) ;
// Compute t h e components o f W. B l o c k ( r , c , r s i z e , c s i z e ) i s an a c c e s s o r t o t h e b l o c k
// whose u p p e r = l e f t l o c a t i o n i s ( r , c ) and whose s i z e i s r s i z e =by= c s i z e . The i n d i c e s
// r and c a r e z e r o =b a s e d .
M a t r i x<n + 2 , n + 2> W = M a t r i x<n + 2 , n + 2 >::ZERO ;
23
{
V e c t o r<n> Y = X [ i ] = A ;
M a t r i x<n , n> YYT = O u t e r P r o d u c t (Y , Y ) ; // Y* T r a n s p o s e (Y)
R e a l YTY = Dot (Y , Y ) ; // T r a n s p o s e (Y) * Y
V e c t o r YTYY = YTY * Y ;
R e a l YTYYTY = YTY * YTY ;
W. B l o c k ( 0 , 0 , 1 , 1 ) += 1 ;
W. B l o c k ( 0 , n + 1 , 1 , 1 ) += YTY ;
W. B l o c k ( 1 , 1 , n , n ) += YYT ;
W. B l o c k ( 1 , n + 1 , n , 1 ) += YTYY ;
W. B l o c k ( n + 1 , 0 , 1 , 1 ) += YTY ;
W. B l o c k ( n + 1 , 1 , 1 , n ) += YTYY ;
W. B l o c k ( n + 1 , n + 1 , 1 , 1 ) += YTYYTY ;
}
W /= num Poi nts ;
Real e i g e n v a l u e s [ n + 2 ] ;
V e c t o r<n> e i g e n v e c t o r s [ n + 2 ] ;
S o l v e E i g e n s y s t e m (W, e i g e n v a l u e s , e i g e n v e c t o r s ) ;
// B l o c k ( i , i s i z e ) i s an a c c e s s o r t o t h e b l o c k whose i n i t i a l location is i and whose

// s i z e i s i s i z e .
Real f0 = e i g e n v e c t o r s [ 0 ] . Block (0 , 1 ) ;
V e c t o r<n> f 1 = e i g e n v e c t o r s [ 0 ] . B l o c k ( 1 , n ) ;
Real f2 = e i g e n v e c t o r s [ 0 ] . Block ( n + 1 , 1 ) ;
R e a l b0 = f 0 = Dot ( f 1 , A) + f 2 * s q r L e n A ;
V e c t o r<n> b1 = f 1 = 2 * f 2 * A ;
R e a l b2 = f 2 ;
i f ( b2 != 0 )
{
R e a l d i s c r = Dot ( b1 , b1 ) = 4 * b0 * b2 ;
i f ( d i s c r >= 0 )
{
c e n t e r = =b1 / ( 2 * b2 ) ;
r a d i u s = s q r t ( d i s c r / ( 4 * b2 * b2 ) ) ;
return true ;
}
}
c e n t e r = V e c t o r<n > : :ZERO ;

radius = 0;
return false ;
}
5.3.2 Pseudocode for Circles
Listing 12 contains pseudocode for fitting a circle to points.
Listing 12. Fitting a circle to points using least squares to fit the coefficients of a quadratic equation
negative.
b o o l F i t C i r c l e ( i n t numPoints , V e c t o r 2 X [ ] , V e c t o r 2& c e n t e r , R e a l& r a d i u s )
{
Vector2 A = { 0 , 0 };
24
{
A += X [ i ] ;
}
A /= numP oin ts ;
// Compute t h e u p p e r = t r i a n g u l a r components o f W.
M a t r i x 4 x 4 W = M a t r i x 4 x 4 : : ZERO ;
{
R e a l RR = Y0Y0 + Y1Y1 , RRRR = RR * RR ;
R e a l Y0RR = Y0 * RR , Y1RR = Y1 * RR ;
W( 0 , 3 ) += RR ;
W( 1 , 1 ) += Y0Y0 ;
W( 1 , 2 ) += Y0Y1 ;
W( 1 , 3 ) += Y0RR ;
W( 2 , 2 ) += Y1Y1 ;
W( 2 , 3 ) += Y1RR ;
W( 3 , 3 ) += RRRR ;
}
W /= num Poi nts ;
W( 0 , 0 ) = 1 ;
// F i l l in the l o w e r = t r i a n g u l a r components o f W.
W( 3 , 0 ) = W( 0 , 3);
W( 2 , 1 ) = W( 1 , 2);
W( 3 , 1 ) = W( 1 , 3);
W( 3 , 2 ) = W( 2 , 3);
// i n n o n d e c r e a s i n g o r d e r
// e i g e n v a l u e s [ 0 ] <= e i g e n v a l u e s [ 1 ] <= e i g e n v a l u e s [ 2 ] <= e i g e n v a l u e s [ 3 ]
Real e i g e n v a l u e s [ 4 ] ;
Vector4 e i g e n v e c t o r s [ 4 ] ;
Real f0 = eigenvectors [ 0 ] [ 0 ] ;
Vector2 f1 = { e i g e n v e c t o r s [ 0 ] [ 1 ] , e i g e n v e c t o r s [ 0 ] [ 2 ] };
Vector2 b1 = f 1 = 2 * f 2 * A ;
R e a l b2 = f2 ;
i f ( b2 != 0 )
{
i f ( d i s c r >= 0 )
{
c e n t e r = =b1 / ( 2 * b2 ) ;
return true ;
}
}
c e n t e r = V e c t o r 2 : : ZERO ;
radius = 0;
return false ;
}
5.3.3 Pseudocode for Spheres
Listing 13 contains pseudocode for fitting a sphere to points.
25
Listing 13. Fitting a sphere to points using least squares to fit the coefficients of a quadratic equation
negative.
b o o l F i t S p h e r e ( i n t numPoints , V e c t o r 3 X [ ] , V e c t o r 3& c e n t e r , R e a l& r a d i u s )
{
Vector3 A = { 0 , 0 , 0 };
{
A += X [ i ] ;
}
A /= numP oin ts ;
// Compute t h e u p p e r = t r i a n g u l a r components o f W.
M a t r i x 5 x 5 W = M a t r i x 5 x 5 : : ZERO ;
{
R e a l RR = Y0Y0 + Y1Y1 + Y2Y2 , RRRR = RR * RR ;
R e a l Y0RR = Y0 * RR , Y1RR = Y1 * RR , Y2RR = Y2 * RR ;
W( 0 , 4 ) += RR ;
W( 1 , 1 ) += Y0Y0 ;
W( 1 , 2 ) += Y0Y1 ;
W( 1 , 3 ) += Y0Y2 ;
W( 1 , 4 ) += Y0RR ;
W( 2 , 2 ) += Y1Y1 ;
W( 2 , 3 ) += Y1Y2 ;
W( 2 , 4 ) += Y1RR ;
W( 3 , 3 ) += Y2Y2 ;
W( 3 , 4 ) += Y2RR ;
W( 4 , 4 ) += RRRR ;
}
W /= numPo int s ;
W( 0 , 0 ) = 1 ;
// F i l l in the l o w e r = t r i a n g u l a r components o f W.
W( 4 , 0 ) = W( 0 , 4);
W( 2 , 1 ) = W( 1 , 2);
W( 3 , 1 ) = W( 1 , 3);
W( 4 , 1 ) = W( 1 , 4);
W( 3 , 2 ) = W( 2 , 3);
W( 4 , 2 ) = W( 2 , 4);
W( 4 , 3 ) = W( 3 , 4);
// i n n o n d e c r e a s i n g o r d e r
// e i g e n v a l u e s [ 0 ] <= e i g e n v a l u e s [ 1 ] <= e i g e n v a l u e s [ 2 ] <= e i g e n v a l u e s [ 3 ] <= e i g e n v a l u e s [ 4 ]
Real e i g e n v a l u e s [ 5 ] ;
Vector5 e i g e n v e c t o r s [ 5 ] ;
Vector3 f1 = { eigenvectors [ 0 ] [ 1 ] , eigenvectors [ 0 ] [ 2 ] , e i g e n v e c t o r s [ 0 ] [ 3 ] };
Vector3 b1 = f 1 = 2 * f 2 * A ;
R e a l b2 = f2 ;
i f ( b2 != 0 )
{
i f ( d i s c r >= 0 )
{
c e n t e r = =b1 / ( 2 * b2 ) ;
26
return true ;
}
}
c e n t e r = V e c t o r 3 : : ZERO ;
radius = 0;
return false ;
}
6 Fitting a Hyperellipsoid to Points
The classic cases are fitting 2-dimensional points by ellipses and 3-dimensional points by ellipsoids. In
n-dimensions, the objects are called hyperellipsoids, defined implicitly by the quadratic equation
(X − C)T S(X − C) = 1 (60)
The center of the hyperellipsoid is the n × 1 vector C, the n × n matrix S is positive definite, and the n × 1
vector X is any point on the hyperellipsoid. The matrix S can be factored using an eigendecomposition,
S = RDRT , where R is an n × n rotation matrix whose n × 1 columns are unit-length vectors U i for
0 ≤ i < n that form a right-handed orthonormal basis. The n × n matrix D is diagonal with diagonal
elements di = 1/e2i for 0 ≤ i < n. The extent ei > 0 is the distance from C to the two extreme points of the
hyperellipsoid in the direction U i . The matrix S can be expressed as
n−1 n−1
X U iU Ti
X V iV T
i
S= = (61)
i=0
e2i i=0
e 2 |V |2
i i
The second equality uses vectors V i that are not necessarily unit length, which is useful when robust
computations require arbitrary precision arithmetic for exact results. The length of V i is not necessary to
compute; rather, we compute |V i |2 = V i · V i to eliminate floating-point rounding errors that occur when
normalizing V i to obtain U i .
Let the samples be {X i }m
i=1 . A nonlinear least-squares error function used for the fitting is
m i2
1 Xh T
E(C, S) = (X i − C) S (X i − C) − 1 (62)
m i=1
An iterative algorithm is presented for locating the parameters C and S. It is based on gradient descent,
which requires derivative computations. The symmetric matrix S has n(n + 1)/2 parameters consisting of
the upper-triangular part of the matrix. The vector C has n parameters.
Define ∆i = X i − C. The first-order derivative of E with respect to the center is
m
∂E −4 X T
= ∆i S∆i − 1 S∆i (63)
∂C m i=1
The first-order derivative of E with respect to the component src of S is


∂E  2 Pm ∆T T
i S∆i − 1 ∆i Brr ∆i , r=c
m i=1
= (64)
∂src  2 Pm ∆T S∆ − 1 ∆T (B + B ) ∆ ,
i rc cr i r 6= c
m i=1 i i
27
where Brc is the matrix whose elements are all 0 except for the entry in row r and column c which is 1.
The algorithm uses a 2-Step gradient step whereby the center and matrix are alternately updated.
6.1 Updating the Estimate of the Center
Given initial guesses C and S for the hyperellipsoid, the center can be updated to C 0 = C + tN for t ≥ 0,
where N = −∂E/∂C is the negative of the gradient of E in the C direction at the initial guess. This is one
step of the gradient descent algorithm, searching in the direction of largest decrease of E in the C parameter.
Define the function
g(t) = E(C + tN , S)
Pm h i2
1 T
= m i=1 (tN − ∆i ) S (tN − ∆i ) − 1
Pm h i2 (65)
= n1 i=1 N T SN t2 − 2N T S∆i t + ∆T i S∆i − 1
Pm 2
= n1 i=1 ct2 − 2bi t + ai
where c = N T SN , bi = N T S∆i and ai = ∆T

i S∆i − 1. Some algebra will show that
g(t) = g0 + g1 t + g2 t2 + g3 t3 + g4 t4 (66)
where
m m m m m
1 X 2 −4 X 4 X 2 2c X −4c X
g0 = ai , g1 = ai bi , g2 = bi + ai , g3 = bi , g4 = c2 (67)
m i=1 m i=1 m i=1 m i=1 m i=1
The derivative is g 0 (t) = g1 + 2g2 t + 3g3 t2 + 4g4 t3 .

Observe that g(t) ≥ 0 because it is the nonnegative error function evaluated along a line in the domain. In
practice, one never expects the minimum of E to be zero, so g(t) > 0. In particular, g(0) > 0. The search
is in the largest direction of decrease for E along any line through the initial guess, which implies g 0 (0) ≤ 0.
If g 0 (0) = 0 with g(0) a local minimum, the line search terminates. If g 0 (0) < 0, notice that the coefficient
g4 > 0, so g 0 (∞) = ∞. These conditions guarantee that g 0 (t) has at least one root for t > 0. Of all such
roots, let T be the root for which g(T ) is the smallest value. It must be the case that g(T ) < g(0), so in fact
E(C, S) < E(C + T N ) and the updated estimate for the center is C 0 = C + T N .
6.2 Updating the Estimate of the Matrix
Given initial guesses C and S for the hyperellipsoid, the matrix can be updated to S 0 = S + tN for t ≥ 0,
where N = −∂E/∂S is the negative of the gradient of E in the S direction at the initial guess. This is
another step of the gradient descent algorithm, searching in the direction of largest decrease of E in the S
parameter. Define the function
h(t) = E(C, S + tN )
Pm h T i2
1
= m i=1 ∆ i (S + tN )∆i − 1
Pm h T i2 (68)
= m i=1 ∆i N ∆i t + ∆T
1
i S∆i − 1
1
Pm 2
= m i=1 (bi t + ai )
28
where bi = ∆T T
i N ∆i and ai = ∆i S∆i − 1. Some algebra will show that
h(t) = h0 + h1 t + h2 t2 (69)
where
m m m
1 X 2 2 X 1 X 2
h0 = ai , h1 = ai bi , h2 = b (70)
m i=1 m i=1 m i=1 i
The derivative is h0 (t) = h1 + 2h2 t.
Observe that h(t) ≥ 0 because it is the nonnegative error function evaluated along a line in the domain. In
practice, one never expects the minimum of E to be zero, so h(t) > 0. In particular, h(0) > 0. The search
is in the largest direction of decrease for E along any line through the initial guess, which implies h0 (0) ≤ 0.
If h0 (0) = 0 with h(0) a local minimum, the line search terminates. If h0 (0) < 0, notice that the coefficient
h2 > 0, so h0 (∞) = ∞. These conditions guarantee that h0 (t) has at least one root for t > 0. Of all such
roots, let T be the root for which h(T ) is the smallest value. It must be the case that h(T ) < h(0).
We need to ensure that S + T N is positive definite. If it is positive definite, the updated estimate for the
matrix is S 0 = S + T N . If it is not positive definite, the root is halved and S 0 = S + (T /2)N is tested for
positive definiteness. If it is still not positive definite, the halving is repeated until a power p > 0 is obtained
for which S 0 = S + (T /2p )N is positive definite.
The test for positive definiteness uses Sylvester’s criterion which says that S is positive definite if all of
its leading principal minors are positive. The minors are determinants of the upper-left 1 × 1 block, the
upper-left 2 × 2 block, and so on through the determinant of S itself. For n = 2, S is positive definite when
 
h i s00 s01
det s00 > 0, det  >0 (71)
s01 s11
For n = 3, S is positive definite when

 
  s00 s01 s02
h i s00 s01  
det > 0, det   > 0, det  s12  > 0 (72)

s00  s01 s11
s01 s11  
s02 s12 s22
6.3 Pseudocode for the Algorithm
The initial guesses for C and S can be provided by the caller or they can be estimated internally. In the
Geometric Tools implementation, the internal algorithm is to compute an oriented bounding box for the
input points and use the box center as the initial C. The initial S is compute from the box axes and box
extents using equation (61). Listing 14 contains pseudocode for the algorithm.
Listing 14. Pseudocode for fitting a hyperellipsoid to points. The number of iterations is specified by the
caller. The function can be called multiple times, the first time allowing the internal estimate of the initial
center and matrix. A subsequent call uses the hyperellipsoid from the previous call. The function returns
the error function value that can be monitored by the caller to decide whether or not to make subsequent
calls.
29
t e m p l a t e < s i z e t N, typename R e a l>
R e a l D o F i t ( v e c t o r <V e c t o r<N, R e a l>> c o n s t& p o i n t s , s i z e t n u m I t e r a t i o n s ,
b o o l u s e H y p e r e l l i p s o i d F o r I n i t i a l G u e s s , H y p e r e l l i p s o i d <N, R e a l>& h y p e r e l l i p s o i d )
{
V e c t o r<N, R e a l> C ;
M a t r i x<N, N, R e a l> A ; // t h e z e r o m a t r i x
if ( useHyperellipsoidForInitialGuess )
{
C = hyperellipsoid . center ;
f o r ( s i z e t i = 0 ; i < N ; ++i )
{
V e c t o r<N, R e a l> p r o d u c t = O u t e r P r o d u c t ( h y p e r e l l i p s o i d . a x i s [ i ] , h y p e r e l l i p s o i d . a x i s [ i ] ) ;
A += p r o d u c t / ( h y p e r e l l i p s o i d . e x t e n t [ i ] * h y p e r e l l i p s o i d . e x t e n t [ i ] ) ;
}
}
else
{
O r i e n t e d B o x<N, R e a l> box = GetBoundingBox ( p o i n t s ) ;
C = box . c e n t e r ;
f o r ( s i z e t i = 0 ; i < N ; ++i )
{
V e c t o r<N, R e a l> p r o d u c t = O u t e r P r o d u c t ( box . a x i s [ i ] , box . a x i s [ i ] ) ;
A += p r o d u c t / ( box . e x t e n t [ i ] * box . e x t e n t [ i ] ) ;
}
}
Real e r r o r = ErrorFunction ( points , C, A) ;

f o r ( s i z e t i = 0 ; i < n u m I t e r a t i o n s ; ++i )
{
e r r o r = UpdateMatrix ( points , C, A ) ;
e r r o r = UpdateCenter ( points , A, C ) ;
}
// E x t r a c t t h e h y p e r e l l i p s o i d a x e s and e x t e n t s .
s t d : : a r r a y <R e a l , N> e i g e n v a l u e ;
s t d : : a r r a y <V e c t o r<N, R e a l >, N> e i g e n v e c t o r ;
D o E i g e n d e c o m p o s i t i o n (A , e i g e n v a l u e , e i g e n v e c t o r ) ;
h y p e r e l l i p s o i d . center = C;
f o r ( s i z e t i = 0 ; i < N ; ++i )
{
hyperellipsoid . axis [ i ] = eigenvector [ i ] ;
h yp e r e ll i p so i d . extent [ i ] = 1 / sqrt ( eigenvalue [ i ] ) ;
}
return error ;
}
7 Fitting a Cylinder to 3D Points
This document describes an algorithm for fitting a set of 3D points with a cylinder. The assumption is that
the underlying data is modeled by a cylinder and that errors have caused the points not to be exactly on
the cylinder. You could very well try to fit a random set of points, but the algorithm is not guaranteed to
produce a meaningful solution.
7.1 Representation of a Cylinder
An infinite cylinder is specified by an axis containing a point C and having unit-length direction W . The
radius of the cylinder is r > 0. Two more unit-length vectors U and V may be defined so that {U , V , W }
30
is a right-handed orthonormal set; that is, all vectors are unit-length, mutually perpendicular, and with
U × V = W , V × W = U , and W × U = V . Any point X may be written uniquely as
X = C + y0 U + y1 V + y2 W = C + RY (73)
where R is a rotation matrix whose columns are U , V , and W and where Y is a column vector whose rows
are y0 , y1 , and y2 . To be on the cylinder, we need
r2 = y02 + y12
= (U · (X − C))2 + (V · (X − C))2
(74)
= (X − C)T (U U T + V V T )(X − C)
= (X − C)T (I − W W T )(X − C)
where I is the identity matrix. Because the unit-length vectors form an orthonormal set, it is necessary that
I = U U T + V V T + W W T . A finite cylinder is obtained by bounding the points in the axis direction,
|y2 | = |W · (X − C)| ≤ h/2 (75)
where h > 0 is the height of the cylinder.
7.2 The Least-Squares Error Function
Let {X i }ni=1 be the input point set. An error function for a cylinder fit based on Equation (74) is
n n h i2
X 2 X
E(r2 , C, W ) = F (X i ; r2 , C, W ) = (X i − C)T I − W W T (X i − C) − r2

(76)
i=1 i=1
where the cylinder axis is a line containing point C and having unit-length direction W and the cylinder
radius is r. Thus, the error function involves 6 parameters: 1 for the squared radius r2 , 3 for the point
C, and 2 for the unit-length direction W . These parameters form the 6-tuple q in the generic discussion
presented previously.
Pn
For numerical robustness, it is advisable to subtract the sample mean A = ( i=1 X i )/n from the samples,
Pin ← X i − A. This preconditioning is assumed in the mathematical derivations to follow, in which case
X
i=1 X i = 0.
In the following discussion, define
P = I − W W T , ri2 = (C − X i )T P (C − X i ) (77)
The matrix P represents a projection onto a plane with normal W , so P 2 = P and depends only on the
direction W . The
Pterm ri2 depends on the center C and the direction W . The error function is written
n
concisely as E = i=1 (ri − r2 )2 .
2
31
7.3 An Equation for the Radius
Pn
The partial derivative of the error function with respect to the squared radius is ∂E/∂r2 = −2 2 2
i=1 (ri −r ).
Setting this to zero, we have the constraint
n
X
0= (ri2 − r2 ) (78)
i=1
which leads to
n
1X 2
r2 = r (79)
n i=1 i
Thus, the squared radius is the average of the squared distances of the projections of X i − C onto a plane
containing C and having normal W . The right-hand side depends on the parameters C and W .
Observe that
1
Pn
ri2 − r2 = ri2 − n
2
j=1 rj
1
Pn
= (C − X i )T P (C − X i ) − n − X j )T P (C − X j )
j=1 (C

n
= C T P C − 2X T T 1 T T T
P
i P C + X i P X i − n j=1 C P C − 2X j P C + X j P X j
T T T T
P
n T
Pn T
(80)
2 1
= C P C − 2X i P C + X i P X i − C P C + n j=1 X j P C − n j=1 X j P X j
P
1 n T T T 1
Pn T
= n j=1 X j − X i 2P C + X i P X i − n j=1 X j P X j
Pn
= −X T T
i 2P C + X i P X i − n
1 T
j=1 X j P X j
Pn
The last equality is based on the precondition j=1 X j = 0.
7.4 An Equation for the Center

Pn 2
The partial derivative with respect to the center is ∂E/∂C = −4 i=1 (ri − r2 )P (X i − C). Setting this to
zero, we have the constraint
Pn 2
0 = i=1 (ri − r2 )P (X i − C)
Pn 2
Pn
= i=1 (ri − r2 )P X i − 2 2
i=1 (ri − r ) P C
(81)
Pn 2
= i=1 (ri − r2 )P X i
where the last equality is a consequence of Equation (78). Multiply equation (80) by P X i , sum over i, and
use equation (81) to obtain
P Pn T P P
n T 1 n T n
0 = −2P i=1 X i X i P C + i=1 X i P X i P X i − n j=1 X j P X j P i=1 X i
P Pn T (82)
n T
= −2P i=1 X i X i P C + i=1 X i P X i P X i
Pn
where the last equality is based on the precondition i=1 X i = 0. We wish to solve this equation for C, but
observe that C + tW are valid centers for all t. It is sufficient to compute a center that has no component
32
in the W -direction; that is, we may construct a point for which C = P C. It suffices to solve Equation (82)
for P C written as the linear system A(P C) = B/2 where
n
! n
1X T 1 X T
A=P X i X i P, B = Xi P Xi P Xi (83)
n i=1 n i=1
The projection matrix is symmetric, P = P T , a condition that leads to the right-hand side of the equation
defining A. We have used P = P 2 to introduce an additional P factor, X T T 2 T T
i P X i = X i P X i = X i P P X i,
which leads to the right-hand side of the equation defining B.
The matrix A is singular because the projection matrix P is singular, so we cannot directly invert A to solve
the equation. The linear system involves terms that live only in the plane perpendicular to W , so in fact
the linear system reduces to two equations in two unknowns in the projection space and is solvable as long
as the coefficient matrix is invertible.
Choose U and V so that {U , V , W } is a right-handed orthonormal set; then P X i = µi U + νi V and
P C = k0 U + k1 V , where µi = U · X i , νi = V · X i , k0 = U · P C, and k1 = V · P C. The matrix A becomes
n
! n
! n
!
1X 2 T 1X
T T
1X 2
A= µ UU + µi νi UV + V U + ν VVT (84)
n i=1 i n i=1 n i=1 i
and the vector B becomes

n
1X 2
B= (µ + νi2 )(µi U + νi V ) (85)
n i=1 i
The vector A(P C) becomes

n n
! n n
!
k0 X 2 k1 X k0 X k1 X 2
A(P C) = µ + µi νi U+ µi νi + ν V (86)
n i=1 i n i=1 n i=1 n i=1 i
Equating this to B/2 and grouping the coefficients for U and V leads to the linear system
 Pn Pn    P 
1 2 1 1 n 2 2
µ µi ν i k 0 1 (µ + ν )µ i
 nP i=1 i n i=1   =  n Pi=1 i i  (87)
1 n
µ ν 1
Pn
ν 2
k 2 1 n
(µ 2
+ ν 2
)ν
n i=1 i i n i=1 i 1 n i=1 i i i
The coefficient matrix is the covariance matrix of the projection of the samples onto the plane perpendicular
to W . Intuitively, this matrix is invertible as long as the projections do not lie on a line. If the matrix
is singular (or nearly singular numerically), the original samples lie on a plane (or nearly lie on a plane
numerically). They are not fitted well by a cylinder or, if you prefer, they are fitted by a cylinder with
infinite radius.
The matrix system of Equation (87) has solution
   Pn  
νi2 − n
Pn Pn 2
k0 1
n i=1
1
i=1 µi νi
1
n i=1 (µi + νi2 )µi
  = 1
2  Pn Pn 2
 Pn 2
 (88)
+ νi2 )νi

2 1
Pn
µ2 1
Pn
ν2 − 1
Pn 1 1 1
k1 n i=1 i n i=1 i n
µ ν
i=1 i i −n i=1 µi νi n i=1 µi n i=1 (µi
which produces the cylinder center P C = k0 U + k1 V ; use this instead of C in Equation (74).
33
Although the solution appears to depend on the choice of U and V , it does not. Let W = (w0 , w1 , w2 ) and
define the skew symmetric matrix  
0 −w2 w1
 
S =  w2 0 −w0  (89)
 
 
−w1 w0 0
By definition of skew symmetry, S T = −S. This matrix represents the cross product operation: Sξ = W × ξ
for any vector ξ. Because {U , V , W } is a right-handed orthonormal set, it follows that SU = V and
SV = −U . It can be shown also that S = V U T − U V T . Define matrix Â by
n
! n
! !
1X 2 T 1X
T T
1X n
Â = ν UU − µi νi UV + V U + µ V V T = SAS T (90)
n i=1 i n i=1 n i=1 i
Effectively, this generates a 2 × 2 matrix that is the adjugate of the 2 × 2 matrix representing A. It has the
property
n n n
!2
X X X
2 2
ÂA = δP, δ = µi νi − µi νi (91)
i=1 i=1 i=1
The trace of a matrix is the sum of the diagonal entries. Observe that Trace(P ) = 2 because |W |2 = 1.
Taking the trace of ÂA = δP , we obtain 2δ = Trace(ÂA). The cylinder center is obtained by multiplying
A(P C) = B/2 by Â, using the definitions in equation (83) and using equation (91),
n
! n
!
Â 1 X T Â 1 X T
PC = Xi P Xi P Xi = Xi P Xi Xi (92)
Trace(ÂA) n i=1 Trace(ÂA) n i=1
where the last equality uses ÂP = Â because S T P = S T . Equation (92) is independent of U and V but
dependent on W .
7.5 An Equation for the Direction
Let the direction be parameterized as W (s) = (w0 (s), w1 (s)), w2 (s)), where s is a 2-dimensional parameter.
For example, spherical coordinates is such a parameterization: W = (cos s0 sin s1 , sin s0 sin s1 , cos s1 ) for
s0 ∈ [0, 2π) and s1 ∈ [0, π/2], where w2 (s0 , s1 ) ≥ 0. The partial derivatives of E are
n
∂E X ∂P
=2 (ri2 − r2 )(C − X i )T (C − X i ) (93)
∂sk i=1
∂sk
Solving ∂E/∂sk = 0 in closed form is not tractable. It is possible to generate a system of polynomial
equations in the components of W , use elimination theory to obtain a polynomial in one variable, and then
find its roots. This approach is generally tedious and not robust numerically.
Alternatively, we can skip root finding
Pn for ∂E/∂sk = 0, instead substituting Equations (80) and (92) directly
into the error function E/n = n1 i=1 (ri2 − r2 )2 to obtain a nonnegative function,
  2
n n n
1 X T 1X T Â 1 X
G(W ) = Xi P Xi − X P X j − 2X T  XT
j P Xj Xj
 (94)
n i=1 n j=1 j i
Trace(ÂA) n j=1
34
A numerical algorithm for locating the minimum of G can be used. Or, as is shown in the sample code,
the domain for (s0 , s1 ) may be partitioned into samples at which G is evaluated. The sample producing the
minimum G-value determines a reasonable direction W . The center C and squared radius r2 are inherent
in the evaluation of G, so in the end we have a fitted cylinder. The evaluations of G are expensive for a large
number of samples: equation (94) contains summations in a term that is then squared, followed by another
summation.
Some algebraic manipulation lead to encapsulating the summations, allowing us to precompute the summa-
tions and represent G(W ) as a rational polynomial of the components of W . This approach increases the
performance on the CPU. It also allows an efficient implementation for massively parallel performance on
the GPU—one GPU thread per direction vector that is sampled from a hemisphere.
The projection matrix P is determined by its upper-triangular elements p = (p00 , p01 , p02 , p11 , p12 , p22 ). We
can write the following, where p is represented as a 6 × 1 vector,
n
1X T
XT
i P Xi − X P X j = p · (ξ i − µ) = p · δ i (95)
n j=1 j
The 6 × 1 vectors ξ i , µ and δ i are defined next (written as 6-tuples). As a 3-tuple, let X i = (xi , yi , zi ),
n
1X
ξ i = x2i , 2xi yi , 2xi zi , yi2 , 2yi zi , zi2 , µ =

ξ , δi = ξi − µ (96)
n i=1 i
We can also write    

n n n
1 X T 1 X 1 X
Xj P Xj Xj =  X j ξT
j
p =  X j δT
j
p (97)
n j=1 n j=1 n j=1
Pn Pn
The last equality is true because j=1 X j = 0 implies j=1 X j µT = 0. Define
n n n
1X 1X 1X
Q = Â/ Trace(ÂA), F0 = XjXT
j , F1 = X j δT
j , F2 = δj δT
j (98)
n j=1 n j=1 n j=1
where Q and F0 are 3 × 3, F1 is 3 × 6 and F2 is 6 × 6; then

Pn h i2
1 T T
G(W ) = n i=1 p δ i − 2X i QF 1 p
2
1
Pn T
2 T
T
T
= n i=1 p δ i − 4 p δ i X i QF1 p + 4 X i QF1 p (99)
= pT F2 p − 4pT F1T QF1 p + 4pT F1T QT F0 QF1 p
The precomputations for the input samples {Y i }ni=1 is the following and must be done so in the order
specified. These steps are independent of direction vectors W .
1
Pn
1. Compute A = n ( i=1 Y i ).
2. Compute X i = Y i − A for all i.

Pn
3. Compute µ = n1 i=1 (x2i , 2xi yi , 2xi zi , yi2 , 2yi zi , zi2 ), δ i = (x2i , 2xi yi , 2xi zi , yi2 , 2yi zi , zi2 ) − µ for all i.
35
Pn Pn Pn
4. Compute F0 = 1
n i=1 X iX T
i , F1 =
1
n i=1 X i δT
i , F2 =
1
n i=1 δi δT
i.
For each specified direction W , do the following steps.
1. Compute the projection matrix P = I − W W T and the skew-symmetric matrix S.

2. Compute A = P F0 P , Â = SAS T , ÂA, Trace(ÂA).
3. Compute Q = Â/ Trace(ÂA).
4. Store the upper-triangular entries of P in p.

5. Compute α = F1 p and β = Qα.
6. Compute G = pT F2 p − 4αT β + 4β T F0 β.
7. The corresponding center is P C = β.
1
Pn
8. The corresponding squared radius is r2 = n
2
i=1 ri where ri2 = (P C − P X i )T (P C − P X i ). This
factors to r2 = p · µ + β T β.
The sample application that used equation (94) directly was really slow. On an Intel Core
®
i7-6700
TM
CPU at 3.40 GHz, the single-threaded version for 10765 points required 129 seconds and the multithreaded
version using 8 hyperthreads required 22 seconds. The evaluation of G using the precomputed summations
is much faster. The single-threaded version required 85 milliseconds and the multithreaded version using 8
hyperthreads required 22 milliseconds.
7.6 Fitting for a Specified Direction
Although one may apply root-finding or minimization techniques to estimate the global minimum of E, as
shown previously, in practice it is possible first to obtain a good estimate for the direction W . Using this
direction, we may solve for C = P C in Equation (92) and then r2 in Equation (79).
For example, suppose that the X i are distributed approximately on a section of a cylinder so that the least-
squares line that fits the data provides a good estimate for the direction W . This
Pn vector is a unit-length
eigenvector associated with the largest eigenvalue of the covariance matrix A = i=1 X i X T i . We may use
a numerical eigensolver to obtain W , and then solve the aforementioned equations for the cylinder center
and squared radius.
The distribution can be such that the estimated direction W from the covariance matrix is not good, as is
shown in the experiments of the next section.
7.7 Pseudocode and Experiments
The simplest algorithm to implement involves the minimization of the function G in Equation (94). An
implemention is shown in Listing 15.
36
Listing 15. Pseudocode for preprocessing the sample points. The output X[] is the array of sample
points translated by the average. The other outputs are mu for µ, F0 for F0 , F1 for F1 and F2 for F2 . The
pseudocode is algo given for evaluating the function G(W ) and generating the corresponding P C and r2 .
void Preprocess ( i n t n , Vector3 p o i n t s [ n ] , Vector3 X[ n ] ,
V e c t o r 3& a v e r a g e , V e c t o r 6& mu , M a t r i x 3 x 3& F0 , M a t r i x 3 x 6& F1 , M a t r i x 6 x 6&F2 )
{
average = { 0 , 0 , 0 };
f o r ( i n t i = 0 ; i < n ; ++i )
{
a v e r a g e += p o i n t s [ i ] ;
}
a v e r a g e /= n ;
f o r ( i n t i = 0 ; i < n ; ++i )
{
X[ i ] = points [ i ] = average ;
}
Vector6 zero = { 0 , 0 , 0 , 0 , 0 , 0 };
Vector6 products [ n ] ;
MakeZero ( p r o d u c t s ) ;
mu = z e r o ;
f o r ( i n t i = 0 ; i < n ; ++i )
{
products [ i ] [ 0 ] = X[ i ] [ 0 ] * X[ i ][0];
products [ i ] [ 1 ] = X[ i ] [ 0 ] * X[ i ][1];
products [ i ] [ 2 ] = X[ i ] [ 0 ] * X[ i ][2];
products [ i ] [ 3 ] = X[ i ] [ 1 ] * X[ i ][1];
products [ i ] [ 4 ] = X[ i ] [ 1 ] * X[ i ][2];
products [ i ] [ 5 ] = X[ i ] [ 2 ] * X[ i ][2];
mu [ 0 ] += p r o d u c t s [ i ] [ 0 ] ;
mu [ 1 ] += 2 * p r o d u c t s [ i ] [ 1 ] ;
mu [ 2 ] += 2 * p r o d u c t s [ i ] [ 2 ] ;
mu [ 3 ] += p r o d u c t s [ i ] [ 3 ] ;
mu [ 4 ] += 2 * p r o d u c t s [ i ] [ 4 ] ;
mu [ 5 ] += p r o d u c t s [ i ] [ 5 ] ;
}
mu /= n ;
MakeZero ( F0 ) ;
MakeZero ( F1 ) ;
MakeZero ( F2 ) ;
f o r ( i n t i = 0 ; i < n ; ++i )
{
Vector6 d e l t a ;
d e l t a [ 0 ] = p r o d u c t s [ i ] [ 0 ] = mu [ 0 ] ;
d e l t a [ 1 ] = 2 * p r o d u c t s [ i ] [ 1 ] = mu [ 1 ] ;
F0 ( 0 , 0 ) += p r o d u c t s [ i ] [ 0 ] ;
F0 ( 0 , 1 ) += p r o d u c t s [ i ] [ 1 ] ;
F0 ( 0 , 2 ) += p r o d u c t s [ i ] [ 2 ] ;
F0 ( 1 , 1 ) += p r o d u c t s [ i ] [ 3 ] ;
F0 ( 1 , 2 ) += p r o d u c t s [ i ] [ 4 ] ;
F0 ( 2 , 2 ) += p r o d u c t s [ i ] [ 5 ] ;
F1 += O u t e r P r o d u c t (X [ i ] , d e l t a ) ;
F2 += O u t e r P r o d u c t ( d e l t a , d e l t a ) ;
}
F0 /= n ;
F0 ( 1 , 0 ) = F0 ( 0 , 1 ) ;
F0 ( 2 , 0 ) = F0 ( 0 , 2 ) ;
F0 ( 2 , 1 ) = F0 ( 1 , 2 ) ;
F1 /= n ;
F2 /= n ;
}
R e a l G( i n t n , V e c t o r 3 X [ n ] , V e c t o r 3 mu , M a t r i x 3 x 3 F0 , M a t r i x 3 x 6 F1 , M a t r i x 6 x 6 F2 , V e c t o r 3 W,
V e c t o r 3& PC , R e a l& r S q r )
{
37
M a t r i x 3 x 3 P = M a t r i x 3 x 3 : : I d e n t i t y ( ) = O u t e r P r o d u c t (W, W) ; // P = I = W * WˆT
// S = {{0 , =w2 , w1 } , {w2 , 0 , =w0 } , {=w1 , w0 , 0}} , i n n e r b r a c e s a r e r o w s
M a t r i x 3 x 3 S ( 0 , =W[ 2 ] , W[ 1 ] , W[ 2 ] , 0 , =W[ 0 ] , =W[ 1 ] , W[ 0 ] , 0 ) ;
M a t r i x 3 x 3 A = P * F0 * P ;
M a t r i x 3 x 3 hatA = =(S * A * S ) ;
M a t r i x 3 x 3 hatAA = hatA * A ;
R e a l t r a c e = T r a c e ( hatAA ) ;
M a t r i x 3 x 3 Q = hatA / t r a c e ;
Vector6 p = { P(0 , 0) , P(0 , 1) , P(0 , 2) , P(1 , 1) , P(1 , 2) , P(2 , 2) };
V e c t o r 3 a l p h a = F1 * p ;
Vector3 beta = Q * alpha ;
R e a l e r r o r = ( Dot ( p , F2 * p ) = 4 * Dot ( a l p h a , b e t a ) + 4 * Dot ( b e t a , F0 * b e t a ) ) / n ;
PC = b e t a ;
r s q r = Dot ( p , mu) + Dot ( b e t a , b e t a ) ;
return error ;
}
The fitting is performed by searching a large number of directions W , as shown in Listing 16.
Listing 16. Fitting a cylinder to a set of points.

// The X [ ] a r e t h e p o i n t s t o be f i t . The o u t p u t s r S q r , C , and W a r e t h e
// c y l i n d e r p a r a m e t e r s . The f u n c t i o n r e t u r n v a l u e i s t h e e r r o r f u n c t i o n
// e v a l u a t e d a t t h e c y l i n d e r p a r a m e t e r s .
R e a l F i t C y l i n d e r ( i n t n , V e c t o r 3 p o i n t s [ n ] , R e a l& r S q r , V e c t o r 3& C , V e c t o r 3& W)
{
Vector3 X[ n ] ;
V e c t o r 3 mu ;
M a t r i x 3 x 3 F0 ;
M a t r i x 3 x 6 F1 ;
M a t r i x 6 x 6 F2 ;
P r e p r o c e s s ( n , p o i n t s , X , a v e r a g e , mu , F0 , F1 , F2 ) ;
// Choose imax and jmax a s d e s i r e d f o r t h e l e v e l o f g r a n u l a r i t y you

// want f o r s a m p l i n g W v e c t o r s on t h e h e m i s p h e r e .
Real minError = i n f i n i t y ;
W = Vector3 : : Zero ( ) ;
C = Vector3 : : Zero ( ) ;
rSqr = 0;
f o r ( i n t j = 0 ; j <= jmax ; ++j )
{
R e a l p h i = h a l f P i * j / jmax ; // i n [ 0 , p i / 2 ]
Real csphi = cos ( phi ) , snphi = s i n ( phi ) ;
f o r ( i n t i = 0 ; i < imax ; ++i )
{
R e a l t h e t a = t w o P i * i / imax ; // i n [ 0 , 2 * p i )
Real cstheta = cos ( theta ) , sntheta = s i n ( theta ) ;
Vector3 currentW ( c s t h e t a * snphi , s n t h e t a * snphi , c s p h i ) ;
Vector3 currentC ;
Real currentRSqr ;
R e a l e r r o r = G( n , X , mu , F0 , F1 , F2 , currentW , c u r r e n t C , c u r r e n t R S q r ) ;
i f ( e r r o r < minError )
{
minError = e r r o r ;
W = currentW ;
C = currentC ;
rSqr = currentRSqr ;
}
}
}
// T r a n s l a t e t h e c e n t e r t o t h e o r i g i n a l c o o r d i n a t e system .
C += a v e r a g e ;
return minError ;
}
38
The following experiments show how well the cylinder fitting works.
A regular lattice of 64 × 65 samples were chosen for a cylinder centered at the origin, with axis direction
(0, 0, 1), with radius 1, and height 4. The samples are (cos θj , sin θj , zij ), where θj = 2πj/64 and zij =
−2 + 4i/64 for 0 ≤ i ≤ 64 and 0 ≤ j < 64. The fitted center, radius, and axis direction are the actual ones
(modulo numerical round-off errors). Figure 1 shows a rendering of the points (in black) and a wire frame
of the fitted cylinder (in blue).
Figure 1. Fitting of samples on a cylinder ring that is cut perpendicular to the cylinder axis.
A lattice of samples was chosen for the same cylinder but the samples are skewed to lie on a cut that is not
perpendicular to the cylinder axis. The samples are (cos θj , sin θj , zij ), where θj = 2πj/64 for 0 ≤ j < 64
and zij = −b + cos θj + 2bi/64 for b = 1/4 and 0 ≤ i ≤ 64. The fitted center, radius, and axis direction are
the actual ones (modulo numerical round-off errors). Figure 2 shows a rendering of the points (in black) and
a wire frame of the fitted cylinder (in blue).
39
Figure 2. Fitting of samples on a cylinder ring that is cut skewed relative to the cylinder axis.
In this example, if you were to compute the covariance matrix of the samples and choose the cylinder axis
direction to be the eigenvector in the direction of maximum variance, that direction is skewed relative to the
original cylinder axis. The fitted parameters are (approximately) W = (0.699, 0, 0.715), P C = (0, 0, 0), and
r2 = 0.511. Figure 3 shows a rendering of the points (in black) and a wire frame of the fitted cylinder (in
blue).
40
Figure 3. Fitting of samples on a cylinder ring that is cut skewed relative to the cylinder axis. The direction
W was chosen to be an eigenvector corresponding to the maximum eigenvalue of the covariance matrix of
the samples.
You can see that the fitted cylinder is not a good approximation to the points.
Figure 4 shows a point cloud generated from a DIC file (data set courtesy of Carolina Lavecchia) and the
fitted cylinder. The data set has nearly 11000 points.
41
Figure 4. A view of a fitted point cloud.
Figure 5 shows a different view of the point cloud and cylinder.
42
Figure 5. Another view of a fitted point cloud, looking along the top of the cyliner to get an idea of how
well the cylinder fits the data.
7.8 Fitting a Cylinder to a Triangle Mesh
Imagine having a triangle mesh that approximates a portion of an infinite cylinder. When the triangles have
small area and have normal vectors perpendicular to the cylinder axis, the projection of the triangles onto a
plane perpendicular to the cylinder axis covers a region also of small area. If you were to project the cylinder
itself onto the plane, you get a circle (not a disk) that covers a region of zero area. This suggests locating a
direction D on the hemisphere of points (x, y, z) with z ≥ 0 for which the sum of areas of projections of the
triangles of the mesh is minimized.
Let the mesh have n points {P i }n=1
i=0 and m triangles, each a triple (i0 , i1 , i2 ) of indices into the points. Given
a unit-length direction D, let U and V be vectors forP which {U , V , D} is a right-handed orthonormal set.
n−1
For numerical robustness, compute the average A = i=0 P i /n and replace P i ← (P i − A). The points
are projected onto a plane perpendicular to D; they are (ui , vi ) = (U · P i , V · P i ). Given the vertices of a
43
projected triangle, say (u0 , v0 ), (u1 , v1 ) and (u2 , v2 ), the area of the projected triangle is
1
(u1 − u0 , v1 − v0 ) · (u2 − u0 , v2 − v0 )⊥

A= (100)
2
where (u, v)⊥ = (v, −u). The areas can be summed over all projected triangles, ignoring the overlap of
projections; that is, the sum of areas is generally larger than the area of the union of the projections.
Pseudocode for a simple search over the hemisphere of directions to locate a minimum sum of areas is shown
in Listing 17.
Listing 17. A simple search over the hemisphere of directions to locate a direction that minimizes the
sum of areas of projected triangles. The factor 1/2 is ignored, so the minimum is over twice the area.
s t r u c t C y l i n d e r { Vector3 center , d i r e c t i o n ; Real radius , height ; };
s t r u c t C i r c l e { Vector2 center ; Real r a d i u s ; };
C y l i n d e r F i t M e s h B y C y l i n d e r ( i n t numPoints , V e c t o r 3 p o i n t s [ ] , i n t n u m T r i a n g l e s , i n t 3 t r i a n g l e s [ ] )
{
// T r a n s l a t e t h e p o i n t s s o t h a t t h e a v e r a g e i s t h e o r i g i n ( f o r n u m e r i c a l r o b u s t n e s s ) .
Vector3 l o c a l P o i n t s [ ] = p o i n t s [ ] ;
Vector3 average = { 0 , 0 , 0 };
{
a v e r a g e += p o i n t s [ i ] ;
}
a v e r a g e /= n umP oin ts ;
{
l o c a l P o i n t s = points [ i ] = average ;
}
// L o c a t e t h e d i r e c t i o n D t h a t m i n i m i z e s t h e sum o f t h e a r e a s o f t h e
// p r o j e c t e d t r i a n g l e s .
Vector3 D;
SearchForMinimum ( numPoints , l o c a l P o i n t s , n u m T r i a n g l e s , t r i a n g l e s , D ) ;
// P r o j e c t t h e p o i n t s o n t o a p l a n e p e r p e n d i c u l a r t o D . T h i s i s t h e
// s e t u p f o r f i t t i n g t h e p r o j e c t e d p o i n t s by a c i r c l e .
Vector3 U, V;
C o m p u t e O r t h o n o r m a l B a s i s (D, U , V ) ;
V e c t o r 2 p r o j e c t i o n s [ nu mP oin ts ] ;
R e a l hmin = i n f i n i t y , hmax = 0 ;
{
T h = Dot (D, l o c a l P o i n t s [ i ] ) ;
hmin = min ( h , hmin ) ;
hmax = max ( h , hmax ) ;
p r o j e c t i o n s [ i ] [ 0 ] = Dot (U , l o c a l P o i n t s [ i ] ) ;
p r o j e c t i o n s [ i ] [ 1 ] = Dot (V , l o c a l P o i n t s [ i ] ) ;
}
// The f i l e A p p r C i r c l e 2 . h h a s a c l a s s w i t h member f u n c t i o n F i t U s i n g S q u a r e d L e n g t h
// t h a t can be u s e d f o r t h e f i t t i n g o f t h e p r o j e c t e d p o i n t s by a c i r c l e .
Circle circle ;
F i t P o i n t s B y C i r c l e ( numPoints , p r o j e c t i o n s , c i r c l e ) ;
// I n v e r s e p r o j e c t t h e c i r c l e t o o b t a i n c y l i n d e r p a r a m e t e r s .
c y l i n d e r . c e n t e r = a v e r a g e + ( c i r c l e . c e n t e r [ 0 ] * U + c i r c l e . c e n t e r [ 1 ] * V) + ( ( hmax + hmin ) / 2 ) * D ;
c y l i n d e r . d i r e c t i o n = D;
cylinder . radius = circle . radius ;
c y l i n d e r . h e i g h t = hmax = hmin ;
}
v o i d SearchForMinimum ( i n t numPoints , V e c t o r 3 p o i n t s [ ] , i n t numTriangles , i n t 3 triangles [] ,

V e c t o r 3& m i n D i r e c t i o n , R e a l& minMeasure )
44
{
// H a n d l e t h e n o r t h p o l e ( 0 , 0 , 1 ) s e p a r a t e l y .
minDirection = { 0 , 0 , 1 };
R e a l minMeasure = G e t M e a s u r e ( m i n D i r e c t i o n , numPoints , p o i n t s , n u m T r i a n g l e s , triangles );
// P r o c e s s a r e g u l a r g r i d o f ( t h e t a , p h i ) a n g l e s .
f o r ( i n t j = 1 ; j <= numPhiSamples ; ++j )
{
R e a l p h i = ( p i / 2 ) * j / numPhiSamples ; // i n [ 0 , p i / 2 ]
Real csphi = cos ( phi ) , snphi = s i n ( phi ) ;
f o r ( i n t i = 0 ; i < numThetaSamples ; ++i )
{
R e a l t h e t a = 2 * p i * i / numThetaSamples ; // i n [ 0 , 2 * p i )
Real cstheta = cos ( theta ) , std : : s i n ( theta ) ;
Vector3 d i r e c t i o n = { cstheta * snphi , sntheta * snphi , csphi };
R e a l m e a s u r e = G e t M e a s u r e ( d i r e c t i o n , numPoints , p o i n t s , n u m T r i a n g l e s , triangles );
i f ( m e a s u r e < minMeasure )
{
minDirection = direction ;
minMeasure = m e a s u r e ;
}
}
}
}
R e a l G e tM e a s u r e ( V e c t o r 3 D, i n t numPoints , V e c t o r 3 p o i n t s [ ] , i n t numTriangles , i n t 3 triangles [])

{
Vector3 U, V;
C o m p u t e O r t h o n o r m a l B a s i s (D, U , V ) ;
V e c t o r 2 p r o j e c t i o n s [ n umP oin ts ] ;
{
p r o j e c t i o n s [ i ] [ 0 ] = Dot (U , p o i n t s [ i ] ) ;
p r o j e c t i o n s [ i ] [ 1 ] = Dot (V , p o i n t s [ i ] ) ;
}
// Add up 2 * a r e a o f t h e t r i a n g l e s .
Real measure = 0 ;
f o r ( i n t t = 0 ; t < n u m T r i a n g l e s ; ++t )
{
Vector2 V [ 3 ] ;
f o r ( i n t i = 0 ; i < 3 ; ++i )
{
V[ i ] = p r o j e c t i o n s [ t r i a n g l e s [ t ] [ i ] ] ;
}
m e a s u r e += a b s ( e d g e 1 0 [ 0 ] * e d g e 2 0 [ 1 ] = e d g e 1 0 [ 1 ] * e d g e 2 0 [ 0 ] ) ;
}
r e t u r n measure ;
}
8 Fitting a Cone to 3D Points
The cone vertex is V , the unit-length axis direction is U and the cone angle is θ ∈ (0, π/2). The cone is
defined algebraically by those points X for which
X −V
U· = cos(θ) (101)
|X − V |
This can be written as a quadratic equation
(X − V )T (cos(θ)2 I − U U T )(X − V ) = 0 (102)
45
with the implicit constraint that U ·(X −V ) > 0; that is, X is on the “positive” cone. Define W = U / cos(θ),
so |W | > 1 and
F (X; V , W ) = (X − V )T (I − W W T )(X − V ) = 0 (103)
The nonlinear least-squares fitting of points {X i }n−1
i=0 computes V and W to minimize the error function
n−1
X
E(V , W ) = F (X i ; V , W )2 (104)
i=0
This can be solved using standard iterative minimizers such as the Gauss–Newton method and Levenberg–
Marquardt method. The partial derivatives of the F terms are needed for ∂E/∂V and ∂E/∂W in order to
compute the Jacobian matrix,
∂F ∂F
= 2 (∆ − (W · ∆) W ) , = −2 (W · ∆) ∆ (105)
∂V ∂W
where ∆ = X − V . Implementations are found in ApprCone3.h.
8.1 Initial Choice for the Parameters of the Error Function
Without any application-specific knowledge about the sample points, the simplest initial choice for the center,
axis directions and axis extents is based on assuming the sample points are dense on a frustum of a cone and
then using integrals of various quantities over that frustum. The cone frustum surface is parameterized by
P (h, φ) = V + hU + h tan θ(cos φW 0 + sin φW 1 ), h ∈ [h0 , h1 ], φ ∈ [0, 2π) (106)
where 0 ≤ h0 < h1 . The set {U , W 0 , W 1 } is a right-handed orthonormal basis for R3 ; that is, the vectors
are unit length, mutually perpendicular, and U = W 0 × W 1 . We want to use the sample points X i to
determine an initial choice for the cone axis direction U , the cone angle θ and the height bounds h0 and h1
of the cone frustum. The integrals involve ratio expressions that are defined by
R h1 n−1
1
h dh (hn − hn0 )
ρn = hR0 h1 = n1 12 2
(107)
h dh
h0 2 (h1 − h0 )
Define the rotation matrix R = [U W 0 W 1 ] whose columns are the specified basis vectors and define Y (h, φ)
to be the 3 × 1 vector which in 3-tuple form is (h, h tan θ cos φ, h tan θ sin φ). The parameterization of the
surface is concisely P (h, φ) = V + RY (h, φ). Several integrals are formulated next and involve the following
integrals. For notational convenience, I will use Y with the understanding that it depends on h and φ.
The element of surface area for the cone frustum surface is

dA = ∂P ∂P
∂h × ∂φ dh dφ

= |(U + tan θ (cos φW 0 + sin φW 1 )) × (h tan θ (− sin φW 0 + cos φW 1 ))| dh dφ

= (h tan θ) |− sin φU × W 0 + cos φU × W 1 + tan θW 0 × W 1 | dh dφ
p
= h tan θ sin2 φ + cos2 φ + tan2 θ dh dφ (108)
√
= h tan θ 1 + tan2 θ dh dφ
p
= h tan θ 1/ cos2 θ dh dφ
= (h tan θ/ cos θ) dh dφ
46
and the surface area of the cone frustum is
Z h1Z 2π Z h1Z 2π
tan θ 2
A= dA = (h tan θ/ cos θ) dh dφ = π (h − h20 ) (109)
h0 0 h0 0 cos θ 1
The average of Y over the cone frustum surface is
1
R h1R 2π
Ȳ = A h0 0
Y dA
1
R h1R 2π
= π(h21 −h20 )
Y h dh dφ
h0 0
 
1
1
R h1R 2π  
 2
= π(h2 −h2 ) h0 0  tan θ cos φ  h dh dφ

1 0   (110)
tan θ sin φ
 
1
 
= ρ3  0 
 
 
0
The average of Y T Y over the cone frustum surface is

R h1R 2π
Y TY = 1
A h0 0
Y T Y dA
1
R h1R 2π T
= π(h21 −h20 ) h0 0
Y Y h dh dφ
R h1R 2π (111)
1
= π(h2 −h2 ) h0 0
(1 + tan2 θ) h3 dh dφ
1 0
= ρ4 sec2 θ
The average of Y Y T over the cone frustum surface is

R h1R 2π
YYT = 1
A h0 0
Y Y T dA
h R 2π
= π(h21−h2 ) h01 0 Y Y T h dh dφ
R
1 0  
1 tan θ cos φ tan θ sin φ
R h1R 2π   (112)
1
= π(h2 −h2 ) h0 0  tan θ cos φ tan2 θ cos2 φ tan2 θ sin φ cos φ  h3 dh dφ
 
1 0  
tan θ sin φ tan2 θ sin φ cos φ tan2 θ sin2 φ
= ρ4 Diag 1, 21 tan2 θ, 12 tan2 θ

47
The average of Y Y T Y over the cone frustum surface is
R h1R 2π
Y Y TY = 1
A h0 0
Y Y T Y dA
R h1R 2π
= 1
π(h21 −h20 ) h0 0
Y Y T Y h dh dφ
 
1
R h1R 2π  
1
(1 + tan2 θ)  tan θ cos φ
 4
=  h dh dφ

π(h21 −h20 ) h0 0   (113)
tan θ sin φ
 
1
 
= ρ5 sec2 θ  0 
 
 
0
In the following, P is the surface parameterization with the understanding that it depends on h and φ. The
average of the points over the cone frustum surface is
1
R h1R 2π
C = A h0 0
P dA
1
R h1R 2π
= π(h21 −h20 ) h0 0
hV + h2 U + h2 tan θ (cos φW 0 + sin φW 1 ) dh dφ (114)
= V + ρ3 U
Define the difference ∆ = V − C = −ρ3 U , in which case P − C = ∆ + RY . Define Z = P − C and define

i be the 3 × 1 vector which as a 3-tuple is (1, 0, 0). Observe that Ri = U . The average of ZZ T over the
cone frustum surface is
R h1R 2π
ZZ T = 1
A h0 0
ZZ T dA

1
R h1 2π
R T T T T T T
= A h0 0 ∆∆ + ∆Y R + RY ∆ + RY Y R dA
T
= ∆∆T + ∆Y RT + RY ∆T + RY Y T RT
= (−ρ3 U )(−ρ3 U )T + (−ρ3 U )(ρ3 i)T RT + R(ρ3 i)(−ρ3 U )T (115)
+ R(ρ4 Diag 1, 21 tan2 θ, 12 tan2 θ )RT

= (ρ4 − ρ23 )U U T + 12 ρ4 tan2 θ(W 0 W T T

0 + W 1W 1 )
= R Diag ρ4 − ρ23 , 21 ρ4 tan2 θ, 12 ρ4 tan2 θ RT

The matrix ZZ T is the covariance matrix of the cone frustum surface points and its eigendecomposition is
48
given by the right-hand side of equation (115). The average of ZZ T Z over the cone frustum surface is
R h1R 2π
ZZ T Z = 1
A h0 0
ZZ T Z dA
h1 2π
1
(∆T ∆)∆ + (2∆∆T R)Y + (Y T Y )∆(∆T ∆)RY +
R R
= A h0 0

+ (2RY Y T RT )∆ + RY Y T Y
(116)
= (∆T ∆)∆ + (2∆∆T R)Y + Y T Y ∆ + (∆T ∆)RY + 2RY Y T RT ∆ + RY Y T Y
= (−ρ33 + 2ρ33 − ρ3 ρ4 sec2 θ + ρ33 − 2ρ3 ρ4 + ρ5 sec2 θ)U
= (2p3 (ρ23 − ρ4 ) + (ρ5 − ρ3 ρ4 ) sec2 θ)U
Equations (115) and (116) can be manipulated to obtain 3 equations in the 3 unknowns h0 , h1 and tan θ.
In this sense, if we have good estimates for C, ZZ T and ZZ T Z, we can reconstruct the cone frustum from
a dense set of samples on or near the frustum.
8.1.1 Simple Attempt to Reconstruct Height Extremes
The cone parameterization in terms of the centroid C is
P (h, φ) = C + (h − p3 )U + h tan θ(cos φW 0 + sin φW 1 ) (117)
The minimum and maximum of the projections onto U relative to C are ĥ0 = h0 − p3 and ĥ1 = h1 − p3 ,
respectively. Recall that p3 = ((h31 − h30 )/3)/((h21 − h20 )/2), so we have two equations in h0 and h1 that we
can solve for in terms of ĥ0 and ĥ1 ,
ĥ20 + ĥ0 ĥ1 − 2ĥ21 ĥ10 + ĥ1 ĥ0 − 2ĥ20

h0 = , h1 = (118)
3(ĥ0 + ĥ1 ) 3(ĥ1 + ĥ0 )
Because C is the centroid, it is necessary that ĥ0 < 0 < ĥ1 . Also notice that h1 − h0 = ĥ1 − ĥ0 , and the
right-hand side is a robust estimate of h1 − h0 .
Although this is a simple mathematical problem given infinitely many points on a cone frustum surface, in
practice it has problems. Listing 18 contains code to generate a rectangular mesh of points on a cone frustum
that leads to ĥ1 being nearly equal to −ĥ0 , which causes the denominator in the h0 and h1 reconstruction
equations to be nearly zero. The reconstructed h0 and h1 are nowhere near what they should be.
Listing 18. The reconstructed h0 and h1 for a dense sample of points on the cone frustum are grossly
incorrect. The cone axis direction is estimated using equation (116) rather than as an eigenvector
√ from
equation (115); however, both estimates are nearly identical and nearly equal to the (3, 2, 1)/ 14.
V e c t o r 3<d o u b l e> V = { 3 . 0 , 2 . 0 , 1 . 0 } ;
V e c t o r 3<d o u b l e> U = { 1 . 0 , 2 . 0 , 3 . 0 } ;
V e c t o r 3<d o u b l e> b a s i s [ 3 ] ;
b a s i s [ 0 ] = U;
ComputeOrthogonalComplement ( 1 , b a s i s ) ;
U = basis [0];
V e c t o r 3<d o u b l e> W0 = b a s i s [ 1 ] ;
V e c t o r 3<d o u b l e> W1 = b a s i s [ 2 ] ;
49
double h0 = 1 . 0 ;
double h1 = 2 . 0 ;
double t h e t a = GTE C PI / 4 . 0 ;
double tantheta = std : : tan ( theta ) ;
s i z e t c o n s t numH = 5 1 2 , numR = 5 1 2 ;
s t d : : v e c t o r <V e c t o r 3<d o u b l e>> X(numH * numR ) ;
f o r ( s i z e t i h = 0 , i = 0 ; i h < numH ; ++i h )
{
d o u b l e h = h0 + ( h1 = h0 ) * ( d o u b l e ) i h / ( d o u b l e ) ( numH = 1 ) ;
double r = h * tantheta ;
f o r ( s i z e t i r = 0 ; i r < numR ; ++i r , ++i )
{
d o u b l e p h i = GTE C TWO PI * ( d o u b l e ) i r / ( d o u b l e )numR ;
double c s p h i = std : : cos ( phi ) ;
double snphi = std : : s i n ( phi ) ;
X [ i ] = V + h * U + r * ( c s p h i * W0 + s n p h i * W1 ) ;
}
}
V e c t o r 3<d o u b l e> C{ 0 . 0 , 0 . 0 , 0 . 0 } ; // t h e c e n t r o i d
f o r ( s i z e t i = 0 ; i < X . s i z e ( ) ; ++i )
{
C += X [ i ] ;
}
C /= ( d o u b l e )X . s i z e ( ) ;
V e c t o r 3<d o u b l e> U f i t { 0 . 0 , 0 . 0 , 0 . 0 } ; // t h e t h i r d = o r d e r term

{
V e c t o r 3<d o u b l e> d i f f = X [ i ] = C ;
U f i t += Dot ( d i f f , d i f f ) * d i f f ;
}
U f i t /= ( d o u b l e )X . s i z e ( ) ;
Normalize ( U f i t ) ;
d o u b l e H1cen = s t d : : n u m e r i c l i m i t s <d o u b l e > : : max ( ) ;

d o u b l e h 1 h a t = =h 0 h a t ;
{
d o u b l e h = Dot ( U f i t , d i f f ) ;
h 0 h a t = s t d : : min ( h , h 0 h a t ) ;
h 1 h a t = s t d : : max ( h , h 1 h a t ) ;
}
d o u b l e hsum = h 1 h a t + h 0 h a t ;
d o u b l e h 0 r e c o n s t r u c t = ( h 0 h a t * ( h 0 h a t + h 1 h a t ) = 2 . 0 * h 1 h a t * h 1 h a t ) / ( 3 . 0 * hsum ) ;
d o u b l e h 1 r e c o n s t r u c t = ( h 1 h a t * ( h 0 h a t + h 1 h a t ) = 2 . 0 * h 0 h a t * h 0 h a t ) / ( 3 . 0 * hsum ) ;
// h 0 h a t = = 0.50000000000238787
// h 1 h a t = 0 . 5 0 0 0 0 0 0 0 0 0 0 4 5 8 4 5 5
// hsum = = 8.7140517258189930 e =14
// h 0 r e c o n s t r u c t = = 75871822289.547775 ( o r i g i n a l was 1 . 0 )
// h 1 r e c o n s t r u c t = = 75871822288.547775 ( o r i g i n a l was 2 . 0 )
Other equations can be derived to attempt reconstruction of h0 and h1 . For example, we can compute
averages of powers of the height relative to the centroid,
Z h1Z 2π Z h1Z 2π
1 12
(U · (P − C)) dA = (h − p3 )2 h dh dφ = p4 − p23 (119)
A h0 0 π(h21 − h20 ) h0 0
Given a set of sample points, the equation to reconstruct the height extremes is
n
1X 2
p4 − p23 = (U · (X i − C)) = b (120)
n i=1
50
where the last equality defines b. We know the estimate h1 − h0 = ĥ1 − ĥ0 = a where the last equality defines
a. We need to solve h1 − h0 = a and p4 − p23 = b. The solution is
√ √ √ √
+(36ab − 3a3 ) ± 3 a2 a2 − 12b −(36ab − 3a3 ) ± 3 a2 a2 − 12b
h0 = , h 1 = (121)
6(a2 − 12b) 6(a2 − 12b)
In Listing 18, after reconstruction of h0hat and h1hat, add the code
double a = h1hat = h0hat ; // 1 . 0 0 0 0 0 0 0 0 0 0 0 6 9 7 2 4
double b = 0 . 0 ;
{
d o u b l e h = Dot ( U f i t , d i f f ) ;
b += h * h ;
}
b /= ( d o u b l e )X . s i z e ( ) ; // 0 . 0 8 3 6 5 9 4 9 1 1 9 3 7 3 3 3 5 1
double d i s c r = a * a = 12.0 * b ; // = 0.0039138943108554258
The discriminant a2 − 12b2 is negative, so there are no real-valued solutions for h0 and h1 .
Attempts to estimate h0 and h1 from the average integral of (U · (P − C))3 also failed when using the point
samples to estimate the integrals and then solve the resulting equations.
8.1.2 Attempt to Reconstruct the Cone Axis Direction
The eigendecomposition in equation (115) provides a symbolic expression involving the cone axis direction U .
In a numerical implementation, we need to identify which eigenvector output by the eigensolver corresponds
to U . The convariance matrix of the point samples is used to estimate ZZ T . We cannot expect two distinct
eigenvalues λ1 = p4 − p23 and λ2 = (p4 tan2 θ)/4, the first with multiplicity 1 and the second with multiplicity
2. For if we obtained such a result numerically, selection of U is trivial. Even theoretically we could have a
problem when there is a single eigenvalue of multiplicity 3.
If the point samples are nearly on a cone frustum, let the numerically computed eigenvalues be sorted as
λ1 ≤ λ2 ≤ λ3 . We could compute λ2 − λ1 and λ3 − λ2 and claim that the minimum difference is due to
numerical rounding errors. The conclusion is that those two eigenvalues are theoretically a single eigenvalue.
The other eigenvalue is the one associated with U . Some experiments showed that this is not a reliable way
to select U .
Instead we can use equation (116) to select U unambiguously. The integral ZZ T Z is estimated by a
summation involving the point samples as shown in Listing 18. The normalization of the summation is the
estimate for U , named Ufit in the listing.
8.1.3 Attempt to Reconstruct the Cone Vertex
As mentioned previously, reconstructing h0 and h1 would allow us to compute p3 and then V = C − p3 U ,

but the reconstruction attempts can fail numerically. A more reliable algorithm for estimating V is shown
here. The goal is to estimate p3 directly rather than estimating h0 and h1 and then computing p3 from
them.
In the following, U is the estimate of the cone axis direction obtained from equation (116). We want to
estimate a positive t for which V = C − tU . At the same time we need to estimate s = cos2 θ for the cone
51
angle. The algorithm will use the discrete least-squares error function for a specified U . Define ∆i = X i −C,
ai = U · ∆i and bi = ∆i · ∆i . Define

T
Fi = (X i − V ) sI − U U T (X i − V ) = s t2 + 2ai t + bi − (t + ai )2

(122)
Pn
The least-squares error function is E = ( i=1 Fi2 )/n. The derivatives of the Fi terms are
∂Fi ∂Fi
= t2 + 2ai t + bi , = 2(s − 1)(t + ai ) (123)
∂s ∂t
Some algebra will show that ∂E/∂t = 0 and ∂E/∂s = 0 lead to
sp3 (t) − q3 (t) = 0, sp4 (t) − q4 (t) = 0 (124)
where pd (t) and qd (t) are polynomials of degree d. Specifically,
p3 (t) = t3 + e0 t2 + e1 t + e2
q3 (t) = t3 + e0 t2 + e3 t + e4
(125)
p4 (t) = t4 + f0 t3 + f1 t2 + f2 t + f3
q4 (t) = t4 + f0 t3 + f4 t2 + f5 t + f6
where
1
Pn 1
Pn
e0 = n i=1 3ai f0 = n i=1 4ai
1
Pn 2 1
Pn 2
e1 = n i=1 (2ai + bi ) f1 = n i=1 (4ai + 2bi )
1
Pn 1
Pn
e2 = n i=1 ai bi f2 = n i=1 4ai bi
1
Pn 2 1
Pn 2
e3 = n i=1 3ai f3 = n i=1 bi
(126)
1
Pn 3 1
Pn 2
e4 = n i=1 ai f4 = n i=1 (5ai + bi )
1
Pn 3
f5 = n i=1 (2ai + 2ai bi )
1
Pn 2
f6 = n i=1 ai bi
Pn
Define crs = n1 i=1 ari bsi . We can solve one st-equation for s = q4 (t)/p4 (t), substitute into the other
st-equation and then multiply by p4 (t) to obtain
0 = p3 (t)q4 (t) − p4 (t)q3 (t) = g4 t4 + g3 t3 + g2 t2 + g1 t + g0 (127)
where
g4 = c30 − c11 + c10 (c01 − c20 )
g3 = c21 − c02 + c01 (c01 + c20 ) + 2(c10 (c30 − c11 ) − c220 )
g2 = 3(c11 (c01 − c20 ) + c10 (c21 − c02 )) (128)
g1 = c01 c21 − 3c02 c20 + 2(c20 c21 − c11 (c30 − c11 ))
g0 = c11 c21 − c02 c30
The least-squares error function is smooth and nonnegative, so there must be at least one (s, t) for which
∇E(s, t) = (0, 0). In theory, this means g(t) = 0 must have at least one real-valued root. Naturally, the
52
quartic root solver must guard against floating-point round-off errors to guarantee this. The GTE quartic
root solver can be executed with rational arithmetic to correctly classify the roots, after which rounding
errors will not change the classification. For each root t ≥ 0, compute s = q3 (t)/p3 (t). If s ∈ (0, 1), we
have a valid angle because s = cos2 θ. Evaluate E(s, t). Of all such valid pairs (s, t), choose the one whose
E-value is minimum. This pair is used to initialize V and cos θ.
Using floating-point arithmetic, it is unlikely but possible that there is no pair (s, t) generated by the
algorithm of the previous paragraph. An implementation must guard against this. Although most likely bad
estimates, choose the initial V to be the average C and chose the angle arbitrarily to be π/4 which is the
midpoint of the domain of the angles (0, π/2).
Listing 18 can be modified by replacing all the code after the line Normalize(Ufit) with that shown in Listing
19. The new listing shows an implementation for the initial guesses for V and cos2 θ.
Listing 19. Code for the initial guesses for the cone vertex and cone angle. The numbers in the comments
are those produced by the example in Listing 18.
d o u b l e c10 = 0 . 0 , c20 = 0 . 0 , c30 = 0 . 0 , c01 = 0 . 0 , c02 = 0 . 0 , c11 = 0 . 0 , c21 = 0 . 0 ;
{
d o u b l e a i = Dot ( U f i t , d i f f ) ;
d o u b l e b i = Dot ( d i f f , d i f f ) ;
c10 += a i ;
c20 += a i * a i ;
c30 += a i * a i * a i ;
c01 += b i ;
c02 += b i * b i ;
c11 += a i * b i ;
c21 += a i * a i * b i ;
}
c10 /= ( d o u b l e )X . s i z e ( ) ;
c20 /= ( d o u b l e )X . s i z e ( ) ;
c30 /= ( d o u b l e )X . s i z e ( ) ;
c01 /= ( d o u b l e )X . s i z e ( ) ;
c02 /= ( d o u b l e )X . s i z e ( ) ;
c11 /= ( d o u b l e )X . s i z e ( ) ;
c21 /= ( d o u b l e )X . s i z e ( ) ;
// The c o e f f i c i e n t s f o r p o l y n o m i a l s p3 ( t ) and q3 ( t ) .
double e0 = 3 . 0 * c10 ;
double e1 = 2 . 0 * c20 + c01 ;
double e2 = c11 ;
double e3 = 3 . 0 * c20 ;
double e4 = c30 ;
// The coefficients for p o l y n o m i a l s p4 ( t ) and q4 ( t ) .

double f 0 = 4 . 0 * c10 ;
double f 1 = 4 . 0 * c20 + 2 . 0 * c01 ;
double f 2 = 4 . 0 * c11 ;
double f 3 = c02 ;
double f 4 = 5 . 0 * c20 + c01 ;
double f 5 = 2 . 0 * c30 + 2 . 0 * c11 ;
double f 6 = c21 ;
// The c o e f f i c i e n t s f o r the q u a r t i c polynomial g( t ) .

double g0 = c11 * c21 = c02 * c30 ; // 0 . 0 5 3 5 6 6 2 8 6 6 0 3 4 1 8 6 1 8
double g1 = c01 * c21 = 3 . 0 * c02 * c20 + 2 . 0 * ( c20 * c21 = c11 * ( c30 = c11 ) ) ; // = 0.98354780514088114
double g2 = 3 . 0 * ( c11 * ( c01 = c20 ) + c10 * ( c21 = c02 ) ) ; // 1 . 7 5 7 0 9 4 8 9 0 8 7 3 9 7 8 5
double g3 = c21 = c02 + c01 * ( c01 + c20 ) + 2 . 0 * ( c10 * ( c30 = c11 ) = c20 * c20 ) ; // = 0.37366801803251715
double g4 = c30 = c11 + c10 * ( c01 = c20 ) ; // = 0.25097847358125042
// Compute t h e r o o t s o f g ( t ) = 0 .
s t d : : map<d o u b l e , i n t > rmMap ;
53
R o o t s P o l y n o m i a l <d o u b l e > : : S o l v e Q u a r t i c ( g0 , g1 , g2 , g3 , g4 , rmMap ) ;
// r o o t [ 0 ] = = 3.6829478517723415
// r o o t [ 1 ] = 0 . 0 6 1 0 2 5 5 0 6 2 0 5 9 0 0 9 1 5
// r o o t [ 2 ] = 0 . 6 3 3 0 7 7 4 5 5 0 0 8 2 1 9 2 1
// r o o t [ 3 ] = 1 . 4 9 9 9 9 9 9 9 9 9 9 6 1 3 1 5
s t d : : v e c t o r <s t d : : a r r a y <d o u b l e , 3>> i n f o ;

double s , t ;
f o r ( a u t o c o n s t& e l e m e n t : rmMap )
{
t = element . f i r s t ;
i f ( t > 0.0)
{
s = ( e4 + t * ( e3 + t * ( e0 + t ) ) ) / ( e2 + t * ( e1 + t * ( e0 + t ) ) ) ;
i f ( 0 . 0 < s && s < 1 . 0 )
{
double e r r o r = 0 . 0 ;
{
d o u b l e a i = Dot ( U f i t , d i f f ) ;
d o u b l e b i = Dot ( d i f f , d i f f ) ;
double t p a i = t + a i ;
double Fi = s * ( b i + t * ( 2 . 0 * a i + t )) = t p a i * t p a i ;
e r r o r += F i * F i ;
}
e r r o r /= ( d o u b l e )X . s i z e ( ) ;
s t d : : a r r a y <d o u b l e , 3> i t e m = { s , t , e r r o r } ;
i n f o . push back ( item ) ;
}
}
}
i f ( i n f o . s i z e ( ) > 0)
{
s t d : : a r r a y <d o u b l e , 3> m i n I t e m = i n f o . f r o n t ( ) ;
f o r ( a u t o c o n s t& i t e m : i n f o )
{
i f ( item [ 2 ] < minItem [ 2 ] )
{
minItem = item ;
}
}
// m i n I t e m = { minS , minT , m i n E r r o r }
V f i t = C = minItem [ 1 ] * U f i t ;
c o s A n g l e S q r F i t = minItem [ 0 ] ;
// We do n o t n e e d t h i s f o r t h e m i n i m i z e r , b u t l e t ’ s r e c o n s t r u c t
// t h e h e i g h t s t o s e e how c l o s e we g e t t o t h e o r i g i n a l o n e s
// ( h0 = 1 , h1 = 2 ) . Knowing t h e e s t i m a t e h1 = h0 = 1 , s o l v e
// minT = p3 = ( ( h1ˆ3= h0 ˆ 3 ) / 3 ) / ( ( h1ˆ2= h0 ˆ 2 ) / 2 )
// f o r h0 and h1 .
d o u b l e h0 = ( 6 . 0 + s t d : : s q r t ( 2 7 6 ) ) / 2 4 . 0 ; // 0 . 9 4 2 2 1 8 6 5 5 2 4 3 1 7 2 9 4
d o u b l e h1 = 1 . 0 + h0 ; // 1 . 9 4 2 2 1 8 6 5 5 2 4 3 1 7 3 1
// The o r i g i n a l c o n e h a s p3 = 14/9 = 1 . 5 5 5 5 5 5 and t h e m i n i m i z i n g

// t = v a l u e i s 1 . 4 9 9 9 9 9 9 9 9 9 9 6 1 3 1 5 .
}
else
{
Vfit = C;
cosAngleSqrFit = sqrt (0.5); // a n g l e F i t is p i /4
}
54
9 Fitting a Paraboloid to 3D Points of the Form (x, y, f (x, y))
Given a set of samples {(xi , yi , zi )}m

i=1 and assuming that the true values lie on a paraboloid
z = f (x, y) = p1 x2 + p2 xy + p3 y 2 + p4 x + p5 y + p6 = P · Q(x, y) (129)
where P = (p1 , p2 , p3 , p4 , p5 , p6 ) and Q(x, y) = (x2 , xy, y 2 , x, y, 1), select P to minimize the sum of squared
errors
Xm
E(P ) = (P · Qi − zi )2 (130)
i=1
where Qi = Q(xi , yi ). The minimum occurs when the gradient of E is the zero vector,
m
X
∇E = 2 (P · Qi − zi )Qi = 0 (131)
i=1
Some algebra converts this to a system of 6 equations in 6 unknowns:

m
! m
X X
T
Qi Qi P = zi Qi (132)
i=1 i=1
The product Qi QT T
i is a product of the 6 × 1 matrix Qi with the 1 × 6 matrix Qi , the result being a 6 × 6
matrix.
Pm Pm
Define the 6 × 6 symmetric matrix A = i=1 Qi QT i and the 6 × 1 vector B = i=1 zi Qi . The choice for
P is the solution to the linear system of equations AP = B. TheP entries of A and B indicate summations
m
over the appropriate product of variables. For example, s(x3 y) = i=1 x3i yi :
    
s(x4 ) s(x3 y) s(x2 y 2 ) s(x3 ) s(x2 y) s(x2 ) p1 s(zx2 )
    
 s(x3 y) s(x2 y 2 ) s(xy 3 ) s(x2 y) s(xy 2 ) s(xy)   p2   s(zxy) 
    
    
 s(x2 y 2 ) s(xy 3 ) s(y 4 ) s(xy 2 ) s(y 3 ) s(y 2 )   p3   s(zy 2 ) 
    
  =  (133)
 s(x3 ) 2 2 2
    
s(x y) s(xy ) s(x ) s(xy) s(x)   p4   s(zx) 
    
 s(x2 y) s(xy 2 ) s(y 3 ) s(y 2 )
    
s(xy) s(y)   p5   s(zy) 
    
2 2
s(x ) s(xy) s(y ) s(x) s(y) s(1) p6 s(z)
An implementation is ApprParaboloid3.h.
55

Least Squares Fitting of Data by Linear or Quadratic Structures

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Least Squares Fitting of Data by Linear or Quadratic Structures

Uploaded by

Copyright:

Available Formats

Least Squares Fitting of Data by Linear or Quadratic

2 The General Formulation for Nonlinear Least-Squares Fitting 3

3 Affine Fitting of Points Using Height Fields 4

4 Affine Fitting of Points Using Orthogonal Regression 10

5 Fitting a Hypersphere to Points 15

6 Fitting a Hyperellipsoid to Points 27

7 Fitting a Cylinder to 3D Points 30

8 Fitting a Cone to 3D Points 45

9 Fitting a Paraboloid to 3D Points of the Form (x, y, f (x, y)) 55

 Least-Squares Fitting of Data with Polynomials

2 The General Formulation for Nonlinear Least-Squares Fitting

3 Affine Fitting of Points Using Height Fields

We have a set of measurements {(X i , hi )}m n

3.1 Fitting by a Line in 2 Dimensions

The measurements are {(xi , hi }m

and has solution Pm

In terms of the original inputs, a = ā and b = h̄ − āx̄.

3.1.1 Pseudocode for Fitting by a Line

Listing 1 contains pseudocode for fitting a height line to points in 2 dimensions.

3.2 Fitting by a Plane in 3 Dimensions

and has solution

3.2.1 Pseudocode for Fitting by a Plane

Listing 2 contains pseudocode for fitting a height plane to points in 3 dimensions.

3.3 Fitting by a Hyperplane in n + 1 Dimensions

The measurements are {(X i , hi )}m

In terms of the original inputs, A = Ā and b = h̄ − Ā · X̄.

3.3.1 Pseudocode for Fitting a Hyperplane

Listing 3 contains pseudocode for fitting a height hyperplane to points in n + 1 dimensions.

// Compute t h e l i n e a r s y s t e m m a t r i x and v e c t o r e l e m e n t s . The f u n c t i o n

We have a set of measurements {X i }m n

4.1 Fitting by a Line [1 Dimension]

Listing 4 contains pseudocode for fitting a line to points in n dimensions with n ≥ 2.

// The f i t t e d l i n e i s u n i q u e when t h e maximum e i g e n v a l u e h a s m u l t i p l i c i t y 1 .

4.2 Fitting by a Hyperplane [(n − 1) Dimensions]

Compute the derivative of equation (24) with respect to A to obtain

4.2.1 Pseudocode for the General Case

Listing 5 contains pseudocode for fitting a hyperplane to points in n dimensions with n ≥ 3.

// The f i t t e d h y p e r p l a n e i s u n i q u e when t h e minimum e i g e n v a l u e h a s m u l t i p l i c i t y 1 .

4.3 Fitting by a Flat [k Dimensions]

Compute the derivative of equation (28) with respect to A to obtain

λ1 ··· λ1 λ2 ··· λ2 λr ··· λr

4.3.1 Pseudocode for the General Case

// The f i t t e d f l a t and i t s complement do n o t h a v e v e c t o r s from t h e same e i g e n s p a c e .

5 Fitting a Hypersphere to Points

The sample points are {X i }m

C 0 = X̄; C i+1 = F (C i ), i ≥ 0 (38)

5.1.1 Pseudocode for the General Case

// The i n i t i a l g u e s s f o r t h e c e n t e r is e i t h e r the incoming center of the

c e n t e r = averageX + averageL * averageU ;

The sample points are {X i }m

Setting this to zero, the center and radius must satisfy

Expanding the squared lengths,

where the last equality defines Bi . Equation (43) becomes

The least-squares center is obtained by solving the previous equation,

5.2.1 Pseudocode for the General Case

// Compute t h e c o v a r i a n c e m a t r i x M o f t h e Y [ i ] = X [ i ] =A and t h e r i g h t =hand s i d e R o f t h e l i n e a r

// S o l v e t h e l i n e a r s y s t e m M* (C=A) = R f o r t h e c e n t e r C . The f u n c t i o n ’ b o o l S o l v e (M, R , S ) ’ t r i e s

5.2.2 Pseudocode for Circles

// Compute t h e c o v a r i a n c e m a t r i x M o f t h e Y [ i ] = X [ i ] =A and t h e r i g h t =hand s i d e R o f t h e l i n e a r

5.2.3 Pseudocode for Spheres

// Compute t h e c o v a r i a n c e m a t r i x M o f t h e Y [ i ] = X [ i ] =A and t h e r i g h t =hand s i d e R o f t h e l i n e a r

5.3 Fitting the Coefficients of a Quadratic Equation

The general quadratic equation that represents a hypersphere in n dimensions is

Least-Squares Fitting of Data with Polynomials