You are on page 1of 7

Section-5 Curve Fitting Least-Squares Regression

Least-Squares Polynomial Regression for Discrete Data


Consider a set of data ( , ) for = 1,2, , . The behaviour of this data can be approximated using leastsquares regression by employing an nth order polynomial ( ) =
+
++
+
where < 1. The objective is to find the coefficients (i.e. as) by minimising the least-squares error E.

For minimisation, set

( )) =

( )

( ))

= 0 for each = 0,1, , . Thus,


= 0 = 2

+2

Finally, we get n+1 normal equations for n+1 unknowns (which are the
=

s) as follows

= 0,1, ,

Expanding the above general equation


+

++

+ +

++

These normal equations can be written in matrix form


= where = [
symmetric and non-singular matrix. There is a unique solution on the condition that

] and A is a
are distinct.

Note that normal equations tend to be ill-conditioned especially for higher-order polynomial regression.
Then the computed coefficients become sensitive to round-off error causing some inaccuracy. (Solve the
example question in the textbook).
Example1: Consider the following data where is the measured variable which depends on . Fit a 2nd order
( )=
polynomial
+
+
to the data using least-squares regression.
x=0:5; y=[2.1 7.7 13.6 27.2 40.9 61.1];

Solution1: You can use Matlabs function polyfit. The polyfit function uses the backslash operator \ to
solve the least-squares problem (See the section General Linear Least-Squares Approximation for Discrete
Data described later in this document).
>> p=polyfit(x,y,2)
p =
1.860714285714288

2.359285714285703

2.478571428571445

% p gives the coefficients of the polynomial ordered in descending powers.

Therefore,

= (3),

= (2),

= (1)

% Read the help files of the functions polyfit and polyval.


% To plot the data together with the fitted polynomial, define
>> xx=0:0.01:5;
>> plot(x,y,'o',xx,polyval(p,xx))

% Alternatively, you can solve this problem using the Basic Fitting tool of Matlab. You
just need to plot the data first by typing plot(x,y,'o'). A figure window then opens and
in this window, select the Tools menu to access the Basic Fitting tool.

Approximation of Continuous Functions Using Least-Squares Polynomial Regression


It is required to minimise
and
/

= ( ( )

( ))

( )=
+
++
= 0 for each = 0,1, , .
=

( )

where ( ) is a function defined in the interval [ , ]


. Then,

( )

( )

= 2

= ( ( )

and set

+2

=0

The n+1 normal equations to solve for the n+1 coefficients become
( )

=
When written in matrix form

= (where

=[

= 0,1, ,

] ), the elements of A are of the form

= 2) to approximate

Example2: Determine the quadratic polynomial (i.e.


least-squares regression.
Solution2: We need to find ( ) = +
+
1

+1

1
1/2 1/3
1/2 1/3 1/4
1/3 1/4 1/5

= 0.050465,

[0,1] using

)
(

) for

)
(

= 4.12251,

2/
1/
4)/
=

The matrix A is known as the Hilbert matrix and it is ill-conditioned, hence its condition number is
considerably greater than 1.

Orthogonal Functions and Least-Squares Regression


Definition: An integrable function is called as a weight function on an interval if ( ) 0 for all x in I,
on the condition that ( ) 0 on any subinterval in I. Weight function places different levels of importance
to approximations on certain sections of the interval.
Let
, ,,
be a set of linearly independent functions on the interval = [ , ] and
function for I. Then a function ( ) can be written as a linear combination of s as follows
( )=
Let ( ) approximate a function ( ) with respect to
error to be minimised is written as

is a weight

( )
( ) on = [ , ] in a least-squares sense, then the

( )[ ( ) ( )]

( )

( ) = 1 and

(Remember that in least-squares polynomial regression,


=0=2

( )

( )

( )

( )
( )=

( )

( )

for

= 0,1, , )

= 0,1, ,

Then the system of normal equations are written as


( ) ( )
If the functions

,,

( )

( )

( )

( )

= 0,1, ,

are chosen to satisfy the orthogonality condition,


( )

( )

( )

then the normal equations become simplified as follows


( ) ( )
Finally,

( )

( )

( )

= 0,1, ,

s are easily found as


=

( ) ( )

( )

= 0,1, ,

See that the procedure of least-squares regression is considerably simplified when the functions are chosen to
be orthogonal. We can summarise the results as follows:
If

, ,,
is an orthogonal set of functions on an interval [ , ] with respect to the weight function
, then the least-squares approximation to ( ) on [ , ] with respect to is
( )=

= 0,1, ,

( ) ( )

Note that if = 1 for

( )

( )

( )
( )

( ) ( )

( )

= 0,1, , , then the set is said to be orthonormal.

Example3: Chebyshev polynomials are orthogonal on (1,1) with respect to the weight function
( )=

1
1

As shown in the below figure, ( ) places more emphasis at the ends of the interval and less emphasis
around the centre of the interval (1,1).

Some other examples of orthogonal polynomials are Hermite, Jacobi, Legendre and Laguerre orthogonal
polynomials. It is possible to construct orthogonal polynomials on an interval [ , ] with respect to a weight
function using a procedure called Gram-Schmidt orthogonalization.

General Linear Least-Squares Approximation for Discrete Data


The general linear least-squares problem can be expressed as
=

( ) +

( ) +

( ) ++

) +

= 1,2, ,

where n is the number of data points; ( , , , ) are the coefficients; ( , , , ) are the (m+1)
basis functions;
is the measured value of the dependent variable for the
data point and
is the
discrepancy between
and the corresponding approximate value given by the linear least-squares model.
(See Section 17.4 in the textbook).
The term linear corresponds to the models dependence on its coefficients. Note that the basis functions
can still be nonlinear. For instance, the model ( ) = (1
) is a nonlinear model as it cannot be
written as a linear combination of some basis functions.
Some examples of linear least-squares model are:
Linear regression:
+
thus
= 1,
=
Polynomial regression:
+
++
thus
Multiple linear regression:
+
+
++

= 1,
= ,
= ,,
=
thus
= 1,
= , ,
=

The general linear least-squares problem can be expressed in matrix form as =


+ where

] , =[
] and = [
] . Z is an ( ( + 1))
=[
matrix where , = ( ) and ( ) is the value of the
basis function calculated at the given values of the
independent variable(s) for the
data. Note that > + 1 which means that the system of linear
equations is typically overdetermined i.e. there are more equations than unknowns.
The matrix Z is also called the design matrix and it is given as
( )
( )
=

( )

( )
( )

( )

( )
( )

( )

For least-squares approximation, minimise


5

( )

by setting
/
= 0 for = 0,1, , . In other words, we are minimising the length of the error vector
where =
. This problem is equivalent to minimising the objective function where we can
easily express using the dot product operation between two vectors as follows:
1
1
= | | =
2
2

1
= [
2

] [

For least-squares approximation, we set


/
= 0 for = 0,1, , . Therefore, insert the expressions of
, and into the above equation to expand and then take the partial derivatives of to obtain ( + 1)
equations with ( + 1) unknowns. These ( + 1) equations are called the normal equations which can be
collected and expressed in matrix form as given below:
(

)=0

Note that the vector 0 is a zero column-vector with the size ( + 1) 1. If the basis functions are
independent, then
is nonsingular and the coefficients can be found by using matrix inverse as follows:
=
. But remember that using matrix inverse is less efficient and less accurate than solving
the system by Gauss Elimination.
If the design matrix

is rank deficient or has more columns than rows, the matrix

becomes singular and

does not exist. In such a case, we do not have a unique solution for the normal equations.
Note that the normal equations are always more badly conditioned than the original overdetermined system.
In order to circumvent this problem, orthogonalisation algorithms such as QR factorization (which is also
used by Matlab) can be employed.
In Matlab, the least-squares solution to the problem =
+ (i.e. ) is given by = \ where
Z is the design matrix. In Matlab, the backslash operator \ is the same as the function mldivide. See the
help file of mldivide to see the algorithms used by Matlab.
Matlab avoids the normal equations. Matlab's backslash operator utilises QR factorization when the
coefficient matrix is not square. Recall that the normal equations are always more badly conditioned than the
original overdetermined system in a typical least-squares problem. In order to circumvent this problem, QR
factorization is employed.
Question: If = + 1 , then what happens? In that case, the number of basis functions will be equal to the
number of data points and becomes a square matrix. Then, =
=
( )
=
which means = . What happens to then?
Example4: The variable y is a function of two independent variables
and . The measured values of y for
several values of
and
are given in the below table. Fit the model
+
+
to this data. Then
the problem is =
+
+
+ where denotes the error.
0
0
5

2
1
10

2.5
2
9

1
3
0

4
6
3

7
2
27

Solution4: See that the basis functions are


= 1,
= ,
=
Define , and as column vectors in Matlab. Then, define the design matrix in Matlab as follows
Z=[ ones(size(x1))

x1

x2 ]

Solution4a: Type a=(Z'*Z)\(Z'*y) to find the coefficients

= 5,

= 4,

= 3

cond(Z'*Z)= 65.466

Solution4b: Type a=Z\y to find the coefficients


*Prefer Solution4b.

= 5,

= 4,

= 3

Example5: Consider a data where and are the independent and dependent variables respectively. It is
required to fit a linear model of the form
+
+
to this data using least-squares regression.
The data is given as t=[0 0.3 0.8 1.1 1.6 2.3]'; y=[0.6 0.67 1.01 1.35 1.47 1.25]';
Solution5: In Matlab, define the design matrix as Z=[ ones(size(t)) exp(-t) t.*exp(-t) ]
The coefficients of the model are then calculated as:
>> a=Z\y
a =
1.398315282466043
-0.885977444768642
0.308457854291529
>> tt=(0:0.01:2.5)'; ym=[ ones(size(tt)) exp(-tt) tt.*exp(-tt) ]*a;
%Define vector tt to plot the fitted curve (i.e. the model).
%ym is the y-values produced by the model.
%Alternatively, you could define ym as ym=a(1)+a(2)*exp(-tt)+a(3)*tt.*exp(-tt);
>> plot(t,y,'o',tt,ym), xlabel('t'), ylabel('y')

You might also like