You are on page 1of 42

Chapter 14

General Linear Squares and


Nonlinear
Regression

x = [-2.5
3.0
1.7 -4.9
y = [-20.1 -21.8 -6.0 -65.4

0.6
0.2

-0.5
4.0 -2.2 -4.3
0.6 -41.3 -15.4 -56.1

y = 20.5717 +3.6005x
Error Sr = 4201.3
Correlation r = 0.4434

-0.2];
0.5];

Large error,
poor correlation

Preferable to
fit a parabola

Polynomial Regression
Quadratic Least Squares
y = f(x) = a0+ a1x + a2x2
Minimize total square error

Sr ( a0 , a1 , a2 ) ( yi a0 a1 x i a2 x i2 ) 2
i 1

n
S r
2

2
y

a
x

a
x

i
0
1 i
2 i
a
i 1
0

n
S r
2

2
x
y

a
x

a
x

i
i
0
1 i
2 i
i 1
a1

n
S
r
0 2 x i2 yi a0 a1 x i a2 x i2

i 1
a 2

Quadratic Least Squares

x x

i 1
n

i1
n

x x

2
x
i
i 1

i 1
n

2
i

3
x
i
i1

2
i

a0

3
x i a1


i 1

a
n
2

4
x

i1
i 1
n

i 1
n

x y
i 1
n

2
x
y
i i
i1

Use Cholesky decomposition to solve for the


symmetric matrix
or use MATLAB function z = A\r

Standard error for 2nd polynomial regression


sy / x

Sr

n3

where
n observations
2nd order polynomial (3 coefficients)
(start off with n degrees of freedom, use up
m+1 for mth-order polynomial)

function [x,y] = example2


x = [ -2.5
3.0 1.7 -4.9 0.6 -0.5
4.0 -2.2 -4.3 -0.2];
y = [-20.1 -21.8 -6.0 -65.4 0.2 0.6 -41.3 -15.4 -56.1 0.5];

[x,y]=example2
z=Quadratic_LS(x,y)
x
y (a0+a1*x+a2*x^2) (y-a0-a1*x-a2*x^2)
-2.5000 -20.1000 -18.5529
-1.5471
3.0000 -21.8000 -22.0814
0.2814
1.7000
-6.0000
-6.3791
0.3791
-4.9000 -65.4000 -68.6439
3.2439
0.6000
0.2000
-0.2816
0.4816
-0.5000
0.6000
-0.7740
1.3740
4.0000 -41.3000 -40.4233
-0.8767
-2.2000 -15.4000 -14.4973
-0.9027
-4.3000 -56.1000 -53.1802
-2.9198
-0.2000
0.5000
0.0138
0.4862
err =
25.6043
Syx =
1.9125
Standard error of the estimate
r =
0.9975
Correlation coefficient r
z =
0.2668
0.7200
-2.7231
y = 0.2668 + 0.7200 x

- 2.7231 x2

Error Sr = 25.6043
Correlation r = 0.9975
Quadratic Least Square:
y = 0.2668 + 0.7200 x 2.7231 x2

Cubic Least Squares


f ( x ) a0 a1 x a2 x 2 a3 x 3
n

Sr ( y a0 a1 x i a2 x i2 a3 x i3 ) 2
i 1

i 1
n

2
x
i

3
x
i

2
x
i

3
x
i

4
x
i

i 1
n

i 1
n

i 1
n
i 1
n

i 1
n

i 1
n

x x x

2
i

xi
i 1
n

i1

x x x

3
i

i 1

4
i

i 1

5
i

3
i


4
x

i

i 1
n

5
x

i
i 1

n
6
xi

i 1

i 1
n

a0
a1

a2
a3

i 1
n

x y
i 1
n

2
x
i yi

i 1
n

x
i 1

3
i

yi

[x,y]=example2;
z=Cubic_LS(x,y)
x
y p(x)=a0+a1*x+a2*x^2+a3*x^3 y-p(x)
-2.5000 -20.1000 -19.9347
-0.1653
3.0000 -21.8000 -21.4751
-0.3249
1.7000
-6.0000
-5.0508
-0.9492
-4.9000 -65.4000 -67.4300
2.0300
0.6000
0.2000
0.5842
-0.3842
-0.5000
0.6000
-0.8404
1.4404
4.0000 -41.3000 -41.7828
0.4828
-2.2000 -15.4000 -15.7997
0.3997
-4.3000 -56.1000 -53.2914
-2.8086
-0.2000
0.5000
0.2206
0.2794
err =
15.7361
Syx =
1.6195
r =
0.9985
Correlation coefficient r
z =
0.6513
1.5946
-2.8078
-0.0608

= 0.9985

y = 0.6513 + 1.5946x 2.8078x2 0.0608x3

[x,y]=example2;
z1=Linear_LS(x,y); z1
z1 =
-20.5717

Linear Least Square

3.6005

z2=Quadratic_LS(x,y); z2
z2 =
0.2668

0.7200

Quadratic Least Square

-2.7231

z3=Cubic_LS(x,y); z3

Cubic Least Square

z3 =
0.6513

1.5946

-2.8078

-0.0608

x1=min(x); x2=max(x); xx=x1:(x2-x1)/100:x2;


yy1=z1(1)+z1(2)*xx;
yy2=z2(1)+z2(2)*xx+z2(3)*xx.^2;
yy3=z3(1)+z3(2)*xx+z3(3)*xx.^2+z3(4)*xx.^3;
H=plot(x,y,'r*',xx,yy1,'g',xx,yy2,'b',xx,yy3,'m');
xlabel('x'); ylabel('y');
set(H,'LineWidth',3,'MarkerSize',12);
print -djpeg075 regres4.jpg

Linear Least Square: y = 20.5717 +


3.6005x
Quadratic: y = 0.2668 + 0.7200 x 2.7231x2
Cubic: y = 0.6513 + 1.5946x 2.8078x2 0.0608x3

Standard error for polynomial regression


sy/ x

Sr
n m 1

where
n observations
m order polynomial
(start off with n degrees of freedom, use up
m+1 for mth-order polynomial)

Multiple Linear Regression

Dependence on more than one variable

y a0 a1 x1 a2 x2
ei yi ( a0 a1 x1 a2 x2 )
e.g. dependence of runoff volume on soil
type and land cover,
or dependence of aerodynamic drag on
automobile shape and speed

Multiple Linear Regression

With two independent variables, get a surface


Find the best-fit plane to the data

Multiple Linear Regression


Much like polynomial regression
Sum of squared residuals

S r yi a0 a1 x 1, i a2 x 2 , i

S r
a 0 2 yi a0 a1 x 1,i a2 x 2 , i
0

S r
0 2 x 1, i yi a0 a1 x 1, i a2 x 2 , i

a 1

S r
0 2 x 2 , i yi a0 a1 x 1, i a2 x 2 , i

a 2

Rearrange the equations

x1,i

i 1
n

x1,i

i 1

2,i

x 2,i

i 1
n
i 1

x
i1

1,i

a0

n
a
x
x

1,i 2 , i
1
i 1

n
a2

x 2,2 i

i 1

x 2 ,i

i 1

yi

i 1

x1,2 i
n

x1,i yi

i 1
n

x
i1

yi

2,i

Very similar to polynomial regression

x x
i 1
n

xi

2
x
i

2
x
i

3
x
i

i 1
n

i1

i1
n
i 1

2
i

a0

3
x
a

i
1

i 1

n
a2

4
x

i 1
i1
n

i 1
n

x i yi

i 1
n

i 1

2
x
y
i i

Multiple Linear Regression


Once again, solve by any matrix method
Cholesky decomposition is appropriate symmetric and positive definite

Very useful for fitting power equation

y a0 x 1a1 x 2a2 x mam


log y log a0 a1 log x 1 a2 log x 2 am log x m

Example: Strength of concrete depends on


cure time and cement/water ratio (or water
content W/C)
cure time days W/C
strength psi
2
0.42
2770
4
0.55
2639
5
0.7
2519
16
0.53
3450
3
0.61
2315
7
0.67
2545
8
0.55
2613
27
0.66
3694
14
0.42
3414
20
0.58
3634

x1=[2 4 5 16 3 7 8 27 14 20];
x2=[0.42 0.55 0.7 0.53 0.61 0.67 0.55 0.66 0.42 0.58];
y=[2770 2639 2519 3450 2315 2545 2613 3694 3414 3634];
H=plot3(x1,x2,y,'ro'); grid on; set(H,'LineWidth',5);
H1=xlabel('Cure Time (days)'); set(H1,'FontSize',12)
H2=ylabel('Water Content'); set(H2,'FontSize',12)
H3=zlabel('Strength (psi)'); set(H3,'FontSize',12)

Hand Calculations
cure time days W/C strength psi
2
0.42
2770
4
0.55
2639
5
0.7
2519
16
0.53
3450
3
0.61
2315
7
0.67
2545
8
0.55
2613
27
0.66
3694
14
0.42
3414
20
0.58
3634
sum(x1)
106

sum(x2)
5.69

i 1
n

i 1

i 1
n

1,i

x
n

2,i

i 1

x
i 1

1,i

sum(y)
29592

2
1,i

x 2 ,i

2,i

4
16
25
256
9
49
64
729
196
400

x2^2
0.1764
0.3025
0.49
0.2809
0.3721
0.4489
0.3025
0.4356
0.1764
0.3364

x1*y
5540.35
10557.82
12592.77
55195.7
6944.629
17815.74
20900.17
99729.22
47793.4
72683.15

x2*y
1163.473
1451.7
1762.988
1828.358
1412.075
1705.22
1436.886
2437.825
1433.802
2107.811

a0

x1,i x 2 , i a1


i1

n
a2

x 2,2 i

i 1
i 1

x1^2

sum(x1x2) sum(x1^2)sum(x2^2) sum(x1y) sum(x2y)


61.24
1748
3.3217 349752.9 16740.14

1,i

x1*x2
0.84
2.2
3.5
8.48
1.83
4.69
4.4
17.82
5.88
11.6

y
n

i 1

1,i

2,i

i 1

106
5.69
10
106 1748 61.24

5.69 61.24 3.32

x
i 1
n

yi
yi

a0
29592

a1 349753
a
16740
2

Solve by Cholesky decomposition


106
5.69
0
0
10
3.16
106 1748 61.24 33.52 24.99
0

5.69 61.24 3.32


1.80 0.04 0.29

3.16
0

33.52 1.80
24.99 0.04

0
0.29

Forward and Back Substitutions


a0
a
1
a2

3358
60

1827

strength( psi ) 3358 60 ( cure days ) 1827 (W / C )

function [x1,x2,y] = concrete


x1=[2 4 5 16 3 7 8 27 14 20];
x2=[0.42 0.55 0.7 0.53 0.61 0.67 0.55 0.66 0.42 0.58];
y=[2770 2639 2519 3450 2315 2545 2613 3694 3414 3634];
[x1,x2,y]=concrete;
z=Multi_Linear(x1,x2,y)
x1
x2
y
(a0+a1*x1+a2*x2) (y-a0-a1*x1-a2*x2)
2
0.42
2770
2711.3
58.652
4
0.55
2639
2594.7
44.267
5
0.7
2519
2381.1
137.94
16
0.53
3450
3357.3
92.72
3
0.61
2315
2424.6
-109.57
7
0.67
2545
2556.9
-11.895
8
0.55
2613
2836.7
-223.73
27
0.66
3694
3785.2
-91.158
14
0.42
3414
3437.3
-23.339
20
0.58
3634
3507.9
126.11
Syx =
130.92
r =
Correlation coefficient
0.97553
z =
0
1
2

3358

60.499

-1827.8

(a , a , a )

strength ( psi ) 3358 60.499 ( cure days ) 1827.8 (W / C )

Multiple Linear Regression

strength ( psi ) 3358 60.499 ( cure days ) 1827.8 (W / C )

xx=0:0.02:1; yy=0:0.02:1; [x,y]=meshgrid(xx,yy);


z=2*x+3*y+2;
surfc(x,y,z); grid on
axis([0 1 0 1 0 7])
xlabel('x1'); ylabel('x2'); zlabel('y')

General Linear Least Squares

Simple linear, polynomial, and multiple linear


regressions are special cases of the general linear
least squares model

y a0 z0 a1 z 1 a2 z 2 ... am z m e

Examples:
y a0 a1 cos( t ) a2 sin( t )
y a0 x 2 a1 sin x

Linear in ai , but zi may be highly nonlinear

General Linear Least Squares

General equation in matrix form

y Z a e

Where
z01
z
02

z 11
z 12

zm 1
zm 2

z0 n

z mn

y T y1
T
a a1
e T e 1

z1n

y1 y n

a1 a m
e1 en

Dependent variables
Regression coefficients
Residuals

General Linear Least Squares

As usual, take partial derivatives to


minimize the square errors Sr
n

Sr yi a j z ji
i 1
j 0

This leads to the normal equations

Z a Z y
T

Solve this for {A} using Cholesky LU


decomposition, or matrix inverse

Nonlinear Regression

Use Taylor series expansion to linearize the


original equation
Gauss-Newton method
Nonlinear function of a1, a2, , am

yi f x i ; a0 , a1 ,..., am e i

Where f is a nonlinear function of x


(xi, yi) are one of a set of n observations

Nonlinear Regression

Use Taylor series for f, and truncate the


higher-order terms
yi f ( x i ; a1 , a 2 , , a m ) e i

f xi j 1 f xi j

f x i j
a0

a0

f x i j
a1

a1

f x i j
a m

j = the initial guess


j+1 = the prediction (improved guess)

am

Nonlinear Regression

Plug the Taylor series into original equation


yi f x i j

f x i j
a0

a0

f x i j
a1

a1

f x i j
a m

am e i

or
yi f x i j

f x i j
a0

a0

f x i j
a1

a1

f x i j
a m

am e i

Gauss-Newton Method

Given all n equations


y1 f x 1 j
y2 f x 2 j

yn f x n j

f x 1 j
a0

f x 2 j
a0
f x n j
a0

a0
a0

a0

f x 1 j
a 1

f x 2 j
a 1
f x n j
a1

Set up matrix equation

a1
a1

a1

f x 1 j
a m

f x 2 j
am
f x n j
am

D Z j A E

am e 1
am e 2

am e n

D Z j A E

where

f x 1 j

f x 1 j

a0
f x 2 j
Z j a0

f x n j
a
0

a1
f x 2 j

f 1

a
a m
0

f x 2 j
f 2

am a0



f
f x n j
n

a0
am

a1

f x n j

a1

y1 f x 1 j
y fx
2
2 j

f x 1 j

a0
a

yn f x n j

; A


am

f 1
a 1
f 2
a 1

f n
a 1

f 1
a m

f 2

a m


f n

am

e1
e
2


e n

Gauss-Newton Method

Using the same least squares approach


Minimizing sum of squares of residuals e

Z Z A Z
T

Get A from

A Z j Z j Z j D
T

Now modify a1, a2, , am with A and repeat


the procedure until convergence is reached

Example: Damped Sinusoidal


function [x,y] = mass_spring
x = [0.00 0.11 0.18 0.25 0.32 0.44 0.55 0.61 0.68 0.80 ...
0.92 1.01 1.12 1.22 1.35 1.45 1.60 1.67 1.76 1.83 2.00];
y = [1.03 0.78 0.62 0.22 0.05 -0.20 -0.45 -0.50 -0.45 -0.31 ...
-0.21 -0.11 0.04 0.12 0.22 0.23 0.18 0.10 0.07 -0.02 -0.10];

a0 x
y

e
cos( a1 x )
Model it with

Gauss-Newton Method
f x e a0 x cos( a1 x )
f
a0 x

xe
cos( a1 x )
a

f xe a0 x sin( a x )
1
a1
f 1
a
0
f 2
Z a0

f
n
a0

y1 e a0 x cos( a1 x1 )

a0 x
y2 e
cos( a1 x 2 )

a0 x
cos( a1 x n )
yn e

f 1
a0 x1
a0 xn
a1

xe
cos(
a
x
)

xe
sin( a1 x n )
1 1

f 2
a0 x 2
a0 xn
xe
cos( a1 x 2 ) xe
sin( a1 x n )

a1

a0 xn
a0 xn

xe
cos( a1 x n ) xe
sin( a1 x n )
f n

a1

[x,y]=mass_spring;

Choose initial a0 = 2, a1 = 3

a=gauss_newton(x,y)

Enter the initial guesses [a0,a1] = [2,3]


Enter the tolerance tol = 0.0001
Enter the maximum iteration number itmax = 50
n =

21 data points

21
iter

a0

a1

da0

da1

1.0000

2.1977

5.0646

0.1977

2.0646

2.0000

1.0264

3.9349

-1.1713

-1.1296

3.0000

1.1757

4.3656

0.1494

0.4307

4.0000

1.1009

4.4054

-0.0748

0.0398

5.0000

1.1035

4.3969

0.0026

-0.0085

6.0000

1.1030

4.3973

-0.0005

0.0003

7.0000

1.1030

4.3972

0.0000

0.0000

Gauss-Newton method has converged


a =
1.1030

4.3972

f x e

1.1030 x

cos( 4.3972 x )

a0 = 1.1030, a1 = 4.3972
f x e 1.1030 x cos( 4.3972 x )