You are on page 1of 38

Linear Regression

Yrd. Doç. Dr. Amar Hasan Hameed

MAM 3206 Numerical Analysis


Autumn 2018

1/31/2019 1
What is Regression?
What is regression? Given n data points ( x1, y1), ( x 2, y 2), ... , ( xn, yn)
best fit y  f (x ) to the data. The best fit is generally based on
minimizing the sum of the square of the residuals, Sr.

Residual at a point is ( xn, yn )


i  yi  f ( xi )

y  f (x )
Sum of the square of the residuals
n
Sr   ( yi  f ( xi )) 2 ( x1, y1)
i 1
Figure. Basic model for regression

2
Linear Regression-Criterion#1
Given n data points ( x1, y1), ( x 2, y 2), ... , ( xn, yn ) best fit y  a0  a1 x
to the data.
y

xi , yi

 i  yi  a0  a1 xi xn , y n

x2 , y2
x3 , y3

x1 , y1  i  yi  a0  a1 xi

x
Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .
n
Does minimizing 
i 1
i work as a criterion, where i  yi  (a 0  a1 xi )
3
Example for Criterion#1
Example: Given the data points (2,4), (3,6), (2,6) and (3,8), best fit
the data to a straight line using Criterion#1

Table. Data Points 10

8
x y
2.0 4.0 6
y

3.0 6.0 4

2.0 6.0 2

3.0 8.0 0
0 1 2 3 4
x

Figure. Data points for y vs. x data.

4
Linear Regression-Criteria#1
Using y=4x-4 as the regression curve

Table. Residuals at each point for


regression model y = 4x – 4. 10

x y ypredicted ε = y - ypredicted 8

2.0 4.0 4.0 0.0 6

y
3.0 6.0 8.0 -2.0 4
2.0 6.0 4.0 2.0
2
3.0 8.0 8.0 0.0
0
4


0 1 2 3 4
i 0
i 1 x

Figure. Regression curve for y=4x-4, y vs. x data

5
Linear Regression-Criteria#1
Using y=6 as a regression curve

Table. Residuals at each point for y=6


10
x y ypredicted ε = y - ypredicted
8
2.0 4.0 6.0 -2.0
6
3.0 6.0 6.0 0.0

y
4
2.0 6.0 6.0 0.0
2
3.0 8.0 6.0 2.0
4 0

i 1
i
0 0 1 2 3 4
x

Figure. Regression curve for y=6, y vs. x data

6
Linear Regression – Criterion #1


i 1
i  0 for both regression models of y=4x-4 and y=6.

The sum of the residuals is as small as possible, that is zero,


but the regression model is not unique.

Hence the above criterion of minimizing the sum of the


residuals is a bad criterion.

7
Linear Regression-Criterion#2
n

Will minimizing 
i 1
i work any better?

y
xi , yi

 i  yi  a0  a1 xi xn , y n

x2 , y2
x3 , y3

x1 , y1  i  yi  a0  a1 xi

x
Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .

8
Linear Regression-Criteria 2
Using y=4x-4 as the regression curve

Table. The absolute residuals


employing the y=4x-4 regression 10
model
8

x y ypredicted |ε| = |y - ypredicted| 6

y
2.0 4.0 4.0 0.0
4
3.0 6.0 8.0 2.0
2
2.0 6.0 4.0 2.0
0
3.0 8.0 8.0 0.0 0 1 2 3 4
4


x
4
i
i 1 Figure. Regression curve for y=4x-4, y vs. x data

9
Linear Regression-Criteria#2
Using y=6 as a regression curve

Table. Absolute residuals employing


the y=6 model 10

x y ypredicted |ε| = |y – 8
ypredicted|
6

y
2.0 4.0 6.0 2.0
4
3.0 6.0 6.0 0.0
2
2.0 6.0 6.0 0.0 0
3.0 8.0 6.0 2.0 0 1 2 3 4
4 x

i 1
i 4
Figure. Regression curve for y=6, y vs. x data

10
Linear Regression-Criterion#2
4


i 1
i  4 for both regression models of y=4x-4 and y=6.

The sum of the absolute residuals has been made as small as


possible, that is 4, but the regression model is not unique.
Hence the above criterion of minimizing the sum of the absolute
value of the residuals is also a bad criterion.
4

Can you find a regression line for which 


i 1
i 4

11
Least Squares Criterion
The least squares criterion minimizes the sum of the square of the
residuals in the model, and also produces a unique line.
n n 2

Sr    i    yi  a0  a1 xi 
2

i 1 i 1
y

xi , yi

 i  yi  a0  a1 xi xn , y n

x2 , y2
x3 , y3

x1 , y1
 i  yi  a0  a1 xi

x
Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .
12
Finding Constants of Linear Model
n n 2

Minimize the sum of the square of the residuals: Sr    i    yi  a0  a1 xi 


2

i 1 i 1
To find a 0 and a1 we minimize Sr with respect to a1 and a0 .
S r n
 2  yi  a0  a1 xi  1  0
a0 i 1

S r n
 2  yi  a0  a1 xi  xi   0
a1 i 1

giving
n n n

a  a x   y
i 1
0
i 1
1 i
i 1
i

n n n

a x  a x   yi xi
2
0 i 1 i
i 1 i 1 i 1

13
Finding Constants of Linear Model
Solving for a 0 and a1 directly yields,
n n n
n xi y i  xi  y i
i 1 i 1 i 1
a1  2
2  
n n
n xi   xi 
i 1  i 1 
and
n n n n

x y x x y
2
i i i i i
a0  i 1 i 1 i 1 i 1 a0  y  a1 x
2
n
 n 
n x   xi 
2
i
i 1  i 1 

14
Example 1
The torque, T needed to turn the torsion spring of a mousetrap through
an angle, is given below. Find the constants for the model given by
T  k1  k 2

Table: Torque vs Angle for a


0.4
torsional spring

Angle, θ Torque, T
Torque (N-m)

0.3

Radians N-m
0.2
0.698132 0.188224
0.959931 0.209138
1.134464 0.230052 0.1
0.5 1 1.5 2
1.570796 0.250965
θ (radians)
1.919862 0.313707
Figure. Data points for Torque vs Angle data

15
Example 1 cont.
The following table shows the summations needed for the calculations of
the constants in the regression model.
Table. Tabulation of data for calculation of important
summations
Using equations described for
 T 2 T
a 0 and a1 with n  5
Radians N-m Radians2 N-m-Radians 5 5 5
0.698132 0.188224 0.487388 0.131405 n   i Ti   i  Ti
0.959931 0.209138 0.921468 0.200758 k2  i 1 i 1 i 1
2
2  
5 5
1.134464 0.230052 1.2870 0.260986
n   i    i 
1.570796 0.250965 2.4674 0.394215 i 1  i 1 
51.5896  6.28311.1921
1.919862 0.313707 3.6859 0.602274
5


i 1
6.2831 1.1921 8.8491 1.5896 58.8491  6.2831
2

 9.6091102 N-m/rad

16
Example 1 cont.
Use the average torque and average angle to calculate k1
5 5

_ T i _  i
T  i 1
 i 1
n n
1.1921 6.2831
 
5 5
 2.3842  10 1  1.2566

Using,
_ _
k1  T  k 2 
 2.3842  101  (9.6091  102 )(1.2566)
 1.1767  101 N-m

17
Example 1 Results
Using linear regression, a trend line is found from the data

Figure. Linear regression of Torque versus Angle data

Can you find the energy in the spring if it is twisted from 0 to 180 degrees?
18
Linear Regression (special case)

Given

( x1 , y1 ), ( x2 , y2 ), ... , ( xn, yn)

best fit

y  a1 x

to the data.

19
Linear Regression (special case cont.)
y  a1 x
n n n
n xi y i  xi  y i
i 1 i 1 i 1
a1  2
n
  n
n xi2   xi 
i 1  i 1 

Is this correct?

20
Linear Regression (special case cont.)

y
xi , yi

 i  yi  a1 xi xn , y n

xi , a1 xi

x1 , y1

21
Linear Regression (special case cont.)

Residual at each data point

 i  yi  a1 xi
Sum of square of residuals
n
Sr    i
2

i 1

n 2

   yi  a1 xi 
i 1

22
Linear Regression (special case cont.)
Differentiate with respect to a1
n
dS r
  2 yi  a1 xi  xi 
da1 i 1

 
n
   2 yi xi  2a1 xi
2

i 1

dS r
0
da1
gives
n

x y i i
a1  i 1
n

x
i 1
2
i

23
Linear Regression (special case cont.)
Does this value of a1 correspond to a local minima or local
maxima? n

x y i i
a1  i 1
n

 i
x 2

i 1

 
n
dS r
   2 yi xi  2a1 xi
2

da1 i 1
d 2Sr n
  2 xi  0
2
2
da1 i 1

Yes, it corresponds to a local minima.


n

x y i i
a1  i 1
n

x
i 1
2
i

24
Example 2
To find the longitudinal modulus of composite, the following data is
collected. Find the longitudinal modulus, E using the regression model
Table. Stress vs. Strain data
Strain Stress
  E and the sum of the square of the
(%) (MPa)
residuals.
0 0 3.0E+09
0.183 306
0.36 612 Stress, σ (Pa)
2.0E+09
0.5324 917
0.702 1223
0.867 1529 1.0E+09
1.0244 1835
1.1774 2140
1.329 2446 0.0E+00
0 0.005 0.01 0.015 0.02
1.479 2752
Strain, ε (m/m)
1.5 2767
1.56 2896 Figure. Data points for Stress vs. Strain data
26
Example 2 cont.
n
Table. Summation data for regression model   i i

i ε σ ε2 εσ E i 1
n


2
1 0.0000 0.0000 0.0000 0.0000 i
2 1.8300×10−3 3.0600×108 3.3489×10−6 5.5998×105 i 1
12
3
4
3.6000×10−3
5.3240×10−3
6.1200×108
9.1700×108
1.2960×10−5
2.8345×10−5
2.2032×106
4.8821×106

i 1
i
2
 1.2764  10 3
5 7.0200×10−3 1.2230×109 4.9280×10−5 8.5855×106 12

6 8.6700×10−3 1.5290×109 7.5169×10−5 1.3256×107  


i 1
i i  2.3337  10 8
7 1.0244×10−2 1.8350×109 1.0494×10−4 1.8798×107 12
8 1.1774×10−2 2.1400×109 1.3863×10−4 2.5196×107
  i i
9 1.3290×10−2 2.4460×109 1.7662×10−4 3.2507×107 E i 1
12
10
11
1.4790×10−2
1.5000×10−2
2.7520×109
2.7670×109
2.1874×10−4
2.2500×10−4
4.0702×107
4.1505×107
 i 1
i
2

12 1.5600×10−2 2.8960×109 2.4336×10−4 4.5178×107 2.3337 108



1.2764 103
12

 1.2764×10−3 2.3337×108
i 1  182.84 GPa

27
Example 2 Results
The equation   182.84  10 9  describes the data.

Figure. Linear regression for stress vs. strain data

28
Nonlinear Regression
Nonlinear Regression
Some popular nonlinear regression models:

1. Exponential model: ( y  aebx )


2. Power model: ( y  ax b )
 ax 
3. Saturation growth model:  y  
 b  x 
4. Polynomial model: ( y  a0  a1x  ...  amx m )

30
Nonlinear Regression
Given n data points( x1, y1), ( x 2, y 2), ... , ( xn, yn) best fit y  f (x )
to the data, where f (x ) is a nonlinear function of x .

( xn , yn )

( x2 , y2 )
y  f (x)
( xi , yi )
yi  f ( xi )
( x1 , y1 )

Figure. Nonlinear regression model for discrete y vs. x data

31
Polynomial Model
best fit y  a0  a1 x  ...  am x
m
Given ( x1, y1), ( x 2, y 2), ... , ( xn, yn)
(m  n  2) to a given data set.

( xn , yn )

( x2 , y2 )

( xi , yi ) y  a0  a1 x    am x m
yi  f ( xi )
( x1 , y1 )

Figure. Polynomial model for nonlinear regression of y vs. x data

32
Polynomial Model cont.
The residual at each data point is given by
Ei  y i  a0  a1 xi  . . .  a m xim
The sum of the square of the residuals then is
n
S r   Ei2
i 1

   y i  a 0  a1 xi  . . .  a m xim 
n
2

i 1

33
Polynomial Model cont.
To find the constants of the polynomial model, we set the derivatives
equal to zero. ai where with respect to i  1, m,
S r
 
n
  2. yi  a0  a1 xi  . . .  am xim (1)  0
a0 i 1
S r
 
n
  2. yi  a0  a1 xi  . . .  am xim ( xi )  0
a1 i 1

   
S r
 
n
  2. yi  a0  a1 xi  . . .  am xim ( xim )  0
am i 1

34
Polynomial Model cont.
These equations in matrix form are given by
   
   n 
 n  n   n m 
  xi  .  xi  a
   yi
. .  
  i 1   i 1    0 
 n    ni 1
   xi   n 2  n m1  a1 

  xi  . . .  xi   xi y i 
  i 1   i 1   i 1  . . 
.  i 1 
. . . . . . . . . . .  a  . . . 
  m   
 n m   n m1   
n
 xim yi
n

  xi    xi  . . .  xi2 m  
 i 1 
 i 1   i 1   i 1 

The above equations are then solved for a0 , a1 ,, am

35
Example 2-Polynomial Model
Regress the thermal expansion coefficient vs. temperature data to
a second order polynomial.

Table. Data points for


α vs
7.00E-06
temperature

Thermal expansion coefficient, α


Temperature, T Coefficient of 6.00E-06
(oF) thermal
expansion, α 5.00E-06
(in/in/oF)
80 6.47×10−6 (in/in/o F) 4.00E-06

40 6.24×10−6
3.00E-06
−40 5.72×10−6
−120 5.09×10−6 2.00E-06

−200 4.30×10−6 1.00E-06


−280 3.33×10−6 -400 -300 -200 -100 0 100 200

−340 2.45×10−6 Temperature, o F

Figure. Data points for thermal expansion coefficient vs


temperature.
36
Example 2-Polynomial Model cont.
We are to fit the data to the polynomial regression model
α  a0  a1T  a 2T 2
The coefficients a0 ,a1 , a2 are found by differentiating the sum of the
square of the residuals with respect to each variable and setting the
values equal to zero to obtain

  n   n 2   n 
 n   Ti    Ti     i 
  i 1   i 1   a   i 1 
0
 n   n 2  n 3     n 
  i 
 T   Ti    Ti   a1    Ti  i
 i n1   i 1   i 1      i 1 
n 
 n 4    2   T 2
a
 T 2   n 3
  i   Ti    Ti  


 i i 

 i 1   i 1   i 1   i 1

37
Example 2-Polynomial Model cont.
The necessary summations are as follows
7
Table. Data points for temperature vs.
Temperature, T Coefficient of
α
T
i 1
i
2
2.5580 105
(oF) thermal expansion,
7

T
α (in/in/oF)
i
3
  7.0472  10 7
80 6.47×10−6
i 1
40 6.24×10−6 7
−40 5.72×10−6 T
i 1
i
4
 2.1363 1010
−120 5.09×10−6
7


−200 4.30×10−6
i  3.3600  10 5
−280 3.33×10−6
i 1
−340 2.45×10−6 7

T 
i 1
i i   2.6978  10 3
7

T
i 1
i
2
 i 8.5013  10 1

38
Example 2-Polynomial Model cont.
Using these summations, we can now calculate a0 ,a1 , a2
 7.0000  8.6000  10 2 2.5800  10 5  a 0   3.3600  10 5 
   
 8.600  10
2
2.5800  10 5  7.0472  10 7   a1    2.6978  10 3 
 2.5800  10 5  7.0472  10 7 2.1363  1010  a 2   8.5013  10 1 

Solving the above system of simultaneous linear equations we have
a 0   6.0217  10 
6

 a    6.2782  10 9 
 1  
a 2   1.2218  10 
11

The polynomial regression model is then


α  a0  a1T  a 2T 2
 6.0217  10 6  6.2782  10 9 T  1.2218  10 11 T 2

39