You are on page 1of 65

Definition of Regression

 Regression analysis is an important tool in statistics and data science. It helps us

understand and measure how variables are related and how it will affect on each other.

 Regression analysis is super important for understanding data and making smart decisions.

 Regression analysis aims to find the best model that shows how one variable depends on

others. We do this by drawing a line or curve through the data points, so it matches the

actual results as closely as possible.

 We call the differences between the actual and predicted values residuals.
Definition of Regression
 Given 𝑛 data points 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , … , 𝑥𝑛 , 𝑦𝑛 , we should find the best fit to the data

𝑦 = 𝑓(𝑥)

 The Residual at each points is 𝐸𝑖 = 𝑦𝑖 − 𝑓(𝑥𝑖 )


Linear Regression
 Given 𝑛 data points 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , … , 𝑥𝑛 , 𝑦𝑛 , we should find the best fit to the data

𝑦 = 𝑎0 + 𝑎1 𝑥

 Does minimizing 𝑛
𝑖=1 𝐸𝑖 work as a criterion?
Linear Regression
 Example:

 Given the data points 2,4 , 3,6 , 2,6 , 3,8 , best fit the data to a straight line using the

criterion of minimizing 𝑖=1 𝐸𝑖 ?


𝑛

𝒙 𝒚
2 4
3 6
2 6
3 8
Linear Regression
 Example:

 To fit a straight line, we choose 2 points

2,4 , 3,8

 4 = 𝑎0 + 𝑎1 (2)

 8 = 𝑎0 + 𝑎1 (3)
𝒙 𝒚 𝒚𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝑬 = 𝒚 − 𝒚𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅
 𝑦 = 4𝑥 − 4 will be the regression line 2 4 4 0
3 6 8 −2
2 6 4 2
3 8 8 0
4

𝐸𝑖 = 0
𝑖=1
Linear Regression
 Example:

 To fit a straight line, we choose 2 points

2,6 , 3,6

 6 = 𝑎0 + 𝑎1 (2)

 6 = 𝑎0 + 𝑎1 (3)
𝒙 𝒚 𝒚𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝑬 = 𝒚 − 𝒚𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅
 𝑦 = 6 will be the regression line 2 4 6 −2
3 6 6 0
2 6 6 0
3 8 6 2
4

𝐸𝑖 = 0
𝑖=1
Linear Regression
 Example:

 4
𝑖=1 𝐸𝑖 = 0 for both regression models 𝑦 = 4𝑥 − 4 and 𝑦 = 6

 The sum of the residuals is minimized, in this case it is zero, but the regression model is

not unique

 Thus the criterion of minimizing the sum of the residuals is a bad criterion
Linear Regression
 Will minimizing 𝑛
𝑖=1 |𝐸𝑖 | work better?
Linear Regression
 Example:

 Given the data points 2,4 , 3,6 , 2,6 , 3,8 , best fit the data to a straight line using the

criterion of minimizing 𝑛
𝑖=1 |𝐸𝑖 |?

𝒙 𝒚
2 4
3 6
2 6
3 8
Linear Regression
 Example:

 To fit a straight line, we choose 2 points

2,4 , 3,8

 4 = 𝑎0 + 𝑎1 (2)

 8 = 𝑎0 + 𝑎1 (3)
𝒙 𝒚 𝒚𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝑬 = 𝒚 − 𝒚𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅
 𝑦 = 4𝑥 − 4 will be the regression line 2 4 4 0
3 6 8 −2
2 6 4 2
3 8 8 0
4

𝐸𝑖 = 4
𝑖=1
Linear Regression
 Example:

 To fit a straight line, we choose 2 points

2,6 , 3,6

 6 = 𝑎0 + 𝑎1 (2)

 6 = 𝑎0 + 𝑎1 (3)
𝒙 𝒚 𝒚𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝑬 = 𝒚 − 𝒚𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅
 𝑦 = 6 will be the regression line 2 4 6 −2
3 6 6 0
2 6 6 0
3 8 6 2
4

𝐸𝑖 = 4
𝑖=1
Linear Regression
 Example:

 4
𝑖=1 𝐸𝑖 = 4 for both regression models 𝑦 = 4𝑥 − 4 and 𝑦 = 6

 The sum of the absolute residuals has been made as small as possible, in this case it is

four, but the regression model is not unique

 Thus the criterion of minimizing the sum of the absolute residuals is also a bad criterion
Linear Regression
 The least squares criterion minimizes the sum of the square of the residuals in the model,

and also produces a unique line.

𝒏 𝟐 𝒏
 𝑺𝒓 = 𝒊=𝟏 𝑬𝒊 = 𝒊=𝟏(𝒚𝒊 − 𝒂𝟎 − 𝒂𝟏 𝒙𝒊 )𝟐
Linear Regression
𝒏 𝟐 𝒏
 𝑺𝒓 = 𝒊=𝟏 𝑬𝒊 = 𝒊=𝟏(𝒚𝒊 − 𝒂𝟎 − 𝒂𝟏 𝒙𝒊 )𝟐

 To find 𝒂𝟎 and 𝒂𝟏 we minimize 𝑺𝒓 with respect to 𝒂𝟎 and 𝒂𝟏 :


𝜕𝑆𝑟 𝑛
 =2 𝑖=1 𝑦𝑖 − 𝑎0 − 𝑎1 𝑥𝑖 −1 = 0
𝜕𝑎0

𝜕𝑆𝑟 𝑛
 =2 𝑖=1 𝑦𝑖 − 𝑎0 − 𝑎1 𝑥𝑖 −𝑥𝑖 = 0
𝜕𝑎1

 Or we have:
𝑛 𝑛 𝑛
 𝑖=1 𝑎0 + 𝑖=1 𝑎1 𝑥𝑖 = 𝑖=1 𝑦𝑖
𝑛 𝑛 𝑛
 𝑖=1 𝑎0 𝑥𝑖 + 𝑖=1 𝑎1 𝑥𝑖 ² = 𝑖=1 𝑦𝑖 𝑥𝑖
Linear Regression
𝒏 𝟐 𝒏
 𝑺𝒓 = 𝒊=𝟏 𝑬𝒊 = 𝒊=𝟏(𝒚𝒊 − 𝒂𝟎 − 𝒂𝟏 𝒙𝒊 )𝟐

 Solving for 𝒂𝟎 and 𝒂𝟏 :

𝑛 𝑛 𝑦 𝑥
𝑖=1 𝑖 𝑖 − 𝑛 𝑛
𝑖=1 𝑖 𝑖=1 𝑦𝑖
𝑥
𝑎1 = 2
𝑛 𝑛
𝑖=1 𝑥 𝑖 −( 𝑛
𝑖=1 𝑥𝑖 )²

𝑛 2 𝑛 𝑛 𝑛
𝑥 𝑦 𝑖 − 𝑥𝑖 𝑖=1 𝑥𝑖 𝑦𝑖
𝑎0 = 𝑖=1 𝑖 𝑖=1
𝑛 2
𝑖=1
𝑛 2 = 𝑦 − 𝑎1 𝑥
𝑛 𝑥
𝑖=1 𝑖 − 𝑥
𝑖=1 𝑖
Linear Regression
 Example:

 The torque, T needed to turn the torsion spring of a mousetrap through an angle, is

given below. Find the constants for the model given by T = 𝑘1 + 𝑘2 𝜃

𝑨𝒏𝒈𝒍𝒆, 𝜃 𝑻𝒐𝒓𝒒𝒖𝒆, 𝑻
𝑹𝒂𝒅𝒊𝒂𝒏𝒔 𝑵−𝒎
0.698132 0.188224
0.959931 0.209138
1.134464 0.230052
1.570796 0.250965
1.919862 0.313707
Linear Regression
 Example:

 The following table shows the summations needed for the calculations of the constants in

the regression model. 𝑨𝒏𝒈𝒍𝒆, 𝜃 𝑻𝒐𝒓𝒒𝒖𝒆, 𝑻 𝜃² 𝑻𝜃


𝑹𝒂𝒅𝒊𝒂𝒏𝒔 𝑵−𝒎 𝑹𝒂𝒅𝒊𝒂𝒏𝒔2 𝐍 − 𝐦 − 𝑹𝒂𝒅𝒊𝒂𝒏𝒔
0.698132 0.188224 0.487388 0.131405
0.959931 0.209138 0.921468 0.200758
1.134464 0.230052 1.2870 0.260986
1.570796 0.250965 2.4674 0.394215
1.919862 0.313707 3.6859 0.602274

6.2831 1.1921 8.8491 1.5896


Linear Regression
 Example:
𝑨𝒏𝒈𝒍𝒆, 𝜃 𝑻𝒐𝒓𝒒𝒖𝒆, 𝑻 𝜃² 𝑻𝜃
𝑹𝒂𝒅𝒊𝒂𝒏𝒔 𝑵−𝒎 𝑹𝒂𝒅𝒊𝒂𝒏𝒔2 𝐍 − 𝐦 − 𝑹𝒂𝒅𝒊𝒂𝒏𝒔
5 5𝑖=1 𝑇𝑖 𝜃𝑖 − 5𝑖=1 𝜃𝑖 5𝑖=1 𝑇𝑖
 𝑘2 = 2 0.698132 0.188224 0.487388 0.131405
5 5𝑖=1 𝜃𝑖 −( 5𝑖=1 𝜃𝑖 )²
0.959931 0.209138 0.921468 0.200758
1.134464 0.230052 1.2870 0.260986
5(1.5896)−(6.2831)(1.1921)
 𝑘2 = = 9.6091 × 10−2 1.570796 0.250965 2.4674 0.394215
5 8.8491 −(6.2831)²
1.919862 0.313707 3.6859 0.602274

5 5
𝑖=1 𝑇𝑖 𝑖=1 𝜃𝑖
 𝑘1 = 𝑇 − 𝑘2 𝜃 = − 𝑘2 6.2831 1.1921 8.8491 1.5896
𝑛 𝑛

1.1921 9.6091×10−2 6.2831


 𝑘1 = − = 1.1767 × 10−1
5 5
Linear Regression
 Example:

5 5𝑖=1 𝑇𝑖 𝜃𝑖 − 5𝑖=1 𝜃𝑖 5𝑖=1 𝑇𝑖


 𝑘2 = 2
5 5𝑖=1 𝜃𝑖 −( 5𝑖=1 𝜃𝑖 )²

5(1.5896)−(6.2831)(1.1921)
 𝑘2 = = 9.6091 × 10−2
5 8.8491 −(6.2831)²

5 5
𝑖=1 𝑇𝑖 𝑖=1 𝜃𝑖
 𝑘1 = 𝑇 − 𝑘2 𝜃 = − 𝑘2
𝑛 𝑛

1.1921 9.6091×10−2 6.2831


 𝑘1 = − = 1.1767 × 10−1
5 5
Linear Regression
 Given 𝑛 data points 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , … , 𝑥𝑛 , 𝑦𝑛 , we should find the best fit to the data

𝑦 = 𝑎1 𝑥 (special case)

𝑛 𝑛 𝑛 𝑛
𝑖=1 𝑦𝑖 𝑥𝑖 − 𝑖=1 𝑥𝑖 𝑖=1 𝑦𝑖
 𝑎1 = 2 is it correct ? Can it be use?
𝑛 𝑛 𝑛
𝑖=1 𝑥𝑖 −( 𝑖=1 𝑥𝑖 )²

 Residual at each point:𝜀𝑖 = 𝑦𝑖 − 𝑎1 𝑥𝑖

 The least squares criterion :

𝒏 𝟐 𝒏
 𝑺𝒓 = 𝒊=𝟏 𝜀𝒊 = 𝒊=𝟏(𝒚𝒊 − 𝒂𝟏 𝒙𝒊 )𝟐
Linear Regression
 Differentiate with respect to 𝑎1

𝑑𝑆𝑟 𝑛 𝑛
 = 𝑖=1 2 𝑦𝑖 − 𝑎1 𝑥𝑖 −𝑥𝑖 = 𝑖=1 −2𝑥𝑖 𝑦𝑖 − 𝑎1 𝑥𝑖 ² = 0
𝑑𝑎1

𝑑²𝑆𝑟 𝑛
 = 𝑖=1 2 𝑥𝑖 ² > 0 ⇒ 𝑎1 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑠 𝑡𝑜 𝑎 𝑙𝑜𝑐𝑎𝑙 𝑚𝑖𝑛𝑖𝑚𝑎
𝑑²𝑎1

𝑛
𝑖=1 𝑦𝑖 𝑥𝑖
 𝑎1 = 𝑛
if no a0
𝑖=1 𝑥𝑖 ²
Linear Regression
 Example:
𝑺𝒕𝒓𝒂𝒊𝒏 𝑺𝒕𝒓𝒆𝒔𝒔
% 𝑴𝒑𝒂
 To find the longitudinal modulus of composite, the following data is
0 0
collected. Find the longitudinal modulus, E using the regression model 0.183 306
0.36 612
𝜎 = 𝐸𝜀 and the sum of the square of the residuals.
0.5324 917
0.702 1223
0.867 1529
1.0244 1835
1.1774 2140
1.329 2446
1.479 2752
1.5 2767
1.56 2896
Linear Regression
𝒊 𝜀(%) 𝜎(𝑮𝒑𝒂) 𝜀² 𝜀𝜎
 Example:
1 0 0 0 0
2 1.83 × 10−3 3.06 × 108 3.3489 × 10−6 5.5998 × 105
𝑛 3 3.6 × 10−3 6.12 × 108 1.296 × 10−5 2.2031 × 106
𝑖=1 𝜎𝑖 𝜀𝑖
𝐸 = 𝑛 𝜀2 4 5.324 × 10−3 9.17 × 108 2.8345 × 10−5 4.8821 × 106
𝑖=1 𝑖
5 7.02 × 10−3 1.223 × 109 4.928 × 10−5 8.5855 × 106
6 8.67 × 10−3 1.529 × 109 7.5169 × 10−5 1. 3256 × 107
2.3337×108 7 1.0244 × 10−2 1.835 × 109 1.0494 × 10−4 1.8798 × 107
𝐸 = 8 1.1774 × 10−2 2.140 × 109 1.3863 × 10−4 2.5196 × 107
1.2764×10−3
9 1.329 × 10−2 2.446 × 109 1.7662 × 10−4 3.2507 × 107
 𝐸 = 182.84 𝐺𝑃𝑎 10 1.479 × 10−2 2.752 × 109 2.1874 × 10−4 4.0702 × 107
11 1.5 × 10−2 2.767 × 109 2.25 × 10−4 4.1505 × 107
12 1.56 × 10−2 2.896 × 109 2.4336 × 10−4 4.5178 × 107
12
1.2764 × 10−3 2.3337 × 108

𝑖=1
Linear Regression
 Example:

𝑛
𝑖=1 𝜎𝑖 𝜀𝑖
𝐸 = 𝑛 𝜀2
𝑖=1 𝑖

2.3337×108
𝐸 =
1.2764×10−3

 𝐸 = 182.84 𝐺𝑃𝑎
NonLinear Regression
NonLinear Regression
 We will study some nonlinear regression models:

 Exponentiel model : 𝑦 = 𝑎𝑒 𝑏𝑥

 Power model : 𝑦 = 𝑎𝑒 𝑏𝑥

𝑎𝑥
 Saturation growth model : 𝑦 =
𝑏+𝑥

 Polynomial model : 𝑦 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑚 𝑥 𝑚
NonLinear Regression: Exponential
 Given 𝑛 data points 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , … , 𝑥𝑛 , 𝑦𝑛 , the best fit to the data 𝑦 = 𝑎𝑒 𝑏𝑥 where

𝑓(𝑥) is a nonlinear function of 𝑥.


NonLinear Regression: Exponential
𝒏 𝟐 𝒏 𝒏
 𝑺𝒓 = 𝒊=𝟏 𝑬𝒊 = 𝒊=𝟏(𝒚𝒊 − 𝒇 𝒙𝒊 )𝟐 = 𝒊=𝟏(𝒚𝒊 − 𝑎𝑒 𝑏𝒙𝒊 )𝟐

 To find 𝒂 and 𝒃 we minimize 𝑺𝒓 with respect to 𝒂 and 𝒃:


𝜕𝑆𝑟 𝑛
 =2 𝑖=1 𝑦𝑖 − 𝑎𝑒 𝑏𝒙𝒊 −𝑒 𝑏𝒙𝒊 = 0
𝜕𝑎

𝜕𝑆𝑟 𝑛
 =2 𝑖=1 𝑦𝑖 − 𝑎𝑒 𝑏𝒙𝒊 −𝑥𝑖 𝑎𝑒 𝑏𝒙𝒊 = 0
𝜕𝑏

 Rewriting the equations:


𝑛 𝑏𝒙𝒊 𝑛 2𝑏𝒙𝒊
− 𝑖=1 𝑦𝑖 𝑒 + 𝑖=1 𝑎𝑒 =0
𝑛 𝑏𝒙𝒊 𝑛 2𝑏𝒙𝒊
 𝑖=1 𝑦𝑖 𝑥𝑖 𝑒 − 𝑖=1 𝑎𝑥𝑖 𝑒 =0
NonLinear Regression: Exponential
𝒏 𝟐 𝒏 𝒏
 𝑺𝒓 = 𝒊=𝟏 𝑬𝒊 = 𝒊=𝟏(𝒚𝒊 − 𝒇 𝒙𝒊 )𝟐 = 𝒊=𝟏(𝒚𝒊 − 𝑎𝑒 𝑏𝒙𝒊 )𝟐

 Solving for 𝒂 and 𝒃:

𝑛 𝑏𝒙𝒊
𝑖=1 𝑦𝑖 𝑒
 From the first equation : 𝑎 = 𝑛 2𝑏𝒙𝒊
need to find value of B first to calculate a
𝑖=1 𝑒

𝑛 𝑏𝒙𝒊
𝑖=1 𝑦𝑖 𝑒
 Substituting 𝑎 in the second equation : 𝑛
𝑖=1 𝑦𝑖 𝑥𝑖 𝑒
𝑏𝒙𝒊 − 𝑛 𝑒 2𝑏𝒙𝒊
𝑛
𝑖=1 𝑥𝑖 𝑒
2𝑏𝒙𝒊 =0
𝑖=1

 The constant 𝑏 can be found through numerical methods such as bisection method.
NonLinear Regression: Exponential
 Example:

 Many patients get concerned when a test involves injection of a radioactive material. For

example for scanning a gallbladder, a few drops of Technetium-99m isotope is used. Half

of the Technetium-99m would be gone in about 6 hours. It, however, takes about 24

hours for the radiation levels to reach what we are exposed to in day-to-dat activities.

Below is given the relative intensity of radiation as a function of time.


𝒕(𝒉𝒓𝒔) 𝟎 𝟏 𝟑 𝟓 𝟕 𝟗
𝜸 𝟏. 𝟎𝟎𝟎 𝟎. 𝟖𝟗𝟏 𝟎. 𝟕𝟎𝟖 𝟎. 𝟓𝟔𝟐 𝟎. 𝟒𝟒𝟕 𝟎. 𝟑𝟓𝟓
NonLinear Regression: Exponential
 Example:

 The relative intensity is related to time by the equation 𝛾 = 𝐴𝑒 𝜆𝑡

 Find the value of the regression constants 𝐴 and 𝜆

𝒕(𝒉𝒓𝒔) 𝟎 𝟏 𝟑 𝟓 𝟕 𝟗
𝜸 𝟏. 𝟎𝟎𝟎 𝟎. 𝟖𝟗𝟏 𝟎. 𝟕𝟎𝟖 𝟎. 𝟓𝟔𝟐 𝟎. 𝟒𝟒𝟕 𝟎. 𝟑𝟓𝟓
NonLinear Regression: Exponential
 Example:

 𝛾 = 𝐴𝑒 𝜆𝑡

𝑛 𝜆𝑡𝑖
𝑖=1 𝛾𝑖 𝑒
𝐴 = 𝑛 2𝜆𝑡𝑖
𝑖=1 𝑒

𝑛 𝜆𝑡𝑖
𝑛 𝜆𝑡𝑖 𝑖=1 𝛾𝑖 𝑒 𝑛 2𝜆𝑡𝑖
𝑓 𝜆 = 𝑖=1 𝛾𝑖 𝑡𝑖 𝑒 − 𝑛 𝑒 2𝜆𝑡𝑖 𝑖=1 𝑡𝑖 𝑒 =0
𝑖=1
NonLinear Regression: Exponential
 Example:

𝑛 𝜆𝑡𝑖
𝑛 𝜆𝑡𝑖 𝑖=1 𝛾𝑖 𝑒 𝑛 2𝜆𝑡𝑖
𝑓 𝜆 = 𝛾 𝑡
𝑖=1 𝑖 𝑖 𝑒 − 𝑛 𝑒 2𝜆𝑡𝑖 𝑡
𝑖=1 𝑖 𝑒 =0
𝑖=1

 To find the solution we will implement the problem on Matlab (setting up

the equation)
NonLinear Regression: Exponential
 Example:

 Result from Matlab: 𝜆 = −0.1151


 The value of 𝐴 can now be calculated:
𝑛 𝜆𝑡𝑖
𝑖=1 𝛾𝑖 𝑒
𝐴= 𝑛 𝑒 2𝜆𝑡𝑖 = 0.9998
𝑖=1

 The exponential regression model is:


𝛾 = 𝐴𝑒 𝜆𝑡 = 0.9998𝑒 −0.1151𝑡

Plot of data and regression curve


NonLinear Regression: Polynomial
 Given 𝑛 data points 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , … , 𝑥𝑛 , 𝑦𝑛 , the best fit to the data 𝑦 = 𝑎0 + 𝑎1 𝑥 +

⋯ + 𝑎𝑚 𝑥 𝑚 ; m ≤ 𝑛 − 2 is a nonlinear function of 𝑥.
NonLinear Regression: Polynomial
𝒏 𝟐 𝒏 𝒏
 𝑺𝒓 = 𝒊=𝟏 𝑬𝒊 = 𝒊=𝟏(𝒚𝒊 − 𝒇 𝒙𝒊 )𝟐 = 𝒊=𝟏(𝒚𝒊 − 𝑎0 − 𝑎1 𝑥 − ⋯ − 𝑎𝑚 𝑥 𝑚 )𝟐

 To find 𝑎0 , 𝑎1 , … and 𝑎𝑚 we minimize 𝑺𝒓 with respect to 𝑎0 , 𝑎1 , … and 𝑎𝑚 :

𝜕𝑆𝑟 𝑛
 =2 𝑖=1 𝑦𝑖 − 𝑎0 − 𝑎1 𝑥 − ⋯ − 𝑎𝑚 𝑥 𝑚 −1 = 0
𝜕𝑎0

𝜕𝑆𝑟 𝑛
 =2 𝑖=1 𝑦𝑖 − 𝑎0 − 𝑎1 𝑥 − ⋯ − 𝑎𝑚 𝑥 𝑚 −𝑥𝑖 = 0
𝜕𝑎1

…

𝜕𝑆𝑟 𝑛
 =2 𝑖=1 𝑦𝑖 − 𝑎0 − 𝑎1 𝑥 − ⋯ − 𝑎𝑚 𝑥 𝑚 −𝑥𝑖𝑚 = 0
𝜕𝑎𝑀
NonLinear Regression: Polynomial
𝒏 𝟐 𝒏 𝒏
 𝑺𝒓 = 𝒊=𝟏 𝑬𝒊 = 𝒊=𝟏(𝒚𝒊 − 𝒇 𝒙𝒊 )𝟐 = 𝒊=𝟏(𝒚𝒊 − 𝑎0 − 𝑎1 𝑥 − ⋯ − 𝑎𝑚 𝑥 𝑚 )𝟐

 To find 𝑎0 , 𝑎1 , … and 𝑎𝑚 we minimize 𝑺𝒓 with respect to 𝑎0 , 𝑎1 , … and 𝑎𝑚 :

𝑛 𝑛
𝑛 … 𝑛
𝑥𝑖
𝑖=1
𝑥𝑖𝑚 𝑎0 𝑦𝑖
Solve matrix to find a0 till am
𝑖=1 𝑖=1
𝑛 𝑛 𝑛

𝑥𝑖 𝑥𝑖2

𝑥𝑖𝑚+1 𝑎1 𝑛

𝑖=1 𝑖=1 𝑖=1 = 𝑖=1


𝑥𝑖 𝑦𝑖

𝑛

𝑛
… …
𝑛
… … …
… 𝑛
𝑥𝑖𝑚 𝑥𝑖𝑚+1 𝑥𝑖2𝑚
𝑖=1 𝑖=1 𝑖=1 𝑎𝑚 𝑖=1
𝑥𝑖𝑚 𝑦𝑖
NonLinear Regression: Polynomial
 Example:
𝑪𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝒐𝒇
 Regress the thermal expansion coefficient vs. 𝑻𝒆𝒎𝒑𝒆𝒓𝒂𝒕𝒖𝒓𝒆
𝒕𝒉𝒆𝒎𝒂𝒍 𝒆𝒙𝒑𝒂𝒏𝒔𝒊𝒐𝒏
𝑻(°𝑭)
𝜶 (𝒊𝒏/ 𝒊𝒏 /°𝑭)
temperature data to a second order polynomial.
𝟖𝟎 𝟔. 𝟒𝟕 × 𝟏𝟎−𝟔
𝟒𝟎 𝟔. 𝟐𝟒 × 𝟏𝟎−𝟔
−𝟒𝟎 𝟓. 𝟕𝟐 × 𝟏𝟎−𝟔
−𝟏𝟐𝟎 𝟓. 𝟎𝟗 × 𝟏𝟎−𝟔
−𝟐𝟎𝟎 𝟒. 𝟑𝟎 × 𝟏𝟎−𝟔
−𝟐𝟖𝟎 𝟑. 𝟑𝟑 × 𝟏𝟎−𝟔
−𝟑𝟒𝟎 𝟐. 𝟒𝟓 × 𝟏𝟎−𝟔
NonLinear Regression: Polynomial
 Example:
𝑪𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝒐𝒇
 We are to fit the data to the polynomial 𝑻𝒆𝒎𝒑𝒆𝒓𝒂𝒕𝒖𝒓𝒆
𝒕𝒉𝒆𝒎𝒂𝒍 𝒆𝒙𝒑𝒂𝒏𝒔𝒊𝒐𝒏
𝑻(°𝑭)
𝜶 (𝒊𝒏/ 𝒊𝒏 /°𝑭)
regression model: 𝛼 = 𝑎0 + 𝑎1 𝑇 + 𝑎2 𝑇 2 𝟖𝟎 𝟔. 𝟒𝟕 × 𝟏𝟎−𝟔
𝟒𝟎 𝟔. 𝟐𝟒 × 𝟏𝟎−𝟔
𝑛 𝑛 𝑛
𝑛 𝟓. 𝟕𝟐 × 𝟏𝟎−𝟔
𝑇𝑖 𝑇𝑖2 𝑎0 𝛼𝑖 −𝟒𝟎
𝑖=1 𝑖=1 𝑖=1 −𝟏𝟐𝟎 𝟓. 𝟎𝟗 × 𝟏𝟎−𝟔
𝑛 𝑛 𝑛 𝑛
−𝟐𝟎𝟎 𝟒. 𝟑𝟎 × 𝟏𝟎−𝟔
𝑇𝑖 𝑇𝑖2 𝑇𝑖3 𝑎1 𝑇𝑖 𝛼𝑖
𝑖=1
𝑛
𝑖=1
𝑛
𝑖=1
𝑛
= 𝑖=1 −𝟐𝟖𝟎 𝟑. 𝟑𝟑 × 𝟏𝟎−𝟔
𝑛 −𝟑𝟒𝟎 𝟐. 𝟒𝟓 × 𝟏𝟎−𝟔
𝑇𝑖2
𝑖=1
𝑇𝑖3
𝑖=1
𝑇𝑖4 𝑎2 𝑇𝑖2 𝛼𝑖
𝑖=1 𝑖=1
NonLinear Regression: Polynomial
𝑛
 Example: 𝑇𝑖 = −8.6000 × 102
𝑖=1
𝑪𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝒐𝒇 7
𝑻𝒆𝒎𝒑𝒆𝒓𝒂𝒕𝒖𝒓𝒆
𝑻(°𝑭)
𝒕𝒉𝒆𝒎𝒂𝒍 𝒆𝒙𝒑𝒂𝒏𝒔𝒊𝒐𝒏 𝑇𝑖2 = 2.5580 × 105
𝜶 (𝒊𝒏/ 𝒊𝒏 /°𝑭) 𝑖=1
7
𝟖𝟎 𝟔. 𝟒𝟕 × 𝟏𝟎−𝟔
𝑇𝑖3 = −7.0472 × 107
𝟒𝟎 𝟔. 𝟐𝟒 × 𝟏𝟎−𝟔 𝑖=1
7
−𝟒𝟎 𝟓. 𝟕𝟐 × 𝟏𝟎−𝟔
𝑇𝑖4 = 2.1363 × 1010
−𝟏𝟐𝟎 𝟓. 𝟎𝟗 × 𝟏𝟎−𝟔 𝑖=1
−𝟐𝟎𝟎 𝟒. 𝟑𝟎 × 𝟏𝟎−𝟔 7

−𝟐𝟖𝟎 𝟑. 𝟑𝟑 × 𝟏𝟎−𝟔 𝛼𝑖 = 3.3600 × 10−5


𝑖=1
−𝟑𝟒𝟎 𝟐. 𝟒𝟓 × 𝟏𝟎−𝟔 7

𝑇𝑖 𝛼𝑖 = −2.6978 × 10−3
𝑖=1
7

𝑇𝑖2 𝛼𝑖 = 8.5013 × 10−1


𝑖=1
NonLinear Regression: Polynomial
 Example:

3.3600 × 105
7.0000 −8.6000 × 102 2.5580 × 105 𝑎0

−8.6000 × 102 2.5580 × 105 −7.0472 × 107 𝑎1


= −2.6978 × 10−3

2.5580 × 105 −7.0472 × 107 2.1363 × 10 10 𝑎2 8.5013 × 10−1

 We are to fit the data to the polynomial regression model:

 𝛼 = 𝑎0 + 𝑎1 𝑇 + 𝑎2 𝑇 2 = 6.0217 × 10−6 + 6.2782 × 10−9 𝑇 − 1.2218 × 10−11 𝑇 2


after solving, diagonals are
a0,a1,a2
NonLinear Regression:
Transfromation of data
 To find the constants of many nonlinear models, it results in solving simultaneous

nonlinear equations.

 For mathematical convenience, some of the data for such models can be transformed.

 For example, the data for an exponential model can be transformed.

 𝑦 = 𝑎𝑒 𝑏𝑥 ⇒ ln 𝑦 = ln 𝑎𝑒 𝑏𝑥 ⇒ ln 𝑦 = ln 𝑎 + ln 𝑒 𝑏𝑥 ⇒ln 𝑦 = ln 𝑎 + 𝑏𝑥

 Let z = ln 𝑦 and 𝑎0 = ln 𝑎 and 𝑎1 = 𝑏 How to transform exponential to


linear

 Thus we will have the linear regression z = 𝑎0 + 𝑎1 𝑥


NonLinear Regression:
Transfromation of data
 Using linear model regression methods

𝑛 𝑛 𝑛 𝑛
𝑖=1 𝑧𝑖 𝑥𝑖 − 𝑖=1 𝑥𝑖 𝑖=1 𝑧𝑖
 𝑎1 = 2
𝑛 𝑛 𝑛
𝑖=1 𝑥𝑖 −( 𝑖=1 𝑥𝑖 )²

 𝑎0 = 𝑧 − 𝑎1 𝑥.

 Once 𝑎0 and 𝑎1 are found, the original constants of the model are found as:

 𝑎 = 𝑒 𝑎0
go back to original form after finding a0 and a1
 𝑏 = 𝑎1
NonLinear Regression:
Transfromation of data
 Example:

 Many patients get concerned when a test involves injection of a radioactive material. For

example for scanning a gallbladder, a few drops of Technetium-99m isotope is used. Half

of the Technetium-99m would be gone in about 6 hours. It, however, takes about 24

hours for the radiation levels to reach what we are exposed to in day-to-dat activities.

Below is given the relative intensity of radiation as a function of time.

𝒕(𝒉𝒓𝒔) 𝟎 𝟏 𝟑 𝟓 𝟕 𝟗
𝜸 𝟏. 𝟎𝟎𝟎 𝟎. 𝟖𝟗𝟏 𝟎. 𝟕𝟎𝟖 𝟎. 𝟓𝟔𝟐 𝟎. 𝟒𝟒𝟕 𝟎. 𝟑𝟓𝟓
NonLinear Regression:
Transfromation of data
 Example:

 The relative intensity is related to time by the equation 𝛾 = 𝐴𝑒 𝜆𝑡

 Find the value of the regression constants 𝐴 and 𝜆

𝒕(𝒉𝒓𝒔) 𝟎 𝟏 𝟑 𝟓 𝟕 𝟗
𝜸 𝟏. 𝟎𝟎𝟎 𝟎. 𝟖𝟗𝟏 𝟎. 𝟕𝟎𝟖 𝟎. 𝟓𝟔𝟐 𝟎. 𝟒𝟒𝟕 𝟎. 𝟑𝟓𝟓
NonLinear Regression:
Transfromation of data
 Example:

 𝛾 = 𝐴𝑒 𝜆𝑡

 𝑙𝑛𝛾 = 𝑙𝑛𝐴𝑒 𝜆𝑡

 𝑙𝑛𝛾 = 𝑙𝑛𝐴 + 𝜆𝑡

 Assuming z = 𝑎0 + 𝑎1 t the linear relationship between z and t

𝑛 𝑛 𝑛 𝑛
𝑖=1 𝑡𝑖 𝑧𝑖 − 𝑖=1 𝑡𝑖 𝑖=1 𝑧𝑖
 𝑎1 = 2
𝑛 𝑛 𝑛
𝑖=1 𝑡𝑖 −( 𝑖=1 𝑡𝑖 )²

 𝑎0 = 𝑧 − 𝑎1 𝑡.
NonLinear Regression:
Transfromation of data
 Example:
𝑛

𝑡𝑖 = 25.00
𝒊 𝑡𝑖 𝜸𝑖 𝑧𝑖 𝑡𝑖 𝑧𝑖 𝑡𝑖 ² 𝑖=1
𝑛
1 0 1 0.00000 0.00000 0.00000
𝑧𝑖 = −2.8778
2 1 0.891 −0.11541 −0.11541 1.00000 𝑖=1
3 3 0.708 −0.34531 −1.0359 9.00000 𝑛

4 5 0.562 −0.57625 −2.8813 25.0000 𝑡𝑖 𝑧𝑖 = −18.990


𝑖=1
5 7 0.447 −0.80520 −5.6364 49.0000 𝑛

6 0 0.355 −1.0356 −9.3207 81.0000 𝑡𝑖 ² = 165.00


25.000 −2.8778 −18.990 165.00 𝑖=1
NonLinear Regression:
Transfromation of data
 Example:
𝑛

𝑡𝑖 = 25.00
𝑛 𝑛 𝑛 𝑛
𝑖=1 𝑡𝑖 𝑧𝑖 − 𝑖=1 𝑡𝑖 𝑖=1 𝑧𝑖 6 −18.990 −25(−2.8778)
 𝑎1 = 2 = = −0.11505 𝑖=1
𝑛 𝑛 𝑛
𝑖=1 𝑡𝑖 −( 𝑖=1 𝑡𝑖 )²
6 165.00 −(25)² 𝑛

𝑧𝑖 = −2.8778
−2.8778 −0.11505 25 𝑖=1
 𝑎0 = 𝑧 − 𝑎1 𝑡 = − = −2.6150 × 10−4 𝑛
6 6
𝑡𝑖 𝑧𝑖 = −18.990
−4 𝑖=1
 𝑎0 = ln A ; A = 𝑒 𝑎0 = 𝑒 −2.6150×10 = 0.99974 𝑛

𝑡𝑖 ² = 165.00
𝛾= 𝐴𝑒 𝜆𝑡 = 0.99974𝑒 −0.11505𝑡 𝑖=1
Adequacy of Linear Regression
If we have some data is Straight Line Model adequate?
Adequacy of Linear Regression
 If we have some data is Straight Line Model adequate?

 We should check:
steps to calculate adequacy of linear regression
 Plot the data and the model

 Find standard error of estimate

 Calculate the coefficient of determination

 Thus we can answer question such as does the model describe the data adequately?

How well does the model predict the response variable predictably?
Adequacy of Linear Regression
 Example:

 Check the adequacy of the straight line model for given data

 𝛼 = 𝑎0 + 𝑎1 𝑇 𝑇 𝛼
(𝐹) (𝝁𝑖𝑛/𝑖𝑛/𝐹)
−340 2.45
−260 3.58
−180 4.52
−100 5.28
−20 5.86
60 6.36
Adequacy of Linear Regression
 Example:
Step 1
 1) Plot the data and the model

 𝛼 = 𝑎0 + 𝑎1 𝑇 𝑇 𝛼
(𝐹) (𝝁𝑖𝑛/𝑖𝑛/𝐹)
 2.45 = 𝑎0 + 𝑎1 (−340) −340 2.45
−260 3.58
 6.36 = 𝑎0 + 𝑎1 (60) −180 4.52
−100 5.28
 𝛼 = 6.0325 + 0.0096964𝑇
−20 5.86
60 6.36
Adequacy of Linear Regression
 Example:
𝑇𝑖 𝛼𝑖 𝑎0 + 𝑎1 𝑇 𝛼𝑖 − 𝑎0 − 𝑎1 𝑇𝑖
 2) Standard error of estimate: −340 2.45 2.7357 −0.28571
−260 3.58 3.5114 0.068571
Step 2
−180 4.52 4.2871 0.23286
𝑆𝑟 Standard error of estimate formula
 𝑆𝛼/𝑇 = −100 5.28 5.0629 0.21714
𝑛−2
−20 5.86 5.8386 0.021429
𝑛 2 𝑛
 𝑆𝑟 = 𝑖=1 𝐸𝑖 = 𝑖=1(𝛼𝑖 − 𝑎0 − 𝑎1 𝑇𝑖 )2 60 6.36 6.6143 −0.25429

𝑆𝑟 𝑆𝛼/𝑇
𝐸𝑖 𝛼𝑖 −𝑎0 −𝑎1 𝑇𝑖
 𝑆𝑐𝑎𝑙𝑒𝑑 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = = 0.25283
0.25141
𝑆𝛼/𝑇 𝑆𝛼/𝑇
Adequacy of Linear Regression
 Example:

 2) Standard error of estimate: 𝑇𝑖 𝛼𝑖 𝑅𝒆𝒔𝒊𝒅𝒖𝒂𝒍 𝑆𝒄𝒂𝒍𝒆𝒅 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍


−340 2.45 −0.28571 −1.1364
−260 3.58 0.068571 0.27275
𝑆𝑟
 𝑆𝛼/𝑇 = = 0.25141 −180 4.52 0.23286 0.92622
𝑛−2
−100 5.28 0.21714 0.86369
−20 5.86 0.021429 0.085235
60 6.36 −0.25429 −1.0115
Adequacy of Linear Regression
 Example:

 3) Find the coefficient of determination


Step 3
𝑛
 𝑆𝑡 = 𝑖=1(𝛼𝑖 − 𝛼)²

𝑛
 𝑆𝑟 = 𝑖=1(𝛼𝑖 − 𝑎0 − 𝑎1 𝑇𝑖 )2
coeff of determination
𝑆𝑡 −𝑆𝑟
 𝑟2 = ; 0 ≤ 𝑟2 ≤ 1
𝑆𝑡
Adequacy of Linear Regression
 Example:

 3) Find the coefficient of determination


Adequacy of Linear Regression
 Example:

 3) Find the coefficient of determination


Adequacy of Linear Regression
 Example:

 3) Find the coefficient of determination


𝑇𝑖 𝛼𝑖 𝛼𝑖 − 𝛼
 𝛼 = 4.6750 −340 2.45 −2.2250
−260 3.58 −1.0950
 𝑆𝑡 = 10.783
−180 4.52 0.15500
 𝑆𝑟 = 0.25283 −100 5.28 0.60500
−20 5.86 1.1850
𝑆𝑡 −𝑆𝑟 60 6.36 1.6850
 𝑟2 = = 0.97655
𝑆𝑡
Adequacy of Linear Regression
 Example:

 3) Find the correlation coefficient

𝑆𝑡 −𝑆𝑟
𝑟= = 0.98820 𝑉𝑒𝑟𝑦 𝑠𝑡𝑟𝑜𝑛𝑔 𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠ℎ𝑖𝑝 0.8 𝑡𝑜 1.0
𝑆𝑡
𝑆𝑡𝑟𝑜𝑛𝑔 𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠ℎ𝑖𝑝 0.6 𝑡𝑜 0.8
𝑀𝑜𝑑𝑒𝑟𝑎𝑡𝑒 𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠ℎ𝑖𝑝 0.4 𝑡𝑜 0.6
𝑊𝑒𝑎𝑘 𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠ℎ𝑖𝑝 0.2 𝑡𝑜 0.4
𝑉𝑒𝑟𝑦 𝑤𝑒𝑎𝑘 𝑜𝑟 𝑛𝑜𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠ℎ𝑖𝑝 0.0 𝑡𝑜 0.2
Adequacy of Linear Regression
 When the range of values for (x) increases in a plot of y, the coefficient of determination

tends to increase.

 A vertical regression line can make the coefficient of determination value appear high, but this

might not reflect the true relationship between the variables.

 A high coefficient of determination doesn′t necessarily mean the linear model is suitable or

accurate.

 Just because the coefficient of determination is high, it doesn′t guarantee that the regression

model will make accurate predictions.


Adequacy of Linear Regression
 Example:

 4) Model meets assumption of random errors

𝑆𝑡 −𝑆𝑟
𝑟= = 0.98820
𝑆𝑡
Linear Regression
Linear Regression
Linear Regression

You might also like