You are on page 1of 26

IMSE2132

Bill KP Chan
Office: HW8-14
Tel.: 3917 7059
Email: billchan@hku.hk
Example 1 Fit a simple linear regression model of ! 0 1 2
the form "(<) = >! + >" ? from the data in the table. " 1 2 2

Solution From the data, we may write the SSE as


!!" = $! + $" ×0 − 1 # + $! + $" ×1 − 2 # + $! + $" ×2 − 2 #

= $! − 1 # + $! + $" − 2 # + $! + 2$" − 2 #

+ !!" /+$! = 0 and + !!" /+$" = 0


$ − 1 + $! + $" − 2 + $! + 2$" − 2 = 0
⇒ . !
0 + $! + $" − 2 + 2 $! + 2$" − 2 = 0
$ 1 + 1 + 1 + $" 0 + 1 + 2 = 1+2+2
⇒ . !
$! 0 + 1 + 2 + $" 0 + 1 + 4 = 0+2+4
1
1 + 1 + 1 0 + 1 + 2 $! 1 1 1
⇒ = 2
0 + 1 + 2 0 + 1 + 4 $" 0 1 2
2
1 0 $ 1
1 1 1 ! 1 1 1
⇒ 1 1 $ = 2
0 1 2 " 0 1 2
1 2 2
$ $
⇒ 0 01 = 0 2
1 1 0 $
where 2 = 2 and 0 = 1 1 and 1 = !
$"
2 1 2
Example 1 Fit a simple linear regression model of ! 0 1 2
the form "(<) = >! + >" ? from the data in the table. " 1 2 2

1 1 0 $
Solution 0$ 01 = 0$ 2 where 2 = 2 and 0 = 1 1 and 1 = !
$"
2 1 2
1 0
$ 1 1 1 3 3
Note that 0 0 = 1 1 = is a square matrix.
0 1 2 3 5
1 2
%" %" %"
If 0$ 0 is invertible, then, 0$ 0 exists. Hence, 0$ 0 0$ 01 = 0$ 0 0$ 2 ⇒

%"
1 = 0$ 0 0$ 2

$ %" 1 7/6
3 3 1 1 1
Hence, we have 1 = ! = 2 =
$" 3 5 0 1 2 1/2
2
& "
i.e., the estimator of >! is $! = ' ≈ 1.167 and the estimator of >" is $" = # ≈ 0.5
Example 1 Fit a simple linear regression model of ! 0 1 2
the form "(<) = >! + >" ? from the data in the table. " 1 2 2

Solution
W = 1.167 + 0.5?

2.5

1.5
y = 0.5x + 1.1667
1

0.5

0
0 0.5 1 1.5 2
Fit a simple linear regression model of the form "(<) = ! 1 2 3
>! + >" ? from the data in the table. Estimate the value of " 4 3 1
< when ? = 1.5.
Step 1: Write the vector 2.
Step 2: Write the matrix 0.
Step 3: Write the matrix 0$ .
Step 4: Calculate 0$ 0.
%(
Step 5: Find 0$ 0 .
$ %(
Step 6: Let 1 = ! . Find $! and $" using 1 = 0$ 0 0$ 2.
$"
Step 7: Write the equation of the linear model (W = $! + $" ?).
Fit a simple linear regression model of the form "(<) = ! 1 2 3
>! + >" ? from the data in the table. Estimate the value of " 4 3 1
< when ? = 1.5.
Step 1: Write the vector 2.
Step 2: Write the matrix 0.
Step 3: Write the matrix 0$ . 2×2 square matrix
Step 4: Calculate 0$ 0. Number of rows? Number of columns? A square matrix?
%(
Step 5: Find 0$ 0 .
$ %(
Step 6: Let 1 = ! . Find $! and $" using 1 = 0$ 0 0$ 2.
$"
Step 7: Write the equation of the linear model (W = $! + $" ?).

4 1 1
1 1 1
2= 3 0= 1 2 0$ = = transpose of matrix 0
1 2 3
1 1 3
1 1
1 1 1 1×1 + 1×1 + 1×1 1×1 + 1×2 + 1×3 3 6
0$ 0 = 1 2 = =
1 2 3 1×1 + 2×1 + 3×1 1×1 + 2×2 + 3×3 6 14
1 3
Fit a simple linear regression model of the form "(<) = ! 1 2 3
>! + >" ? from the data in the table. Estimate the value of " 4 3 1
< when ? = 1.5. 2×2 square matrix
%(
Step 5: Find 0$ 0 . Number of rows? Number of columns? A square matrix?
$ %(
Step 6: Let 1 = ! . Find $! and $" using 1 = 0$ 0 0$ 2.
$"
Step 7: Write the equation of the linear model (W = $! + $" ?).
3 6
0$ 0 =
6 14
]"" ]"#
If \ is an invertible 2×2 matrix such that \ = ] ]## ,
#"
%" ( ]## −]"# ]"" ]"#
then \ = )*+(-) −] ]"" where det \ = ]#" ]## = ]"" ]## − ]"# ]#"
#"

%" 7/3 −1
$ %( 3 6 " 14 −6 " 14 −6
0 0 = = / ' = =
6 14 −6 3 /×"0%'×' −6 3 −1 1/2
' "0

The inverse of \ is denoted by \%" and is an ^×^ matrix such that


\%" \ = \\%" = _ where _ is the ^×^ identity matrix.
1 0 0
1 0
For ^ = 1, _ = . For ^ = 3, _ = 0 1 0 .
0 1
0 0 1
Note that \ is invertible if and only if det \ ≠ 0.
Fit a simple linear regression model of the form "(<) = ! 1 2 3
>! + >" ? from the data in the table. Estimate the value of " 4 3 1
< when ? = 1.5.
%(
Step 5: Find 0$ 0 .
$ %(
Step 6: Let 1 = ! . Find $! and $" using 1 = 0$ 0 0$ 2.
$"
Step 7: Write the equation of the linear model (W = $! + $" ?).

%( 7/3 −1 1 1 1 4
$
0 0 = 0$ = 2= 3
−1 1/2 1 2 3
1
7/3 −1 4 $! 17/3
%( 1 1 1
1= 0$ 0 0$ 2 = 3 = ? =
−1 1/2 1 2 3 $" −3/2
1
5
Hence, the linear model is W =
"&
− ?
/ 4
/ #
3
"& / 0" 2
When ? = 1.5, W = − ×1.5 =
/ # "#
1
y = -1.5x + 5.6667
0
1 1.5 2 2.5 3
]"" ]"# ]"/
If \ is an invertible 3×3 matrix such that \ = ]#" ]## ]#/ ,
]/" ]/# ]//
then
]## ]#/ ]#" ]#/ ]#" ]## 2
]/# ]// − ] ]// ]/" ]/#
/"
1 ]"# ]"/ ]"" ]"/ ]"" ]"#
\%" = − ] ]// ]/" ]// − ] ]/#
bcd(\) /# /"
]"# ]"/ ]"" ]"/ ]"" ]"#
]## ]#/ − ] ]#/ ]#" ]##
#"

where
]"" ]"# ]"/
] ]#/ ]#" ]#/ ]#" ]##
det \ = ]#" ]## ]#/ = ]"" ## − ] + ]
]/# ]// "# ]
/" ]// "/ ]
/" ]/#
]/" ]/# ]//

e.g.
2
2 2 3 2 3 2

2 1 2 %" 2 3 1 3 1 2 2 1 −2
1 1 2 2 2 2 1 1
3 2 2 = − − = −7 4 2
2 1 2 2 3 1 3 1 2 5
1 2 3 1 2 2 2 2 1 4 −3 1
3 2 2 −
1 2 3 2 2 3 2 3 2
• Regression analysis is a collection of statistical techniques
for studying the relationship between a response of
interest ! and a set of predictor variables "e , "f , … , "g ,
called the regressor.
• The linear regression model is an important type of
regression model.
%(!) = )h + )e "e + )f "f + ⋯ + )g "g = ,i -
where , = 1, "e , . . . , "g j and - = )h , )e , … , )g j

• The response ! is a linear function of the unknown model


parameters, called regression coefficients, )h , )e , … , )g .
• Note that for 0 regressors "k , there are 0 + 1 regression
coefficients.
• Note also that the first value of the vector , is always 1.
• Let #! be the observed value of the response at $" = 1, !!# , . . . , !!$ % .
• Thus,) = ## , #& , … , #' % is the observation at

1 !## … !#$ $)(


1 !&# … !&$ $)*
+= =
⋮ ⋮ ⋱ ⋮ ⋮
1 !'# … !'$ $)+

• Hence, #! = ., + .# !!# + .& !!& + ⋯ + .$ !!$ + 1! = $" )2 + 1! where


1! ~4 0, 6 & is the random difference between #! and its expected value.
• Note that all errors 1! are assumed to be independent and normally
distributed with zero mean and equal variance 6 & .
• If 7 = 1# , 1& , … , 1' % , we can write ) = +2 + 7.
Let 8 = 9, , 9# , … , 9$ % . If 8 is an estimate for 2, the method of least
squares requires 8 is to be chosen so that
::1 = +8 − ) & = +8 − ) % +8 − )
&
= ∑ 9, + 9# !!# + ⋯ + 9$ !!$ − #! is minimized
! ""# ! ""# ! ""#
• Then, = 0, = 0, … , = 0 gives
!$! !$" !$#

1 2% + 2& 4'& + ⋯ + 2( 4'( − 7' = 0 1 2% + 2& 4'& + ⋯ + 2( 4'( = 1 7'

1 4'& 2% + 2& 4'& + ⋯ + 2( 4'( − 7' = 0 1 4'& 2% + 2& 4'& + ⋯ + 2( 4'( = 1 4'& 7'

⋮ ⋮
1 4'( 2% + 2& 4'& + ⋯ + 2( 4'( − 7' = 0 1 4'( 2% + 2& 4'& + ⋯ + 2( 4'( = 1 4'( 7'

1 1 … 1 1 4&& … 4&( 2% 1 1 … 1 7&


4&& 4)& … 4*& 1 4)& … 4)( 2& 4&& 4)& … 4*& 7)
⇒ = ⇒ <+ <= = <+ >
⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮
4&( 4)( … 4*( 1 4*& … 4*( 2( 4&( 4)( … 4*( 7*
,&
• Suppose <+ < is invertible and <+ < exists. Then, we have
,& ,&
<+ < <+ <= = <+ < <+ > ⇒
,&
= = <+ < ?
<+ > = @

,& +
A 7& ,&
D+- @ ,&
+
• Since A = = < < < ⋮ = <+ < <+ ⋮ = <+ < <+ <@ = @,
A(7* ) D+. @
= is an unbiased estimator of @.
Example 2 Fit a linear regression model of the ?" 0 1 0 2
form "(<) = >! + >" ?" + ># ?# from the data in the table ?# 0 0 1 2
below. Then, estimate < when ?" = 1 and ?# = 1 W 1 2 3 3

1 1 0 0
Solution Given, 2 = 2 and 0 = 1 1 0
3 1 0 1
3 1 2 2
1 0 0 4 3 3
1 1 1 1
∴ 0$ 0 = 0 1 0 2 1 1 0 = 3 5 4 .
1 0 1
0 0 1 2 3 4 5
1 2 2
%"
9 −3 −3 %"
$ "
Hence, 0 0 =
"3
−3 11 −7 ⇒ 1 = 0$ 0 0$ 2 =
−3 −7 11
9 −3 −3 1 5/3
1 1 1 1
2 2 = −1/9 . Hence, >
" o! = $! = 5/3,
−3 11 −7 0 1 0
2 3
"3
−3 −7 11 0 0 1 8/9
3
o# = $# = 8/9 and the regression model is W = 4 − " ?" + 3 ?#
o" = $" = −1/9 and >
> / 5 5
4 " 3 "&
Thus, when ?" = 1 and ?# = 1, the estimated value for W is − ×1 + ×1 = .
/ 5 5 5
= 1 2 3 4 5 6 7 8 9
!! 1.5 1.8 2.4 3 3.5 3.9 4.4 4.8 5
"! 4.8 5.7 7 8.3 10.9 12.4 13.1 13.6 15.3
Example 3 Fit a linear regression model of the form "(<) = >! + >" ? from
the data given in the following table. Estimate the value of < when ? = 2.

2
Solution Given, 2 = 4.8 5.7 7 8.3 10.9 12.4 13.1 13.6 15.3 and
1 1 1 1 1 1 1 1 1 2 9 30.3
0= . ∴ 0$ 0 =
1.5 1.8 2.4 3 3.5 3.9 4.4 4.8 5 30.3 115.11
%" " 115.11 −30.3 " 115.11 −30.3
Hence, 0$ 0 = 5 /!./ =
/!./ ""4.""
−30.3 9 5×""4.""%/!.// −30.3 9
" 115.11 −30.3 0.97634 −0.257
= =
""&.5 −30.3 9 −0.257 0.07634
91.1
Also, 0$ 2 = .
345.09
%" 0.97634 −0.257 91.1 0.26
Hence, 1 = 0$ 0 0$ 2 = = .
−0.257 0.07634 345.09 2.93
Thus, >o! = $! = 0.26 and >
o" = $" = 2.93
When ? = 2, W = 0.26 + 2.93×2 = 6.12
= 1 2 3 4 5 6 7 8 9
!! 1.5 1.8 2.4 3 3.5 3.9 4.4 4.8 5
"! 4.8 5.7 7 8.3 10.9 12.4 13.1 13.6 15.3
Example 3 Fit a linear regression model of the form "(<) = >! + >" ? from
the data given in the following table. Estimate the value of < when ? = 2.

The regression line is


W = 0.26 + 2.93?

18
16
14
12
10
y = 2.9303x + 0.2569
8
6
4
2
0
1.5 2.5 3.5 4.5
• When the number of observations or the number of
regression coefficients is large, the dimensions of the
matrices also increase. Using computer to do the
calculation of the matrix operations can largely
improve the calculation efficiency and accuracy.
• There are a lot of commercial or open source software
available on the market that can help the user to
compute matrix operations without going into the
details. Computational packages such as MINITAB,
SPSS, MATLAB, Octave, R and Python (NumPy) are
some examples for the purpose.

• The Microsoft Excel also includes built-in functions for
matrix operations in its standard spreadsheet
environment. Some common matrix operations such as
transpose (=TRANSPOSE(array)), inverse
(=MINVERSE(array)) and multiplication
(=MMULT(array1, array2)) can be found in Excel.
• The Excel file "Linear regression models (examples).xlsx"
shows how the examples in this note can be solved
using Excel functions.
• Documentation for general Excel functions can be found
in the Microsoft's website. Please refer to
https://support.office.com/en-us/article/Formulas-and-
functions-294d9486-b332-48ed-b489-
abe7d0f9eda9#ID0EAABAAA=Formulas
Notation

1 0 0
Transpose 0$ is the transpose of 0.

Let 0 = 1 1 0 . Note that 0 is a 4×3 matrix. The matrix and its transpose can
1 0 1 be defined in an Excel file as follows.
1 2 2

Note
If 0 is a ^×p matrix, then
0$ is a m×^ matrix such
that the element on the
q 78 row and r78 column of
0$ is equal to the
element on the r78 row
and q 78 column of 0.

To define the transpose of 0


1. Highlight 3×4 array of cells
2. Type in the following function in the cell on the top-left corner
=TRNASPOSE(<range>)
3. Select <range> to be the array of cells of the 4×3 matrix 0
4. Press Ctrl-Shift-Enter
Notation
1 0 1 0 Inverse 0%( is the inverse of 0.

Let 0 = 1 1 2 0 .
Note that if we want to find the inverse of 0, the matrix
1 0 1 1
1 2 1 2 must be square and invertible, i.e., det 0 ≠ 0 .

The matrix and its inverse can be defined in an Excel file as follows.

Note
If 0 is a ^×^ invertible
matrix such that 0%(
exists, we have
0%( 0 = 00%( = s
where s is an ^×^
identity matrix.

To define the inverse of 0


1. Highlight 4×4 array of cells
2. Type in the following function in the cell on the top-left corner
=MINVERSE(<range>)
3. Select <range> to be the array of cells of the 4×4 matrix 0
4. Press Ctrl-Shift-Enter
Multiplication
1
1 1 1 Note that 0 is a 2×3 matrix and 2 is a 3×1 matrix
Let 0 = and 2 = 2
2 3 1
3
The matrix multiplication of 02 can be defined in an Excel file as follows.
Note

If 0 is an ^×p matrix and 2 is


an p×t matrix, then 02 is an
^×t matrix.

To define matrix multiplication of 02 Moreover, if ]9: is the q 78 row


1. Highlight 2×1 array of cells and u78 column of 0 and v:; is
2. Type in the following function in the cell on the top- the u78 row and r78 column of
left corner 2 and w9; is the q 78 row and r78
=MMULTI(<range1>,<range2>) column of 02, then
=
3. Select <range1> to be the array of cells of the
2×3 matrix 0 w9; = x ]9: v:;
4. Select <range2> to be the array of cells of the :<"
3×1 matrix 2
5. Press Ctrl-Shift-Enter
Example 4 Fit a linear regression model of the form "(<) = >! + >" ?" + ># ?#
from the data given in the following table. Estimate the value of < when ?" = 11 and
?# = 4.
!# 5 7 8 9 9 10 10 12 13
!& 2 3 2 5 4 3 4 3 5
" 10 10 15 13 14 20 18 24 19

Solution Given, 2 = 10 10 15 13 14 20 18 24 19 23 2 and


1 5 2
1 7 3 Hence, using Excel
1 8 2 %"
1 9 5 1 = 0$ 0 0$ 2 = 4.59 1.87 − 1.80 2 , i.e., > o! = 4.59,
1 9 4 o" = 1.87, >
> o# = −1.8 or
0=
1 10 3 <y = 4.59 + 1.87?" − 1.8?#
1 10 4 Thus, when ?" = 11 and ?# = 4, the estimated value of <, i.e.,
1 12 3 <y is 4.59 + 1.87×11 − 1.8×4 = 117.96.
1 13 5
1 15 4
In a linear regression model, the model response < is a linear function of the unknown
model parameters z, not the regressor {. In fact, the linear regression model
"(<) = >! + >" ?" + ># ?# + ⋯ + >: ?:

could be written as
"(<) = >! + >" }" ({) + ># }# ({) + ⋯ + >: }: ({)

where }" , }# , }/ , … , }: represent any functions of the regressor { = ?" , ?# , … , ?: . For


example, suppose the data has one predictor variable ? to fit a model of the form
"(<) = >! + >" ? + ># ? # + ⋯ + >: ? :

The above linear regression model is a model for a polynomial regression problem.
Before choosing a model to fit, it is often useful to plot a scatterplot diagram from the
data.
Example 5 Fit a linear regression model of the form "(<) = >! + >" ? + ># ? #
from the data given in the following table. Estimate the value of < when ? = 2.5.

! 0 1 2 3 4 5 6 7 8 9
" 9.1 7.3 3.2 4.6 4.8 2.9 5.7 7.1 8.8 10.2

9.1 1 0 0
7.3 1 1 1
3.2 1 2 4
4.6 1 3 9
4.8 1 4 16
2= , 0=
2.9 1 5 25
5.7 1 6 36
7.1 1 7 49
8.8 1 8 64
10.2 1 9 81

%"
1 = 0$ 0 0$ 2
= 8.698 − 2.341 0.288 2
When ? = 2.5, <y = 8.698 − 2.341×2.5 + 0.288×2.5# = 4.6455.
Example 6 Fit a regression model from the data given in the following table
! 21 23 25 27 29 32 35
" 7 11 21 24 66 115 325

From the diagram, it can be observed


that the relationship between ? and W is
nonlinear. The higher the value of ?,
the faster W increase. It is reasonable to
suspect that ? and W are related by an
exponential function of the form
W = ]c >?
where ] and v are some unknown
parameters. scatterplot
Taking logarithm on the function, we get ln W = ln ] + v?.
The model is now converted to a simple linear regression model
 = >! + >" ?
where  = ln(W), >! = ln(]) and >" = v.
Example 6 Fit a regression model from the data given in the following table of
the form  = >! + >" ? where  = ln(W), >! = ln(]) and >" = v.
! 21 23 25 27 29 32 35
" 7 11 21 24 66 115 325
 = Ä^ W 1.946 2.398 3.045 3.178 4.19 4.745 5.784
6.3
5.8
19.46 1 21 y = 0.272x - 3.849
2.398 1 23 5.3
3.045 1 25 4.8
Let Å = 3.178 , 0 = 1 27 . 4.3
4.19 1 29 3.8
4.745 1 32 3.3
5.784 1 35 2.8
2.3
1.8
20 25 30 35
%"
1 = 0$ 0 0$ Å = −3.849 0.272 2 .
Thus, we have  = ln W = −3.849 + 0.272? , i.e.,
W = c %/.305@!.#&#? = 0.0213c !.#&#?
Apart from the exponential function, there are many transform functions that are
frequently arise in engineering applications. Examples:
W = 1/ ] + v? (reciprocal function)
W = ]? > (power function)

Reciprocal function
By taking reciprocal on both sides of the function, it can be converted to a linear
relationship between 1/W and ?.

Power function
By taking logarithm on both sides of the power function, the function gives a linear
relationship between ln W and ln ? .

If there is no clear indication about the functional form of the regression of < on ?,
we may assume that the underlying relationship between ? and < can be modelled by
a smooth and continuous function. From the Taylor Theorem, the relationship can
then be approximated by a polynomial of degree u. That is a polynomial regression
problem with the form "(<) = >! + >" ? + ># ? # + ⋯ + >: ? :

You might also like