Linear Regression Analysis: Module - Ii

LINEAR REGRESSION ANALYSIS
MODULE – II
Lecture - 4
Simple Linear Regression

Analysis
Dr. Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
2
Maximum likelihood estimation

We assume that ε i ' s (i = 1, 2,..., n) are independent and identically distributed following a normal distribution N (0, σ ).
2
Now we use the method of maximum likelihood to estimate the parameters of the linear regression model
yi =β 0 + β1 xi + ε i (i =1, 2,..., n),
the observations yi (i = 1, 2,..., n) are independently distributed with N ( β 0 + β1 xi , σ ) for all i = 1, 2,..., n. The likelihood
2
function of the given observations ( xi , yi ) and unknown parameters β 0 , β1 and σ 2 is
1/ 2
n
 1   1 
i , yi ; β 0 , β1 , σ
L( x= 2
∏  2 
i =1  2πσ 
exp  − 2 ( yi − β 0 − β1 xi ) 2 .
 2σ 
The maximum likelihood estimates of β 0 , β1 and σ 2 can be obtained by maximizing L( xi , yi ; β 0 , β1 , σ 2 ) or equivalently

ln L( xi , yi ; β 0 , β1 , σ 2 ) where
n n  1  n
ln L( xi , yi ; β 0 , β1 , σ 2 ) =
−   ln 2π −   ln σ 2 −  2  ∑ ( yi − β 0 − β1 xi ) 2 .
2 2  2σ  i =1
The normal equations are obtained by partial differentiation of log-likelihood with respect to β 0 , β1 and σ 2 equating them
to zero
∂ ln L( xi , yi ; β 0 , β1 , σ 2 ) 1 n
− 2 ∑ ( yi − β 0 − β1 xi ) =
= 0
∂β 0 σ i =1
∂ ln L( xi , yi ; β 0 , β1 , σ 2 ) 1 n
− 2 ∑ ( yi − β 0 − β1 xi )xi =
= 0
∂β1 σ i =1
and
∂ ln L( xi , yi ; β 0 , β1 , σ 2 ) n 1 n
4 ∑
=
− 2+ ( yi − β 0 − β1 xi ) 2 =
0.
∂σ 2
2σ 2σ i =1
3
The solution of these normal equations give the maximum likelihood estimates of β 0 , β1 and σ 2 as
b0= y − b1 x
n
∑ ( x − x )( y − y )
i i
sxy
b1 =
i =1
n
∑ ( xi − x )2
sxx
i =1
and
n
∑ ( y − bi 0 − b1 xi ) 2
s 2 = i =1
,
n
respectively.
It can be verified that the Hessian matrix of second order partial derivation of ln L with respect to β 0 , β1 , and σ 2 is negative
=
definite at β 0 b=
0 , β1 b1 , and σ 2 = s 2 which ensures that the likelihood function is maximized at these values.
Note that the least squares and maximum likelihood estimates of β 0 and β1 are identical when disturbances are normally
distributed. The least squares and maximum likelihood estimates of σ are different. In fact, the least squares estimate of σ 2 is
2
1 n
=s2 ∑
n − 2 i =1
( yi − y ) 2
n−2 2
so that it is related to maximum likelihood estimate as s 2 =
s .
n
Thus b0 and b1 are unbiased estimators of β 0 and β1 whereas s is a biased estimate of σ 2 , but it is asymptotically unbiased.
2
The variances of b0 and b1 are same as that of b0 and b1 respectively but the mean squared error
MSE ( s 2 ) < Var ( s 2 ).

4
Testing of hypotheses and confidence interval estimation for slope parameter

Now we consider the tests of hypothesis and confidence interval estimation for the slope parameter of the model under two
cases, viz., when σ 2 is known and when σ 2 is unknown.
Case 1: When σ2 is known

Consider the simple linear regression model yi =β 0 + β1 xi + ε i (i =1, 2,..., n). It is assumed that ε i ' s are independent
and identically distributed and follow N (0, σ 2 ).
First we develop a test for the null hypothesis related to the slope parameter
H 0 : β1 = β10
where β10 is some given constant.
Assuming σ 2 to be known, we know that

σ2
E (b1 ) β=
= 1 , Var (b1 )
sxx
and b1 is a linear combination of normally distributed yi ' s., so
 σ2 
b1 ~ N  β1 , 
 sxx 
and so the following statistic can be constructed

b1 − β10
Z1 =
σ2
sxx
which is distributed as N(0, 1) when H0 is true.
5
A decision rule to test H1 : β1 ≠ β10 can be framed as follows:
Reject H0 if Z 0 > zα /2
where zα /2 is the α / 2 percentage points on normal distribution. Similarly, the decision rule for one sided alternative
hypothesis can also be framed.
The 100(1 − α )% confidence interval for β1 can be obtained using the Z1 statistic as follows:
 
P  − zα ≤ Z1 ≤ z α  =1 − α
−
 2 2 
 
 
 b1 − β1
P − zα ≤ ≤ zα  =1 − α
 σ 2 
 2 2

 sxx 
 σ2 σ2 
P b1 − zα ≤ β1 ≤ b1 + zα  =1 − α .
 2 s xx 2 s xx 

So 100(1 − α )% confidence interval for β1 is
 α2 α2 
 b1 − zα /2 , b1 + zα /2 
 s s 
 xx xx 
where zα / 2 is the α / 2 percentage point of the N(0,1) distribution.

6
Case 2: When σ 2 is unknown
When σ 2 is unknown, we proceed as follows. We know that

SS res
~ χ 2 (n − 2).
σ 2
and
 SS 
E  res  = σ 2 .
n−2
Further, SS res / σ 2 and b1 are independently distributed. This result will be proved formally later in module on multiple linear
regression. This result also follows from the result that under normal distribution, the maximum likelihood estimates, viz.,
sample mean (estimator of population mean) and sample variance (estimator of population variance) are independently
distributed so b1 and s2 are also independently distributed.
Thus the following statistic can be constructed:

b1 − β1
t0 =
σˆ 2
sxx
b1 − β1
= ~ tn − 2
SS res
(n − 2) sxx
which follows a t-distribution with (n - 2) degrees of freedom, denoted as tn − 2 , when H0 is true.

7
A decision rule to test H1 : β1 ≠ β10 is to

reject H 0 if t0 > tn − 2,α / 2
where tn − 2,α / 2 is the α / 2 percentage point of the t-distribution with (n - 2) degrees of freedom.
Similarly, the decision rule for one sided alternative hypothesis can also be framed.
The 100(1 − α )% confidence interval of β1 can be obtained using the t0 statistic as follows :
 
P  −tα ≤ t0 ≤ tα  =1 − α
 2 2 
 
 
 b1 − β1
P −tα ≤ ≤ tα  =1 − α
 σˆ 2 
 2 2

 sxx 
 σˆ 2 σˆ 2 
P b1 − tα ≤ β1 ≤ b1 + tα  =1 − α .
 2 sxx 2 sxx 
So the 100(1 − α )% confidence interval β1 is
 SS res SS res 
 b1 − tn − 2,α /2 , b1 + tn − 2,α /2  .
 (n − 2) sxx (n − 2) sxx 
8
Testing of hypotheses and confidence interval estimation for intercept term
Now, we consider the tests of hypothesis and confidence interval estimation for intercept term under two cases, viz., when
σ 2 is known and when σ 2 is unknown.
Case 1: When σ 2 is known
Suppose the null hypothesis under consideration is H 0 : β 0 = β 00 ,
1 x2 
where σ is known, then using the result that E
2
= (b0 ) β 0 , Var=
(b0 ) σ 2  +  and b0 is a linear combination of
 n sx 
normally distributed random variables, the following statistic
b0 − β 00
Z0 = ~ N (0,1),
21 x2 
σ  + 
 n sxx 
has a N(0, 1) distribution when H0 is true.
A decision rule to test H1 : β 0 ≠ β 00 can be framed as follows:

Reject H0 if Z 0 > zα /2
where zα /2 is the α / 2 percentage points on normal distribution.
9
The 100(1 − α )% confidence intervals for β 0 when σ 2 is known can be derived using the Z 0 statistic as follows::
 
P  − zα ≤ Z 0 ≤ zα  =1 − α
 2 2
 
 
 b0 − β 0 
P  − zα ≤ ≤ zα  =1 − α
 2  1 x2  2

 + 
  n s xx 

 
 x2  
21 x2  21
P b0 − zα σ  +  ≤ β 0 ≤ b0 + zα σ  +   =1 − α .
 2  n s xx  2  n s xx  

So the 100(1 − α )% of confidential interval of β 0 is
 
21 x2  21 x2
 b0 − zα /2 σ  +  , b0 + zα /2 σ  +  .
  n sxx   n sxx  

10
Case 2: When σ is unknown

2
When σ 2 is unknown, then the statistic is constructed
b0 − β 00
t0 =
SSres  1 x 2 
+
n − 2  n s xx 
which follows a t-distribution with (n - 2) degrees of freedom, i.e., tn − 2 when H0 is true.
A decision rule to test H1 : β 0 ≠ β 00 is as follows:
Reject H0 whenever t0 > tn − 2,α / 2

where tn − 2,α / 2 is the α / 2 percentage point of the t-distribution with (n - 2) degrees of freedom.
11
The 100 (1 − α )% of confidential interval of β 0 can be obtained as follows:
Consider
P tn − 2,α /2 ≤ t0 ≤ tn − 2,α /2  =1 − α
 
 
 b0 − β 0 
P tn − 2,α /2 ≤ ≤ tn − 2,α /2  = 1−α
 
SS res 1 x 2
 
 + 
 n − 2  n sxx  
 
 SS res  1 x 2  SS res  1 x 2 
P b0 − tn − 2,α /2  +  ≤ β ≤ b + t n − 2,α /2  +   =1 − α .
n − 2  n sxx  n − 2  n sxx
0 0
  
The 100 (1 − α )% confidential interval for β 0 is
 SS res  1 x 2  SS res  1 x 2 
b0 − tn − 2,α / 2  +  , b0 + tn − 2,α / 2  + .
 n − 2  n sxx  n − 2  n sxx  
Confidence interval for σ 2
A confidence interval for σ 2 can also be derived as follows. Since SS res / σ ~ χ n − 2 , thus consider
2 2
 SS 
P  χ n2− 2,α /2 ≤ res ≤ χ n2− 2,1−α /2  =
1−α
 σ 2

 SS SS 
P  2 res ≤ σ 2 ≤ 2 res  =− 1 α.
χ
 n − 2,1−α /2 χ 
n − 2,α /2 
 SS res SS res 
The corresponding 100(1 − α )% confidence interval for σ is  2 .
2
,
χ  n − 2,1−α /2 χ n2− 2,α /2 

Linear Regression Analysis: Module - Ii

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Regression Analysis: Module - Ii

Uploaded by

Copyright:

Available Formats

LINEAR REGRESSION ANALYSIS

Simple Linear Regression

Maximum likelihood estimation

yi =β 0 + β1 xi + ε i (i =1, 2,..., n),

function of the given observations ( xi , yi ) and unknown parameters β 0 , β1 and σ 2 is

The maximum likelihood estimates of β 0 , β1 and σ 2 can be obtained by maximizing L( xi , yi ; β 0 , β1 , σ 2 ) or equivalently

MSE ( s 2 ) < Var ( s 2 ).

Testing of hypotheses and confidence interval estimation for slope parameter

Case 1: When σ2 is known

Assuming σ 2 to be known, we know that

and b1 is a linear combination of normally distributed yi ' s., so

and so the following statistic can be constructed

So 100(1 − α )% confidence interval for β1 is

where zα / 2 is the α / 2 percentage point of the N(0,1) distribution.

Case 2: When σ 2 is unknown

When σ 2 is unknown, we proceed as follows. We know that

Thus the following statistic can be constructed:

which follows a t-distribution with (n - 2) degrees of freedom, denoted as tn − 2 , when H0 is true.

A decision rule to test H1 : β1 ≠ β10 is to

So the 100(1 − α )% confidence interval β1 is

Testing of hypotheses and confidence interval estimation for intercept term

Case 1: When σ 2 is known

Suppose the null hypothesis under consideration is H 0 : β 0 = β 00 ,

A decision rule to test H1 : β 0 ≠ β 00 can be framed as follows:

So the 100(1 − α )% of confidential interval of β 0 is

Case 2: When σ is unknown

When σ 2 is unknown, then the statistic is constructed

which follows a t-distribution with (n - 2) degrees of freedom, i.e., tn − 2 when H0 is true.

A decision rule to test H1 : β 0 ≠ β 00 is as follows:

Reject H0 whenever t0 > tn − 2,α / 2

Confidence interval for σ 2

You might also like