Understanding the Distributions of Estimators in Simple Linear Regression

SSE
Q? How to prove that is the unbiased estimator of 2 .

n 2
The following result from the single variate will be used.
Thm.: If X 1 ,.., X n are random sample from N (0,  ) , then the

2
function ( X i X ) 2 follows the 2 distribution with d.f.

(n-1).
^2
Thm. : The residual function SSE i is the unbiased
estimator of  . .
2
Pf.: Since
^2
SSE i
^ ^ ^ ^
(Yi 0 1 X i ) 2 (Yi Y 1 X 1 X i ) 2
( X iX )( X i X )(Yi Y ) 2
= i
(Y Y  )
 i
( X  X ) 2
( X iX )( X i X )(Yi Y ) 2

= i
(Y Y  )
( X i X ) 2
[( X i X )(Yi Y )]2 [( X i X )(Yi Y )]2

= (Yi Y ) 2 
2
( X i X ) 2 ( X i X ) 2
[( X i X )(Yi Y )]2
= (Yi Y ) 
2
( X i X ) 2
^ 2
= (Y Y ) 1 ( X i X ) 2
2
i
^ 2
= ( X 1 i 
i 1X ) 1 ( X i X ) 2
2
^2
=  ( X i X ) 21(Xi X )(
i ) (
i ) 1 ( X i X )2
2 2 2
1
Thus, we get
E (SSE )
2
= 1 ( X i X )2 21 (X i X )E(
i ) E((
i )2 )
^2
E(1 )( X i X )2
= 1 ( X X )2 0 (n 1)2 (1 ( X i X )2 2 )

2
i
= ( n 2)
2
^2
Actually, SSE i follows a  -dist. with degrees of
2
freedom (n-2).
ANOVA(Analysis of Variance) Table
To construct the ANOVA table for simple regression model,
we need to find the following components.
S.S. d.f. M.S. E(M.S.) F-ratio
SSR SSR / 1
SSR = SST SSR 1 2 ???
1 SSE /( n 2)
^2 SSE
SSE i n 2 2
n 2
SST (Yi Y )2 n 1
(1) S.S. : sources of the sum of squares.
There are three squares in this table, which are SSR, SSE, and
SST.
SSR(sum of squares for regression) : SSR=SST-SSE
^2
SSE(sum of squares for errors): SSE= i .
SST(total sum of squares): SST (Yi Y ) .

2
In regression, each squares follows a  -distribution.

2
(2) d.f. : degrees of freedom for each squares.
The computation of the degrees of freedom is based on the
distribution of the corresponding squares.

(3) M.S. : Mean squares
M.S. is computing the ratio between S.S. and its corresponding
degrees of freedom.
(4) E(M.S.) : Expectation of the mean squares.
R.K. : SSE /( n 2) is an unbiased estimator for 2 .
(5) F-ratio :
F-ratio actually follows the F-distribution with degrees of
freedom 1 and ( n 2) .
The distribution for each squares will be discussed easier after
we have learned how to handle the regression model in the
matrix form and it will be introduced later in class.
Def.(of F-dist.):
Let U ~ v 1 and V ~ v22 . If U and V are independent

2
to each other, then
F VU // vv21 ~ F( v 1, v 2 )
where F( v1,v 2 ) is the F-dist. With degrees of freedom v1 and
v2 .
Interpretations for the parameters
(1) The interpretation for the slope 1 is at follows. If the
independent variable X increases one unit, then the dependent
variable Y will increase 1 units. This is because
Y [ 0 1 ( X 1 1)] [ 0 1 X 1 ] 1
(2) The interpretation for the interception 0 is when we keep
the independent variable X at the zero level, then the response
of the dependent variable Y . That is,
0 [ 0 1 (0)] 0
^ ^
Distributions for 0 and 1
Under the Normal assumption, we can get the distributions for

^ ^
0 and 1 , which are still Normal. The results are shown in
the following theories.
One thing to be noticed is that they are Not independent for each
other.
^
Thm.1: Under the normal assumption, the estimator 1
2
follows N ( 1 , ( X X ) 2 ) .
 i
 
( X i X )(Yi Y )
^
P.f. : By def., 1
( X i X )2
( X i X )[1 ( X i X ) 
i ]
=
( X i X ) 2
 
( X i X )(i )
= 1
( X i X )2
= 1 Ci (
i ) ,
( X i X )
C 
where
( X i X )2 .
1
It is clearly that
^
E ( 1 ) E ( 1 Ci (
i )) 1 Ci E (i )
= 1
and the variance
^
V ( 1 ) V ( 1 Ci (
i )) V [Ci (i )]
= C V () 2 C C cov[(

i )(j )]
2
i i i j
ij
n 1 2 1
= Ci  2 Ci C j ( )2
2
n ij n
n 1 2 1 2 ( X i X )( X j X ) ( X i X )
2
1
= n  ( ) [ ]
( X iX )2 n (( X i X )2 ) 2
n 1 2 1 1 2 1
 ( ) [ ]
= n
 i
( X X ) 2
n  i
( X  X ) 2
2
 i
= ( X X )2
^
Thm.2: Under the normal assumption, the estimator 0
2 X i2
follows N ( 0 , n ).
( X i X ) 2
Pf.: Try this at home.
Q? What is the covariance between the estimators of 0 and 1 ?
As we mentioned before, the distributions discussed are useful
when we perform the following testing problems.

Testing-Hypothesis for the parameters
Sometimes, if we are interesting in the testing problems like
H 0 : 0 0 and H 0 : 1 1 , then the above distributions
will help us to perform the testing-hypothesis. The procedure is
discussed at next.
(1) When test H 0 : 1 c v.s H1 : 1 c ?
For such test we will use the fact that

^ 2
1 ~ N ( 1 , )
 i
( X  X ) 2
and thus have

^
1 c
~ N (0,1)
^
V ( 1 )
^
1 c
We will reject H 0 : 1 c if the value
| |
^ is greater
V ( 1 )
z
than the value at given , and accept the hypothesis
2
otherwise.
(2) When test H 0 : 0 c v.s H1 : 0 c ?

Understanding the Distributions of Estimators in Simple Linear Regression

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Understanding the Distributions of Estimators in Simple Linear Regression

Uploaded by

Copyright:

Available Formats

SSE

Q? How to prove that is the unbiased estimator of 2 .

Thm.: If X 1 ,.., X n are random sample from N (0,  ) , then the

function ( X i X ) 2 follows the 2 distribution with d.f.

( X iX )( X i X )(Yi Y ) 2

[( X i X )(Yi Y )]2 [( X i X )(Yi Y )]2

= 1 ( X X )2 0 (n 1)2 (1 ( X i X )2 2 )

To construct the ANOVA table for simple regression model,

we need to find the following components.

S.S. d.f. M.S. E(M.S.) F-ratio

(1) S.S. : sources of the sum of squares.

SSR(sum of squares for regression) : SSR=SST-SSE

SST(total sum of squares): SST (Yi Y ) .

In regression, each squares follows a  -distribution.

(2) d.f. : degrees of freedom for each squares.

The computation of the degrees of freedom is based on the

distribution of the corresponding squares.

M.S. is computing the ratio between S.S. and its corresponding

(4) E(M.S.) : Expectation of the mean squares.

R.K. : SSE /( n 2) is an unbiased estimator for 2 .

F-ratio actually follows the F-distribution with degrees of

freedom 1 and ( n 2) .

The distribution for each squares will be discussed easier after

we have learned how to handle the regression model in the

matrix form and it will be introduced later in class.

Let U ~ v 1 and V ~ v22 . If U and V are independent

to each other, then

(1) The interpretation for the slope 1 is at follows. If the

independent variable X increases one unit, then the dependent

variable Y will increase 1 units. This is because

Y [ 0 1 ( X 1 1)] [ 0 1 X 1 ] 1

(2) The interpretation for the interception 0 is when we keep

the independent variable X at the zero level, then the response

of the dependent variable Y . That is,

0 [ 0 1 (0)] 0

Under the Normal assumption, we can get the distributions for

the following theories.

and the variance

= C V () 2 C C cov[(

Pf.: Try this at home.

Q? What is the covariance between the estimators of 0 and 1 ?

As we mentioned before, the distributions discussed are useful

when we perform the following testing problems.

Sometimes, if we are interesting in the testing problems like

H 0 : 0 0 and H 0 : 1 1 , then the above distributions

will help us to perform the testing-hypothesis. The procedure is

(1) When test H 0 : 1 c v.s H1 : 1 c ?

For such test we will use the fact that

and thus have

(2) When test H 0 : 0 c v.s H1 : 0 c ?

You might also like