Professional Documents
Culture Documents
Regression with a
Single Regressor:
Hypothesis Tests and
Confidence Intervals
v2
~ N 1 , , where vi = (Xi – μX)ui
n( X )
2 2
n (Xi X )
2
SE (ˆ1 ) 0.52
• The 1% 2-sided significance level is 2.58, so we reject the null at
the 1% significance level.
• Alternatively, we can compute the p-value…
Estimation:
• OLS estimators and
• and have approximately normal sampling distributions in
large samples
Testing:
• H0: β1 = β1,0 v. β1 ≠ β1,0 (β1,0 is the value of β1 under H0)
• t = ( – β1,0)/SE( )
• p-value = area under standard normal outside tact (large n)
Confidence Intervals:
• 95% confidence interval for β1 is { ± 1.96×SE( )}
• This is the set of β1 that is not rejected at the 5% level
• The 95% CI contains the true β1 in 95% of all samples.
When Xi = 0, Yi = β0 + ui
• the mean of Yi is β0
• that is, E(Yi|Xi=0) = β0
When Xi = 1, Yi = β0 + β1 + ui
• the mean of Yi is β0 + β1
• that is, E(Yi|Xi=1) = β0 + β1
so:
β1 = E(Yi|Xi=1) – E(Yi|Xi=0)
= population difference in group means
Yi = β0 + β1Xi + ui
• β0 = mean of Y when X = 0
• β0 + β1 = mean of Y when X = 1
• β1 = difference in group means, X =1 minus X = 0
• SE( ) has the usual interpretation
• t-statistics, confidence intervals constructed as usual
• This is another way (an easy way) to do difference-in-means
analysis
• The regression formulation is especially useful when we have
additional regressors (as we will very soon)
1. What…?
2. Consequences of homoskedasticity
3. Implication for computing standard errors
Heteroskedastic or homoskedastic?
© Pearson Education Limited 2015
5-25
The class size data:
Heteroskedastic or homoskedastic?
© Pearson Education Limited 2015
5-26
So far we have (without saying so)
assumed that u might be heteroskedastic.
• You can prove that OLS has the lowest variance among
estimators that are linear in Y… a result called the Gauss-
Markov theorem that we will return to shortly.
• The formula for the variance of and the OLS standard
error simplifies: If var(ui|Xi=x) = , then
var[( X i x )ui ]
var( )= (general formula)
n( )
2 2
X
u2
= (simplification if u is homoscedastic)
n 2
X
Some people (e.g. Excel programmers) find the
homoskedasticity-only formula simpler – but it is
wrong unless the errors really are homoskedastic.
( X i
1
X )ui
n
– β1 = i1n = wi ui ,
n i1
( X i X )2
i1
( Xi X )
where wi = 1 n .
n
(X i
X) 2
i1
( X i X )ui
– β1 = i1
n
( X i
X ) 2
i1
1 n ( Xi X )
= wi ui , where wi = n .
n i1 1
i
(
n
X X ) 2
i1
What is the distribution of a weighted average of normals?
Under assumptions 1 – 5:
1 n 2 2
– β1 ~ N 0, 2 wi u (*)
n i1
Substituting wi into (*) yields the homoskedasticity-only
variance formula.
© Pearson Education Limited 2015
5-43
In addition, under assumptions 1 – 5, under the null
hypothesis the t statistic has a Student t distribution with
n – 2 degrees of freedom