Professional Documents
Culture Documents
Regression (Inference)
Ch04 Page 1
OLS: Sampling
Distribution
Assumption MLR.6 (Normality)
The population error is independent of the explanatory variables and is normally distributed with zero mean and
variance Normal .
,
,
Under the CLM assumptions MLR.1 through MLR.6, conditional on the Recall that under MLR.1-MLR.5 one has
sample values of the independent variables,
.
Therefore, ⛔ The error variance, , is unknown
and therefore needs to be estimated. ⛔
.
Ch04 Page 2
. degrees of freedom. Further assume that and
are independent. Define, then,
has a distribution with degrees of
where is the number of unknown parameters in the population freedom.
model , ( slope parameters and the .
intercept ) and is the degrees of freedom ( ).
Ch04 Page 3
Hypothesis Testing: Single
Parameter
1. H0: β_l=0
2. H0: β_l=a_l
3. p-Values (t Test)
4. Confidence Intervals
Ch04 Page 4
:
Recall the population model can be written as
.
The population parameter is equal to zero, i.e. after controlling for the other independent variables, there is no effect of on .
The statistic will be used to test the above null hypothesis. The farther the estimated coefficient is away from zero, the less likely it is that the null
hypothesis holds true. Therefore, we will reject the null hypothesis if the ratio is 'sufficiently large', i.e., far away from zero. But what does “far” away
from zero mean? Critical value
Under the null hypothesis one can define a significance level (or 'level') which is .
One-Sided Alternative:
Suppose we have decided on a significance level, as this is the most popular
choice. Thus, we are willing to mistakenly reject when it is true of the time.
➡ While has a distribution under so that it has zero mean - under the
alternative , the expected value of is positive.
Thus, we are looking for a "sufficiently large" positive value of in order to
reject in favor of Negative values of provide no evidence
in favor of .
Test whether, after controlling for education and tenure, higher work experience leads
to higher hourly wages, i.e.,
against .
: 522 = 526-(3+1)
5% Critical Value: 1.645
Ch04 Page 5
5% Critical Value: 1.645
1% Critical Value: 2.326
statistic:
One-Sided Alternative:
Suppose we have decided on a significance level, as this is the most popular choice.
Thus, we are willing to mistakenly reject when it is true of the time.
➡ While has a distribution under so that it has zero mean - under the alternative
, the expected value of is negative.
Thus, we are looking for a "sufficiently large" negative value of in order to reject
in favor of Positive values of provide no evidence in favor of
.
Test whether, after controlling for annual teacher compensation ( ) and the number
of staff per one thousand students ( ), lower student enrollment ( ) leads to higher
standardized tenth-grade math test ( )., i.e.,
against .
: 404 = 408-(3+1)
5% Critical Value: -1.645
1% Critical Value: -2.326
statistic:
Ch04 Page 6
, or , is statistically insignificant even at the level.
Two-Sided Alternative:
Suppose we have decided on a significance level, as this is the most popular choice.
Thus, we are willing to mistakenly reject when it is true of the time.
➡ While has a distribution under so that it has zero mean - under the alternative
, the expected value of can be either negative or positive.
Thus, we are looking for a "sufficiently large" non-zero value of in order to reject
in favor of Close-to-zero values of provide no evidence in favor
of .
For a two-tailed test, is chosen to make the area in each tail of the distribution equal
2.5%. In other words, is the 97.5th percentile in the distribution with degrees
of freedom.
: 137 = 141-(3+1)
5% Critical Value: 1.960
1% Critical Value: 2.576
statistic:
against
against
Ch04 Page 7
against
against
Ch04 Page 8
:
Generally, if the null hypothesis is
Usually
, or in
economics.
This measures how many standard deviations is away from the hypothesized values .
We can use the general statistic to test against one-sided or two-sided alternatives.
We find the critical value for a one-sided or two-sided alternatives exactly as before!
➡ The difference is in how we compute the statistic, not in how we obtain the appropriate .⬅
against .
: 95 = 97-(1+1)
5% Critical Value: 1.662
1% Critical Value: 2.368
statistic:
Ch04 Page 9
-Values ( Test)
Definition
Given the observed value of the statistic, what is the smallest significance
level at which the null hypothesis would be rejected? This level is known as
the -value for the test.
Ch04 Page 10
Guidelines for discussing economic and statistical significance
If a variable is statistically significant, discuss the magnitude of the coefficient to get an idea of its economic or practical importance
The fact that a coefficient is statistically significant does not necessarily mean it is economically or practically significant!
If a variable is statistically and economically important but has the "wrong" sign, the regression model might be misspecified
If a variable is statistically insignificant at the usual levels 10%, 5%, or 1% one may think of dropping it from the regression
If the sample size is small, effects might be imprecisely estimated so that the case for dropping insignificant variables is less strong
Ch04 Page 11
Confidence Intervals
Definition
Confidence intervals are also called interval estimates because they provide a
range of likely values for the population parameter, , and not just a point
estimate .
Let be the critical value associated with at two-sided test at the 95% level of significance, then by construction
,
,
,
.
Lower Bound:
Upper Bound:
For very large sample sizes one can work with the following:
Ch04 Page 12
Hypothesis Testing:
Single Linear
Combination of
Parameters
Consider the multiple linear regression model:
, ➡ ,
. .
, ➡ , ➡ ,
. . .
Now notice that , and therefore we can write our original model as
.
.
.
Therefore one can test the , by simply running a modified regression of on , , , , and then do a simple
test on the coefficient multiplying on this augmented regression.
,
.
It is equivalent to testing
,
.
Ch04 Page 13
Although this is correct
,
Ch04 Page 14
Hypothesis Testing:
Multiple Linear
Restrictions
1. Testing Exclusion Restrictions
2. p-Values (F Test)
3. General Linear Restrictions
Ch04 Page 15
Testing Exclusion
Restrictions
Recall the population model can be written as
.
Since setting these
We are interested in testing the multiple null hypothesis parameters equal to zero
effectively excludes regressors ,
, , . , and then we call these
"exclusion restrictions."
Against the alternative hypothesis is not true.
The statistic will be used to test the above null hypothesis. The farther the estimated coefficients are away from zero, the less
likely it is that the null hypothesis holds true. Therefore, we will reject the null hypothesis if the ratio is 'sufficiently large', i.e.,
substantially larger than zero. But what does “substantially larger” than zero mean? Critical value
Under the null hypothesis one can define a significance level (or 'level') which is .
It can be shown that under and assuming that MLR.1-MLR.6 (CLM) assumptions The Distribution
hold, then the statistic is distributed as an random variable with degrees of Let and and assume that and
freedom in the numerator and degrees of freedom in the denominator, are independent. Then has
i.e., an distribution with degrees of freedom.
. .
The integer is called the numerator degrees of
freedom, and is called the denominator degrees of
freedom.
, , ,
is not true.
Ch04 Page 16
: 347 = 353-(5+1)
: 350 = 353-(5+1-3)
1% Critical Value: 3.78
statistic:
Restricted: ,
Unrestricted: .
✔ In our example:
Therefore one can re-write the statistics in terms of the restricted ( ) and unrestricted ( ) :
Ch04 Page 17
-Values ( Test)
Definition
It is the probability of observing a value of at least as large as we did,
given that the null hypothesis is true, i.e.,
Ch04 Page 18
General Linear
Restrictions
Consider the following model
, There are
is not true. restrictions.
Ch04 Page 19