Professional Documents
Culture Documents
Introduction
• So far we have seen finite sample, small sample, or exact properties of the OLS
estimators in the population model
• The unbiasedness and efficient estimator of OLS under the first four Gauss
Markov assumptions is a finite sample property because it holds for any sample
size n
• Based on the normally distributed error term and its independent of the
explanatory variables assumption, the distributions of the OLS estimators have
normal sampling distributions, which led directly to the t and F distributions for
t and F statistics
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• Fortunately, under the first five Gauss Markov assumptions, OLS has
satisfactory large sample properties.
• Consistency
• Unbiasedness of estimators, although important, cannot always be achieved.
• Although not all useful estimators are unbiased, virtually all economists agree
that consistency is a minimal requirement for an estimator.
• In effect, this means that we can make our estimator arbitrarily close to if we
can collect as much data as we want.
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• Sampling distributions of
for sample sizes
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• Simply knowing that the estimator is getting closer to the population value as
the sample size grows does not allow us to test hypotheses about the parameters.
• This result is the basis for deriving the t and F distributions that we use so
often in applied econometrics.
• If the errors , , ,... are random draws from some distribution other
• the t statistics will not have t distributions and the F statistics will not have F
distributions.
• More than 95% of the area under the distribution is within two standard
deviations.
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• Example
• To estimate a model
explaining the number of
arrests of young men during a
particular year (narr86).
• Normality plays no role in the unbiasedness of OLS, nor does it affect the
conclusion that OLS is the best linear unbiased estimator under the Gauss-
Markov assumptions.
• Does this mean that, we must abandon the t statistics for determining which
variables are statistically significant? Or we must abandon the F statistics for
testing exclusion restrictions or other multiple hypotheses
• Even though the y are not from a normal distribution, we can use the central
limit theorem to conclude that the OLS estimators satisfy asymptotic
normality, which means they are approximately normally distributed in large
enough sample sizes.
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• We would like to test whether, say, the last q of these variables all have zero
population parameters: the null hypothesis is
•
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• As with F testing, the alternative to the null hypothesis is that at least one of the
parameters is different from zero.
• However, it turns out that, to get a usable test statistic, we must include all of
the independent variables in the regression.
• If the null hypothesis is true , the R-squared from the auxiliary regression
should close to zero
• The question is, how to determine when the statistic is large enough to reject
the null hypothesis at a chosen significance level.
• Under the null hypothesis, the sample size multiplied by the usual R-squared
from the auxiliary regression is distributed asymptotically as a chi-square
random variable with q degrees of freedom.
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
ii) Regress on all of the independent variables and obtain the R-squared,
say,
(ii) Compute
Example : CRIME1.RAW contains data on arrests during the year 1986 and other
information on 2,725 men born in either 1960 or 1961 in California.
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• Each man in the sample was arrested at least once prior to 1986.
• The variable narr86 is the number of times the man was arrested during 1986:
it is zero for most men in the sample (72.29%), and it varies from 0 to 12.
1.5
Density
.5
0
0 5 10 15
narr86
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• where
– narr86 = the number of times a man was arrested.
– tottime =total time the man has spent in prison prior to 1986 since reaching the age of 18.
– qemp86 = number of quarters in 1986 during which the man was legally employed.
• We use the LM statistic to test the null hypothesis that avgsen and tottime have no effect on narr86
once the other factors have been controlled for. = = 2,725 ∗ . =4.0875; . =
9.21034, . = 5.991465, . = 4.605
• The 10% critical value in a chi-square distribution with two degrees of freedom is about 4.605,
we fail to reject the null hypothesis that = 0, = 0 at the 10% level
• The p-value is > 4.0875 = 0.129542
• Because female 1 when the person is female, and female 0 when the person
is male, the parameter is the difference in hourly wage between females and
males, given the same amount of education (and the same error term u).
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
:
• The key here is that the level of educ
education is the same in both
expectations; the difference, , is due
to gender only.
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• why we do not also include in a dummy variable, say male, which is one for
males and zero for females. This would be redundant.
• Including dummy variables for both genders is the simplest example of the so-
called dummy variable trap, which arises when too many dummy variables
describe a given number of groups.
• we have chosen males to be the base group or benchmark group, that is, the
group against which comparisons are made.
• The negative intercept—the intercept for men, in this case—is not very
meaningful because no one has zero values for all of educ, exper, and tenure in
the sample.
From the above regression , the sample average wage for male equals to 7.099
while for female equals to 7.099 2.512
• What does the coefficient on colonial mean? For given levels of lotsize, sqrft,
and bdrms, the difference in between a house of colonial style and
that of another style is .054.
• This means that a colonial-style house is predicted to sell for about 5.4%
more, holding other factors fixed as compared to its counterpart.
Introduction
OLS Assymptotic
Chapter Two Dummy Variable Regression Model
. ∆
. ∆
%∆ . ∆
So, on average price of the colonial type houses are 5.5485% more than their
counterpart (non colonial type)
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
•
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• When we estimate this model, the coefficient on married (with standard error
in parentheses) is .053 (.041), and the coefficient on female becomes 2.290
(.036).
• we defined four categories based on marital status and gender, single male,
married male, single woman, married woman.
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• In fact, we can recast that model by adding an interaction term between female
and married to the model where female and married appear separately
• This model also allows us to obtain the estimated wage differential among all
four groups, but here we must be careful to plug in the correct combination of
zeros and ones.
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• Setting female 0 and married 0 corresponds to the group single men, which
is the base group, since this eliminates female, married, and female.married.
• We can find the interaction for marriedmen by setting female 0 and married
1 in the model; marriedfemale by setting female 1 and married 1 and so on.
( )= . − . + . − .
. . . .
+ . + . − . + . − .
. . . . .
= , = .
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• The interacting of dummy variables with explanatory variables that are not
dummy variables to allow for a difference in slopes.
• Continuing with the wage example, suppose that we wish to test whether the
return to education is the same for men and women, allowing for a constant
wage differential between men and women (a differential for which we have
already found evidence).
• What kind of model allows for different returns to education? Consider the model.
An important hypothesis is that the return to education is the same for women
and men
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
• We are also interested in the hypothesis that average wages are identical for
men and women who have the same levels of education.
• This means that and must both be zero under the null hypothesis.
Introduction
OLS Assymptotic
Chapter Two Dummy Variable Regression Model
Introduction
OLS Assymptotic
Chapter Two Dummy Variable Regression Model
• Several financial companies rate the quality of debt for local governments,
where the ratings depend on things like probability of default. (Local
governments prefer lower interest rates in order to reduce their costs of
borrowing.)
• For simplicity, suppose that rankings range from zero to four, with zero being
the worst credit rating and four being the best.
• Then is the percentage point change in MBR when CR increases by one unit,
holding other factors fixed.
• if , and otherwise;
• Effectively, we take the single credit rating and turn it into five categories.
Then, we can estimate the model.
Instead, assume that we categorized the rank in the following meaningful six categories.
These are: top 10 rank, rank's in between 11 to 25, rank's in between 26 to 40, rank's in
between 41 to 60, rank's in between 61 to 100, and last rank.
Or, We can generate a categorical variable instead of six dummies above and run the
regression as follows regress: reg lsalary lsat gpa llibvol lcost i.scrank