Chapter II

II.
Asymptotic OLS and Dummy

Variables Regression Model
Introduction
OLS Asymptotic
Chapter Two Dummy Variable Regression Model
Introduction
• So far we have seen finite sample, small sample, or exact properties of the OLS
estimators in the population model
• The unbiasedness and efficient estimator of OLS under the first four Gauss
Markov assumptions is a finite sample property because it holds for any sample
size n
• Based on the normally distributed error term and its independent of the
explanatory variables assumption, the distributions of the OLS estimators have
normal sampling distributions, which led directly to the t and F distributions for
t and F statistics
Introduction
OLS Asymptotic
• If the error is not normally distributed, the distribution of a t statistic is not

exactly t, and an F statistic does not have an exact F distribution for any sample
size.
• Asymptotic properties or large sample properties of estimators and test

statistics are not defined for a particular sample size; rather, they are defined
as the sample size grows without bound.
• Fortunately, under the first five Gauss Markov assumptions, OLS has
satisfactory large sample properties.
• One practically important finding is that even without the normality

assumption, t and F statistics have approximately t and F distributions, at least
in large sample sizes.
Introduction
OLS Asymptotic
• Consistency
• Unbiasedness of estimators, although important, cannot always be achieved.
• Although not all useful estimators are unbiased, virtually all economists agree
that consistency is a minimal requirement for an estimator.
• if your estimator of a particular population parameter is not consistent, then

you are wasting your time.
• If is consistent estimator of then the distribution of becomes more

and more tightly distributed around as the sample size grows.
• As n tends to infinity, the distribution of collapses to the single point .
• In effect, this means that we can make our estimator arbitrarily close to if we
can collect as much data as we want.
Introduction
OLS Asymptotic
• Sampling distributions of
for sample sizes
Introduction
OLS Asymptotic
• Asymptotic Normality and Large Sample Inference
• Consistency of an estimator is an important property, but it alone does not allow

us to perform statistical inference.
• Simply knowing that the estimator is getting closer to the population value as
the sample size grows does not allow us to test hypotheses about the parameters.
• For testing, we need the sampling distribution of the OLS estimators.
• Under the CLM ,the sampling distributions of the s are normal.
• This result is the basis for deriving the t and F distributions that we use so
often in applied econometrics.
• If the errors , , ,... are random draws from some distribution other
than the normal, the will not be normally distributed.

Introduction
OLS Asymptotic
• the t statistics will not have t distributions and the F statistics will not have F
distributions.
• Recall that asm6 is equivalent to saying that the distribution of y given , ,

,... is normal.
• Because is observed and is not, in a particular application, it is much easier

to think about whether the distribution of is likely to be normal.
• A normally distributed random variable is symmetrically distributed about its

mean.
• More than 95% of the area under the distribution is within two standard
deviations.
Introduction
OLS Asymptotic
• Example
• To estimate a model
explaining the number of
arrests of young men during a
particular year (narr86).
• Most men are not arrested

during the year, and the vast
majority are arrested one
time at the most.
• .
Introduction
OLS Asymptotic
• Normality plays no role in the unbiasedness of OLS, nor does it affect the
conclusion that OLS is the best linear unbiased estimator under the Gauss-
Markov assumptions.
• But exact inference based on t and F statistics requires asm6.
• Does this mean that, we must abandon the t statistics for determining which
variables are statistically significant? Or we must abandon the F statistics for
testing exclusion restrictions or other multiple hypotheses
• Even though the y are not from a normal distribution, we can use the central
limit theorem to conclude that the OLS estimators satisfy asymptotic
normality, which means they are approximately normally distributed in large
enough sample sizes.
Introduction
OLS Asymptotic
• Other Large Sample tests: the Lagrange Multiplier Statistic

• There is little reason to go beyond the usual and statistics
• Sometimes it is useful to have other ways to test multiple exclusion restrictions

using the Lagrange Multiplier (LM) statistic.
• The form of the LM statistic relies on the Gauss-Markov assumptions, the

same assumptions that justify the F statistic in large samples. We do not need
the normality assumption.
• We would like to test whether, say, the last q of these variables all have zero
population parameters: the null hypothesis is
•
Introduction
OLS Asymptotic
• As with F testing, the alternative to the null hypothesis is that at least one of the
parameters is different from zero.
• The LM statistic requires estimation of the restricted model only
• In particular indicates the residuals from the restricted model.
• If the omitted variables through truly have zero population

coefficients, then, at least approximately, should be uncorrelated with each
of these variables in the sample.
• This suggests running a regression of these residuals, , on those independent

variables excluded under , which is almost what the LM test does.
Introduction
OLS Asymptotic
• However, it turns out that, to get a usable test statistic, we must include all of
the independent variables in the regression.
• we run on , , ,... what we call it an auxiliary regression
• If the null hypothesis is true , the R-squared from the auxiliary regression
should close to zero
• The question is, how to determine when the statistic is large enough to reject
the null hypothesis at a chosen significance level.
• Under the null hypothesis, the sample size multiplied by the usual R-squared
from the auxiliary regression is distributed asymptotically as a chi-square
random variable with q degrees of freedom.
Introduction
OLS Asymptotic
• The Lagrange Multiplier Statistic for q Exclusion Restrictions:

i) Regress y on the restricted set of independent variables and save the
residuals,
ii) Regress on all of the independent variables and obtain the R-squared,
say,
(ii) Compute
(iii) Compare LM to the appropriate critical value, c, in a distribution; if LM >

c, the null hypothesis is rejected.
Example : CRIME1.RAW contains data on arrests during the year 1986 and other
information on 2,725 men born in either 1960 or 1961 in California.
Introduction
OLS Asymptotic
• Each man in the sample was arrested at least once prior to 1986.
• The variable narr86 is the number of times the man was arrested during 1986:
it is zero for most men in the sample (72.29%), and it varies from 0 to 12.
Kernel density estimate

kernel = epanechnikov, bandwidth = 0.1371
1.5
Density
.5
0
0 5 10 15
narr86
Introduction
OLS Asymptotic
• where
– narr86 = the number of times a man was arrested.
– pcnv = the proportion of prior arrests leading to conviction.
– avgsen = average sentence served from past convictions.
– tottime =total time the man has spent in prison prior to 1986 since reaching the age of 18.
– ptime86 = months spent in prison in 1986.
– qemp86 = number of quarters in 1986 during which the man was legally employed.
• We use the LM statistic to test the null hypothesis that avgsen and tottime have no effect on narr86
once the other factors have been controlled for. = = 2,725 ∗ . =4.0875; . =
9.21034, . = 5.991465, . = 4.605
• The 10% critical value in a chi-square distribution with two degrees of freedom is about 4.605,
we fail to reject the null hypothesis that = 0, = 0 at the 10% level
• The p-value is > 4.0875 = 0.129542
• test (tottime avgsen) F( 2, 2719) = 2.03 Prob > F = 0.1310

Introduction
OLS Asymptotic
• Dummy Variable Regression Model

• Qualitative factors often come in the form of binary information:
• The relevant information can be captured by defining a binary variable.
• Binary variables are most commonly called dummy variables.
• With only a single dummy explanatory variable, we just add it as an

independent variable in the equation.
• Only two observed factors affect wage: gender and education.
• Because female 1 when the person is female, and female 0 when the person
is male, the parameter is the difference in hourly wage between females and
males, given the same amount of education (and the same error term u).
Introduction
OLS Asymptotic
• if 0 then, for the same level of

other factors, women earn less than
men on average. educ
• In terms of expectations, if we assume

the zero conditional mean assumption
:
• The key here is that the level of educ
education is the same in both
expectations; the difference, , is due
to gender only.
Introduction
OLS Asymptotic
• why we do not also include in a dummy variable, say male, which is one for
males and zero for females. This would be redundant.
• Using two dummy variables would introduce perfect collinearity because

,
• Including dummy variables for both genders is the simplest example of the so-
called dummy variable trap, which arises when too many dummy variables
describe a given number of groups.
• we have chosen males to be the base group or benchmark group, that is, the
group against which comparisons are made.
• Nothing much changes when more explanatory variables are involved

Introduction
OLS Asymptotic
• The negative intercept—the intercept for men, in this case—is not very
meaningful because no one has zero values for all of educ, exper, and tenure in
the sample.
From the above regression , the sample average wage for male equals to 7.099
while for female equals to 7.099 2.512
• A common specification in applied work has the dependent variable appearing

in logarithmic form, with one or more dummy variables appearing as
independent variables.
Introduction
OLS Asymptotic
• How do we interpret the dummy variable coefficients in this case?
• Not surprisingly, the coefficients have a percentage interpretation.
• Using the data in HPRICE1.RAW, we obtain the equation

log = −1.35 + 0.1678 log + 0.707 log + 0.027bdrm + 0.054colonial
0.651 0.038 0.093 0.029 .045)
= 88, = 0.649
• Colonial is a binary(dummy) variable equal to one if the house is of the

colonial style.
• What does the coefficient on colonial mean? For given levels of lotsize, sqrft,
and bdrms, the difference in between a house of colonial style and
that of another style is .054.
• This means that a colonial-style house is predicted to sell for about 5.4%
more, holding other factors fixed as compared to its counterpart.
Introduction
OLS Assymptotic
However, the exact probability can be computed as
. ∆
. ∆
%∆ . ∆
So, on average price of the colonial type houses are 5.5485% more than their
counterpart (non colonial type)
Introduction
OLS Asymptotic
Interactions Involving Dummy Variables

• Just as variables with quantitative meaning can be interacted in regression
models, so can dummy variables.
• We can use several dummy independent variables in the same equation. For
example,.
• The coefficient on married gives the (approximate) proportional differential in

wages between those who are and are not married, holding gender, educ,
exper, and tenure fixed.
•
Introduction
OLS Asymptotic
• When we estimate this model, the coefficient on married (with standard error
in parentheses) is .053 (.041), and the coefficient on female becomes 2.290
(.036).
• Thus, the “marriage premium” is estimated to be about 5.3%, but it is not

statistically different from zero (t=1.29).
• An important limitation of this model is that the marriage premium is

assumed to be the same for men and women; this is relaxed in the following
example.
• we defined four categories based on marital status and gender, single male,
married male, single woman, married woman.
Introduction
OLS Asymptotic
• In fact, we can recast that model by adding an interaction term between female
and married to the model where female and married appear separately
• The above example shows explicitly that there is a statistically significant

interaction between gender and marital status.
• This model also allows us to obtain the estimated wage differential among all
four groups, but here we must be careful to plug in the correct combination of
zeros and ones.
Introduction
OLS Asymptotic
• Setting female 0 and married 0 corresponds to the group single men, which
is the base group, since this eliminates female, married, and female.married.
• We can find the interaction for marriedmen by setting female 0 and married
1 in the model; marriedfemale by setting female 1 and married 1 and so on.
( )= . − . + . − .
. . . .
+ . + . − . + . − .
. . . . .
= , = .
Introduction
OLS Asymptotic
• The interacting of dummy variables with explanatory variables that are not
dummy variables to allow for a difference in slopes.
• Continuing with the wage example, suppose that we wish to test whether the
return to education is the same for men and women, allowing for a constant
wage differential between men and women (a differential for which we have
already found evidence).
• For simplicity, we include only education and gender in the model.
• What kind of model allows for different returns to education? Consider the model.
An important hypothesis is that the return to education is the same for women
and men
Introduction
OLS Asymptotic
• In terms of our previous model, this is stated as , which means that

the slope of with respect to educ is the same for men and women.
Note that this hypothesis implies that the return of education to men and
women is the same.
• We are also interested in the hypothesis that average wages are identical for
men and women who have the same levels of education.
• This means that and must both be zero under the null hypothesis.
Introduction
OLS Assymptotic
Introduction
OLS Assymptotic
• Incorporating ordinal Information by using Dummy Variables

• Suppose that we would like to estimate the effect of city credit ratings on the
municipal bond interest rate (MBR).
• Several financial companies rate the quality of debt for local governments,
where the ratings depend on things like probability of default. (Local
governments prefer lower interest rates in order to reduce their costs of
borrowing.)
• For simplicity, suppose that rankings range from zero to four, with zero being
the worst credit rating and four being the best.
• This is an example of an ordinal variable. Call this variable CR for concreteness.

Introduction
OLS Assymptotic
• How do we incorporate the variable CR into a model to explain MBR?
• One possibility is to just include CR as we would include any other explanatory

variable:
• Then is the percentage point change in MBR when CR increases by one unit,
holding other factors fixed.
• Unfortunately, it is rather hard to interpret a one-unit increase in CR.
• A better approach is to define dummy variables for each value of CR.
• Thus, let if , and otherwise;
• if , and otherwise;
• if 3, and otherwise; and so on.

Introduction
OLS Assymptotic
• Effectively, we take the single credit rating and turn it into five categories.
Then, we can estimate the model.
Consider an example of salary structure of law school using lawsch85.exel
Instead, assume that we categorized the rank in the following meaningful six categories.
These are: top 10 rank, rank's in between 11 to 25, rank's in between 26 to 40, rank's in
between 41 to 60, rank's in between 61 to 100, and last rank.
So, we can generate 5 dummy variables, by considering one category as a benchmark, we

can run the regression. reg lsalary lsat gpa llibvol lcost top10 r11_r25 r26_r40 r41_r60
r61_r100 lastrank
Or, We can generate a categorical variable instead of six dummies above and run the
regression as follows regress: reg lsalary lsat gpa llibvol lcost i.scrank

Chapter II

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter II

Uploaded by

Copyright:

Available Formats

II.

Asymptotic OLS and Dummy

• If the error is not normally distributed, the distribution of a t statistic is not

• Asymptotic properties or large sample properties of estimators and test

• One practically important finding is that even without the normality

• if your estimator of a particular population parameter is not consistent, then

• If is consistent estimator of then the distribution of becomes more

• As n tends to infinity, the distribution of collapses to the single point .

• Asymptotic Normality and Large Sample Inference

• Consistency of an estimator is an important property, but it alone does not allow

• For testing, we need the sampling distribution of the OLS estimators.

• Under the CLM ,the sampling distributions of the s are normal.

than the normal, the will not be normally distributed.

• Recall that asm6 is equivalent to saying that the distribution of y given , ,

• Because is observed and is not, in a particular application, it is much easier

• A normally distributed random variable is symmetrically distributed about its

• Most men are not arrested

• But exact inference based on t and F statistics requires asm6.

• Other Large Sample tests: the Lagrange Multiplier Statistic

• Sometimes it is useful to have other ways to test multiple exclusion restrictions

• The form of the LM statistic relies on the Gauss-Markov assumptions, the

• The LM statistic requires estimation of the restricted model only

• In particular indicates the residuals from the restricted model.

• If the omitted variables through truly have zero population

• This suggests running a regression of these residuals, , on those independent

• we run on , , ,... what we call it an auxiliary regression

• The Lagrange Multiplier Statistic for q Exclusion Restrictions:

(iii) Compare LM to the appropriate critical value, c, in a distribution; if LM >

Kernel density estimate

– pcnv = the proportion of prior arrests leading to conviction.

– avgsen = average sentence served from past convictions.

– ptime86 = months spent in prison in 1986.

• test (tottime avgsen) F( 2, 2719) = 2.03 Prob > F = 0.1310

• Dummy Variable Regression Model

• The relevant information can be captured by defining a binary variable.

• Binary variables are most commonly called dummy variables.

• With only a single dummy explanatory variable, we just add it as an

• Only two observed factors affect wage: gender and education.

• if 0 then, for the same level of

• In terms of expectations, if we assume

• Using two dummy variables would introduce perfect collinearity because

• Nothing much changes when more explanatory variables are involved

• A common specification in applied work has the dependent variable appearing

• How do we interpret the dummy variable coefficients in this case?

• Not surprisingly, the coefficients have a percentage interpretation.

• Using the data in HPRICE1.RAW, we obtain the equation

• Colonial is a binary(dummy) variable equal to one if the house is of the

However, the exact probability can be computed as

Interactions Involving Dummy Variables

• The coefficient on married gives the (approximate) proportional differential in

• Thus, the “marriage premium” is estimated to be about 5.3%, but it is not

• An important limitation of this model is that the marriage premium is

• The above example shows explicitly that there is a statistically significant

• For simplicity, we include only education and gender in the model.

• In terms of our previous model, this is stated as , which means that

• Incorporating ordinal Information by using Dummy Variables

• This is an example of an ordinal variable. Call this variable CR for concreteness.

• How do we incorporate the variable CR into a model to explain MBR?

• One possibility is to just include CR as we would include any other explanatory

• Unfortunately, it is rather hard to interpret a one-unit increase in CR.

• A better approach is to define dummy variables for each value of CR.

• Thus, let if , and otherwise;

• if 3, and otherwise; and so on.

Consider an example of salary structure of law school using lawsch85.exel