Professional Documents
Culture Documents
SI 2021
Week 4, 28 September
Dr Syed K Abbas
Office Location: BS304
Email: Syed.Abbas@xjtlu.edu.cn
Plan
Last week, we looked at Gauss-Markov assumptions and Goodness-of-fit. This week, at the end of this
session, you should be able to understand how to:
• What is the marginal effect of experience on wage for a worker with 1 and 5 years of
experience?
• Take the derivative w.r.t experience and plug the values 1 and 5 for experience. It
yields 0.0321 and 0.0281.
• For workers with 1 and 5 years of experience, the marginal effects are estimated to be
approximately 3.2% and 2.8% respectively. You can observe that an extra year of
experience increases wages of the workers at the decreasing rate.
Interpreting the linear model
• The linear model 𝑦! = 𝒙"! 𝜷 + 𝜀! (3.1) has no meaning unless we make
some assumptions about 𝜀!
• 𝜀! has expectation zero and 𝒙! are given. Commonly, we state that 𝐸 𝜀! 𝒙! = 0
(3.2)
• Under this assumption, the regression model describes the expected value of
𝑦 given 𝒙, that is 𝐸 𝑦! 𝒙! = 𝒙"! 𝜷
• It answers, if we know 𝒙, what do we expect 𝑦 to be?
• The coefficient 𝛽# measures the expected change in 𝑦! if 𝑥!# changes by one
$% 𝑦! 𝒙!
unit, but all other variables in 𝒙! do not change. That is: = 𝛽#
$&!"
• The statement 'other variables in 𝒙! do not change' is a ceteris paribus
condition (other things equal). In multiple regressions single coefficients can
only be interpreted this way; strictly speaking we can only interpret 𝛽# if we
know which other variables are included
Ceteris paribus – other things equal
• If we are interested in the relationship between 𝑦! and 𝑥!" the other variables in 𝒙! act
as control variables.
• For example, what is the impact of an earnings announcement upon a firm’s stock
price controlling for overall market movements?
• Sometimes, ceteris paribus is hard to maintain.
• For example, what is the impact of age upon a person’s wage, keeping years of
experience fixed?
• Sometimes, ceteris paribus is impossible, for example if the model includes both age
and age-squared: the effect of age cannot be measured keeping age squared constant.
• Example: the model includes 𝛽# 𝑎𝑔𝑒! + 𝛽$ 𝑎𝑔𝑒! #
• In this case, we can interpret the derivative: the marginal effect of a changing age
%& 𝑦! 𝒙!
(ceteris paribus) = 𝛽# + 2𝛽$ 𝑎𝑔𝑒! ; it is the marginal effect of changing age if
%'!"
other variables in xi are constant.
Elasticities
• Often, researchers are interested in elasticities
• Elasticities can be estimated directly from a linear% model formulated in natural logs
(excluding dummy variables): log 𝑦! = log 𝒙! 𝜸 + 𝜈!
where log 𝒙! is shorthand for a vector with elements 1 log 𝑥!" … log 𝑥!& ′. This
is called a loglinear model; it is assumed that 𝐸 𝜈! log 𝒙! = 0
'( 𝑦! 𝒙!
)!" '( log 𝑦! log 𝒙!
• In a loglinear model, elasticity is ≈ = 𝛾$
')!" *! ' +,- )!"
'( 𝑦! 𝒙! )!" )!"
• In a linear model, elasticity is ')!" *!
= 𝛽$ 𝒙 #𝜷
!
• Thus, in linear model elasticities are nonconstant but vary with 𝒙! , while in a
loglinear model we have constant elasticities.
Elasticities
• Thus:
• In a linear model, elasticity is coefficient times something; if you want to report
elasticity: specify where, e.g. at mean or median
• In a loglinear model, elasticity is the coefficient itself
• The choice of functional form is dictated by convenience of economic
interpretation in most cases
• In some cases a loglinear model may be preferred when explaining (log 𝑦! )
rather than 𝑦! helps to reduce heteroskedasticity problems
• If a dummy variable is included in a loglinear model, its coefficient measures
the expected relative change in 𝑦! due to an absolute (unit) change in 𝑥!#
• It is possible to include some explanatory variables in logs and some in levels
• The interpretation of a coefficient for a level variable is: relative change in 𝑦! resulting
from absolute change 𝑥!$ ; semi-elasticity
Let us look at some examples here.
10
11
Asymptotic properties of OLS
12
Asymptotic properties of OLS
• If the assumptions (A2) and (A5) are violated, the properties of the OLS
estimator may differ from those reported above.
• In many cases, the exact properties are unknown.
• We employ asymptotic theory, which refers to the question what happens if,
hypothetically, the sample size grows infinitely large 𝑁 → ∞
'
• (A6) ∑(
!)' 𝒙! 𝒙! ′ converges to a finite non-singular matrix 𝜮&& .
(
Note that if A-1 does not exist, A is called a singular matrix.
• (A7) 𝐸 𝒙! 𝜀! = 0
Vˆ{b} = s 2 ( X ¢X ) -1 (2.36)
N
åe 2
i
æ N ö -1
and ( X ¢X ) = ç å xi xi¢ ÷ s 1 é1 2ù
-1 2 N
Where s = 2 i =1
N -K è i =1 ø V {b2 } = å i 2 2 úû
1 - r232 N êë N i =1
( x - x ) (2.37)
and
Asymptotic properties of OLS
• We use this to approximate the properties of our estimator in a given
sample. (In reality, sample sizes rarely grow.)
• Under assumptions (A6) and (A7) [which are weaker than the Gauss-
Markov assumptions (A1)-(A4)]: 𝑝𝑙𝑖𝑚 𝒃 = 𝜷
Asymptotic properties of OLS
• For testing purposes, asymptotic distributions are important; it can be shown
that for (A6) plus the Gauss-Markov assumptions (A1)-(A4): 𝑁(𝒃 −
𝜷) → 𝒩(𝟎, 𝜎 * 𝜮&& +' ) ‘ → means asymptotically distributed.’ The estimator is consistent and
asymptotically normal (CAN).
• Note that åe 2
i
æ N ö
Where s = 2 i =1
and ( X ¢X ) -1
= ç å xi xi¢ ÷
N -K è i =1 ø
-1
and s 1 é1
2 N
2ù
V {b2 } = å i 2 2 úû
1 - r232 N êë N i =1
( x - x ) (2.37)
Multicollinearity
• In general, there is nothing wrong with including variables in the
model that are correlated, for example
• experience and schooling
• age and experience
• inflation rate and nominal interest rate
• However, when correlations are high, it becomes hard to identify the
individual impact of each of the variables.
• Multicollinearity is used to describe the situation when an exact or
approximate linear relationship exists between the explanatory
variables (regressors).
• It is the problem when an approximate linear relationship among the
explanatory variables leads to unreliable regression estimates.
Multicollinearity
• The signs of multicollinearity are:
- High standard errors (low t-values)
- Strange signs or magnitudes of coefficients
• This could lead to misleading conclusions.
• The variance of 𝑏# is inflated if 𝑥# can be approximated by the other
explanatory variables.
• The Variance Inflation Factor (VIF) can be used to detect multicollinearity:
' *
𝑉𝐼𝐹 𝑏# = , where 𝑅 # is the squared correlation coefficient
'+-" #
between the 𝑘-th explanatory variable and the other explanatory variables; a
VIF of 10 or more is usually considered 'high'
Multicollinearity
Exact multicollinearity arises when an exact linear relationship exists between the
explanatory variables. For example:
male = 1 – female
The natural solution is to drop one explanatory variable (or more than one, if necessary).
Some programs (e.g. Stata) do this automatically, other programs (e.g. Eviews) give an
error message. [“near collinear matrix”]
48 Alternative parameterizations
AN INTRODUCTION TO LINEAR REGRESSION
19
Missing observations, outliers, and prediction
Outliers
• In calculating the OLS estimator, some observations may have a
disproportional impact.
• An outlier is an observation that deviates markedly from the rest of the sample.
• It could be due to mistakes or problems in the data.
• The estimated slope coefficient when the outlier is included is 0.52 (with a
standard error of 0.18), and the 𝑅 * is only 0.18.
• When the outlier is dropped, the estimated slope coefficient increases to
0.94 (with a standard error of 0.06), and the 𝑅 * increases to 0.86.
• Approaches:
• investigate sensitivity of results
• test for the presence of outliers
• use robust estimation methods (LAD = least absolute deviation, estimates
conditional median rather than mean)
Outliers-regression approach
Missing observations, outliers, and prediction
Missing observations
• In particular Micro-economic data
• This needs to be properly indicated in the dataset, so that the software
would not take it as ‘zero’
• OLS estimator may be subject to sample selection bias
• One approach for using the complete sample is to use the sample average
and augment the model with a missing data indicator: but could still be
biased
• Another approach is hot deck imputation: missing values are replaced by
random draws from the available observed values, but these could be non-
random, so not advised
Missing observations, outliers, and prediction
Prediction
One of the goals for the econometrician is to make predictions, after having produced
the coefficient estimates and corresponding standard errors.
This means, we are interested in predicting the value of the dependent variable at a
given value for the explanatory variables.
The unbiased predictor 𝑦C! can be computed using the estimated 𝒃 coefficients for a
given value of regressor 𝒙! ′: 𝑦C! = 𝒙%! 𝒃 ; this means we are interested in predicting the
value of the dependent variable (use 'predict' command in stata)
Model selection and
misspecification
6.3 Model selection and misspecification
Model Specification
In any econometric investigation, choice of the model is one of the first steps.
When an irrelevant variable is included, and we estimate model 1 while model 2 is the truth, this is less
of a problem. The main disadvantage is that we now include irrelevant information for estimating 𝜷;
this increases the variance; means 𝜷 is less accurately estimated; in this case it is better to estimate
𝜷 from the restricted model 2.
Model selection and misspecification
Thus; including irrelevant variables increases the variance of the estimators for the
other model parameters.
It is also possible that a chosen model may have important variables omitted.
Our economic principles may have overlooked a variable, or lack of data may lead us to drop a variable even when it is
prescribed by economic theory.
On the other hand, including too few variables has the danger of biased estimates.
Omitting
Eq. 6.21WEDU leads us to overstate the effect of an extra year of
education for the husband by about $2,000
• Omission of a relevant variable (defined as one whose coefficient is nonzero)
leads to an estimator that is biased.
• This bias is known as omitted-variable bias
6.3
Model selection and misspecification
Model Specification
y = β1 + β 2 x2 + β3 x3 + e
Eq. 6.22
Notice that the coefficient estimates for HEDU and WEDU have not
changed a great deal. This outcome occurs because KL6 is not
highly correlated with the education variables.
6.3 Model selection and misspecification
Model Specification
You may think that a good strategy is to include as many variables as possible in
your model.
Doing so will not only complicate your model unnecessarily but may also
inflate the variances of your estimates because of the presence of irrelevant
variables.
6.3
Model Specification
The inclusion of irrelevant variables has reduced the precision of the estimated
coefficients for other variables in the equation.
Model selection and misspecification
1. Choose variables and a functional form on the bases of your theoretical and general
understanding of the relationship.
2. If an estimated equation has coefficients with unexpected signs, or unrealistic magnitudes, they
could be caused by a misspecification such as the omission of an important variable.
3. One method for assessing whether a variable or a group of variables should be included in an
equation is to perform significance tests.
5. The adequacy of a model can be tested using a general specification test known as RESET
(coming soon).
Some tests for model selection
• have a model that violates the assumptions of the multiple regression model
We would extend our topic next week and look at the
general specification tests like RESET test, Chapter 3
41