You are on page 1of 23

Introduction to Econometrics

 Econometrics is used in all applied fields to test economic theories, to do policy analysis and to
predict economic time series
 Econometric model is derived from formal economic model and informal economic reasoning
and intuition
 The goal is to estimate the parameters in the model and to test hypotheses about these
parameters
 The data structures include cross-sectional, time series, pooled cross-sectional, and panel data
 Time series and panel data require special treatment because of the correlation across time and
trends and seasonality component
 Because of the non-experimental nature of the most data in social sciences, uncovering causal
relationship is very challenging.
 Under the Gauss-Markov Assumption, the model is
 Linear in parameters
 Random sampling
 Sample variation in the explanatory variables
 Zero conditional mean
 Homoskedasticity
 OLS is a BLUE estimator
 Error variance increases variance of the slope estimator, and sample variance of the
independent variable decreases the variance of the slope estimator
Multiple Regression Model

 MRM allows to effectively hold other factors fixed while examining the effects of a
particular independent variable on the dependent variable. It explicitly allows the
independent variables to be correlated
 It can be used to model nonlinear relationships by appropriately choosing the dependent
and independent variables
 OLS can be applied to MRM to estimate the parameters. Each parameter measures the
partial effect of the corresponding independent variable on the dependent variable, holding
all other independent variables fixed
 Including an irrelevant variable in a model has no effect on the unbiasedness of the
intercept and other slope estimators. However, it can increase the variance of the
parameters because of multicollinearity.
 Omitting a relevant variable causes OLS to be biased.
 As collinearity between the independent variable approaches one, the variance of the
estimated parameter becomes unbounded.
 Assuming error term is normally distributed, the OLS estimators are normally distributed
 t statistics is used to test hypotheses about a single parameter against one or two-sided
alternatives. P-values – the smallest significance level for which the null hypothesis is
rejected – helps us to test hypothesis at any significance level
 F statistics is used to test multiple exclusion restrictions. F statistics for the overall
significance of a regression tests the null hypothesis that all slope parameters are zero
 LM statistics can be used for testing exclusion restriction in large samples
 The first four GM assumptions imply that OLS is consistent.
 In large samples OLS can be applied where dependent variables is not even approximately
normally distributed.
 A change in the units of measurement of an independent variable changes the OLS coefficient in the
expected manner. If x is multiplied by c, its coefficient is divided by c. If the dependent variables is
multiplied c, all OLS coefficients are multiplied by c.
 In a standardized OLS regression, Beta coefficients measure the effects of the independent variables on the
dependent variable in standard deviation units
 When logarithm transformation is used the coefficients have percentage change interpretation. Units have no
effect.
 Logs are used when positive values with lot of variation is involved. Logs are not used for variables
measured in years or percentage.
 Models with log(y) as the dependent variable often more closely satisfy the classical linear
model assumption. The model has a better chance of being linear, homoscedastic and
normality plausible.
 In many cases taking log greatly reduces the variation of a variable, making OLS estimates
less prone to outlier influence. However, in cases where y is a fraction and close to zero,
for many observations, log (y) can have much more variability than y.
 For large changes in an explanatory variable, we can compute a more accurate estimate of
the percentage change effect
 When we want to log but some y=0, log (1+y) can be used
 Quadratic function in an explanatory variable allows for an increasing or decreasing effect. The
turning point can be easily calculated.
 Interaction terms allow the partial effect of an explanatory variable, say x1, to depend on the level of
another variable, say x2 –vice versa.
 Interpreting models with interactions can be tricky. The coefficient on x1, say b1, measures the
partial effect of x1 on y when x2=0, which may be impossible or uninteresting. Centering x1 and x2
around interesting values before constructing the interaction term typically leads to an equation that
is visually more appealing
 Adjusted R2 penalizes the number of regressors and can drop when an independent variable is added.
Used for choosing between non-nested models with different number of explanatory variables.
 Residual analysis can be used to determine whether particular members of the sample have predicted
values that are well above or well below the actual outcomes.
 Dummy variable regression is used to capture the effect of qualitative information. If there are g
groups, then g-1 dummy variables are included in the model. The coefficient of dummy variables are
interpreted relative to the base or benchmark group.
 A set of dummy variables representing different outcomes of the ordinal variable is used for ordinal
scaled variables.
 Dummy variables can be interacted with quantitative variables to allow slope differences across
different groups.
 Chow test can be used to detect whether there are any differences across groups. Allowing for
intercept differences, a standard F test can be used for testing whether the slopes for two different
groups are the same.
 The linear probability model, which is simply estimated by OLS, allows us to explain a binary
response using regression analysis. The OLS estimates are interpreted as changes in the probability of
success (y=1). Heteroscedasticity and predicted values outside the range are the drawbacks.
 Heteroscedasticity does not cause bias or inconsistency in the OLS estimators, but the usual
standard errors and test statistics are no longer valid. Heteroskedastic-robust standard error can
be computed.
 Breusch-Pagan test and White test are used to test for heteroscedasticity.
 OLS is no longer the best linear unbiased estimator in the presence of heteroscedasticity.
 When the form of heteroscedasticity is know Generalized Least Squares estimation can be used.
This leads to weighted least squares as a means of obtaining the BLUE estimator.
 The test statistics from the WLS estimation are either exactly valid when the error term is
normally distributed or asymptotically valid under non normality.
 More commonly, we must estimate a model for the heteroscedasticity before applying WLS. The
resulting feasible GLS estimator is no longer unbiased, but it is consistent and asymptotically
efficient. The test statistics are asymptotically valid.
 Mis-specified functional form makes the estimated equation difficult to interpret.
 Incorrect functional form can be detected by adding quadratics, computing RESET, or
testing against a non-nested alternative model using the Davidson-Mackinnon test.
 Using proxy variable is one way of addressing the problem of omitted variable. A possible
proxy is to use data on a dependent variable from a prior year.
 Under classical errors-in-variables assumptions, measurement error in the dependent
variable has no effect on the statistical properties of OLS. But for CEV for and
independent variable, the OLS estimator for the coefficient on the mismeasured variable is
biased towards zero.
 Non-random samples from an underlying population can lead to biases in OLS.
 When sample selection is correlated with the error term u, OLS is generally biased and
inconsistent.
 Exogenous sample selection – which is either based on the explanatory variables or is
otherwise independent of u –does not cause problems of OLS
 Outliers in data sets can have large impacts on the OLS estimates, especially in small
samples. Least Absolute Deviation estimation is an alternative to OLS that is less sensitive
to outliers and that delivers consistent estimates.
 For time series data the assumption of no serial correlation is added.
 Trends and seasonality can be easily handled in a multiple regression framework by
including time and seasonal dummy variables in regression equation
 OLS can be justified using asymptotic analysis, provided time series processes are
stationary and weakly dependent.
 Processes with deterministic trends that are weakly dependent can be used directly in
regression analysis, provided time trends are included in the model.
 When time series is highly persistent, need to use first differences of the variables
 When models have complete dynamics (no further lags of any variable are needed) errors
will be serially uncorrelated
 Presence of serial correlation causes the usual OLS standard errors and statistics to be
misleading. Typically, the OLS standard errors underestimate the true uncertainty in the
parameter estimates.
 Popular model of serial correlation is the AR(1) model. An asymptotically valid t statistics
is obtained by regressing the OLS residuals on the lagged residuals, assuming the
regressors are strictly exogenous and a homoskedasticity assumption holds.
 For models with a lagged dependent variable or other non-strictly exogenous regressors,
the standard t test on ut-1 is still valid, provided all independent variables are included as
regressors along with ut-1. We can use an F or an LM statistic to test for higher order serial
correlation.
 In models with strictly exogenous regressors, we can use a feasible GLS procedure –
Cochrane-Orcutt or Prais-Winsten – to correct for AR (1) serial correlations. The FGLS
estimates are obtained from OLS on quasi-differenced variables.
 Another way to deal with serial correlation, especially when the strict exogeneity
assumption might fail is to use OLS but to compute serial correlation-robust standard error
(that are also robust to heteroskedasticity). Heteroskedasticity can also detected and
addressed in same way as mentioned before.
 OLS using pooled data is the leading method of estimation and the usual inference procedures
are available, include corrections for heteroskedasticity. Different time intercepts can also be
allowed.
 We might also interact time dummies with certain key variables to see how they have changed
over time.
 Panel data sets are most useful when controlling for time-constant unobserved features – of
people, firms, cities, and so on – which we think might be correlated with the explanatory
variables in the model.
 For two period data, standard OLS analysis on the differenced can be used and the usual
inference procedures are asymptotically valid under homoskedasticity.
 For more than two time periods, pooled OLS on the differenced data assuming homoskedasticity
and differenced errors are serially uncorrelated in order to apply the usual t and F statistics.
 Compared with the first differencing, the fixed effects estimator is efficient when the
idiosyncratic errors are serially uncorrelated (as well as homoscedastic)
 Fixed effects methods apply immediately to unbalanced panels, but we must assume that the
reason some time periods are missing are not systematically related to the idiosyncratic errors.
 The random effects estimator is appropriate when the unobserved effect is thought to be
uncorrelated with all the explanatory variables. In this case a i can be left in the error terms, and
the resulting serial correlation over time can be handled by generalized least square estimation.
 Random effects estimation corrects the serial correlation problem and produces asymptotically
efficient estimators, provided the unobserved effect has zero mean given values of the
explanatory variable in all time periods.
 Feasible GLS can be obtained by a pooled regression on quasi-demeaned data.
 The method of instrumental variables as a way to estimate the parameters in a linear model
consistently when one or more explanatory variables are endogenous.
 An instrumental variables mist be (a) exogenous – uncorrelated with the error term of the
structural equation; (b) partially correlated with he endogenous explanatory variable
 The method of two stage least squares, which allows for more instrumental variables than
we have explanatory variables, is used routinely in the empirical social sciences
 When instruments are poor, then 2SLS can be worse than OLS.
 When we have valid instrumental variables, we can test whether an explanatory variable is
endogenous.
 Heteroskedasticity and serial correlation can be tested for and dealt with using methods
similar to the of models with exogenous explanatory variables.
 Given a full Simultaneous Equations System, we are able to determine which equations
can be estimated (identified)
 OLS estimation of an equation that contains endogenous explanatory variable generally
produces biased and inconsistent estimators.
 2SLS can be used to estimate any identified equation in a system.
 SEMs can be applied to time series data as well. As with the OLS we must be aware of
trending, integrated processes in applying 2SLS
 SEM applications with panel data are very powerful, as they allow us to control for
unobserved heterogeneity while dealing with simultaneity.
 Logit and probit models are used for binary response variables. The advantages are: fitted probabilities are between
zero and one, and the partial effects diminish. However, their coefficients are harder to interpret.
 The Tobit model is applicable to nonnegative outcomes that pile up at zero but also take on a broad range of
positive values. As with logit and probit, the expected values of y given x depend on x and β in nonlinear ways.
 When dependent variable is a count variable a Poisson regression model is appropriate. Parameter interpretation as
semi-elasticities or elasticities depending on whether x is in level or log form. Standard errors have to be computed.
 Censored and truncated regression models handle specific kinds of missing data problems. In censor regression, the
dependent variable is censored above or below a threshold.
 A truncated regression model arises when a part of the population is excluded entirely, a special case of sample
selection problem.
 Exogenous sample selection does not affect consistency of OLS when it is applied to subsample, but endogenous
sample selection does.
 One can test and correct for sample selection bias for the general problem of incidental truncation.
 In the case of time series, if a series has unit root, the usual large sample normal approximations are no longer valid. Also, a unit
root process has the property that an innovation has a long lasting effect.
 Dickey-Fuller t test and ADF test are used for testing unit roots.
 We can allow for a linear trend when testing for unit roots.
 When two non-stationary series are regressed on each other, there is a serious concern about spurious regression.
 When series are cointegrated a regression involving I(1) variables is not spurious. A linear function of the two I(1) variables is I(0).
DF on residuals will test this.
 Cointegration between yt and xt implies that error correction terms may appear in a model relating Δyt to Δxt; error correction
terms are lags in yt-βxt, where β is the cointegrating parameter.
 Univariate models such as ARIMA can used for forecasting time series. Dynamic regression models including autoregressions and
vector autoregression models are used routinely.
 Forecasting trending and I(1) series require special care. Processes with deterministic trends can be forecasted by including time
trend in regression models, possibly with lags of variables
 A typical approach to forecasting an I(1) process is to forecast the difference in the process and to add the level of the variables to
that forecasted difference.
Thank You

You might also like