You are on page 1of 18

lOMoARcPSD|4568763

Ecotrix

B.A. Economics (Hons.) (University of Delhi)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)
lOMoARcPSD|4568763

A cross-sectional data set consists of a sample of individuals, households, firms, cities,


states, countries, or a variety of other units, taken at a given point in time. A time
series data set consists of observations on a variable or several variables over time.
Some data sets have both cross-sectional and time series features. For example, suppose
that two cross-sectional household surveys are taken in the United States, one in 1985
and
one in 1990. In 1985, a random sample of households is surveyed for variables such as
income, savings, family size, and so on. In 1990, a new random sample of households is
taken using the same survey questions. To increase our sample size, we can form a
pooled cross section by combining the two years. A panel data (or longitudinal data)
set consists of a time series for each cross-sectional member in the data set. The key
feature of panel data that distinguishes them from a pooled cross section is that the
same cross-sectional units (individuals, firms, or counties in the preceding examples)
are followed over a given time period.

Always keep in mind that regression does not necessarily imply causation. Causality
must be justified, or inferred, from the theory that underlies the phenomenon that is
tested empirically.
Regression analysis may have one of the following objectives:
1. To estimate the mean, or average, value of the dependent variable, given the values
of the independent variables.
2. To test hypotheses about the nature of the dependence—hypotheses suggested by
the underlying economic theory.
3. To predict, or forecast, the mean value of the dependent variable, given the value(s)
of the independent variable(s) beyond the sample range.
4. One or more of the preceding objectives combined.

the population regression line (PRL) gives the average, or mean, value of the
dependent variable corresponding to each value of the independent variable in the
population as a whole

THE NATURE OF THE STOCHASTIC ERROR TERM


1. The error term may represent the influence of those variables that are not explicitly
included in the model.
2. Human behavior, after all, is not totally predictable or rational. Thus, u may reflect this
inherent
randomness in human behavior.
3. u may also represent errors of measurement.
4. The principle of Ockham’s razor—that descriptions be kept as simple as possible until
proved inadequate—would suggest that we keep our regression model as simple as
possible.

ei = the estimator of ui. We call ei the residual term, or simply the residual.
Conceptually, it is analogous
to ui and can be regarded as the estimator of the latter. It is introduced in the SRF for the
same reasons as ui was introduced in the PRF. Simply stated, ei represents the difference
between the actual Y values and their estimated values from the sample regression

r has the same sign as the slope coefficient.

the lower the p value, the greater the evidence against the null hypothesis.so higher the
p value lower the chance of rejecting null hypothesis

JB statistic follows the chi-square distribution with 2 d.f. asymptotically

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

 Correlation analysis measure degree of linear association between two variables.


 Stochastic variables have probability distributions.
 Linearity means linear in parameter.
 No multi collinearity
 Random sampling
 Fixed x values or x values independent of the error term
 Zero mean of disturbance that means no specification bias
 Homoscadecity
 No serial or auto correlation
 N> parameters
 Var(x)> 0 no outliners
 Standard error of regression (sigma cap) = sqrt( sigma ei / n-2)
 Cov (b1,b2) = - x bar* variance of b2
 For linearity we need non stochastic x, for unbiasedness we need error term is
zero, random sampling, no multinearity , linearity in parameters.
 Var b2= sigma sq by sigma x^2
 For min var we need no auto correlation and homoscadecity , linearity and
unbiasedness
 If n increases var b2 tends to zero that means consistency
 If ui are not normal still estimators are BLUE
 For joint test set up r sq = 0 follow f statistic.
 bxy * byx = r2
 AM of bxy * byx ≥ r2
 Regression y on x and x on y intersect at mean of x and y
 If r=0 that means y on x and x on y are perpendicular
 If r =1/ -1 that means y on x and x on y coincide or parallel.
 R 1.23 means regression coefficient between x and (combine y and z) . multiple
correlation coefficient is always greater or equal than any total correlation
coefficient ( r12, r13, r23)

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

 An important property of R2 is that the larger the number of explanatory variables


in a model, the higher the R2 will be. It would then seem that if we want to explain
a substantial amount of the variation in a dependent variable, we merely have to
go on adding more explanatory variables! comparing the R2 values of two models
with the same dependent variable but with differing numbers of explanatory
variables is essentially like comparing apples and oranges.


 The common practice is to add variables as long as the adjusted R2 increases
(even though its numerical value may be smaller than the unadjusted R2). But
when does adjusted R2 increase? It can be shown that will increase if the
(absolute t) value of the coefficient of the added variable is larger than 1, where
the t value is computed under the null hypothesis that the population value of the
said coefficient
is zero.
 The adjusted and unadjusted R2s are identical only when the unadjusted R2 is equal to
1.

 The d.f. of the total sum of squares (TSS) are always n-1 regardless of the number of
explanatory variables included in the model.
 The assumptions made by the classical linear regression model (CLRM) are not
necessary to compute OLS estimators.
 In the two-variable PRF, b2 is likely to be a more accurate estimate of B2 if the
disturbances ui follow the normal distribution. This is not true
 The PRF gives the mean value of the dependent variable corresponding to each value
of the independent variable. A linear regression model means a model linear in the
parameters.
 Comparing two models on the basis of goodness of fit, dependent variable and n
should be same for comparison.
 Standardised value regression will tell us effect of independent variable on dependent
variable in exact proportion of slope coefficient.
 If correlation btw Y and X2 is zero . partial correlation btw Y and X2 holding X3
constant may not to zero.
 Polynomial regression do not violate multicollinearity. Example total cost function
 Precision of OLS estimators is measured by their standard errors and good of fit by r2
 OLS estimators have minimum variance in the entire class of unbiased estimators
whether linear or not. Bit for this normality is necessary.
 Under normality maximum likelihood and OLS estimators converge.
 The interpretation of CI is that 95 out of 100 cases , this interval will have true B. we
can not say that the probability is 95% that the interval contains true B, because now
interval is fixed and no longer random the prob for B lies is only 0 or 1. This definition
is valid for regression coefficient.
 to compare the values of two models, the dependent variable must be in the
same form.

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

 For the linear-in-variable (LIV) model, the slope coefficient is constant but the
elasticity coefficient is variable, whereas for the log-log model, the elasticity is
constant but the slope coefficient is variable.
 for the linear model the elasticity coefficient changes from point to point and that
for the log-linear model it remains the same at all points the elasticity coefficient
for the linear model is often computed at the sample mean values of X and Y to
obtain a measure of average elasticity. If we add the elasticity coefficients,we
obtain an economically important parameter, called the returns to scale
parameter, which gives the response of output to a proportional change in
inputs. If the sum of the two elasticity coefficients is 1, we have constant
returns to scale (i.e., doubling the inputs simultaneously doubles the output); if
it is greater than 1,we have increasing returns to scale (i.e., doubling the
inputs simultaneously more than doubles the output); if it is less than 1, we have
decreasing returns to scale (i.e., doubling the inputs less than doubles the
output).

 LOGLIN MODEL is growth model


B2 is instantaneous rate of growth

 (a) AFC (b) engel


curve (c) Philips curve
 This model is nonlinear in X because it enters the model inversely or reciprocally,
but it is a linear regression model because the parameters are linear. The salient
feature of this model is that as X increases indefinitely, the term 1/X approaches
zero and Y approaches the limiting or asymptotic value of B1. Therefore, models

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

like regression have built into them an asymptote or limit value that the
dependent variable will take when the value of the X variable increases
indefinitely.
 In Figure (a) if we let Y stand for the average fixed cost (AFC) of production, that
is, the total fixed cost divided by the output, and X for the output, then as
economic theory shows, AFC declines continuously as the output increases
(because the fixed cost is spread over a larger number of units) and eventually
becomes asymptotic at level B1.
 An important application of Figure (b) is the Engel expenditure curve which
relates a consumer’s expenditure on a commodity to his or her total expenditure
or income. If Y denotes expenditure on a commodity and X the total income, then
certain commodities have these features: (1) There is some critical or threshold
level of income below which the commodity is not purchased (e.g., an
automobile). this threshold level of income is at the level −(B2/B1). (2) There is a
satiety level of consumption beyond which the consumer will not go no matter
how high the income (even millionaires do not generally own more than two or
three cars at a time). This level is nothing but the asymptote B1 For such
commodities, the reciprocal model of this figure is the most appropriate.
 Figure (c) is the celebrated Phillips curve of macroeconomics. Based on the
British data on the percent rate of change of money wages (Y) and the
unemployment rate (X) in percent, Phillips obtained a curve similar to Figure (c).20
As this figure shows, there is asymmetry in the response of wage changes to the
level of unemployment. Wages rise faster for a unit change in unemployment if
the unemployment rate is below UN, which is called the natural rate of
unemployment by economists, than they fall for an equivalent change when the
unemployment rate is above the natural level, B1 indicating the asymptotic floor
for wage change. This particular feature of the Phillips curve may be due to
institutional factors, such as union bargaining power, minimum wages, or
unemployment insurance.

 POLYNOMIAL REGRESSION MODELS

this is for Total cost function


 Regression through origin example okun’s law, CAPM


 First, in the model without the intercept, we use raw sums of squares and cross
products, whereas in the intercept-present model, we use mean-adjusted sums of
squares and cross products.
 Second, the d.f. in computing sigma cap sq is n-1 now rather than n-2 , since we
have only one unknown.
 Third, the conventionally computed r2 formula we have used thus far explicitly
assumes that the model has an intercept term. Therefore, you should not use that
formula. If you use it, sometimes you will get nonsensical results because the
computed r2 may turn out to be negative.
 Finally, for the model that includes the intercept, the sum of the estimated
residuals, is always zero, but this need not be the case for a model without the
intercept term. For all these reasons, one may use the zero-intercept model only if
there is strong theoretical reason for it, as in Okun’s law or some areas of
economics and finance.

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

 If X are multiplied by w and Y are multiplied by v, b2 factor by v/w and b1 factor by v


only.sigma cap sq by v2 Keep following things in mind while scaling, First, the r2 and
t value are same, Second , when Y and X are measured in the same units of
measurement the slope coefficients as well as their standard errors remain the
same although the intercept values and their standard errors are different Third,
when the Y and X variables are measured in different units of measurement, the
slope coefficients are different, but the interpretation does not change.
 Standardized regression is regression from origin.


 The interpretation is that if the (standardized) regressor increases by one
standard deviation, the average value B2* of the (standardized) regressand
increases by standard deviation units. you will see the advantage of using
standardized variables, for standardization puts all variables on equal footing
because all standardized variables have zero means and unit variances. This will
help us to find which regressor has more effect on regrassand by looking at their
coefficient. Higher the coefficient higher the effect.


 Log inverse model is basically based on microeconomics return to factor diagram i.e.
increasing at increasing rate then increasing at decreasing rate.
 In log log model ui follow log normal distribution with mean e^(siqma2 /2) and
variance {e^(sigma sq.) * [e^(sigma sq.) - 1]

 Restricted least square. H0: B2=B3=0

 Interpretation of R2 or ESS, how much variation in dependent variable is explained by


variation in independent variable or variation percentage explained by regression.
 Regression models that contain only dummy explanatory variables are called
analysis-of-variance (ANOVA) models. 1Since dummy variables generally take
on values of 1 or 0, they are nonstochastic; that is, their values are fixed. And
since we have assumed all along that our X variables are fixed in repeated
sampling, the fact that one or more of these X variables are dummies does not
create any special problems insofar as estimation of model is concerned. In short,
dummy explanatory variables do not pose any new estimation problems and we
can use the customary OLS method to estimate the parameters of models that
contain dummy explanatory variables. a regression on an intercept and a dummy

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

variable is a simple way of finding out if the mean values of two groups differ. If
the dummy coefficient B2 is statistically significant (at the chosen level of
significance level), we say that the two means are statistically different. If it is not
statistically significant, we say that the two means are not statistically significant.
 The general rule is: If a model has the common intercept, B1, and if a qualitative
variable has m categories, introduce only (m - 1) dummy variables. If this rule is
not followed, we will fall into what is known as the dummy variable trap, that is,
the situation of perfect collinearity or multicollinearity, if there is more than
one perfect relationship among the variables. Another way to resolve the perfect
collinearity problem is to keep as many dummies as the number of categories but
to drop the common intercept term, B1, from the model; that is, run the
regression through the origin.
 Regression models containing a combination of quantitative and qualitative
variables are called analysis-of-covariance (ANCOVA) models.

 the two regression lines differ in their intercepts but


their slopes are the same. In other words, these two regression lines are parallel.

 Interaction dummies find


reference category first abs the compare.
 number of dummies for each qualitative variable is one less than the number of
categories of that variable.
 To comparing two or more regressions is popularly known as the Chow test, this

is basically for finding structural change

 WHAT HAPPENS IF THE DEPENDENT VARIABLE IS ALSO A DUMMY VARIABLE?


THE LINEAR PROBABILITY MODEL (LPM). we now interpret the slope coefficient
B2 as a change in the probability that Y = 1, when X changes by a unit. The
estimated Yi value from is the predicted probability that Y equals 1 and b2 is an
estimate of B2.
 There are some problem in this model First, although Y takes a value of 0 or 1,
there is no guarantee that the estimated Y values will necessarily lie between 0
and 1. In an application, some can turn out to be negative and some can exceed
1. Second, since Y is binary, the error term is also binary. This means that we
cannot assume that ui follows a normal distribution. Rather, it follows the
binomial probability distribution. Third, it can be shown that the error term is
heteroscedastic; so far we are working under the assumption that the error term
is homoscedastic. Fourth, since Y takes only two values, 0 and 1, the
conventionally computed R2 value is not particularly meaningful. Of course, not all
these problems are insurmountable. For example, we know that if the sample size
is reasonably large, the binomial distribution converges to the normal distribution.
we can find ways to get around the heteroscedasticity problem. So the problem
that remains is that some of the estimated Y values can be negative and some
can exceed 1. In practice, if an estimated Y value is negative it is taken as zero,
and if it exceeds 1, it is taken as 1. This may be convenient in practice if we do
not have too many negative values or too many values that exceed 1. But the
major problem with LPM is that it assumes the probability changes linearly with
the X value; that is, the incremental effect of X remains constant throughout.
 In the model Yi = B1 + B2Di + ui, letting Di take the values of (0, 2) instead of (0, 1) will
halve the value of B2 and its standard error but not change the t value.
 When dummy variables are used, ordinary least squares (OLS) estimators are still
unbiased.
 Spline function / piecewise regression Y= a + bX + (X-X*)D + u, D= 1 , X>X* or 0
X<X*

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

 Semilogarithmic Regression : lnY= a + bD + u, if D=0: lnY = a, if D=1 : lnY= a + b ,


antilog of a is median of reference category not mean. Interpretation will change here
from mean to median.
 In loglog model we take dummy as 1 and 10.
 in cases of perfect linear relationship or perfect multicollinearity among
explanatory variables, we cannot obtain unique estimates of all parameters. And
since we cannot obtain their unique estimates, we cannot draw any statistical
inferences (i.e., hypothesis testing) about them from a given sample.
 THEORETICAL CONSEQUENCES OF MULTICOLLINEARITY:
It is interesting that so long as collinearity is not perfect, OLS estimators still
remain BLUE even though one or more of the partial regression coefficients in a
multiple regression can be individually statistically insignificant. It is true that
even in the presence of near collinearity, the OLS estimators are unbiased. But
remember that unbiasedness is a repeated sampling propertyminimum variance
does not mean the numerical value of the variance will be small. Multicollinearity
is essentially a sample (regression) phenomenon in the sense that even if the X
variables are not linearly related in the population.
 PRACTICAL CONSEQUENCES OF MULTICOLLINEARITY:
 Large variances and standard errors of OLS estimators
 Wider confidence intervals
 Insignificant t ratios
 A high R2 value but few significant t ratios
 OLS estimators and their standard errors become very sensitive to small
changes in the data; that is, they tend to be unstable.
 Wrong signs for regression coefficients.
 Difficulty in assessing the individual contributions of explanatory variables
to the explained sum of squares (ESS) or R2.
 If the goal of the study is to use the model to predict or forecast the future mean
value of the dependent variable, collinearity per se may not be bad. On the other
hand, if the objective of the study is not only prediction but also reliable
estimation of the individual parameters of the chosen model, then serious
collinearity may be bad, because we have seen that it leads to large standard
errors of the estimators.

 DETECTION OF MULTICOLLINEARITY
 High R2 but few significant t ratios
 High pairwise correlations among explanatory variables
 Examination of partial correlations.
 Subsidiary, or auxiliary, regressions. If this regression shows that we have
a problem of multicollinearity because, say, the R2 is high but very few X
coefficients are individually statistically significant, we then look for the
“culprit,” the variable(s) that may be a perfect or near perfect linear
combination of the other X’s. We proceed as follows: (a) Regress X2 on the
remaining X’s and obtain the coefficient of determination, (b) Regress X3
on the remaining X’s and obtain its coefficient of determination, Continue
this procedure for the remaining X variables in the model. suppose we
want to test the hypothesis that X2 is not collinear with the remaining X’s

 The variance inflation factor (VIF).


R2 is zero, that is, there is no collinearity, the VIF will be 1. high R2 is
neither necessary nor sufficient to get high standard errors and thus
multicollinearity by itself need not cause high standard errors. Suppose an
in an auxiliary regression is very high R2 (but less than 1), suggesting a
high degree of collinearity But the variance of, say, b2, not only depends

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

upon the VIF but also upon the variance of ui, , as well as on the variation
in X2, . Thus, it is quite possible that an R2 is very high, say, 0.91, but that
either variance of ui, is low or the variation in X2, is high, or both,
so that the variance of b2 can still be lower and the t ratio higher. In
other words, a high R2 can be counterbalanced by a low or a high , or both.
Of course, the terms high and low are used in a relative sense. ( tolerance
= 1/VIF)

 REMEDIAL MEASURES
 Dropping a Variable(s) from the Model it is a sure way but least preferred.
 Acquiring Additional Data or a New Sample: if the sample size of X3
increases, will generally increase variation in X3 as a result of which
the variance of b3 will tend to decrease, and with it the standard error
of b3.
 Prior Information about Some Parameters
 Rethinking the Model
 Transformation of Variables
 multicollinearity is often encountered in times series data because economic
variables tend to move with the business cycle. Here information from cross-
sectional studies might be used to estimate one or more parameters in the
models based on time series data.
 Perfect collinearity btw variables x1,x2,x3 is a*x1+ b*x2+ c*x3=0, a b c are
not all equal to zero simultaneously
 Micronumerosity means n is slightly greater than k
 heteroscedasticity is usually found in crosssectional data and not in time
series data.1 In cross-sectional data we generally deal with members of a
population at a given point in time, such as individual consumers or their
families; firms; industries; or geographical subdivisions, such as a state,
county, or city. Moreover, these members may be of different sizes, such as
small, medium, or large firms, or low, medium, or high income. In other words,
there maybe some scale effect
 heteroscadicity may resukt because of outliners in the sample, omission of
variables and skewness of distribution of regressors in the model. Wrong
functional form, incorrect transformation of variables
 CONSEQUENCES OF HETEROSCEDASTICITY
 OLS estimators are still linear. They are still unbiased. But they no longer
have minimum variance; that is, they are no longer Efficient. In short, OLS
estimators are no longer BLUE in small as well as in large samples (i.e.,
asymptotically). It might give consistent estimators
 The usual formulas to estimate the variances of OLS estimators are
generally biased. A priori we cannot tell whether the bias will be positive
(upward bias) or negative (downward bias). A positive bias occurs if OLS
overestimates the true variances of estimators, and a negative bias occurs
if OLS underestimates the true variances of estimators. The bias arises
from the fact that , the conventional estimator of true standard error is no
longer an unbiased .
 in the presence of heteroscedasticity, the usual hypothesis-testing routine
is not reliable, raising the possibility of drawing misleading conclusions.
 DETECTION OF HETEROSCEDASTICITY

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

 Graphical Examination of Residuals

(b) to (e), there is a


possibility that heteroscedasticity is present in the data.
 heteroscedasticity is generally expected if small-, medium-, and large-sized
firms are sampled together. Similarly, in a cross-sectional study of the average
cost of production in relation to the output, heteroscedasticity is likely to be
found if small-, medium-, and large-sized firms are included in the sample.
 Park Test

it is two stage procedure.


Instead of regressing the log of the squared residuals on the log of the
Xvariable(s), you can also regress the squared residuals on the X variable,
especially if some of the X values are negative. The Park test therefore
involves the following steps:
1. Run the original regression despite the heteroscedasticity problem, if any.
2. From this regression, obtain the residuals ei, square them, and take their
logs
3. Run the regression using an explanatory variable in the original model; if
there is more than one explanatory variable, run the regression against each
X variable. Alternatively, run the regression against , the estimated Y.
4. Test the null hypothesis that B2 = 0; that is, there is no heteroscedasticity.
If a statistically significant relationship exists between ln e2 and ln Xi, the null
hypothesis of no heteroscedasticity can be rejected,
5. If the null hypothesis is not rejected, then B1 in the regression can be
interpreted as giving the value of the common, or homoscedastic,
variance.
This test fails because maybe error term vi heteroscadastic.
 White’s General Heteroscedasticity Test
White’s test proceeds as follows:
1. We first estimate regression by OLS, obtaining the residuals, ei.

2. We then run the following auxiliary regression: The term vi is the residual term in the

auxiliary regression.
3. Obtain the R2 value from the auxiliary regression. Under the null hypothesis that there

is no heteroscedasticity. .
This test is also used for specification bias
 Glejser Test
Glejser, has maintained that in large samples the preceding models are fairly
good in detecting heteroscedasticity. Therefore, Glejser’s test can be used as

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

a diagnostic tool in large samples. The null hypothesis in each case is that
there is no heteroscedasticity; that is, B2 = 0. If this hypothesis is rejected,
there is probably evidence of heteroscedasticity

 Goldfeld-Quandt test
Order the Xi, omit c central observation and split remaining n-c observation
into two part. Fit regression, find RSS of both, obtain $= RSS2 / RSS1. F
statistic with (n-c-2k)/2 df.
Note: There is no general test of heteroscedasticity that is free of any assumption
about which variable the error term is correlated with.

 REMEDIAL MEASURES
 When sigma2 i Is Known:The Method of Weighted Least Squares (WLS).estimator
obtained is still BLUE

 When True sigma2i Is Unknown


Case 1: The Error Variance Is Proportional to Xi: The Square Root
Transformation

It is important to note that to estimate we must use the regression-


through-the-origin estimating procedure
Case 2: Error Variance Proportional to Xi^2

 Respecification of the Model


Instead of speculating about , sometimes a respecification of the PRF—
choosing a different functional form can reduce heteroscedasticity.
 However, if the sample size is reasonably large, we can use White’s procedure
to obtain heteroscedasticity corrected standard errors. “Robust standard
error”
 The term autocorrelation can be defined as “correlation between members of
observations ordered in time (as in time series data) or space (as in crosssectional
data). Just as heteroscedasticity is generally associated with cross-sectional data
autocorrelation is usually associated with time series data (i.e., data ordered in
temporal sequence), although, as the preceding definition suggests,
autocorrelation can occur in cross-sectional data also, in which case it is called
spatial correlation (i.e., correlation in space rather than in time).

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

Figures (a) to (d) show a


distinct pattern among the u’s while Figure (e) shows no systematic pattern
 There are several reasons for autocorrelation
 Inertia: it take time toadopt chabges.
 Model Specification Error(s): incorrect specification of a model we mean
that either some important variables that should be included in the model
are not included (this is the case of underspecification) or that the model
has the wrong functional form—a linear-in-variable (LIV) model is fitted
whereas a log-linear model should have been fitted. If such model
specification errors occur, then the residuals from the incorrect model
will exhibit a systematic pattern
 The Cobweb Phenomenon
 Data Manipulation
 autocorrelation can be positive as well as negative, although economic time
series generally exhibit positive autocorrelation

 CONSEQUENCES OF AUTOCORRELATION:
 The least squares estimators are still linear and unbiased. But they are not
efficient; that is, they do not have minimum variance. The estimated
variances of OLS estimators are biased. Sometimes the usual formulas to
compute the variances and standard errors of OLS estimators seriously
underestimate true variances and standard errors, thereby inflating t
values. This gives the appearance that a particular coefficient is
statistically significantly different from zero, whereas in fact that might not
be the case.(wider CI)
 Therefore, the usual t and F tests are not generally reliable. The usual
formula to compute the error variance, namely, = RSS/d.f. is a biased
estimator of the true and in some cases it is likely to underestimate the
latter. As a consequence, the conventionally computed R2 may be an
unreliable measure of true R2.(overestimate R2)
 The conventionally computed variances and standard errors of forecast
may also be inefficient

 DETECTING AUTOCORRELATION
 The Graphical Method
 The Durbin-Watson d Test
it is very important to note the assumptions underlying the d statistic:
1. The regression model includes an intercept term. Therefore, it cannot be used
to determine autocorrelation in models of regression through the origin. Ui is
normally distributed .
2. The X variables are nonstochastic; that is, their values are fixed in repeated
sampling. No missing term in time series data.

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

3. The disturbances ut are generated by the following mechanism

which states that the value of the disturbance,


or error, term at time t depends on its value in time period (t - 1) and a purely
random term (vt), the extent of the dependence on the past value, is measured by
p (rho). This is called the coefficient of autocorrelation, The mechanism is
known as the Markov firstorder autoregressive scheme or simply the first-
order autoregressive scheme, usually denoted as the AR(1) scheme.
4.The regression does not contain the lagged value(s) of the dependent variable
as one of the explanatory variables. In other words, the test is not applicable to
models such as

where is the one-period lagged value of the


dependent variable Y.
Models like regression are known as autoregressive models, a regression of a
variable on itself with a lag as one of the explanatory variables.

 The Breusch-Godfrey (BG) Test:


This test is general in that it allows for (1) stochastic regressors, such as the
lagged values of the dependent variables, (2) higher-order autoregressive
schemes, such as AR(1), AR(2), etc., and (3) simple or higher-order moving
averages of the purely random error terms.

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

 Run the regression ,obtain residual from this regression, . Now run the
following regression:

That is, regress the residual at time t on the original regressors,


including the intercept and the lagged values of the residuals up to
time .Obtain the value of R2 of this regression. This is called the
auxiliary regression. Under the null hypothesis that all the
coefficients of the lagged residual terms are simultaneously equal to
zero, it can be shown that in large samples That is, in large samples,
the product of the sample size and follows the chi-square distribution
with k degrees of freedom (i.e., the number of lagged residual terms).
In econometrics literature, the BG test is known as the Lagrange

multiplier test. But keep in mind that the BG test is of


general applicability, whereas the Durbin-Watson test assumes only
first-order serial correlation.

 HOW TO ESTIMATE p:
 p Estimated from Durbin-Watson d Statistic

 estimate p from ols now.


 In applied econometrics one assumption that has been used
extensively is that ; that is, the error terms are perfectly positively
autocorrelated so p=1 which may be true of some economic time
series.then we can obtain this result from first difference method.

note that The model has no


intercept.
 REMEDIAL MEASURES:

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

apply OLS to the transformed


variables Y* and X*, the estimators thus obtained will have the desirable BLUE
property generalized difference equations In this differencing procedure
we lose one observation because the first sample observation has no
antecedent. To avoid this loss of one observation, the first observation of Y

and X is transformed as follows: This transformation is


known as the Prais-Winsten transformation.

 THE ATTRIBUTES OF A GOOD MODEL


1. the principle of parsimony, suggests that a model be kept as simple as
possible
2. the estimated parameters must have unique values
3. adjusted R2 is as high as possible
4. Theoretical Consistency
5. Predictive Power
 specification errors.( we know the true model but estimated incorrect)
1. Omission of a relevant variable(s).
2. Inclusion of an unnecessary variable(s).
3. Adopting the wrong functional form.
4. Errors of measurement.

 Omission of a relevant variable(s).

TRUE MODEL

ESTIMATED MODEL

CONSEQUENCES OF OMISSION:
1. If X3 is correlated to X2, A1 & A2 are biased and inconsistent
2. If X2 and X3 are uncorrelated,a2 is unbiased. It is consistent as well. But
a1 still remains biased, unless mean of X3 is zero.
3. Error variance is incorrectly estimated and biased.
4. estimated variance of b2 is a biased estimator of the variance of the
true estimator b2. Even in the case X2 and X3 are uncorrelated, this
variance remains biased. As a result, the usual confidence interval and
hypothesis-testing procedures are unreliable.

5. if b32(correlation btw X2 & X3 is +


upward bias ; - downward bias. If 0 so no bias in a2.

 Inclusion of a relevant variable(s).

ESTIMATED MODEL

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)


lOMoARcPSD|4568763

TRUE MODEL

CONSEQUENCES OF INCLUSION:
1. The OLS estimators of the “incorrect” model are unbiased (as well as
consistent).
2. Error variance is correctly estimated. As a result, the usual confidence
interval and hypothesis-testing procedures are reliable and valid.
3. However, the a’s estimated from the regression are inefficient— their
variances will be generally larger than those of the b’s estimated from
the true model In short, the OLS estimators are LUE (linear unbiased
estimators) but not BLUE.

 Adopting the wrong functional form.


Without going into theoretical fine points, if we choose the wrong functional
form, the estimated coefficients may be biased estimates of the true
coefficients

 Error of measurement
If dependent variable has measurement error, estimators are still unbiased
but have larger variances. But these variances are unbiased.(inefficient)
If independent variable has measurement error, estimators are biased and
inconsistent even in small samples.

 mispecification errors( we know the true model to begin)

o Stochastic explanatory variables:


if stochastic error is not independent to explanatory variable OLS are
biased and inconsistent even if the sample size is large.
If independent but stochastic unbiased and inefficient.

o Error is not normally distributed


OLS are still BLUE.
 Linearity of models means model is linear in parameters. Some models are
nonlinear, we can classify them as intrinsically linear( transformation of
nonlinear into linear is possible) and non intrinsically linear( transformation of
nonlinear into linear is not possible). For parameter estimation OLS is not
possible at all.

Downloaded by Taruna Bajaj (tarunabajaj25@gmail.com)

You might also like