Knowledge Management Paper

Q – No
The Significance of the Stochastic Disturbance Term

The disturbance tern ‘ui’ is a replacement for all those variables that are omitted from the model
but that collectively affect y.
Why don’t we introduce them (variables) into the model clearly and replaced by ui?
There are many reasons for using ui:
 Vagueness of theory
When we might be unsure about the other variables affecting the dependent variable (Y)
in the given theory, then ‘ui’ may be used as a substitute for all omitted variables of
model.
 Unavailability of data
Shortage of quantitative information about these variables.
Example:
Information on family wealth generally is not available.
 Core variables versus peripheral variables
Assume that besides income X1, the number of children per family X2, sex X3, religion
X4, education X5, and geographical region X6 also effect consumption expenditure. But
combined effect of all variables may not introduce them into the model clearly. It is
expect that their joint effect can be treated as a random variable ui.
 Intrinsic randomness in human behavior
There is some ‘intrinsic’ randomness in individual Y’s that cannot be explained even if
we get succeed in introducing variables clearly. The ui’s may very well reflect this
intrinsic randomness.
 Poor proxy variable

There may be error of measurement
Example:
Friedman regards permanent consumption as a function of permanent income.
Since data on these variables are not clearly observable, we use proxy variables, current
consumption and current income. There is problem of error of measurement and ui
represent this error of measurement.
 Principle of parsimony
When the theory is not strongly suggest what other variables might be included, so
instead of introducing them, add a random variable (ui) just to keep the model simple.
 Wrong functional form
There may be wrong formation of relationship between dependent and independent
variables. For two variable model it is easy by scatter diagram but in multiple regression
model it is not possible to visualize.
Q – No
The assumptions underlying the method of least squares
 Linear regression Model
The model should be linear in parameters but not necessary in variables.
 X (independent variable) values are fixed in repeated sampling. Values taken by the X
are considered fixed in repeated samples.
 Zero mean value of disturbance ui.
Given the value of X, the mean value of random disturbance term ui is zero.
 Homoscedasticity or equal variance of ui
Given the value of X, the variance of ui is the same for all observations. So the
conditional variances of ui are identical.
 No autocorrelation between the disturbances
Given any two X values, Xi and Xj(i1= j), the correlation between any two ui and uj(i1= j)
is zero. In simple words, the disturbances ui and uj are uncorrelated.
 Zero covariance between ui and Xi

Cov(ui,Xi) = E(ui,Xi) = 0
The disturbance u and independent variable X are uncorrelated.
 The number of observations n must be greater than the number of parameters to be
estimated. In other words n must be greater than number of independent variables.
 The nature of X variable
Variability in X values. They must not all be the same
 The regression model is correctly specified
 There is no perfect multicollinearity between Xs.
Q – No
The normality assumption for ui
 CNLRM (classical normal linear regression model) assumes that each ui is distributed
normally with
Mean: E(ui) = 0
Variance: E(u2i) = o2
Cov(ui, uj): E(ui, uj) = 0
For two normally distributed variables, the zero covariance or correlation means
independence of them, so ui and uj are not only uncorrelated but also independently
distributed.
 Why the normality assumption?
_With some expectations, the distribution of sum of a large number of independent and
identically distributed random variables tends to a normal distribution as the number of
such variables increases indefinitely.
_If number of variables is not very large, their sum may still be normally distributed.
_Under the normality assumption for ui, the OLS (ordinary least squares) estimators B^1
and B^2 are also normally distributed.
_The normal distribution is a comparatively simple distribution involving only two

parameters (mean and variance)
Q – No
Dummy variables
Sometimes we have non-numeric or qualitative variables in regression analysis like color, gender
etc.
So in case we transform into dichotomous variable like “yes or no”.
We assign dummy variables so they support us to use qualitative variables.
Dummy variables are not real they are just dummies used to represent qualitative variables.
Dummy variables used to recode a binominal categorical variable in regression by taking values
“0 or 1”.
Dummy are mutually exclusive categories, means select only one option
Example:-
Are you married or unmarried?
Yes or No
So we shall use 1 for Yes and 0 for No as dummies
Q – No
Detection of Heteroscedasticity
There are two ways informal and formal for detection of heteroscedasticity, which one is
informal through graphs and called graphical method and second is through formal tests.
 Graphical method
 Formal tests:
Goldfield-Quandt test
White’s Test
 The GQ Test
The Goldfield-Quandt test is carried out as
1. First of all split the total sample of length T into two sub-samples of length T1 and T2.
The regression model is estimated on each sub-sample and the two residual variances
are calculated.
2. The null hypothesis is that the variances of the disturbances are equal
3. The test statistic, denoted GQ, is simply the ratio of two residual variances where
larger of two variances must be placed in numerator GQ = s21/s22
4. The test statistic is distributed as an F(T1-k, T2-k) under the null of homoscedasticity.
5. A problem with test is that the choice of where to split the sample is that usually
arbitrary and may affect outcome of test.
 White’s Test
1. Assume that the regression we carried out as
y1 = B1 + B2 + B3x3i + ut
and we want to test Var(ut) = o2. We estimate the model, obtaining the
residuals, ^ut
2. Then run the auxiliary regression ^u2t = a1 + a2x2t + a3x3t + a4x22t +
a5x23t + a6x2tx3t + vt
3. Obtain R2 from auxiliary regression and multiply it by number of
observations, T. TR2 ~ x2 (m)
4. If x2 from step 3 is greater than corresponding value from statistic table then
reject null hypothesis that the disturbances are homoscedastic.
Q – No
Detecting Autocorrelation
There are two ways in general formal and informal.
The informal way is done through graphs and called graphical method.
The formal way is done through formal tests like:
 The Durbin Watson Test
 The Breusch-Godfrey Test
 Run Test
The Durbin Watson Test
The following assumptions should be satisfied:
- The regression model includes a constant
- Autocorrelation is assumed to be of first-order only
- The equation does not include a lagged dependent variable as an explanatory variable
Procedure of Durbin Watson
1. Estimate the model by OLS and obtain the residuals
2. Calculate the DW statistics
3. Construct the table with calculated DW statistic and dU, dL, 4-dU and 4-dL
critical values.
4. Conclude
Decision rules
- D value always lies between 0 and 4
- If closer d is to zero, the greater is the evidence of positive autocorrelation.
- If closer d is to 4, the greater is the evidence of negative autocorrelation.
- If d is about 2, no autocorrelation
The Breusch-Godfrey Test
It is a general test
This test resolves the drawbacks of the Durbin Watson test.
1. Estimate the model and obtain the residuals
2. Run the full LM (Lagrange Multiplier) model with number of lags used being determined
by assumed order of autocorrelation.
3. Compute the LM statistic = (n-p)R2 from LM model and compare it with chi-square
critical value.
4. conclude
Remedial Measure
- First-Difference Transformation
Change the functional form
- Generalized Transformation
Generalized least-square (GLS) method
Estimate value of p through regression of residual on lagged residual and use value to run
transformed regression.
- Newey-West (for large sample)
Generates HAC (Heteroscedasticity and autocorrelation consistent) standard errors.
- Model Evaluation
- In some situations we can continue to use the OLS method.
Q – No
Methodology of Econometrics
- Statement of Economic theory
Keynes postulated that the Marginal propensity to consume the rate of change of
consumption for a unit change in income is greater than zero but less than 1.
- Specification of the Mathematical model of consumption
Keynes postulated a positive relationship between consumption and income, he did not
specify the brief form of functional relationship between the two
Y = B1 + B2X 0<B2<1
- Specification of the Econometric model of consumption
Assumes that there is exact relationship between consumption and income. But generally
relationship between economic variables are inexact.
- Obtaining Data
To estimate the econometric model, to obtain the numerical values of B1 and B2 we need
data.
- Estimation of econometric model
Our task is to estimate the parameters of consumption function. The numerical estimates
of parameters give empirical content to consumption function.
- Hypothesis testing
Assuming that the fitted model is good, we have to develop a criteria to find out whether
the estimates obtained are according to expectations of theory that is being tested.
- Forecasting or prediction
If chosen model does not overthrow the hypothesis, we may use it to predict future value
of dependent Y variable on the basis of expected future value of independent variable X.
- Use of the model for policy purposes
The estimated model is used to make policies by appropriate mix of fiscal and monetary
the government can manipulate the control variable X to produce a desired level of target
variable Y.

Knowledge Management Paper

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Knowledge Management Paper

Uploaded by

Copyright:

Available Formats

Q – No

The Significance of the Stochastic Disturbance Term

 Poor proxy variable

 Zero covariance between ui and Xi

 Why the normality assumption?

_The normal distribution is a comparatively simple distribution involving only two

You might also like