You are on page 1of 33

BFI Program

ECONOMETRICS I National Economics University


2020
MA Hai Duong
1
MORE ON LINEAR
REGRESSION ANALYSIS

2
RECAP.
The multiple regression model can be written in matrix form as

The OLS procedure finds a linear combination of that is closest to the vector , i.e. the
length of its error vector (OLS residuals) is the shortest
This implies that the OLS residual vector is perpendicular to all columns of X, i.e.,

which leads to the OLS formula

3
RECAP
A consequence of orthogonality of residuals and columns of X is that
or
This leads to the definition of the coefficient of determination , which is a measure of
goodness of fit

OLS formula gives us fitted values that are closest to the actual values in the sense that the
length of the residual vector is the smallest possible.
But why should this be a good estimate of the unknown parameters of the conditional
expectation function?
We ask if this formula produces the best information which we can get from our data
about the unknown parameters β 4
LECTURE OUTLINE
Interpretation of the OLS estimates in multiple regression (textbook reference 3-2a to 3-2e)
Properties of good estimators
Properties of the OLS estimator
1. OLS is unbiased: The expected value of (Textbook reference 3-3, and Appendix E-2)
2. The Variance of (Textbook reference 3-4, and Appendix E-2)
3. OLS is BLUE (Gauss-Markov Theorem): The Efficiency of (Textbook reference 3-5,
and Appendix E-2)
Units of measurement: do the results qualitatively change if we change the units of
measurement? (textbook reference 2-4a, 6-1 (exclude 6-1a))

5
INTERPRETATION OF OLS
ESTIMATES
Recall Example in the third lecture: The causal effect of education on wage
The wage equation:
And OLS estimation results are:

shows that for two people with the same IQ, the one with one more year of education is
predicted to earn $42.06 more in monthly wages. (The estimation results were provided by
EVIEWS 8.0 in next slide)

6
OLS IN ACTION

7
EFFECT OF RESCALING
The upshot: changing units of measurement does not create any extra information,
so it only changes OLS results in predictable and non-substantive ways.
Scaling : Any change of unit of measurement that involves multiplying by
a constant (such as changing dollars to cents or pounds to kilograms), does not
change the column space of , so will not change . Therefore, it must be that the
coefficient of changes such that stays the same

Where:

8
EFFECT OF RESCALING
Change of units of the form : Any change of unit of measurement that
involves multiplying by a constant and adding or subtracting a constant (such as
changing from ◦C to ◦F) does not change the column space of either, so will not change
. Therefore, it must be that the changes in such a way that stays the same.

Where:

In both cases, since does not change, residuals, and SSR all stay the same, so will not
change.

9
EFFECT OF RESCALING
Changing the units of dependent variable : This changes the length of
but the column space of is still the same. So, will be multiplied by , and since
both and are multiplied by , the residuals will be multiplied by also.

Where:

In this case, and all change but all will be multiplied by ... (question for you), so
again, will not change.

10
ESTIMATORS AND THEIR
UNBIASEDNESS
An estimator is a formula that combines sample information and produces an
estimate for parameters of interest.
Example: Sample average is an estimator for the population mean.
Example: is an estimator for the parameter vector in the multiple regression model.
Since estimators are functions of sample observations, they are variables
While we do not know the values of population parameters, we can use the power of
mathematics to investigate if the expected value of the estimator is indeed the
parameter of interest.
Definition: An estimator is an unbiased estimator of a parameter of interest if its
expected value is the parameter of interest.
11
THE EXPECTED VALUE OF THE
OLS ESTIMATOR
Under the following assumptions,
Multiple Regression Model
MLR.1 Linear in Parameters E.1 Linear in Parameters

MLR.2 Random Sampling


We have a random sample of obs.
MLR.4 Zero Conditional Mean E.3 Zero Conditional Mean

MLR.3 No Perfect Collinearity E.2 No Perfect Collinearity


None of is a constant and there are no has rank
exact linear relationships
12
UNBIASED ESTIMATOR
We use an important property of conditional expectations to prove that the OLS
estimator is unbiased: if then for any function g.
For example: , etc
Now let’s show
The assumption of no perfect collinearity (E.2) is immediately required to get exists
Step 1: Using the population model (Assumption E.1), substitute for in the
estimator’s formula and simplify:

13
UNBIASED ESTIMATOR
Step 2: Take expectations:

using Assumption E.3, since


Note that all assumptions were needed and were used in this proof.
Are these assumptions too strong?

14
UNBIASED ESTIMATOR
Linearity in parameters is not too strong, and does not exclude non-linear
relationships between and (more on this later)
Random sample is not too strong for cross-sectional data if participation in the
sample is not voluntary. Randomness is obviously not correct for time series data.
Perfect multi-collinearity is quite unlikely unless we have done something silly like
using income in dollars and income in cents in the list of independent variables, or
we have fallen into the “dummy variable trap” (more on this later)
Zero conditional mean is not a problem for predictive analytics, because for best
prediction, we always want our best estimate of for a set of observed predictors.

15
UNBIASED ESTIMATOR
Zero conditional mean can be a problem for prescriptive analytics (causal analysis) when we
want to establish the causal effect of one of the variables, say on controlling for an attribute
that we cannot measure. E.g., causal effect of education on wage keeping ability constant:
We want to estimate in:
Omitted , the regression is: where
Run the regression: so
is a biased estimator of
This is referred to as “omitted variable bias”
One solution is to add a measurable proxy for ability, such as IQ

16
OMITTED VARIABLE BIASED
Our estimate from the regression on is:
by
, so under (E.3)
Substitute in for the

So

17
SUMMARY OF BIAS IN WHEN
IS OMITTED
Summary of bias in when is omitted in estimating equation

Positive bias Negative bias

Negative bias Positive bias

18
UNBIASED ESTIMATOR
Zero conditional mean can be quite restrictive in time series analysis as well.
It implies that the error term in any time period is uncorrelated with each of the
regressors, in all time periods, past, present and future. (Reference with the 1 st
lecture)
Assumption MLR.4 is violated when the regression model contains a lag of the
dependent variable as a regressor. E.g. We want to predict this quarter’s GDP using
GDP outcomes for the past 4 quarters
In this case, the regression parameters are biased estimators.

19
THE VARIANCE OF THE OLS
ESTIMATOR
Coming up with unbiased estimators is not too hard. But how the estimates that
produce are dispersed around the mean determines how precise they are.
To study the variance of , we need to learn about variances of a vector of random
variables (Var - Cov matrix)

20
AN EXTRA ASSUMPTION
Multiple regression model
MLR.1 Linear in Parameters E.1 Linear in Parameters

MLR.2 Random Sampling


We have a random sample of obs.
MLR.4 Zero Conditional Mean E.3 Zero Conditional Mean

MLR.3 No Perfect Collinearity E.2 No Perfect Collinearity


None of is a constant and there are no has rank
exact linear relationships
MLR.5 Homoskedasticity E.4 Homo+Randomness

21
VARIANCE - COVARIANCE
MATRIX
Under these assumptions, the variance-covariance matrix of the OLS estimator
conditional on is

(Theorem E.2 in Appendix E – page 724 - 725)

22
VARIANCE - COVARIANCE
MATRIX
We can immediately see that given the OLS estimator will be more precise (i.e. its
variance will be smaller) the smaller is
It can also be seen (not as obvious) that as we add observations to the sample, the
variance of the estimator decreases, which also makes sense
Gauss-Markov Theorem: Under Assumptions E.1 to E.4 (or MLR.1 to MLR.5) is
the best linear unbiased estimator (B.L.U.E.) of .
This means that there is no other estimator that can be written as a linear
combination of elements of that will be unbiased and will have a lower variance
than

23
ESTIMATING THE ERROR
VARIANCE
We can calculate and we showed that

but we cannot compute this because we do not know


An unbiased estimator of is

(Reference Theorem E.4 in appendix E – page 726)


The square root of , , is reported by Eviews in regression output under the name
standard error of the regression, and it is a measure of how good the fit is (the
smaller, the better)

24
STANDARD ERROR OF THE
REGRESSION IN EVIEWS

25
OLS REPORT
Example: Predicting wage with education and experience (Eviews report at previous
slide)
We report these results as

26
SUMMARY
Under the assumptions that:
1. the population model is linear in parameters;
2. the conditional expectation of true errors given all explanatory variables is zero;
3. the sample is randomly selected; and
4. the explanatory variables are not perfectly collinear
the OLS estimator is an unbiased estimator of
If we add no conditional heteroskedasticity to the above assumptions, then OLS is
the B.L.U.E. for

27
SUMMARY
Interpretation of OLS estimates in multiple regression: Very important to
interpret the coefficients using the context, and very important to remember that
each estimate the partial effect of its corresponding all other staying constant.
Linear scaling: Linear transformation of dependent or independent variables
changes the outcomes of linear regression in exactly predictable ways, and all of
these outcomes are substantively the same. Hence, we should use the units of
measurement which make presentation of results more understandable without
worrying that changing the units of data may a ffect our results.

28
DIGRESSION: THE VARIANCE-
COVARIANCE MATRIX
Consider n random variables with means . We can write:
and

Denote the variance of by for


Denote the covariance of and by for
for
and

29
THE VARIANCE-COVARIANCE
MATRIX
We define the variance-covariance matrix of as

All elements stayed in the main


Note that since this matrix is symmetric diagonal of this matrix are
variances, others are covariance
An important property: If is an matrix of constants, then
Var(A’ ) = A’ Var(v)A

30
TEXTBOOK EXERCISES
Problems 1, 2 (page 93)
Problems 3, 5 (page 94)
Problems 6, 7, 8 (page 95)
Problem 12 (page 96)

31
COMPUTER EXCERCISES
C1 (page 97)
C2, C3, C4 (page 98)
C10 (page 100)
C12 (page 101)

32
THANK FOR YOUR
ATTENDANCE - Q & A

33

You might also like