Ch03 - Multiple Regression Analysis

Multiple Regression Analysis
y = 0 + 1x1 + 2x2 + . . . kxk + u
1. Estimation
Economics 20 - Prof. Anderson 1

Parallels with Simple Regression
0 is still the intercept
1 to k all called slope parameters
u is still the error term (or disturbance)
Still need to make a zero conditional mean
assumption, so now assume that
E(u|x1,x2, …,xk) = 0
Still minimizing the sum of squared
residuals, so have k+1 first order conditions
Interpreting Multiple Regression
yˆ  ˆ0  ˆ1 x1  ˆ 2 x2  ...  ˆ k xk , so

yˆ  ˆ x  ˆ x  ...  ˆ x ,
1 1 2 2 k k
so holding x2 ,..., xk fixed implies that

yˆ  ˆ x , that is each  has
1 1
a ceteris paribus interpretation

A “Partialling Out” Interpretation
Consider the case where k  2, i.e.
yˆ  ˆ  ˆ x  ˆ x , then
0 1 1 2 2
ˆ1    rî1 yi   rˆ 2
i1 , where rî1 are
the residuals from the estimated
regression xˆ1  ˆ0  ˆ2 xˆ 2
“Partialling Out” continued
Previous equation implies that regressing y
on x1 and x2 gives same effect of x1 as
regressing y on residuals from a regression of
x1 on x2
This means only the part of xi1 that is
uncorrelated with xi2 are being related to yi so
we’re estimating the effect of x1 on y after x2
has been “partialled out”
Simple vs Multiple Reg Estimate
~ ~ ~
Compare the simple regression y   0  1 x1
with the multiple regression yˆ  ˆ0  ˆ1 x1  ˆ 2 x2
~
Generally, 1  ˆ1 unless :
ˆ  0 (i.e. no partial effect of x ) OR
2 2
x1 and x2 are uncorrelated in the sample

Goodness-of-Fit
We can think of each observation as being made
up of an explained part, and an unexplained part,
yi  yˆ i  uî We then define the following :
  y  y  is the total sum of squares (SST)
2
i
  yˆ  y  is the explained sum of squares (SSE)

2
i
 uˆ is the residual sum of squares (SSR)

2
i
Then SST  SSE  SSR

Goodness-of-Fit (continued)
How do we think about how well our
sample regression line fits our sample data?
Can compute the fraction of the total sum

of squares (SST) that is explained by the
model, call this the R-squared of regression
R2 = SSE/SST = 1 – SSR/SST

Goodness-of-Fit (continued)
We can also think of R 2 as being equal to
the squared correlation coefficient between
the actual yi and the values yˆ i
   y  y   yˆ  yˆ  
2
  y  y    yˆ  yˆ  
i i
R 2
 2 2
i i

More about R-squared
R2 can never decrease when another
independent variable is added to a
regression, and usually will increase
Because R2 will usually increase with the

number of independent variables, it is not a
good way to compare models

Assumptions for Unbiasedness
Population model is linear in parameters: y = 0 +
1x1 + 2x2 +…+ kxk + u
We can use a random sample of size n, {(xi1, xi2,…,
xik, yi): i=1, 2, …, n}, from the population model, so
that the sample model is yi = 0 + 1xi1 + 2xi2 +…+
kxik + ui
E(u|x1, x2,… xk) = 0, implying that all of the
explanatory variables are exogenous
None of the x’s is constant, and there are no exact
linear relationships among them

Too Many or Too Few Variables
What happens if we include variables in
our specification that don’t belong?
There is no effect on our parameter
estimate, and OLS remains unbiased
What if we exclude a variable from our

specification that does belong?
OLS will usually be biased

Omitted Variable Bias
Suppose the true model is given as
y   0  1 x1   2 x2  u , but we
~ ~ ~
estimate y     x  u , then
0 1 1
~
1 
 xi1  x1  yi
 x i1  x1 
2

Omitted Variable Bias (cont)
Recall the true model, so that
yi   0  1 xi1   2 xi 2  ui , so the
numerator becomes
  x  x  
i1 1 0  1 xi1   2 xi 2  ui  
  x  x    2   xi1  x1  xi 2    xi1  x1 ui
2
1 i1 1

~
  1   2  x  x x   x
i1 1 i2 i1  x1  ui
  x  x     x  x1  
2 2
i1 1 i1
since E(ui )  0, taking expectations we have
 
~
E  1  1   2
 x  x x i1 1 i2
  x  x  
2
i1 1

Consider the regression of x2 on x1

~ ~ ~ ~
x2   0   1 x1 then  1 
 x  x x
i1 1 i2
  x  x  
2
i1 1
 
~ ~
so E 1  1   2 1

Summary of Direction of Bias
Corr(x1, x2) > 0 Corr(x1, x2) < 0
2 > 0 Positive bias Negative bias
2 < 0 Negative bias Positive bias

Omitted Variable Bias Summary
Two cases where bias is equal to zero
 2 = 0, that is x2 doesn’t really belong in model
 x1 and x2 are uncorrelated in the sample
If correlation between x2 , x1 and x2 , y is

the same direction, bias will be positive
If correlation between x2 , x1 and x2 , y is
the opposite direction, bias will be negative
The More General Case
Technically, can only sign the bias for the
more general case if all of the included x’s
are uncorrelated
Typically, then, we work through the bias

assuming the x’s are uncorrelated, as a
useful guide even if this assumption is not
strictly true

Variance of the OLS Estimators
Now we know that the sampling distribution
of our estimate is centered around the true
parameter
Want to think about how spread out this
distribution is
Much easier to think about this variance under
an additional assumption, so
Assume Var(u|x1, x2,…, xk) = 2
(Homoskedasticity)
Variance of OLS (cont)
Let x stand for (x1, x2,…xk)
Assuming that Var(u|x) = 2 also implies
that Var(y| x) = 2
The 4 assumptions for unbiasedness, plus

this homoskedasticity assumption are
known as the Gauss-Markov assumptions

Variance of OLS (cont)
Given the Gauss - Markov Assumptions
   2
Var ˆ j  , where
SST j 1  R j 
2
SST j    xij  x j  and R is the R

2 2 2
j
from regressing x j on all other x' s

Components of OLS Variances
The error variance: a larger 2 implies a
larger variance for the OLS estimators
The total sample variation: a larger SST j
implies a smaller variance for the estimators
Linear relationships among the
independent variables: a larger Rj2 implies a
larger variance for the estimators

Misspecified Models
Consider again the misspecified model


 
2
~ ~ ~ ~
y   0  1 x1 , so that Var 1 
SST1
 
~
 
Thus, Var   Var ˆ unless x and
1 1 1
x2 are uncorrelated, then they' re the same

Misspecified Models (cont)
While the variance of the estimator is
smaller for the misspecified model, unless
2 = 0 the misspecified model is biased
As the sample size grows, the variance of

each estimator shrinks to zero, making the
variance difference less important

Estimating the Error Variance
We don’t know what the error variance, 2,
is, because we don’t observe the errors, ui
What we observe are the residuals, ûi
We can use the residuals to form an

estimate of the error variance

Error Variance Estimate (cont)
ˆ    uˆ
2
  n  k  1  SSR df
2
i
thus, se ˆ   ˆ  SST 1  R  

j j
2 12
j
df = n – (k + 1), or df = n – k – 1
df (i.e. degrees of freedom) is the (number
of observations) – (number of estimated
parameters)
The Gauss-Markov Theorem
Given our 5 Gauss-Markov Assumptions it
can be shown that OLS is “BLUE”
Best
Linear
Unbiased
Estimator
Thus, if the assumptions hold, use OLS

Ch03 - Multiple Regression Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ch03 - Multiple Regression Analysis

Uploaded by

Copyright:

Available Formats

Multiple Regression Analysis

y = 0 + 1x1 + 2x2 + . . . kxk + u

Economics 20 - Prof. Anderson 1

yˆ  ˆ0  ˆ1 x1  ˆ 2 x2  ...  ˆ k xk , so

so holding x2 ,..., xk fixed implies that

a ceteris paribus interpretation

Economics 20 - Prof. Anderson 3

x1 and x2 are uncorrelated in the sample

Economics 20 - Prof. Anderson 6

  yˆ  y  is the explained sum of squares (SSE)

 uˆ is the residual sum of squares (SSR)

Then SST  SSE  SSR

Can compute the fraction of the total sum

Economics 20 - Prof. Anderson 8

Economics 20 - Prof. Anderson 9

Because R2 will usually increase with the

Economics 20 - Prof. Anderson 10

Economics 20 - Prof. Anderson 11

What if we exclude a variable from our

Economics 20 - Prof. Anderson 12

Economics 20 - Prof. Anderson 13

Economics 20 - Prof. Anderson 14

since E(ui )  0, taking expectations we have

Economics 20 - Prof. Anderson 15

Consider the regression of x2 on x1

Economics 20 - Prof. Anderson 16

Corr(x1, x2) > 0 Corr(x1, x2) < 0

2 > 0 Positive bias Negative bias

2 < 0 Negative bias Positive bias

Economics 20 - Prof. Anderson 17

If correlation between x2 , x1 and x2 , y is

Typically, then, we work through the bias

Economics 20 - Prof. Anderson 19

The 4 assumptions for unbiasedness, plus

Economics 20 - Prof. Anderson 21

SST j    xij  x j  and R is the R

from regressing x j on all other x' s

Economics 20 - Prof. Anderson 23

Consider again the misspecified model

x2 are uncorrelated, then they' re the same

Economics 20 - Prof. Anderson 24

As the sample size grows, the variance of

Economics 20 - Prof. Anderson 25

What we observe are the residuals, ûi

We can use the residuals to form an

Economics 20 - Prof. Anderson 26

thus, se ˆ   ˆ  SST 1  R  

Economics 20 - Prof. Anderson 28

You might also like