You are on page 1of 76

INTRODUCTION

3
Multiple Regression
LECTURE

1
CHAPTER 7

Multiple Regression
The Problem of Estimation

2
NOTATION

Y = 1 + 2 X2 + 3 X3 +…+ k Xk + u

3
Explaination for the parial coefficients
Y = 1 + 2X2 + 3X3 + u (suppose this is a true model)
Y measures the change in the mean
= 2 : 2
X2 values of Y, per unit change in X2,
holding X3 constant.
or The ‘direct’ or ‘net’ effect of a unit change
in X2 on the mean value of Y
Y holding X2 constant, the direct effect
= 3
X3 of a unit change in X3 on the mean
value of Y.

To assess the true contribution of X2 to the change in


Y, we control the influence of X3.
4
Derive OLS estimators of multiple regression
Y = ^1 + ^2X2 + ^3X3 + u^

^u = Y - ^ - ^ X - 
^X
1 2 2 3 3

OLS is to minimize the SSR( ^u2)


min. RSS = min.  u^ 2 = min. (Y - ^ - ^ X - ^ X )2
1 2 2 3 3

RSS
^ =2  ( Y - ^1- ^2X2 - ^3X3)(-1) = 0
1
RSS ^ - 
^ X - ^ X )(-X ) = 0
=2  ( Y -
^2
1 2 2 3 3 2

RSS ^ -^ X -^ X )(-X ) = 0


=2  ( Y -
^3
1 2 2 3 3 3
5
rearranging three equations:
^ ^ ^
n1 + 2 X2 + 3  X3 = Y
^ X + ^ X 2 + ^  X X = X Y
 2 2 2 3 3 2 3 2

^ X + ^ X X + ^  X 2 = X Y
 1 3 2 2 3 3 3 3

rewrite in matrix form:


n X2 X3 ^
 Y
1 2-variables Case
X2 X22 X2X3 ^ =  X2Y
2
3-variables Case
^ X3Y
X3 X2X3 X32 3
^
(X’X)  = X’Y Matrix notation
6
Cramer’s rule:
n Y X3
X2 X2Y X2X3
X3 X3Y X32 (yx2)(x32) - (yx3)(x2x3)
^
2 = =
n X2 X3 (x22)(x32) - (x2x3)2
X2 X22 X2X3
X3 X2X3 X32
n X2 Y
X2 X22 X2Y
X3 X2X3 X3Y (yx3)(x22) - (yx2)(x2x3)
^
3 = =
n X2 X3 (x22)(x32) - (x2x3)2
X2 X22 X2X3
X3 X2X3 X32
_ ^_ _
^ ^
1 = Y - 2X2 - 3X3
7
or in matrix form:
(X’X) ^ = X’Y
3x3 3x1 3x1
^
==>  = (X’X)-1 (X’Y)
3x1 3x3 3x1
^ ^ ^ ^u2
Var-cov() = u2 (X’X)-1 and u2 =
n-3
Variance-Covariance matrix
^ ^ ^ ^ ^
Var(1) Cov(1 2) Cov(1 3)
^ ^ ^ ^ ^ ^
Var-cov() = Cov (2 1) Var(2) Cov(2 3)
^ ^ ^ ^ ^
Cov (3 1) Cov(3 2) Var(3)
^ 2(X’X)-1
= u
8
-1
n X2 X3
^
= u2 X2 X22 X2X3
X3 X3X2 X32

^ 2= u^2 u^2
and  =
u n-3 n- k

k=3
# of independent variables
( including the constant term)

9
Scalar forms of Var and SE of OLS estimators

2 2
 
var(ˆ2 )  se( ˆ3 )  
 2i 23 )
x 2
(1  r 2
 3i 23 )
x 2
(1  r 2

2
  2
var(ˆ2 )  se( ˆ2 )  
 2i 23 )
x 2
(1  r 2
 2i 23 )
x 2
(1  r 2

 r23 2
cov(ˆ2 , ˆ3 ) 
(1  r232  2i
x 2
 3i
x 2

2
( x2i x3i ) 2

r 
 2 i  3i
23 2 2
x x
10
Properties of multiple OLS estimators
_ _ _
1. The regression line(surface)passes through the mean of Y1, X2, X3
_ _ _
i.e., ^ ^ ^
1 = Y - 2X2 - 3X3 Linear in parameters
_
^ ^_ ^ _ Regression through the mean
==> Y = 1 + 2X2 + 3X3
_
2. ^=Y+^
Y 2x2 + ^3x3 Unbiased: E(^)=
i i
or y =^x +
2 2
^x
3 3

^ Zero mean of error


3. u=0
4. ^ = uX
uX ^ =0 ^ =0 )
(uX constant Var(ui) = 2
2 3 k

5. ^^
uY=0 random sample
11
Properties of multiple OLS estimators
6. As X2 and X3 are closely related ==> var(^2) and var(^3)
become large and infinite. Therefore the true values of 2
and 3 are difficult to know.
All the normality assumptions in the two-variables case regression
are also applied to the multiple variable regression.
But one addition assumption is
No exact linear relationship among the independent variables.
(No perfect collinearity, i.e., Xk  Xj )

7. The greater the variation in the sample values of X2 or


X3, the smaller variance of ^2 and ^3 , and the
estimations are more precisely.
8. BLUE (Gauss-Markov Theorem)
12
The adjusted R2 (R2) as one of indicators of the overall fitness
ESS RSS u^2
R2 = =1- =1-
TSS TSS y2
_ ^2 / (n-k)
u
R2 = 1 - k : # of independent
y / (n-1)
2
variables plus the
_ ^2 constant term.
R2 = 1 -
SY2
_ ^ n : # of obs.
u 2
(n-1)
R =1-
2
y2 (n-k)
_
n-1
R2 = 1 - (1-R2) n-k
_
R2  R2 0 < R2 < 1
Adjusted R2 can be negative: R2  0
Note: Don’t misuse the adjusted R2, Gujarati(2003) pp. 222 13
Some notes on R2 and adjusted R2

• R2 always increases when we add independent variables. So we use R 2


to decide on whether or not to add one more variable. The rule
R 2if
can be applied is: add more variable in creases and t-test for
new variable is significant.
• We compare two R2 of two models to select the better model if the
two models have the same sample size n and forms of dependent
variable.
2
• Game of maximizing R : In regression analysis, we should be more
concerned about the logical or theoretical relevance of X to Y and
their significance. So, it is not necessary to get very high R2 or
adjected R2. If they are high, it is good, but if they are low, it does not
mean the model is bad.

14
Applying elasticities
percentage change in y y/y y x
 = = =
percentage change in x x/x x y

Y = 1 + 2 X
Y
X =  2

Y X X
= =  2
X Y Y
15
Estimating elasticities
^ Y X X
= =  b2
X Y Y
^Y = b1 + b2X t = 4 + 1.5X t
t
X = 8 = average number of years of experience
Y = $10 = average wage rate

^
 =  b2 X 8
= 1.5 = 1.2
Y 10 16
log-log models: Nonlinearities
ln(Y) = 1 + 2 ln(X)

ln(Y) ln(X)
=  2 X
X

1 Y 1 X
Y X =  2 X
X 17
X Y
Y X =  2

elasticity of Y with respect to X:


X Y
 = =  
Y X 2

18
Summary
dY
Model Equation dY ( Y )
( ) dX
dX X

Y  1  2 X dY   X
2( )
linear 2
dX Y
dY
Log-log ln Y   1   2 ln X d ln Y
 Y  2 2
d ln X dX
X
dY   Y
==> 2( )
dX X

19
Summary(Count.)
dY
Log-lin ln Y   1   2 X d ln Y  Y   2X
2
dX dX
dY  
==> 1Y
dX
Lin-log Y   1   2 ln X
dY  dY   2 1
d ln X dX
2
Y
X
dY   1
==> 2
dX X
1 dY  dY  
 2 ( -1 )
Reciprocal Y   1   2 1 1 2

X d ( 2 ) dX XY
X X
dY   -1
==> 2
dX X2 20
Application of functional form regression
1. Cobb-Douglas Production function:

Y   1L
 
K e
u
 

Transforming:
ln Y  ln  1   2 ln L   3 ln K  u
==> ln Y  ’1   2 ln L   3 ln K  u
d ln Y   : elasticity of output w.r.t. labor input
2
d lnL
d ln Y   : elasticity of output w.r.t. capital input.
3
d lnK

 2   3 > 1 Information about the scale of returns.


<
21
2. Polynomial regression model:
Marginal cost function or total cost function

costs

i.e. Y   1   2 X   3 X  u (MC)
2

MC

costs
or
Y   1  2 X   3 X   4 X  u (TC)
2 3

TC

y 22
Chapter 8

Multiple Regression Analysis:


The Problem of Inference

23
Hypothesis Testing in multiple regression:
1. Testing individual partial coefficient
2. Testing the overall significance of all coefficients
3. Testing restriction on variables (add or drop): Xk = 0 ?
4. Testing partial coefficient under some restrictions
Such as 2+ 3 = 1;
or 2 = 3 (or 2+ 3 = 0); etc.
5. Testing the functional form of regression model.

6. Testing the stability of the estimated regression model


-- over time
-- in different cross-sections

24
1. Individual partial coefficient test
1 holding X3 constant: Whether X2 has the effect on Y ?

H0 : 2 = 0 Y
= 2 = 0?
X2
H1 : 2  0
^ -0
 0.726
t= 2 = = 14.906
Se (^ )
2 0.048
Compare with the critical value tc0.025, 12 = 2.179

Since t > tc ==> reject Ho

^
Answer : Yes, 2 is statistically significant and is
significantly different from zero.
25
1. Individual partial coefficient test (cont.)
2 holding X2 constant: Whether X3 has the effect on Y?

H0 : 3 = 0 Y
= 3 = 0?
X3
H1 : 3  0
^
3 - 0 2.736-0
t= = = 3.226
Se (^) 0.848
3

Critical value: tc0.025, 12 = 2.179


Since | t | > | tc | ==> reject Ho

^
Answer: Yes, 3 is statistically significant and is
significantly different from zero.
26
2. Testing overall significance of the multiple regression

3-variable case: Y = 1 + 2X2 + 3X3 + u


H0 : 2 = 0, 3 = 0, (all variable are zero effect)
H1 : 2  0 or 3  0 (At least one variable is not zero,
At least one variable has the
effect)
1. Compute and obtain F-statistics
2. Check for the critical Fc value (Fc, k-1, n-k)

3. Compare F and Fc , and


if F > Fc ==> reject H0

27
^ ^u
Analysis of Variance: Since y = y +
==> y2 = y^2 + u^2
ANOVA TABLE TSS = ESS + RSS
(SS) (MSS)
Source of variation Sum of Square df Mean sum of Sq.
y^2
Due to regression(ESS)  y ^ 2
k-1
k-1
Due to residuals(RSS)  u^2 n-k u^2 ^2
n-k =  u

Total variation(TSS)  y2 n-1


Note: k is the total number of parameters including the intercept term.

MSS of ESS  ^y2/(k-1)


ESS / k-1
F= = =
MSS of RSS RSS / n-k  u^ 2 /(n-k)

H0 : 2 = … = k = 0
H1 : 2  …  k  0 if F > Fcα,k-1,n-k ==> reject Ho 28
Three- 2x2 + ^3x3 + u^
y= ^
variable
 y2 = ^2  x2 y + ^
3  x3 y +  u^2
case
TSS = ESS + RSS

ANOVA TABLE
Source of variation SS df(k=3) MSS
ESS ^ x y + ^ x y
 3-1 ESS/3-1
2 2 3 3

RSS ^2
u n-3 RSS/n-3
(n-k)
TSS y2 n-1
ESS / k-1 (^ x y + ^ x y) / 3-1
2 2 3 3
F-Statistic = = ^2 / n-3
RSS / n-k u 29
An important relationship between R2 and F
ESS / k-1 ESS (n-k)
F= =
RSS / n-k RSS (k-1)
ESS n-k
=
TSS-ESS k-1 For the three-variables case :
ESS/TSS n-k R2 / 2
=
ESS F=
1 - TSS k-1 (1-R2) / n-3
R2 n-k
=
1 - R2 k-1

R2 / (k-1) (k-1) F
F = R =
2

(1-R2) / n-k (k-1)F + (n-k)


30
R2 and the adjusted R2 (R2)
ESS RSS u^2
R2 = =1- =1-
TSS TSS y2
_ ^2 / (n-k)
u
R2 = 1 - k : # of independent
y / (n-1)
2
variables
_ ^2 including the
R2 = 1 -
SY2 constant term.
n : # of obs.
_ ^2
u (n-1)
R =1-
2
y2 (n-k)
_
R2 = 1 - (1-R2) n-1
n-k
_
R2  R2 0 < R2 < 1
Adjusted R2 can be negative: R2  0
31
Overall significance test:
H0 : 2 = 3 = 4 = 0
H1 : at least one coefficient
is not zero.
2  0 , or 3  0 , or 4  0

R 2
/ k-1
F =
* =
(1-R ) / n- k
2

0.9710 / 3
=
(1-0.9710) /16
= 179.13
Fc(0.05, 4-1, 20-4) = 3.24
 k-1 n-k
Since F* > Fc ==> reject H0. 32
Construct the ANOVA Table (8.4) .(Information from EViews)
Source of
variation
SS Df MSS
2 2 2 2
Due to R (y ) k-1 R (y )/(k-1)
regression =(0.971088)(28.97771)2x19
(SSE) =15493.171 =3 =5164.3903
2 2 2 2 2
Due to (1- R )(y ) or ( u ) n-k (1- R )(y )/(n-k)
Residuals =(0.0289112)(28.97771) )2x19
(RSS) =461.2621 =16 =28.8288
2
Total (y ) n-1
(TSS) =(28.97771) 2x19
=15954.446 =19

Since (y)2 = Var(Y) = y2/(n-1) => (n-1)(y)2 = y2

MSS of regression 5164.3903


F =
*
= = 179.1339
MSS of residual 28.8288 33
Example:Gujarati(2003)-Table6.4, pp.185)

H 0 :  1 = 2 =  3 = 0
ESS / k-1 R2
/ k-1 0.707665 / 2
F =
*
= =
RSS/(n- k) (1-R ) / n- k (1-0.707665)/ 61
2

F* = 73.832

Fc(0.05, 3-1, 64-3) = 3.15


 k-1 n-k

Since F* > Fc
==> reject H0.
34
Construct the ANOVA Table (8.4) .(Information from EVIEWS)
Source of
variation
SS Df MSS
2 2 2 2
Due to R (y ) k-1 R (y )/(k-1)
regression =(0.707665)(75.97807)2x64
(SSE) =261447.33 =2 =130723.67
2 2 2 2 2
Due to (1- R )(y ) or ( u ) n-k (1- R )(y )/(n-k)
Residuals =(0.292335)(75397807)2x64
(RSS) =108003.37 =61 =1770.547
2
Total (y ) n-1
(TSS) =(75.97807)2x64
=369450.7 =63

Since (y)2 = Var(Y) = y2/(n-1) => (n-1)(y)2 = y2

MSS of regression 130723.67


F =
*
= = 73.832
MSS of residual 1770.547 35
Y = 1 + 2 X2 + 3 X3 + u

H0 : 2 = 0, 3= 0,
H1 : 2  0 ; 3  0

Compare F* and F , checks the F-table:


c Fc0.01, 2, 61 = 4.98
Fc0.05, 2, 61 = 3.15
Decision Rule:
Since F*= .73.832 > Fc = 4.98 (3.15) ==> reject Ho

Answer : The overall estimators are statistically significant


different from zero.

36
37
3.Testing the addition variable in the regression model
Old model :
Y = 1 + 2 X2 + u1

Obtain R2old or RSSold and/or ESSold

Now consider a new variable X3, whether it is relevant to add or not ?

New model :
Y = 1 + 2X2 + 3 X3 + u2

Obtain R2new or RSSnew and ESSnew

H0 : 3 = 0, add X3 is not relevant


H1 : 3  0, add X3 is relevant
38
Steps of testing whether X3 has an incremental contribution
of the explanatory power in the new model.
1. Compute the F-statistic
(J = 1)
(RSSold - RSSnew) / # of additional variables
F* =
RSSnew / n - # of parameters in new model
(n-3)
2. Compare F* and Fc(, 1, n-3)
3. Decision rule: If F* > Fc ==> reject H0 : 3 = 0
that means X3 is a relevant variable to be added into the model.
F* can also be calculated by
# of new regressors
(R2new - R2old) / df (add or drop)
F=
(1 - R2new) / df n-k (in the new model)39
Add an irrelevant variable X2: (Studenmund, pp.166)

Old model

RSSold = 160.5929 R2old = 0.986828

40
New model H0 : add X3 (R) is not suitable
3 = 0

Fc0.05, 1, 39
= 4.17

(add or drop)
(R new - R old) / df
2 2
# of new regressors
F* =
(1 - R2new) / df in the new
n-k ( )
model
(0.9872 - 0.9868) / 1
= = 1.218 Since
(1 - 0.9872) / 39 F* < Fc ==> not reject 41H0.
Add an relevant variable X3: (Studenmund, pp.166)

Old model
Y = 0 + 1 X1 + 2 X2 + u

Next:
New model: Y = 0 + 1 X1 + 2 X2 + 3 X3 + u’
H0 : add X3 (YD variable is not suitable, 3 = 0
42
Add an relevant variable X3(YD): (Studenmund, pp.166)

New model

(R2new - R2old) / df 0.9868 - 0.9203 / 1 0.0665x40


F =
*
= = = 201.5
(1 - R2new) / df (1 - 0.9868) / 44 - 4 0.0132

Fc
= 4.08 Since F* > Fc ==> reject H0.
(0.05, 1, 40)
43
Add variables in general discussion:

old : Yi = 1 + 2X2 + 3 X3 +……+ k Xk + ui Restricted l= m=0

new : Yi = 1 + 2X2 + 3X3 +...+ kXk + lXl + Xm + ui unrestricted

Testing the two additional variables Xl, Xm, whether they


are relevant or not?
F-test : H0 : l = 0, m = 0
H1 : l  0, or m  0
# of new regressors
(R2new- R2old) / # of added regressors
F* =
(1 - R2new) / n-m
# of total regressors
If F* > Fc ==> reject H0 in the new model
44
Drop variables in general discussion:
old : Yi = 1 + 2X2 + 3X3 +...+ kXk + lXl + mXm + ui unrestricted

new : Yi = 1 + 2X2 + 3 X3 +……+ k Xk + ui Restricted l= m=0

Test the dropping of two variables Xl, Xm whether they are


relevant or not?
F-test : H0 : l = 0, m = 0
H1 : l  0, or m  0

(R2old- R2new) / # of dropped regressors


F* =
(1 - R2old) / n-k

If F* > Fc ==> reject H0


45
Justify whether to keep (or drop) Xl and Xm in the model
1. Theory: Check the sign of  ^ and ^ ? Whether the
l m
variables are theoretically explaining the dependent
variable?
2. Overall fitness: Check R2 increase or not? Check the F*-
statistics increase or not?
3. Check the t-statistics of Xl and Xm? Whether tl*, tm* > 1.96?
(5% leve of significance)

4. Bias: Check t-statistics of other variables, X2,……Xk whether


they have change significantly or not?

46
Use the R2 instead of ESS or RSS in the F-test.

Restriction Test:

(R2UR - R2R) / J J: # restricted variables


F=
(1 - R2UR) / (n-
k: # of parameters (included
k)
the intercept) in the
(R2UR - R2R) / (dfUR-dfR) unrestricted model.
=
(1 - R2UR) / dfUR Note that we use R2 for
restriction test only if two
regression models have the same
form of dependent variable.
47
Example 8.4: The demand for Chicken ( Gujarati(2003) p. 272)

Old model or restricted model

48
New Model or unrestricted model

H0: No joint effect of X4 and X5, i.e., 4 = 5 = 0


49
WALD TEST (Likelihood Ratio
Adding variables: Test)

1 H0: No joint effect of X4 and X5, i.e., 4 = 5 = 0

(R2new - R2old) / # added (0.9823 - 0.9801) /2


F* = =
(1-R 2
new ) / n-k (1 - 0.9823) / (23 - 5)
0.0011
= = 1.119 Fc0.05, 2, 18 = 3.55
0.000983

Since F* < Fc ==> not reject H0

50
Dropping variable: 2 H0: No effect of X5, i.e., 5 = 0
Since t*(β ) < tc ==> not reject H0
5

(R2UR - R2R) / m-k (0.982313 - 0.981509) /1


F =
*
=
1-R2UR/ n-k (1-0.982313) / (23 -
= 0.864 < Fc 0.05, 1, 19 = 4)
4.38  not reject H 0 51
4. Testing partial coefficient under some restrictions:

Y = 1 X22 X3 3 eu
Restricted least squares:
Constant returns to scales  2 = 1 - 3
 2 + 3 = 1
 3 = 1 - 2
ln Y = 1 + 2 ln X2 + 3 ln X3 + u unrestricted
model
=> ln Y = 1 + ( 1 - 3 ) ln X2 + 3 ln X3 + u
=> ln Y = 1 + ln X2 + 3 ( ln X3 – lnX2 ) + u
=> (ln Y - ln X2) = 1 + 3 ( ln X3 – lnX2) + u
Y X3 restricted
=> ln(X ) = ’1 +’3ln( X ) + u’
2 2 model
Y* = ’1 + ’3 X* + u’ 52
OR

=> ln Y = 1 + 2 ln X2 + (1- 2 ) ln X3 + u
=> ln Y = 1 + 2 ln X2 + lnX3 - 2 ln X3 + u
=> (ln Y - ln X3) = 1 + 2 ( lnX2 – lnX3) + u
Y X2
=> ln(X ) = ”1 + ”2 ln ( X ) + u” restricted
3 3
model
Y** = ”1+”3 X** + u”

53
Unrestricted equation: Restricted equation:
lnY = 1+ 2lnX2+3lnX3 + u ln(Y/X2) = ’1+ ’3ln(X3/X2)

H0 : 2 + 3 = 1 # of restriction in RSSUR = 0.013604


restricted model RSSR = 0.016629
(RSSR –RSSUR) / m F* =3.75
F= # of variable in
(RSSUR) / n - k Fc(0.05,1,17)=4.45
unrestricted model
54
Unrestricted equation: Restricted equation:
lnY = 1+ 2lnX2+3lnX3 + u ln(Y/X3) = ”1+ ”2ln(X2/X3)

H0 : 2 + 3 = 1 # of restriction in R2UR = 0.013604


restricted model R2R = 0.016629
(RSSR –RSSUR) / m F* = 3.75
F= # of variable in
(RSSUR) / n - k Not Reject H0
unrestricted model
55
Test for Restriction on parameters: t test approach
Unrestricted model: Y = 1 + 2 X2 + 3X3 + u
Restriction: 2 = 3 (or 2 - 3 =  = 0 or as 2 =  + 3 = 0 )
^ - ^

Compute t = 2 3
and compare to tc
Se(^2- ^3)and follow t-test decision rule

se( ˆ2  ˆ3 )  var(ˆ2 )  var(ˆ3 )  2 cov(ˆ2 , ˆ3 )


Next rewrite the equation as : Or
=> Y = 1 + ( + 3 )X2 + 3 X3 + u’ use
=> Y = 1 + X2 + 3 X2 + 3 X3 + u’
F-test
=> Y = 1 + X2 + 3 (X2 + X3) + u’
Restricted model: Y = 1 +  X2 + 3 X*3 + u’
Simply use the t-value to test whether  is zero or not 56
Example

Unrestricted model Restricted model

H 0 : 3 -  4 = 0 H0:  = 0
H1:   0
tc(0.05, 931)= 1.96
(R2R – R2UR) / m
F* = =0
(1-R UR) / n - k
2 => not reject H0
57
F (0.05, 4, 931) =2.37
c
5- Test for Functional Form

(MacKinnon, White, Davidson)


MWD Test for the functional form
Optional Reading

58
(MacKinnon, White, Davidson)
MWD Test for the functional form (Gujarati(2003) pp.280)
H0: linear model;
H1:log-linear model
^
1. Run OLS on the linear model, obtain Y
^ ^ + ^ X + ^ X
Y = 
^
1 2 2 3 3

2. Run OLS on the log-log model and obtain lnY


^
lnY = ^  + ^ ln X + ^ ln X
1 2 2 3 3
^
3. Compute Z1 = ln(Y) - lnY ^
4. Run OLS on the linear model by adding z1
^
Y = ^’ + 
1
^ ’ X + ^ ’ X + ^’ Z
2 2 3 3 4 1

and check t-statistic of 4’


If t*^4 > tc ==> reject H0 : linear model
If t*^4 < tc ==> not reject H0 : linear model 59
MWD test for the functional form (Cont.)
^
5. Compute Z2 = antilog (lnY) - Y ^
6. Run OLS on the log-log model by adding Z2
^ ^
lnY = ^ ’ + ^ ’ ln X +  ’ ln X + ^’ Z
1 2 2 3 3 4 2

^
and check t-statistic of ’3

If t*^4 > tc ==> reject H0 : log-log model


If t*^4 < tc ==> not reject H0 : log-log model

60
MWD TEST: TESTING the Functional form of regression

Example: 8.5
Step 1:
Run the linear model
and obtain

^
Y

^

CV1 = _ = 1583.279
Y 24735.33

= 0.064
61
Step 2:
Run the log-log model
and obtain

^
lnY

fitted
or
estimated

^
 0.07481
CV2 = _ = = 0.0074
Y 10.09653
62
Step 4:
H0 : true model
is linear
tc0.05, 11 = 1.796
tc0.10, 11 = 1.363

t* < tc at 5%
=> not reject H0

t* > tc at 10%
=> reject H0

63
Step 6:
H0 : true model is
log-log model

tc0.025, 11 = 2.201
tc0.05, 11 = 1.796
tc0.10, 11 = 1.363

Since t* < tc C.V.1 0.064


=> not reject H0 Comparing the C.V. = =
C.V.2 0.0074
64
Alternative Criteria for comparing two different functional models:

The coefficient of variation:


^

C.V =
Y
It measures the average error of the sample
regression function relative to the mean of Y.

Linear, log-linear, and log-log equations can be


meaningfully compared.

The smaller C.V. of the model,


the more preferred equation (functional model).

65
Compare two different functional form models:

Model 1 Model 2
linear model log-log model

^
 / Y of model 1 2.1225/89.612 0.0236
Coefficient Variation = =
(C.V.) ^
 / Y of model 2 0.0217/4.4891 0.0048

= 4.916 means that model 2 is better


66
6- “Chow Test” - structural stability Test

H0 : no structural change
H1 : yes

Procedures:

1. Divide the sample of N observation into two groups.


- group1 consisting of first n1 obs.
- group2 consisting of the remaining n2 = N - n1 obs.

67
2. Run OLS on two sub-sample groups separately and
obtain the RSS1, and RSS2

3. Run OLS on the whole sample (N) and obtain the


restricted RSSR

(RSSR - RSS1 - RSS2) / k


4. Compute F = *

(RSS1 + RSS2) / N-2k

5. Compute F* and Fc, k, N-2k


If F* > Fc ==> reject H0
It means that there is a structural change in the sample.

68
Structural stability:CHOW TEST

69
Scatter plot of Income and Savings

70
Structural stability : Ho: Var(u1) = Var(u2) = 2

Whole sample

RSSR

71
Y = 1 + 2X + u1

Sub-sample n1

RSS1

72
Y = 1 + 2X +u2

Sub-sample n2

RSS2

73
Empirical Results:
2
Dep. Constant Indep. V R SEE RSS n
variable X
Y 624226 0.0376 0.7672 31.12 23248.3 26
(70-95) (4.89) (8.89)

Y 1.0161 0.0803 0.9021 13.36 1785.03 12


(70-81) (0.08) (9.60)
Y 153.494 0.0148 0.2071 28.87 10005.2 14
(82-95) (4.69) (1.77)

(RSSR - RSS1 - RSS2) / k (23248.3 – 1785.03 -10005.2)/ 2


F =
* =
(RSS1 + RSS2) / N-2k (1785.03 +10005.2)/(22)

F* = 10.69 Fc0.01 =5.72; Fc0.10=2.56


Fc0.05=3.44; 74
Conclusion: F* > Fc ==> reject H0
Prediction in multiple regression
• The procedure is exactly the same as the
prediction in two variable model.
• We use the matrix notation form, the formula
is given in the appendix C9 on page 861 of
the book.
• We also have “mean prediction” and
“individual prediction”.
• The software will do it for you.

75
THE END

76

You might also like