You are on page 1of 15

GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EARNINGS S EXP

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 2, 537) = 67.54
Model | 22513.6473 2 11256.8237 Prob > F = 0.0000
Residual | 89496.5838 537 166.660305 R-squared = 0.2010
-------------+------------------------------ Adj R-squared = 0.1980
Total | 112010.231 539 207.811189 Root MSE = 12.91

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.678125 .2336497 11.46 0.000 2.219146 3.137105
EXP | .5624326 .1285136 4.38 0.000 .3099816 .8148837
_cons | -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213
------------------------------------------------------------------------------

EAR Nˆ INGS  26.49  2.68 S  0.56 EXP

The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on
S, years of schooling, and EXP, years of work experience.

1
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120

100
Hourly earnings ($)

80

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-20
Years of schooling (highest grade completed)

Suppose that you were particularly interested in the relationship between EARNINGS and S
and wished to represent it graphically, using the sample data.

2
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120

100
Hourly earnings ($)

80

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-20
Years of schooling (highest grade completed)

A simple plot would be misleading.

3
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120 . cor S EXP


(obs=540)
| S ASVABC
100 --------+------------------
S| 1.0000
Hourly earnings ($)

80 EXP| -0.2179 1.0000

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-20
Years of schooling (highest grade completed)

Schooling is negatively correlated with work experience. The plot fails to take account of
this, and as a consequence the regression line underestimates the impact of schooling on
earnings.
4
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120 . cor S EXP


(obs=540)
| S ASVABC
100 --------+------------------
S| 1.0000
Hourly earnings ($)

80 EXP| -0.2179 1.0000

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-20
Years of schooling (highest grade completed)

We will investigate the distortion mathematically when we come to omitted variable bias.

5
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120 . cor S EXP


(obs=540)
| S ASVABC
100 --------+------------------
S| 1.0000
Hourly earnings ($)

80 EXP| -0.2179 1.0000

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-20
Years of schooling (highest grade completed)

To eliminate the distortion, you purge both EARNINGS and S of their components related to
EXP and then draw a scatter diagram using the purged variables.

6
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EARNINGS EXP

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 2.98
Model | 617.717488 1 617.717488 Prob > F = 0.0847
Residual | 111392.514 538 207.049282 R-squared = 0.0055
-------------+------------------------------ Adj R-squared = 0.0037
Total | 112010.231 539 207.811189 Root MSE = 14.389

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
EXP | .2414715 .1398002 1.73 0.085 -.0331497 .5160927
_cons | 15.55527 2.442468 6.37 0.000 10.75732 20.35321
------------------------------------------------------------------------------

. predict EEARN, resid

We start by regressing EARNINGS on EXP, as shown above. The residuals are the part of
EARNINGS which is not related to EXP. The ‘predict’ command is the Stata command for
saving the residuals from the most recent regression. We name them EEARN.
7
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg S EXP

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 26.82
Model | 152.160205 1 152.160205 Prob > F = 0.0000
Residual | 3052.82313 538 5.67439243 R-squared = 0.0475
-------------+------------------------------ Adj R-squared = 0.0457
Total | 3204.98333 539 5.94616574 Root MSE = 2.3821

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
EXP | -.1198454 .0231436 -5.18 0.000 -.1653083 -.0743826
_cons | 15.69765 .4043447 38.82 0.000 14.90337 16.49194
------------------------------------------------------------------------------

. predict ES, resid

We do the same with S. We regress it on EXP and save the residuals as ES.

8
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

80

60

40

20

0
-8 -6 -4 -2 0 2 4 6

-20

-40

Now we plot EEARN on ES and the scatter is a faithful representation of the relationship,
both in terms of the slope of the trend line (the red line) and in terms of the variation about
that line.
9
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

80

60

40

20

0
-8 -6 -4 -2 0 2 4 6

-20

-40

As you would expect, the trend line is steeper that in scatter diagram which did not control
for EXP (reproduced here as the black dashed line).

10
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EEARN ES
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 1, 538) = 131.63
Model | 21895.9298 1 21895.9298 Prob > F = 0.0000
Residual | 89496.5833 538 166.350527 R-squared = 0.1966
-------------+------------------------------ Adj R-squared = 0.1951
Total | 111392.513 539 206.665145 Root MSE = 12.898
------------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ES | 2.678125 .2334325 11.47 0.000 2.219574 3.136676
_cons | 8.10e-09 .5550284 0.00 1.000 -1.090288 1.090288
------------------------------------------------------------------------------

Here is the regression of EEARN on ES.

11
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EEARN ES
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 1, 538) = 131.63
Model | 21895.9298 1 21895.9298 Prob > F = 0.0000
Residual | 89496.5833 538 166.350527 R-squared = 0.1966
-------------+------------------------------ Adj R-squared = 0.1951
Total | 111392.513 539 206.665145 Root MSE = 12.898
------------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ES | 2.678125 .2334325 11.47 0.000 2.219574 3.136676
_cons | 8.10e-09 .5550284 0.00 1.000 -1.090288 1.090288
------------------------------------------------------------------------------

From multiple regression:


. reg EARNINGS S EXP
------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.678125 .2336497 11.46 0.000 2.219146 3.137105
EXP | .5624326 .1285136 4.38 0.000 .3099816 .8148837
_cons | -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213
------------------------------------------------------------------------------

A mathematical proof that the technique works requires matrix algebra. We will content
ourselves by verifying that the estimate of the slope coefficient is the same as in the
multiple regression.
12
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EEARN ES
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 1, 538) = 131.63
Model | 21895.9298 1 21895.9298 Prob > F = 0.0000
Residual | 89496.5833 538 166.350527 R-squared = 0.1966
-------------+------------------------------ Adj R-squared = 0.1951
Total | 111392.513 539 206.665145 Root MSE = 12.898
------------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ES | 2.678125 .2334325 11.47 0.000 2.219574 3.136676
_cons | 8.10e-09 .5550284 0.00 1.000 -1.090288 1.090288
------------------------------------------------------------------------------

From multiple regression:


. reg EARNINGS S EXP
------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.678125 .2336497 11.46 0.000 2.219146 3.137105
EXP | .5624326 .1285136 4.38 0.000 .3099816 .8148837
_cons | -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213
------------------------------------------------------------------------------

Finally, a small and not very important technical point. You may have noticed that the
standard error and t statistic do not quite match. The reason for this is that the number of
degrees of freedom is overstated by 1 in the residuals regression.
13
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EEARN ES
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 1, 538) = 131.63
Model | 21895.9298 1 21895.9298 Prob > F = 0.0000
Residual | 89496.5833 538 166.350527 R-squared = 0.1966
-------------+------------------------------ Adj R-squared = 0.1951
Total | 111392.513 539 206.665145 Root MSE = 12.898
------------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ES | 2.678125 .2334325 11.47 0.000 2.219574 3.136676
_cons | 8.10e-09 .5550284 0.00 1.000 -1.090288 1.090288
------------------------------------------------------------------------------

From multiple regression:


. reg EARNINGS S EXP
------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.678125 .2336497 11.46 0.000 2.219146 3.137105
EXP | .5624326 .1285136 4.38 0.000 .3099816 .8148837
_cons | -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213
------------------------------------------------------------------------------

That regression has not made allowance for the fact that we have already used up 1 degree
of freedom in removing EXP from the model.

14
Copyright Christopher Dougherty 2012.

These slideshows may be downloaded by anyone, anywhere for personal use.


Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section 3.2 of C. Dougherty,


Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.10.28

You might also like