You are on page 1of 15

Introduction to Econometrics

Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 1/15
This time:

Look at a numerical example of bivariate linear regression

Extend the analysis to consider more exaplanatory variables

- Coefficient estimates and interpretation
- Tests of hypothesis on coefficients
- Goodness of fit measures and tests





Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 2/15
An Example of Applied Regression Analysis

Estimating a supply function

Y
i
= B
1
+ B
2
X
i
+ u
i
(PRF)

Y
i
is the quantity produced (tons)
Xi is the price of the good (/ton)

Table 1: Data for supply Function


Use example to calculate the estimated regression line, do formal hypothesis tests
on the coefficients, see how well the model fits the data and predict the quantity
supplied for a particular price.



Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 3/15
Table 1: Data for Supply Function


n Y X x y xy x
2
y
2
yhat e e
2

1 69 9 0 6 0 0 36 63 6 36.00
2 76 12 3 13 39 9 169 72.75 3.25 10.56
3 52 6 -3 -11 33 9 121 53.25 -1.25 1.56
4 56 10 1 -7 -7 1 49 66.25 -10.25 105.06
5 57 9 0 -6 0 0 36 63 -6 36.00
6 77 10 1 14 14 1 196 66.25 10.75 115.56
7 58 7 -2 -5 10 4 25 56.5 1.5 2.25
8 55 8 -1 -8 8 1 64 59.75 -4.75 22.56
9 67 12 3 4 12 9 16 72.75 -5.75 33.06
10 53 6 -3 -10 30 9 100 53.25 -0.25 0.06
11 72 11 2 9 18 4 81 69.5 2.5 6.25
12 64 8 -1 1 -1 1 1 59.75 4.25 18.06
n=12
sum 756 108 0 0 156 48 894 756 0 387
mean 63 9


b
1
= 33.75

b
2
= 3.25


Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 4/15
Estimating the Coefficients::

using deviations of the variables from their means

25 . 3
48
156
2
2
= = =

x
xy
b


75 . 33 ) 9 )( 25 . 3 ( 63
2 1
= = = X b Y b










Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 5/15
Stata Output:
reg Y X

Source | SS df MS Number of obs = 12
-------------+------------------------------ F( 1, 10) = 13.10
Model | 507 1 507 Prob > F = 0.0047
Residual | 387 10 38.7 R-squared = 0.5671
-------------+------------------------------ Adj R-squared = 0.5238
Total | 894 11 81.2727273 Root MSE = 6.2209

------------------------------------------------------------------------------
Y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
X | 3.25 .8979142 3.62 0.005 1.249322 5.250678
_cons | 33.75 8.27836 4.08 0.002 15.30466 52.19534
------------------------------------------------------------------------------

Given b
1
, b
2
we may calculate the predicted value of Y
i
for each X
i
value - shown
in column 11 together with the regression errors in column 12 and is plotted in
Figure 1 below.



Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 6/15
The regression may be written as

Y
i
= 33.75 + 3.25X
i
+ e
i
OR i

Y
= 33.75 + 3.25X
i

b
2
= 3.25 says a 1 increase in price will raise output by 3.25 tons.




5
0
6
0
7
0
8
0
6 8 10 12
X
Y Linear prediction
Quantity Supplied

Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 7/15
The Fit of the Regression

How well the regression line fits the data is shown by the value of the R
2
. The R
2

is calculated as:

=
2
2
2
1
y
e
R


From Table 1 we see that
387
2
=

e
and
894
2
=

y
hence

567 . 0
894
387
1
1
2
2
2
= =
=

y
e
R




Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 8/15
Testing the significance of parameter estimates

b
1
and b
2
are sample estimates of the true parameters B
1
and B
2
we must test their
statistical reliability. In order to apply the standard tests of significance we must,
among other things, know the mean and variance of the estimates.

Mean of b
1
:
1 1
) ( B b E =


Variance of b
1
:
2
2
1
2
var( )
X
b
n x
o =



Mean of b
2
: 2 2
) ( B b E =


Variance of b
2
:
2
2
2
1
var( ) b
x
o ==




Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 9/15
where
2
o
denotes the variance of u and is calculated as:

2
2

2
e
n
o =



The significance of the individual estimated coefficients can be tested using the
Students t test. This is calculated as:
) var(
*
i
i
i
b
B b
t

=


In the case of our equation we have:

2
2
387
38.7
2 12 2
e
n
o = = =



Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 10/15
2
2
1
2
(1020)
var( ) 38.7 68.53
(12)(48)
X
b
n x
o = = =



2
2
2
1 1
var( ) 38.7 0.806
(48)
b
x
o = = =


A standard test in this respect is to test the significance of the estimated coefficients
against an alternative zero value.
In the case of the intercept term, the null hypothesis is therefore

H
0
: B
1
= 0
H
1
: B
1
= 0

The test is calculated as:
08 . 4
53 . 68
0 75 . 33
=

= t


Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 11/15
Compare t with the critical value of t obtained from the Student Tables with n-2
degrees of freedom.

For a two-tailed test and 10 degrees of freedom,
the 5% critical value is t
*
0.025

= 2.228.

t > t
*
or 4.08>2.228 so we correspondingly reject the null in favour of the
alternative. The intercept is therefore significant.

Repeating the exercise for b
2
we obtain

62 . 3
806 . 0
0 25 . 3
=

= t

t
*
0.025
= 2.228 < 3.62 so conclude that the slope coefficient is significantly
different from zero.



Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 12/15
confidence intervals around the estimated coefficients

i.e
) ) var( (
2
*
025 . 0 2
b t b


t
*
0.025
= 2.228

se(
2
b
) = 0.898

so
00 . 2 25 . 3 ) 898 . 0 )( 228 . 2 (
2
= b



There is therefore a 0.95 probability that the true population B
2
parameter will lie in
the interval [1.25, 5.25].





Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 13/15
Joint significance

In addition to the test of individual coefficient significance, there is a test of joint
significance for all coefficients in the regression. The null hypothesis for this test

is therefore:
H
0
:
0
2
= R

H
1
:
0
2
> R


The test is calculated as

|
.
|

\
|

=
1 2
2
1
2
2
n
R
R
F


or
09 . 13
1 2
2 12
) 567 . 0 1 (
) 567 . 0 (
=
|
.
|

\
|

= F


Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 14/15
Compare with a critical value of F, F
*
(1,10)
given in Tables.

At the 5% level we have F
*
(1,10)
= 4.96.

F > F
*
(1,10)
i.e. 13.09>4.96

so we reject null hypothesis that R
2
= 0


Prediction

Suppose for example that the producer wanted to know how much he/she would
sell when the price of the good was 20 per ton. The predicted value may be read
off from the regression as:

0

Y
= 33.75 + 3.25(20) = 98.75 ton


Introduction to Econometrics
Eco-20042
Lecture 4

e20042lec4-BivarEx.docx 15/15
However, this value only represents a point estimate drawn from a distribution of
possible predictions which depend upon the statistical properties of the b
1
, b
2

estimates. In order to see how reliable the forecast is we should therefore calculate
the confidence interval around the prediction. The confidence interval is given by:
2
2
0
0 0.025
2
( ) 1

( )
X X
Y t
n X X
o
(

+
(



From our model we have:
2
1 (20 9)
98.75 2.228 (38.7)
12 48
98.75 2.228*11.80
| |
+
|
\ .
=

There is a 0.95 probability that output would lie in the interval [76.38, 121.12].
Depending upon your requirements, this range may prove too uncertain for you to
attempt raising price.