Professional Documents
Culture Documents
Chapter FOUR
THE CLASSICAL REGRESSION ANALYSIS
[The Multiple Linear Regression Model]
4.1. Introduction
In simple regression we study the relationship between a dependent variable and a single
explanatory (independent variable). But it is rarely the case that economic relationships
involve just two variables. Rather a dependent variable Y can depend on a whole series of
explanatory variables or regressors. For instance, in demand studies we study the relationship
between quantity demanded of a good and price of the good, price of substitute goods and the
consumer’s income. The model we assume is:
Y i =β 0 + β 1 P1 + β 2 P2 + β3 X i +ui -------------------- (4.1)
Where
Y i = quantity demanded, P is price of the good, P is price of substitute goods, X is
1 2 i
u
consumer’s income, and β ' s are unknown parameters and i is the disturbance.
Equation (3.1) is a multiple regression with three explanatory variables. In general for K-
explanatory variable we can write the model as follows:
Y i =β 0 +β 1 X 1 i +β 2 X 2 i +β 3 X 3 i +.. . .. .. . .+ β k X ki +ui ------- (4.2)
X k =( i=1 , 2 ,3 , .. . .. .. , K )
Where i are explanatory variables, Yi is the dependent variable and
β j ( j=0 ,1 ,2 ,. .. .(k+1 )) are unknown parameters and ui is the disturbance term. The
disturbance term is of similar nature to that in simple regression, reflecting:
- the basic random nature of human responses
- errors of aggregation
- errors of measurement
- errors in specification of the mathematical form of the model
x≠j
i.e. E(u i u j )=0 for i
6. Independence of
ui and X : Every disturbance term ui is independent of the
i
The model:
Y = β0 + β 1 X 1 + β 2 X 2 +U i ……………………………………(4.3)
is multiple regression with two explanatory variables. The expected value of the above model
is called population regression equation i.e.
E(Y )=β 0 + β1 X 1 + β 2 X 2 , Since E(U i )=0 . …………………................(4.4)
where
β i is the population parameters. β is referred to as the intercept and β and β 2 are
0 1
also some times known as regression slopes of the regression. Note that, β 2 for example
Since the population regression equation is unknown to any investigator, it has to be estimated
from sample data. Let us suppose that the sample data has been used to estimate the
population regression equation. We leave the method of estimation unspecified for the
present and merely assume that equation (3.4) has been estimated by sample regression
equation, which we write as:
Y^ = β^ 0 + β^ 1 X 1 + β^ 2 X 2 ……………………………………………….(4.5)
^ β
Where β j are estimates of the j and Y^ is known as the predicted value of Y.
To obtain expressions for the least square estimators, we partially differentiate ∑ e2i with
^ ^ ^
respect to β 0 , β 1 and β 2 and set the partial derivatives equal to zero.
∂ [ ∑ e2i ]
=−2 ∑ (Y i − β^ 0 − β^ 1 X 1i − β^ 2 X 2 i )=0
∂β^
0 …………………………(4.8)
∂ [ ∑ e2i ]
=−2 ∑ X 1 i ( Y i − β^ 0− β^ 1 X 1i− β^ 1 X 1i )=0
∂ β^
1 …………………………….. (4.9)
∂ [ ∑ e2i ]
=−2 ∑ X 2 i ( Y i − β^ 0− β^ 1 X 1i − β^ 2 X 2i )=0
∂ β^
2 ………… ………..(4.10)
Summing from 1 to n, the multiple regression equation produces three Normal Equations:
∑ Y =n β^ 0+ β^ 1 ΣX 1 i + β^ 2 ΣX 2 i …………………………………….(4.11)
∑ X 2i Y i = β^ 0 ΣX1 i + β^ 1 ΣX 21i + β^ 2 ΣX 1i X 1i …………………………(4.12)
∑ X 2i Y i = β^ 0 ΣX2 i + β^ 1 ΣX 1 i X 2i + β^ 2 ΣX 22i ………………………...(4.13)
^
From (3.11) we obtain β 0
β^ 0 =Ȳ − β^ 1 X̄ 1 − β^ 2 X̄ 2 ------------------------------------------------- (4.14)
Substituting (4.14) in (3.12) , we get:
∑ X 1i Y i =( Ȳ − β^ 1 X̄ 1− β^ 2 X̄ 2 ) ΣX 1 i + β^ 1 ΣX 1 i2 + β^ 2 ΣX 2i
⇒ ∑ X 1i Y i −Y^ ΣX 1i = β^ 1 ( ΣX 1 i2 − X̄ 1 ΣX 2i )+ β^ 2 ( ΣX 1 i X 2i − X̄ 2 ΣX 2i )
X 1i Y i −n Ȳ X̄ 1i = β^ 2 ( ΣX )+ β^ 2 ( ΣX 1 i X 2 −n X̄ 1 X̄ 2 )
⇒∑
−n X̄
1i2 1 i2 ------- (4.15)
We know that
∑ x 12 ∑ x1 x2 β^ 1 = ∑ x1 y …………. (4.20)
∑ x1 x2 ∑ x 22 β^ 2 ∑ x2 y
If we use Cramer’s rule to solve the above matrix we obtain
Σx1 y . Σx 2−Σx1 x 2 . Σx2 y
β^ 1=
2
Σx 2 . Σx 2 −Σ ( x 1 x2 )2
1 2 …………………………..…………….. (4.21)
Σx2 y . Σx 2−Σx1 x 2 . Σx1 y
β^ 2 =
1
Σx 2 . Σx 2 −Σ( x 1 x 2 )2
1 2 ………………….……………………… (4.22)
^ ^
We can also express β 1 and β 2 in terms of covariance and variances of Y , X 1 and X 2
Cov ( X 1 , Y ) . Var ( X 1 )−Cov( X 1 , X 2 ) . Cov ( X 2 , Y )
β^ 1= −−−−−−−−−(4 . 23 )
Var ( X 1 ). Var ( X 2 )−[cov ( X 1 , X 2 )]2
Cov ( X 2 , Y ) . Var ( X 1 )−Cov( X 1 , X 2 ) . Cov ( X 1 , Y )
β^ 2 = −−−−−−−−−(4 . 24 )
Var ( X 1 ). Var ( X 2 )−[ Cov( X 1 , X 2 )] 2
4.3.1. The coefficient of determination ( R2):two explanatory variables case
In the simple regression model, we introduced R 2 as a measure of the proportion of variation
in the dependent variable that is explained by variation in the explanatory variable. In
multiple regression model the same measure is relevant, and the same formulas are valid but
now we talk of the proportion of variation in the dependent variable explained by all
explanatory variables included in the model. The coefficient of determination is:
Σe 2
2 ESS RSS i
R = =1− =1−
TSS TSS Σy 2
i ------------------------------------- (4.25)
In the present model of two explanatory variables:
Σe2i =Σ ( y i− β^ 1 x 1i − β^ 2 x 2 i )2
=Σe i ( y i− β^ 1 x 1i − β^ 2 x 2i )
=Σe i y − β^ 1 Σx 1i e i− β^ 2 Σei x 2i
=Σe i y i since Σei x 1i=Σei x 2i =0
=Σy ( y − β^ x − β^ x )
i i 1 1i 2 2i
i. e Σe2i =Σy2 − β^ 1 Σx 1i y i − β^ 2 Σx 2i y i
ESS β^ 1 Σx 1i y i + β^ 2 Σx2 i y i
2
∴ R= =
TSS Σy2 ----------------------------------(4.27)
As in simple regression, R2 is also viewed as a measure of the prediction ability of the model
over the sample period, or as a measure of how well the estimated regression fits the data.
The value of R2 is also equal to the squared sample correlation coefficient between
Y^ ∧Y t .
Since the sample correlation coefficient measures the linear association between two
variables, if R2 is high, that means there is a close association between the values of
Y t and the
model,
Y^ t and the model does not fit the data well.
2
4.3.2. Adjusted Coefficient of Determination ( R̄ )
2
One difficulty with R is that it can be made large by adding more and more variables, even if
the variables added have no economic justification. Algebraically, it is the fact that as the
variables are added the sum of squared errors (RSS) goes down (it can remain unchanged, but
2 2
this is rare) and thus R goes up. If the model contains n-1 variables then R =1. The
2
manipulation of model just to obtain a high R is not wise. An alternative measure of
2 2
goodness of fit, called the adjusted R and often symbolized as R̄ , is usually reported by
regression programs. It is computed as:
Σe2i / n−k
R̄ =1−2
Σy / n−1 2
=1−( 1−R 2 ) ( n−k
n−1
)--------------------------------(4.28)
This measure does not always goes up when a variable is added because of the degree of
freedom term n-k is the numerator. As the number of variables k increases, RSS goes down,
R̄2 R2
but so does n-k. The effect on depends on the amount by which falls. While solving
one problem, this corrected measure of goodness of fit unfortunately introduces another one.
R̄2
It losses its interpretation; is no longer the percent of variation explained. This modified
R̄2
is sometimes used and misused as a device for selecting the appropriate set of explanatory
variables.
So far we have discussed the regression models containing one or two explanatory variables.
Let us now generalize the model assuming that it contains k variables. It will be of the form:
Y = β0 + β 1 X 1 + β 2 X 2 +. .. .. .+ β k X k +U
There are k parameters to be estimated. The system of normal equations consist of k+1
∂ Σe2i
=−2 Σ(Y i − β^ 0− β^ 1 X 1− β^ 2 X 2−. .. . ..− β^ k X k )( x i )=0
∂ β^ 1
……………………………………………………..
∂ Σe2i
=−2 Σ(Y i − β^ 0− β^ 1 X 1− β^ 2 X 2−. .. . ..− β^ k X k )( x ki )=0
∂ β^ k
The general form of the above equations (except first ) may be written as:
∂ Σe2i
=−2 Σ(Y i − β^ 0− β^ 1 X 1i−−−−−− β^ k X ki )=0
∂β ^
j ; where ( j=1,2,....k )
ΣY i X 2i = β^ 0 ΣX 21 i + β^ 1 ΣX 1 i X 2i + β^ 2 ΣX +. .. .. . .. ..+ β^ k ΣX 2i X ki
2i 2
: : : : :
: : : : :
ΣY i X ki= β^ 0 ΣX ki + β^ 1 ΣX 1 X ki +∑ X 2 i X ki .. .. . .. .. . .. .. . .. .+ β^ k ΣX
i ki 2
Solving the above normal equations will result in algebraic complexity. But we can solve this
easily using matrix. Hence in the next section we will discuss the matrix approach to linear
regression model.
[] [ ][ ] [ ]
Y1 1 X 11 X 21 . . .. .. . Xk 1 β0 U1
Y2 1 X 12 X 22 . . .. .. . Xk 2 β1 U2
Y3 = 1 X 13 X 23 . . .. .. . Xk 3 β2 + U3
. . . . . . .. .. . . . .
Yn 1 X 1n X2 n . . .. .. . X kn βn Un
Y = X . β + U
To derive the OLS estimators of β , under the usual (classical) assumptions mentioned earlier,
[] []
β^ 0 e1
^
β e2
1
^
β= and e= .
.
. .
^
β en
k
^
Y = X β+e ^
Thus we can write: and e=Y −X β
We have to minimize:
n
∑ e2i =e 21+ e22+ e23 +.. . .. .. . .+ e2n
i=1
[]
e1
e2
=[ e1 , e 2 .. . .. . en ] . = e'e
.
en
=∑ e2i =e ' e
Hence β^ , β^ , β^ ,. .. . .. .. β^
β^ is the vector of required least square estimators, 0 1 2 k.
⇒ β^ =CY …………………………………………….(4.33)
3. Minimum variance
Before showing all the OLS estimators are best(possess the minimum variance property), it is
important to derive their variance.
We know that, var( β )=Ε [ ( β−β ) ] =Ε [ ( β−β )( β−β )' ]
^ ^ 2 ^ ^
Ε [( β−
^ β )( β−β
^ )' ]=
[ ]
Ε ( β^ 1−β 1 )2 Ε [ ( β^ 1−β 1 )( β^ 2−β 2 ) ] .. .. . .. Ε [( β^ 1 −β 1 )( β^ k −β k ) ]
Ε [ ( β^ 2 −β 2 )( β^ 1 −β 1 ) ] Ε( β^ −β ) Ε [( β^ −β )( β^ −β ) ]
2
2 2 .. .. . .. 2 2 k k
: : :
: : :
Ε [( β^ k −β k )( β^ 1−β 1 ) ] Ε [ ( β^ k−β k )( β^ 2 −β 2 ) ] .. .. . .. . Ε ( β^ k−β k )2
[ ]
var ( β^ 1 ) cov ( β^ 1 , β^ 2 ) . . .. .. . cov ( β^ 1 , β^ k )
cov ( β^ 2 , β^ 1 ) var ( β^ 2 ) . . .. .. . cov ( β^ , β^ )
2 k
=: : :
: : :
cov ( β^ k , β^ 1 ) cov ( β^ k , β^ 2 ) . . .. .. . var ( β^ k )
The above matrix is a symmetric matrix containing variances along its main diagonal and
covariance of the estimators every where else. This matrix is, therefore, called the Variance-
covariance matrix of least squares estimators of the regression slopes. Thus,
var ( β^ )=Ε [ ( β−β
^ ^
)( β−β )' ] ……………………………………………(4.35)
^ −1
From (3.15) β=β +( X ' X ) X'U
2
I
Note: (σ u being a scalar can be moved in front or behind of a matrix while identity matrix n
can be suppressed).
^ 2 −1
Thus we obtain, var( β )=σ u ( X ' X ) 4
[ ]
n ΣX 1n . .. .. . . ΣX kn
2
1n
ΣX 1 n ΣX . .. .. . . ΣX 1 n X kn
: : :
: : :
2
−1 kn
ΣX kn ΣX 1n X kn . .. .. . . ΣX
Where, ( X ' X ) =
^
We can, therefore, obtain the variance of any estimator say β 1 by taking the ith term from the
−1 2
principal diagonal of ( X ' X ) and then multiplying it by σ u .
Where the X’s are in their absolute form. When the x’s are in deviation form we can write the
multiple regression in matrix form as ;
^ x' x )−1 x ' y
β=(
[ ]
β^ 1 2
∑x1 Σx 1 x 2 .. . .. .. Σx 1 x k
β^2 Σx 2 x 1 Σx 2 .. . .. .. Σx 2 x k
2
: : : :
: : : :
^ β^ Σx n x1 Σx n x2 .. . .. .. Σx
where β = k
' 2
and ( x x )= k
The above column matrix β^ doesn’t include the constant term β^ 0 .Under such conditions the
^ 2 −1
variances of slope parameters in deviation form can be written as: var( β )=σ u (x ' x )
…………………………………………………….(4.38)
(the proof is the same as (4.37) above). In general we can illustrate the variance of the
parameters by taking two explanatory variables.
The multiple regressions when written in deviation form that has two explanatory variables is
y 1 = β^ 1 x 1 + β^ 2 x 2
^
( β−β)=¿ [( β^ 1−β1 ) ¿] ¿ ¿¿
In this model; ¿
^
( β−β ) ' =[( β^ 1 −β 1 )( β^ 2 −β 2 ) ]
)'=¿ [ ( β1 −β 1 ) ¿ ] ¿ ¿ ¿
^ ^ ^
∴ ( β−β )( β−β
¿
[ (
Ε [( β− β)( β−β)' ]= Εalignl 1 1
^ ^ ^ −β )2
β ( β^ 1−β 1)( β^ 2−β 2 ) ¿ ] ¿ ¿¿
and ¿
=
[ var ( β^ 1 ) cov ( β^ 1 , β^ 2 )
cov ( β^ 1 , β^ 2 ) var( β^ 2 ) ]
In case of two explanatory variables, x in the deviation form shall be:
[ ]
x 11 x 21
x= x 12
:
x1 n
x 22
:
x 2n
and x '=
[ x 11
x 12
x 12 .. .. . .. x 1n
x 22 . . .. .. . x 2n ]
[ ]
−1
−1 Σx21 Σx1 x 2
∴ σ 2u ( x ' x) =σ 2u
Σx1 x 2 Σx22
σ 2u ( x ' −1
x) =
σ 2u
[ Σx 22
−Σx 1 x 2
−Σx 1 x 2
Σx 21 ]
Σx 21 Σx 1 x 2
| |
Or Σx 1 x 2 Σx 22
σ 2u Σx22
var ( β^ 1 )=
i.e., Σx 12 Σx 22−( Σx 1 Σx 2 )2 …………………………………… (4.39)
σ 2u Σx 12
var ( β^ 2 )=
and, Σx 21 Σx 22−( Σx 1 Σx 2 )2 ………………. …….…….(4.40)
(−) σ 2u Σx 1 x 2
cov ( β^ 1 , β^ 2 )=
Σx 21 Σx 22−( Σx 1 Σx 2 )2 …………………………………….(4.41)
2
The only unknown part in variances and covariance of the estimators is σ u .
{ }
Σe 2i
As we have seen in simple regression modelσ^ = n−2 . For k-parameters (including the
2
In the above model we have three parameters including the constant term and
{ }
σ^ 2= n−3
Σei2
Minimum variance of β^
^
β i ' s in the β
To show that all the vector are Best Estimators, we have also to prove that the
variances obtained in (3.37) are the smallest amongst all other possible linear unbiased
estimators. We follow the same procedure as followed in case of single explanatory variable
model where, we first assumed an alternative linear unbiased estimator and then it was
established that its variance is greater than the estimator of the regression model.
^
Assume that β^ is an alternative unbiased and linear estimator of β . Suppose that
^^
β= [ ( X ' X )−1 X ' +B ] Y
^^
var( β)=Ε
^^
( β−β[ ^^
)( β−β)' ]
=Ε [ {[ ( X ' X )
−1
X ' + B ] Y −β }{ [( X ' X )−1 X ' + B ] Y −β } ' ]
=Ε [ {[ ( X ' X )−1
X ' + B ] ( Xβ +U )−β }{[ ( X ' X )−1 X ' + B ] ( Xβ +U )− β } ' ]
=Ε [ {( X ' X )−1 X ' Xβ +( X ' X )−1 X ' U + BX β + BU −β }
{( X ' X )−1 X ' Xβ+( X ' X )−1 X ' U + BX β + BU −β } ' ¿
¿
=Ε [ {( X ' X )−1 X ' U +BU }{( X ' X )−1 X ' U +BU } ' ¿ ¿
( ∵ BX=0)
=Ε [ {( X ' X )−1 X ' U + BU }{U ' X ( X ' X )−1 +U ' B ' }]
=σ 2u [ ( X ' X )−1 X ' X ( X ' X )−1 + BX ( X ' X )−1 +( X ' X )−1 X ' B' + BB' ]
^
var( β^ ) =σ 2u ( X ' X )−1 + σ 2u BB ' ……………………………………….(4.45)
^^
Or, in other words, var( β ) is greater than var( β^ ) by an expression σ 2u BB ' and it proves that
SST =∑ y 2
¿^
SSE=∑ y2=∑ (Y^ i−Ȳ)2=Explained sum ofsquares.¿SSR=∑ y2i =∑ (Y i−Y^ )2=Unexplainedsumof squares.
Or n−1 can also be given in terms of the slope coefficients SST=∑ y =∑(Y −Ȳ) =Totalsumofsquares.
2 2
i
SSE
MSE=
In simple linear regression, the higher the k−1 means the better the model is determined by the
explanatory variable in the model. In multiple linear regression, however, every time we
SSR
MSR=
insert additional explanatory variable in the model, the n−k increases irrespective of the
MSR
Fcal= ≈Fα(k−1,n−k)
improvement in the goodness-of- fit of the model. That means high MSE may not imply that the
model is good.
In multiple linear regression, therefore, we better interpret the adjusted α than the ordinary
2 ∑ ^y 2
or the unadjusted R , . We have known that the value of R2 =
∑ y2 is always between zero and one.
But the adjusted ∑ ^y2= R2∑ y2can lie outside this range even to be negative.
R2=1−
∑ e2
In the case of simple linear regression, ∑ y2 is the square of linear correlation coefficient.
Again as the correlation coefficient lies between -1 and +1, the coefficient of determination
∑ x1 y=1043.25 ∑ x2 y=−509
lies between 0 and 1. The of multiple linear regression also lies between 0 and +1.
The adjusted ∑
x1 x2=960.6667
however, can sometimes be negative when the goodness of fit is poor. When
the adjusted ∑ y =1536.25 value is negative, we considered it as zero and interpret as no variation of the
2
2
The coefficient of determination( R ) can be derived in matrix form as follows.
2 ^ ^ ^ ^
We know that Σei =e ' e=Y ' Y −2 β ' X ' Y + β ' X ' X β since ( X ' X ) β= X ' Y and
∑ Y i2=Y ' Y
∴ e' e=Y ' Y −2 β^ ' X ' Y + β'
^ X'Y
We know,
y i =Y i−Ȳ
1
∴ Σy 2i =ΣY 2i − ( ΣY i )2
n
In matrix notation
1
Σy2i =Y ' Y − ( ΣY i )2
n ………………………………………………(4.48)
Equation (3.48) gives the total sum of squares variations in the model.
2 2
Explained sum of squares=Σy i −Σei
1
=Y ' Y − ( Σy )2 −e ' e
n
1
= β^ ' X ' Y − ( ΣY i )2
n ……………………….(4.49)
Explained sum of squares
R2 =
Since Total sum of squares
1
β^ ' X ' Y − (ΣY i )2 ^
n β ' X ' Y −n Ȳ
∴ R 2= =
1 Y ' Y −n Ȳ 2
Y ' Y − ( ΣY i )2
n ……………………(4.50)
We hope that from the discussion made so far on multiple regression model, in general, you
may make the following summary of results.
^ ^ ^
Let Y = β0 + β 1 X 1 + β 2 X 2 +e i ………………………………… (4.51)
A.
H 0 : β 1 =0
H 1 : β 1 ≠0
B.
H 0 : β 2 =0
H 1 : β 2 ≠0
The null hypothesis (A) states that, holding X2 constant X1 has no (linear) influence on Y.
Similarly hypothesis (B) states that holding X1 constant, X2 has no influence on the dependent
variable Yi.To test these null hypothesis we will use the following tests:
i- Standard error test: under this and the following testing methods we test only for
β^ 1 .The test for β^ 2 will be done in the same way.
SE( β^ 1 )> 1 2 β^ 1
√ ∑ ∑
x12 i
σ^ 2 ∑ x 22i
x 22i −( ∑ x 1 x 2)
2
; where
σ^ 2=
Σe 2i
n−3
estimate
β i is statistically significant.
Note: The smaller the standard errors, the stronger the evidence that the estimates are
statistically reliable.
^
ii. The student’s t-test: We compute the t-ratio for each β i
β^ i−β
t∗¿ ~ t n-k
SE ( β^ )i , where n is number of observation and k is number of parameters.
If we have 3 parameters, the degree of freedom will be n-3. So;
β^ 2 −β2
t∗¿
SE ( β^ )2 ; with n-3 degree of freedom
^
If t*<t (tabulated), we accept the null hypothesis, i.e. we can conclude that β 2 is
not significant and hence the regressor does not appear to contribute to the
explanation of the variations in Y.
If t*>t (tabulated), we reject the null hypothesis and we accept the alternative one;
β^ 2 is statistically significant. Thus, the greater the value of t* the stronger the
evidence that
β i is statistically significant.
¿ ¿
¿ ∑ x ( Y −Y ) ∑ xi Y Y ∑ xi
β =
i i
=
i
− bu t ∑ x =0
1
∑ x 2
i ∑ x i2 ∑ xi
2 i
β
¿
=
∑ x i
Y i
1
∑ x 2
i
x i
Le t K =
i
∑ x
2
i
¿
β 1
=∑ K i
Y i
K i ha s th e follo w i n g proper t i e s ;
∑ K =0 . Th i s is beca u s e ∑ K =
∑ x
=
0
=0
i i
∑ x
2
i ∑ x
2
i
∑ x X ¿
∑ K X =∑ K x =1 . Th i s is beca u s e ∑ K X =
i i
bu t x = X − X an d substi t u t e
i i i i i i
∑ x 2
i
i i
in t o th e form u l a
¿ ¿
∑ ( X − X ) X ∑ X 2
X ∑ X ¿
=
i i
=
i
−
i
. Bu t ∑ X =n X an d substi t u t e in th e form u l a
∑ x i2 ∑ x i2 ∑ xi
2 i
¿ ¿
∑ X
2
n X 2
∑ X
2
−n X
2
¿ ∑ xi
2
=
i
− =
i
. Bu t ∑ X 2
−n X 2
=∑ x 2
= =1
∑ xi
2
∑ x
2
i ∑ x
2
i
i i
∑ xi
2
1 x x 2
∑ xi
2
1
∑ K 2
= . Th i s is beca u s e K =
i
an d K 2
=
i
. Theref o re , ∑ K 2
= =
i
∑ x 2
i
i
∑ x 2
i
i
( ∑ x 2
i
)2
i
∑ xi
( 2
)2 ∑ x 2
i
¿
β1 =β 1+∑ K i U i
¿
E(β1 )=E(β1 +∑ K i U i )
= E(β 1 )+E(∑ K i U i )
¿
E(β 1 )=β 1+∑ K i E(U i ). But by theassumptionthat E(U i )=0
¿
E(β 1 )=β 1
Regression
¿ ¿
=E( β 1 −β 1 )2 . But β 1 −β 1=∑ K i U i
=E( ∑ K i U i )2 =E( ∑ K i2 X i2 +2 ∑ K i U i U j )
i≠ j
β1
=∑ K i δ U
β0
2 2
β0
=δ U2 ∑ K i2 . But ∑ K i2=1 x2
∑ i
¿ δ U2
Therefo re, Var ( β 1 )=
∑x 2
i
¿
¿
¿ ¿ ¿ ¿ ¿
β 0 =Y − β 1 X . But β1 = ∑ K i Y i
Residual
¿ ¿
=Y − X ∑ K i Y i
β0
¿
1
=∑ ( −X Ki) Y i
n
1 ¿
=∑ ( − X K i )( β 0 + β 1 X i +U i )
β0
n
β0 β1 Xi Ui ¿ ¿ ¿
=∑ ( + + − X β 0 K i − X β1 K i X i − X K i U i )
n n n
=
∑ β0
+
β1∑ X i
+
∑ Ui −X ¿ ¿ ¿
β 0 ∑ K i − X β1 ∑ K i X i − X ∑ K i U i
n n n
nβ0 ¿ ¿ ¿
=
n
+ β 1 X − β1 X − X ∑ Ki Ui
¿ ¿
β 0 = β 0− X ∑ K i U i
¿
Total β0 β0
1 ¿ ¿
=δ 2
U ∑ (
n 2
−2 X K i + X 2
K 2
i )
=δ 2 ∑
(
1
− 2 X ∑
¿
K +
¿
X 2
∑ K 2
U 2 i i
n
2 n ¿
2 1
=δ U ( + X )
n2
∑ x 2
i
¿
2
2 1 X
= δU ( + )
n ∑ xi
2
¿
∑ x2
i +n X 2 ¿
=δ 2
U (
n ∑ 2
x i
) . But ∑ x 2
i + n X 2
=∑ X 2
i
Therefore , Var ( β
¿
)=
δ 2
U∑ X 2
i
0
n ∑ x 2
i
This implies that the total sum of squares is the sum of the explained (regression) sum of
squares and the residual (unexplained) sum of squares. In other words, the total variation in
the dependent variable is the sum of the variation in the dependent variable due to the
variation in the independent variables included in the model and the variation that remained
unexplained by the explanatory variables in the model. Analysis of variance (ANOVA) is the
technique of decomposing the total sum of squares into its components. As we can see here,
the technique decomposes the total variation in the dependent variable into the explained and
the unexplained variations. The degrees of freedom of the total variation are also the sum of
the degrees of freedom of the two components. By dividing the sum of squares by the
corresponding degrees of freedom, we obtain what is called the Mean Sum of Squares
(MSS).
The Mean Sum of Squares due to regression, errors (residual) and Total are calculated as the
Sum of squares and the corresponding degrees of freedom (look at column 3 of the above
ANOVA table.
The final table shows computation of the test statistic which can be computed as follows:
¿
δ U 2=
∑ e2
i
n −2 [The F statistic follows F distribution]
Y =β 0 + β 1 X +U i
δ2U
i i
¿ ¿ ¿
Y =β 0 + β 1 X +U
¿ ¿ ¿
¿
y i =β 1 xi + U i−U
∑ xi2 E ( β1−β 1 )2. But E( β 1−β 1)2 is the var iance of β 1 which isgivenas
¿
y i =β 1 xi
¿
ei = y i− y i
¿ ¿
ei =β 1 x i +U −U −β 1 x
∑ xi2
i i
¿ ¿
ei =U i −U −( β1 −β 1 ) x i
¿ ¿ ¿ ¿
2
ei =( Ui −U )2
+( β 1 −β 1 )2 2
xi −2( β 1− β1 ) xi (U i −U )
¿ ¿ ¿ ¿
∑ e2
i =∑ (U −
iU )2
+( β 1−β 1 )2
∑ 2
xi −2( β1 −β 1 )∑ xi ( U i −U )
δ U2
i
n
Now let us see the ex p e ctation o ne by on e .
( ∑ Ui )2 ∑ Ui 2
+2 ∑ U i U j
E (∑ =E (∑
¿ ∑ x 2i
2 i≠ j
Ui − ) Ui 2
− ( ) )
n n
=δ 2U
∑ E ( U i )2 +2 ∑ E ( U i U j )
=∑ E ( Ui )2
−
n
∑ E( Ui ) 2
+0
∑ xi
=∑ δ 2
U −
n
=∑ 2
δU −
δU
n
2
∑ 2
δU
2
=nδ 2
U −n
n
=nδ 2
U −δU 2
¿ ¿
−2 E (( β 1 − β 1 )∑ x i ( U i −U ))
X2 is income of the consumer. The hypothesis suggests that the price and income elasticity of
^
∴β t distribution with N - K degrees of freedom.
Where K = the total number of parameters estimated.
^ ^ = Σ (1 n− X̄ k i ) Y i
α
The α is given as
Thus the t-statistic is:
^
α
Decision: Reject H0 if tcal. > ttab.
Note: Using similar procedures one can also test linear equality restrictions, for example
^
^ ∧ β
α
and other restrictions.
Illustration: The following table shows a particular country’s the value of imports (Y), the
level of Gross National Product(X1) measured in arbitrary units, and the price index of
imported goods (X2), over 12 years period.
Table 1: Data for multiple regression examples
Yea 196 196 196 196 196 196 196 196 196 196 197 197
r 0 1 2 3 4 5 6 7 8 9 0 1
Y 57 43 73 37 64 48 56 50 39 43 69 60
X1 220 215 250 241 305 258 354 321 370 375 385 385
X2 125 147 118 160 128 149 145 150 140 115 155 152
a) Estimate the coefficients of the economic relationship and fit the model.
To estimate the coefficients of the economic relationship, we compute the entries given in
Table 2
Table 2: Computations of the summary statistics for coefficients for data of Table 1
Year Y X1 X2 x1 x2 Y X12 x22 x1y x2y x1x2 y2
1960 57 220 125 -86.5833 -15.3333 3.75 7496.668 235.1101 -324.687 -57.4999 1327.608 14.0625
1961 43 215 147 -91.5833 6.6667 -10.25 8387.501 44.44489 938.7288 -68.3337 -610.558 105.0625
1962 73 250 118 -56.5833 -22.3333 19.75 3201.67 498.7763 -1117.52 -441.083 1263.692 390.0625
1963 37 241 160 -65.5833 19.6667 -16.25 4301.169 386.7791 1065.729 -319.584 -1289.81 264.0625
1964 64 305 128 -1.5833 -12.3333 10.75 2.506839 152.1103 -17.0205 -132.583 19.52731 115.5625
1965 48 258 149 -48.5833 8.6667 -5.25 2360.337 75.11169 255.0623 -45.5002 -421.057 27.5625
1966 56 354 145 47.4167 4.6667 2.75 2248.343 21.77809 130.3959 12.83343 221.2795 7.5625
1967 50 321 150 14.4167 9.6667 -3.25 207.8412 93.44509 -46.8543 -31.4168 139.3619 10.5625
1968 39 370 140 63.4167 -0.3333 -14.25 4021.678 0.111089 -903.688 4.749525 -21.1368 203.0625
1969 43 375 115 68.4167 -25.3333 -10.25 4680.845 641.7761 -701.271 259.6663 -1733.22 105.0625
1970 69 385 155 78.4167 14.6667 15.75 6149.179 215.1121 1235.063 231.0005 1150.114 248.0625
1971 60 385 152 78.4167 11.6667 6.75 6149.179 136.1119 529.3127 78.75022 914.8641 45.5625
Sum 639 3679 1684 0.0004 0.0004 0 49206.92 2500.667 1043.25 -509 960.6667 1536.25
Mean 53.25 306.5833 140.3333 0 0 0
α ∧ β
^
θ
θ
^
E( θ)−θ=the amount of bias
The summary results in deviation forms are then given by:
^
θ θ
^
E( θ)−θ=0⇒E( ^
θ)=θ α^ ∧ β^
α ∧ β Ε( β)^ =β and Ε(α^ )=α
^
The coefficients are then obtained as follows.
^ )= β
Ε( β
β ^
.
The fitted model is then written as: β=ΣkY i = 75.40512 + 0.025365X1 - 0.21329X2
b) Compute the variance and standard errors of the slopes.
First, you need to compute the estimate of the variance of the random term as follows
=Σk i ( α + βX i +U i )
=αΣk +βΣk X +Σk u
Variance of i i i i i
24
MTU CANR, econometrics for 2nd year AgEC
Variance of
⇒∑ ki=0
Σx i X i Σ ( X − X̄ ) Xi
Σk i X i= =
Σx 2
i Σx 2
i
ΣX2−X̄ ΣX ΣX2−nX̄2
= 2 2 = 2 2=1
Standard error of ΣX −nX̄ ΣX −nX̄
⇒ ∑ k i X i =1.............................
Similarly, the standard error of the intercept is found to be 37.98177. The detail is left for you as
an exercise.
c) Calculate and interpret the coefficient of determination.
We can use the following summary results to obtain the R2.
yˆ 2
135.0262
^
Ε( β)=E(β)+Σk i E(ui ), (The sum of the above two). Then,
k i
^
Ε( β )=β
or
d) Compute the adjusted R2.
Ε ( u i )= 0
e) Construct 95% confidence interval for the true population parameters (partial regression
coefficients).[Exercise: Base your work on Simple Linear Regression]
f) Test the significance of X1 and X2 in determining the changes in Y using t-test.
The hypotheses are summarized in the following table.
25
MTU CANR, econometrics for 2nd year AgEC
The critical value (t 0.05, 9) to be used here is 2.262. Like the standard error test, the t- test revealed
that both X1 and X2 are insignificant to determine the change in Y since the calculated t values
are both less than the critical value.
Exercise: Test the significance of X1 and X2 in determining the changes in Y using the standard
error test.
g) Test the overall significance of the model. (Hint: use = 0.05)
This involves testing whether at least one of the two variables X 1 and X2 determine the changes
in Y. The hypothesis to be tested is given by:
^
α
The ANOVA table for the test is give as follows:
Source of Sum of Squares Degrees of Mean Sum of Squares Ε( α^ )=α
variation freedom
Regression
α^ =Σ ( n− X̄ k i ) Y i =Σ( −X̄k)(α+βX+U)Y =α+βX +U 1 1 1
1
[ ] i i i =3-1=2 1
n i ii
=α+β n ΣXi+ n Σui−α X̄ Σki−β X̄Σki Xi−X̄ Σkiui n Σui−X̄Σkiui
=α+
Total ∴α
^ α =12-
1=11
26
MTU CANR, econometrics for 2nd year AgEC
^
α and ^
β
Extensions of Regression Models
As pointed out earlier non linearity may be expected in many Economic Relationships. In other
words the relationship between Y and X can be non-linear rather than linear. Thus, once the
independent variables have been identified the next step is to choose the functional form of the
relationship between the dependent and the independent variables. Specification of the functional
form is important, because a correct explanatory variable may well appear to be insignificant or
to have an unexpected sign if an inappropriate functional form is used. Thus the choice of a
functional form for an equation is a vital part of the specification of that equation. The choice of
a functional form almost always should be based on an examination of the underlying economic
theory. The logical form of the relationship between the dependent variable and the independent
variable in question should be compared with the properties of various functional forms, and the
one that comes closest to that underlying theory should be chosen for the equation.
Some Commonly Used Functional Forms
a) The Linear Form: It is based on the assumption that the slope of the relationship between the
independent variable and the dependent variable is constant.
β1
i=1,2,...K
X1
In this case elasticity is not constant.
27
MTU CANR, econometrics for 2nd year AgEC
X 2
If the hypothesized relationship between Y and X is such that the slope of the relationship can be
expected to be constant and the elasticity can therefore be expected to be variable, then the linear
functional form should be used.
Note: Economic theory frequently predicts only the sign of a relationship and not its functional
form. Under such circumstances, the linear form can be used until strong evidence that it is
inappropriate is found. Thus, unless theory, common sense, or experience justifies using some
other functional form, one should use the linear model.
b) Log-linear, double Log or constant elasticity model
The most common functional form that is non-linear in the variable (but still linear in the
coefficients) is the log-linear form. A log-linear form is often used, because the elasticities and
not the slopes are constant i.e., = Constant.
β2
Thus, given the assumption of a constant elasticity, the proper form is the exponential (log-
linear) form.
X 2
Given:
The log-linear functional form for the above equation can be obtained by a logarithmic
transformation of the equation.
28
MTU CANR, econometrics for 2nd year AgEC
X 1
The model can be estimated by OLS if the basic assumptions are fulfilled.
ui ui
The model is also called a constant elasticity model because the coefficient of elasticity between
Y and X (1) remains constant.
X i
This functional form is used in the estimation of demand and production functions.
Note: We should make sure that there are no negative or zero observations in the data set before
we decide to use the log-linear model. Thus log-linear models should be run only if all the
variables take on positive values.
c) Semi-log Form
The semi-log functional form is a variant of the log-linear equation in which some but not all of
the variables (dependent and independent) are expressed in terms of their logs. Such models
expressed as:
E ( u i )= 0 ui
( i ) ( lin-log model ) and ( ii ) ( log-lin
model ) are called semi-log models. The semi-log functional form, in the case of taking the log
of one of the independent variables, can be used to depict a situation in which the impact of X on
Y is expected to ‘tail off’ as X gets bigger as long as 1 is greater than zero.
29
MTU CANR, econometrics for 2nd year AgEC
ui
Example: The Engel’s curve tends to flatten out, because as incomes get higher, a smaller
percentage of income goes to consumption and a greater percentage goes to saving.
Consumption thus increases at a decreasing rate.
Growth models are examples of semi-log forms
d) Polynomial Form
Polynomial functional forms express Y as a function of independent variables some of which are
raised to powers others than one. For example in a second degree polynomial (quadratic)
equation, at least one independent variable is squared.
X i
Such models produce slopes that change as the independent variables change. Thus the slopes of
Y with respect to the Xs are
30
MTU CANR, econometrics for 2nd year AgEC
ui 2
ui≈N(0,δ u )
Simple transformation of the polynomial could enable us to use the OLS method to estimate the
parameters of the model
Y i=β0+β1 D1i+β2 D 2i+ui
Setting
Y^ i =26 , 158. 62−1734 . 47 D 1i −3264 . 62 β 2 D 2 i
i
31
MTU CANR, econometrics for 2nd year AgEC
ui
An asymptote or limit value is set that the dependent variable will take if the value of the X-
variable increases indefinitely i.e. 0 provides the value in the above case. The function
approaches the asymptote from the top or bottom depending on the sign of 1.
Example: Phillips curve, a non-linear relationship between the rate of unemployment and the
percentage wage change.
^ ^ 2 ^ 2
var(β)=Ε( β−Ε( β)) =Ε( β−β)
a) Estimate and interpret the regression coefficients.
b) Compute the average and marginal productivity of labor and capital in the firms.
c) Compute the standard errors of the estimates.
d) Compute and interpret the coefficient of multiple determination.
e) Calculate the adjusted R2.
f) Test significance of individual coefficients.
g) Test the overall significance of the model.
h) Identify the type of economies of scale (returns to scale) for the firm and advise.
32
MTU CANR, econometrics for 2nd year AgEC
where Y is per capita consumption of Potato in Birr, X1 is real disposable per capita income in
Birr, X2 retail price of Potato per Kg, X3 retail price of Cabbage per Kg and X 4 is retail price of
Cauliflower per Kg and var(β)^ =E(∑k u) is a random or error term.
2
ii
33