CH - 4 - Econometrics UG

MTU CANR, econometrics for 2nd year AgEC
Chapter FOUR
THE CLASSICAL REGRESSION ANALYSIS
[The Multiple Linear Regression Model]
4.1. Introduction
In simple regression we study the relationship between a dependent variable and a single
explanatory (independent variable). But it is rarely the case that economic relationships
involve just two variables. Rather a dependent variable Y can depend on a whole series of
explanatory variables or regressors. For instance, in demand studies we study the relationship
between quantity demanded of a good and price of the good, price of substitute goods and the
consumer’s income. The model we assume is:
Y i =β 0 + β 1 P1 + β 2 P2 + β3 X i +ui -------------------- (4.1)
Where
Y i = quantity demanded, P is price of the good, P is price of substitute goods, X is
1 2 i
u
consumer’s income, and β ' s are unknown parameters and i is the disturbance.
Equation (3.1) is a multiple regression with three explanatory variables. In general for K-
explanatory variable we can write the model as follows:
Y i =β 0 +β 1 X 1 i +β 2 X 2 i +β 3 X 3 i +.. . .. .. . .+ β k X ki +ui ------- (4.2)
X k =( i=1 , 2 ,3 , .. . .. .. , K )
Where i are explanatory variables, Yi is the dependent variable and
β j ( j=0 ,1 ,2 ,. .. .(k+1 )) are unknown parameters and ui is the disturbance term. The
disturbance term is of similar nature to that in simple regression, reflecting:
- the basic random nature of human responses
- errors of aggregation
- errors of measurement
- errors in specification of the mathematical form of the model
and any other (minor) factors, other than

x i that might influence Y.
In this chapter we will first start our discussion with the assumptions of the multiple
regressions and we will proceed our analysis with the case of two explanatory variables and
then we will generalize the multiple regression model in the case of k-explanatory variables
using matrix algebra.
Chapter four Multiple Linear Regression Model 1

4.2. Assumptions of Multiple Regression Model

In order to specify our multiple linear regression model and proceed our analysis with regard
to this model, some assumptions are compulsory. But these assumptions are the same as in
the single explanatory variable model developed earlier except the assumption of no perfect
multicollinearity. These assumptions are:
1. Randomness of the error term: The variable u is a real random variable.
2. Zero mean of the error term: E(u i )=0
3. Homoscedasticity: The variance of each

ui is the same for all the x i values.
E( u 2 )=σ 2
i.e. i u (constant)
4. Normality of u: The values of each

ui are normally distributed.
2
i.e. U i ~ N (0 ,σ )
5. No auto or serial correlation: The values of

ui (corresponding to Xi ) are independent
from the values of any other

ui (corresponding to X ) for i j.
j
x≠j
i.e. E(u i u j )=0 for i
6. Independence of
ui and X : Every disturbance term ui is independent of the
i
explanatory variables. i.e. E(u i X 1i )=E(u i X 2i )=0

This condition is automatically fulfilled if we assume that the values of the X’s are a
set of fixed numbers in all (hypothetical) samples.
7. No perfect multicollinearity: The explanatory variables are not perfectly linearly
correlated.
We can’t exclusively list all the assumptions but the above assumptions are some of the basic
assumptions that enable us to proceed our analysis.
4.3. A Model With Two Explanatory Variables
In order to understand the nature of multiple regression model easily, we start our analysis
with the case of two explanatory variables, then extend this to the case of k-explanatory
variables.

The model:
Y = β0 + β 1 X 1 + β 2 X 2 +U i ……………………………………(4.3)
is multiple regression with two explanatory variables. The expected value of the above model
is called population regression equation i.e.
E(Y )=β 0 + β1 X 1 + β 2 X 2 , Since E(U i )=0 . …………………................(4.4)
where
β i is the population parameters. β is referred to as the intercept and β and β 2 are
0 1
also some times known as regression slopes of the regression. Note that, β 2 for example
measures the effect on E(Y ) of a unit change in X 2 when X 1 is held constant.
Since the population regression equation is unknown to any investigator, it has to be estimated
from sample data. Let us suppose that the sample data has been used to estimate the
population regression equation. We leave the method of estimation unspecified for the
present and merely assume that equation (3.4) has been estimated by sample regression
equation, which we write as:
Y^ = β^ 0 + β^ 1 X 1 + β^ 2 X 2 ……………………………………………….(4.5)
^ β
Where β j are estimates of the j and Y^ is known as the predicted value of Y.
Now it is time to state how (3.3) is estimated. Given sample observation on Y , X 1 ∧X 2 , we

estimate (3.3) using the method of least square (OLS).
Y^ = β^ 0 + β^ 1 X 1i + β^ 2 X 2i + ei ……………………………………….(4.6)
is sample relation between Y , X 1 ∧X 2 .

e i=Y i−Y^ =Y i− β^ 0 − β^ 1 X 1 − β^ 2 X 2 …………………………………..(4.7)
To obtain expressions for the least square estimators, we partially differentiate ∑ e2i with
^ ^ ^
respect to β 0 , β 1 and β 2 and set the partial derivatives equal to zero.
∂ [ ∑ e2i ]
=−2 ∑ (Y i − β^ 0 − β^ 1 X 1i − β^ 2 X 2 i )=0
∂β^
0 …………………………(4.8)
∂ [ ∑ e2i ]
=−2 ∑ X 1 i ( Y i − β^ 0− β^ 1 X 1i− β^ 1 X 1i )=0
∂ β^
1 …………………………….. (4.9)

∂ [ ∑ e2i ]
=−2 ∑ X 2 i ( Y i − β^ 0− β^ 1 X 1i − β^ 2 X 2i )=0
∂ β^
2 ………… ………..(4.10)
Summing from 1 to n, the multiple regression equation produces three Normal Equations:
∑ Y =n β^ 0+ β^ 1 ΣX 1 i + β^ 2 ΣX 2 i …………………………………….(4.11)
∑ X 2i Y i = β^ 0 ΣX1 i + β^ 1 ΣX 21i + β^ 2 ΣX 1i X 1i …………………………(4.12)
∑ X 2i Y i = β^ 0 ΣX2 i + β^ 1 ΣX 1 i X 2i + β^ 2 ΣX 22i ………………………...(4.13)
^
From (3.11) we obtain β 0
β^ 0 =Ȳ − β^ 1 X̄ 1 − β^ 2 X̄ 2 ------------------------------------------------- (4.14)
Substituting (4.14) in (3.12) , we get:
∑ X 1i Y i =( Ȳ − β^ 1 X̄ 1− β^ 2 X̄ 2 ) ΣX 1 i + β^ 1 ΣX 1 i2 + β^ 2 ΣX 2i
⇒ ∑ X 1i Y i −Y^ ΣX 1i = β^ 1 ( ΣX 1 i2 − X̄ 1 ΣX 2i )+ β^ 2 ( ΣX 1 i X 2i − X̄ 2 ΣX 2i )
X 1i Y i −n Ȳ X̄ 1i = β^ 2 ( ΣX )+ β^ 2 ( ΣX 1 i X 2 −n X̄ 1 X̄ 2 )
⇒∑
−n X̄
1i2 1 i2 ------- (4.15)
We know that
∑ ( X i −Y i )2=(ΣX i Y i−n X̄ i Y i )=Σxi y i

∑ ( X i − X̄ i ) 2=( ΣX i2 −n X̄ i2 )=Σxi2
Substituting the above equations in equation (4.14), the normal equation (4.12) can be written
in deviation form as follows:
∑ x 1 y = β^ 1 Σx12 + β^ 2 Σx1 x 2 ………………………………………… (4.16)
Using the above procedure if we substitute (4.14) in (4.13), we get
∑ x 2 y = β^ 1 Σx 1 x 2 + β^ 2 Σx 22 ……………………………………….. (4.17)
Let’s bring (3.16) and (3.17) together
∑ x 1 y = β^ 1 Σx12 + β^ 2 Σx1 x 2 ………………………………………. (4.18)
∑ x 2 y = β^ 1 Σx 1 x 2 + β^ 2 Σx 22 ………………………………………. (4.19)
β^ 1 and β^ 2 can easily be solved using matrix

We can rewrite the above two equations in matrix form as follows.
∑ x 12 ∑ x1 x2 β^ 1 = ∑ x1 y …………. (4.20)
∑ x1 x2 ∑ x 22 β^ 2 ∑ x2 y
If we use Cramer’s rule to solve the above matrix we obtain
Σx1 y . Σx 2−Σx1 x 2 . Σx2 y
β^ 1=
2
Σx 2 . Σx 2 −Σ ( x 1 x2 )2
1 2 …………………………..…………….. (4.21)
Σx2 y . Σx 2−Σx1 x 2 . Σx1 y
β^ 2 =
1
Σx 2 . Σx 2 −Σ( x 1 x 2 )2
1 2 ………………….……………………… (4.22)
^ ^
We can also express β 1 and β 2 in terms of covariance and variances of Y , X 1 and X 2
Cov ( X 1 , Y ) . Var ( X 1 )−Cov( X 1 , X 2 ) . Cov ( X 2 , Y )
β^ 1= −−−−−−−−−(4 . 23 )
Var ( X 1 ). Var ( X 2 )−[cov ( X 1 , X 2 )]2
Cov ( X 2 , Y ) . Var ( X 1 )−Cov( X 1 , X 2 ) . Cov ( X 1 , Y )
β^ 2 = −−−−−−−−−(4 . 24 )
Var ( X 1 ). Var ( X 2 )−[ Cov( X 1 , X 2 )] 2
4.3.1. The coefficient of determination ( R2):two explanatory variables case
In the simple regression model, we introduced R 2 as a measure of the proportion of variation
in the dependent variable that is explained by variation in the explanatory variable. In
multiple regression model the same measure is relevant, and the same formulas are valid but
now we talk of the proportion of variation in the dependent variable explained by all
explanatory variables included in the model. The coefficient of determination is:
Σe 2
2 ESS RSS i
R = =1− =1−
TSS TSS Σy 2
i ------------------------------------- (4.25)
In the present model of two explanatory variables:
Σe2i =Σ ( y i− β^ 1 x 1i − β^ 2 x 2 i )2
=Σe i ( y i− β^ 1 x 1i − β^ 2 x 2i )
=Σe i y − β^ 1 Σx 1i e i− β^ 2 Σei x 2i
=Σe i y i since Σei x 1i=Σei x 2i =0
=Σy ( y − β^ x − β^ x )
i i 1 1i 2 2i

i. e Σe2i =Σy2 − β^ 1 Σx 1i y i − β^ 2 Σx 2i y i
otalsumof ¿ var iation) ¿¿=β^ 1 Σx1i yi+ β^ 2 Σx2i yi underbracealignl ⏟

⇒Σy2underbracealignlT⏟ Explained sumof ¿ var iation) ¿¿¿+Σe 2 underbracealignl⏟
Residualsumof squares ¿ ¿¿¿¿
i
square (Total ¿ square (Explained ¿ (unexplained var iation)¿ ----------------- (4.26)
ESS β^ 1 Σx 1i y i + β^ 2 Σx2 i y i
2
∴ R= =
TSS Σy2 ----------------------------------(4.27)
As in simple regression, R2 is also viewed as a measure of the prediction ability of the model
over the sample period, or as a measure of how well the estimated regression fits the data.
The value of R2 is also equal to the squared sample correlation coefficient between
Y^ ∧Y t .
Since the sample correlation coefficient measures the linear association between two
variables, if R2 is high, that means there is a close association between the values of
Y t and the
values of predicted by the model,

Y^ t . In this case, the model is said to “fit” the data well. If
R2 is low, there is no association between the values of

Y t and the values predicted by the
model,
Y^ t and the model does not fit the data well.
2
4.3.2. Adjusted Coefficient of Determination ( R̄ )
2
One difficulty with R is that it can be made large by adding more and more variables, even if
the variables added have no economic justification. Algebraically, it is the fact that as the
variables are added the sum of squared errors (RSS) goes down (it can remain unchanged, but
2 2
this is rare) and thus R goes up. If the model contains n-1 variables then R =1. The
2
manipulation of model just to obtain a high R is not wise. An alternative measure of
2 2
goodness of fit, called the adjusted R and often symbolized as R̄ , is usually reported by
regression programs. It is computed as:
Σe2i / n−k
R̄ =1−2
Σy / n−1 2
=1−( 1−R 2 ) ( n−k
n−1
)--------------------------------(4.28)
This measure does not always goes up when a variable is added because of the degree of
freedom term n-k is the numerator. As the number of variables k increases, RSS goes down,
R̄2 R2
but so does n-k. The effect on depends on the amount by which falls. While solving

one problem, this corrected measure of goodness of fit unfortunately introduces another one.
R̄2
It losses its interpretation; is no longer the percent of variation explained. This modified
R̄2
is sometimes used and misused as a device for selecting the appropriate set of explanatory
variables.
3.4. General Linear Regression Model and Matrix Approach
So far we have discussed the regression models containing one or two explanatory variables.
Let us now generalize the model assuming that it contains k variables. It will be of the form:
Y = β0 + β 1 X 1 + β 2 X 2 +. .. .. .+ β k X k +U
There are k parameters to be estimated. The system of normal equations consist of k+1
equations, in which the unknowns are the parameters

β 0 , β 1 , β2 . .. . .. . βk and the known terms
will be the sums of squares and the sums of products of all variables in the structural
equations.
Least square estimators of the unknown parameters are obtained by minimizing the sum of the
squared residuals.
Σe2i =Σ ( y i− β^ 0 − β^ 1 X 1 − β^ 2 X 2 −.. . .. .− β^ k X k )2
With respect to β j ( j=0,1,2,....(k+1))

The partial derivations are equated to zero to obtain normal equations.
∂ Σe 2i
=−2 Σ(Y i − β^ 0− β^ 1 X 1− β^ 2 X 2−. .. . ..− β^ k X k )=0
∂ β^ 0
∂ Σe2i
=−2 Σ(Y i − β^ 0− β^ 1 X 1− β^ 2 X 2−. .. . ..− β^ k X k )( x i )=0
∂ β^ 1
……………………………………………………..
∂ Σe2i
=−2 Σ(Y i − β^ 0− β^ 1 X 1− β^ 2 X 2−. .. . ..− β^ k X k )( x ki )=0
∂ β^ k
The general form of the above equations (except first ) may be written as:
∂ Σe2i
=−2 Σ(Y i − β^ 0− β^ 1 X 1i−−−−−− β^ k X ki )=0
∂β ^
j ; where ( j=1,2,....k )

The normal equations of the general linear regression model are

ΣY i =n β^ 0 + β^ 1 ΣX 1i + β^ 2 ΣX 2i +. .. .. . .. .. . .. .. . .. .. . .. .. . .. .. .+ β^ k ΣX ki
ΣY i X 1i = β^ 0 ΣX 1 i + β^ 1 ΣX +.. . .. .. . .. .. . .. .. . .. .. . .. .. .. . .. .+ β^ k ΣX 1 i X ki
1i 2
ΣY i X 2i = β^ 0 ΣX 21 i + β^ 1 ΣX 1 i X 2i + β^ 2 ΣX +. .. .. . .. ..+ β^ k ΣX 2i X ki
2i 2
: : : : :
: : : : :
ΣY i X ki= β^ 0 ΣX ki + β^ 1 ΣX 1 X ki +∑ X 2 i X ki .. .. . .. .. . .. .. . .. .+ β^ k ΣX
i ki 2
Solving the above normal equations will result in algebraic complexity. But we can solve this
easily using matrix. Hence in the next section we will discuss the matrix approach to linear
regression model.
4.4. Matrix Approach to Linear Regression Model

The general linear regression model with k explanatory variables is written in the form:
Y i =β 0 +β 1 X 1 i +β 2 X 2 i +.. .. . .. .. . .. .+β k X ki +U
i
where (i=1,2,3,........n) and β 0 = the intercept, β 1 to β k = partial slope coefficients U=

stochastic disturbance term and i=ith observation, ‘n’ being the size of the observation. Since i
represents the ith observation, we shall have ‘n’ number of equations with ‘n’ number of
observations on each variable.
Y 1 =β 0 +β 1 X 11 + β 2 X 21+ β3 X 31 . .. . .. .. . .. ..+β k X k 1 +U 1
Y 2 =β 0 +β 1 X 12 +β 2 X 22 +β 3 X 32 .. .. . .. .. . .. .+β k X k 2 +U 2
Y 3 =β 0 +β 1 X 13+β 2 X 23+β 3 X 33 . .. . .. .. . .. ..+β k X k 3 +U 3
…………………………………………………...
Y n =β 0 + β1 X 1n + β2 X 2n +β 3 X 3 n . .. .. . .. .. . ..+β k X kn +U n
These equations are put in matrix form as:

[] [ ][ ] [ ]
Y1 1 X 11 X 21 . . .. .. . Xk 1 β0 U1
Y2 1 X 12 X 22 . . .. .. . Xk 2 β1 U2
Y3 = 1 X 13 X 23 . . .. .. . Xk 3 β2 + U3
. . . . . . .. .. . . . .
Yn 1 X 1n X2 n . . .. .. . X kn βn Un
Y = X . β + U
In short Y = Xβ+U ……………………………………………………(4.29)

The order of matrix and vectors involved are:
Y =(n×1), X= {(n×(k +1 ) } , β={ (k +1)×1 } and U=(n×1 )
To derive the OLS estimators of β , under the usual (classical) assumptions mentioned earlier,
we define two vectors β^ and ‘e’ as:
[] []
β^ 0 e1
^
β e2
1
^
β= and e= .
.
. .
^
β en
k
^
Y = X β+e ^
Thus we can write: and e=Y −X β
We have to minimize:
n
∑ e2i =e 21+ e22+ e23 +.. . .. .. . .+ e2n
i=1
[]
e1
e2
=[ e1 , e 2 .. . .. . en ] . = e'e
.
en
=∑ e2i =e ' e
e' e=(Y −X β^ )' (Y − X β^ )

^ X ' Y −Y ' X β+
=YY '− β' ^ β^ ' X ' X β^
………………….…(4.30)
^ X' Y '
β'
Since is scalar (1x1), it is equal to its transpose;

β^ ' X ' Y =Y ' X β^

e' e=Y ' Y −2 β' ^ X ' Y + β^ ' X ' X β^
------------------------------------- (4.31)
Minimizing e’e with respect to the elements in β^

∂ Σe2i ∂(e ' e )
= =−2 X ' Y +2 X ' X β^
∂β^ ∂β ^
∂( X ' AX )
=2 AX
Since ∂ ^
β and also too 2X’A
Equating the expression to null vector 0, we obtain:
^
−2 X ' Y +2 X ' X β=0 ^ X'Y
X ' X β=

^ X ' X )−1 X ' Y
β=( ………………………………. ………. (4.32)
Hence β^ , β^ , β^ ,. .. . .. .. β^
β^ is the vector of required least square estimators, 0 1 2 k.
4.4.1. Statistical Properties of the Parameters (Matrix) Approach

^
We have seen, in simple linear regression that the OLS estimators ( α^ ∧ β ) satisfy the small
sample property of an estimator i.e. BLUE property. In multiple regression, the OLS
estimators also satisfy the BLUE property. Now we proceed to examine the desired
properties of the estimators in matrix notations:
1. Linearity
^ X ' X )−1 X ' Y
β=(
We know that:
'−1 '
Let C=( X X ) X
⇒ β^ =CY …………………………………………….(4.33)
Since C is a matrix of fixed variables, equation (4.33) indicates us β^ is linear in Y.

2. Unbiased ness
^ X ' X )−1 X ' Y
β=(
^ X ' X )−1 X ' ( Xβ+U )
β=(
^ +( X ' X )−1 X ' U
β=β …….……………………………... (4.34)

since [ ( X ' X ) X ' X=I ]

−1
Ε( β^ )=Ε { β +( X ' X )−1 X ' U }

=Ε( β )+ Ε [ ( X ' X )−1 X ' U ]
=β+ Ε( X ' X )−1 X ' Ε(U )

=β , since Ε(U )=0
Thus, least square estimators are unbiased.
3. Minimum variance
Before showing all the OLS estimators are best(possess the minimum variance property), it is
important to derive their variance.
We know that, var( β )=Ε [ ( β−β ) ] =Ε [ ( β−β )( β−β )' ]
^ ^ 2 ^ ^
Ε [( β−
^ β )( β−β
^ )' ]=
[ ]
Ε ( β^ 1−β 1 )2 Ε [ ( β^ 1−β 1 )( β^ 2−β 2 ) ] .. .. . .. Ε [( β^ 1 −β 1 )( β^ k −β k ) ]
Ε [ ( β^ 2 −β 2 )( β^ 1 −β 1 ) ] Ε( β^ −β ) Ε [( β^ −β )( β^ −β ) ]
2
2 2 .. .. . .. 2 2 k k
: : :
: : :
Ε [( β^ k −β k )( β^ 1−β 1 ) ] Ε [ ( β^ k−β k )( β^ 2 −β 2 ) ] .. .. . .. . Ε ( β^ k−β k )2
[ ]
var ( β^ 1 ) cov ( β^ 1 , β^ 2 ) . . .. .. . cov ( β^ 1 , β^ k )
cov ( β^ 2 , β^ 1 ) var ( β^ 2 ) . . .. .. . cov ( β^ , β^ )
2 k
=: : :
: : :
cov ( β^ k , β^ 1 ) cov ( β^ k , β^ 2 ) . . .. .. . var ( β^ k )
The above matrix is a symmetric matrix containing variances along its main diagonal and
covariance of the estimators every where else. This matrix is, therefore, called the Variance-
covariance matrix of least squares estimators of the regression slopes. Thus,
var ( β^ )=Ε [ ( β−β
^ ^
)( β−β )' ] ……………………………………………(4.35)
^ −1
From (3.15) β=β +( X ' X ) X'U
⇒ β^ −β=( X X )−1 X U ………………………………………………(4.36)

' '
Substituting (3.17) in (3.16)

var( β^ )=Ε [ {( X ' X )−1 X ' U }{( X ' X )−1 X ' U } ' ]
var( β^ )=Ε [ ( X ' X )−1 X ' UU ' X ( X ' X )−1 ]

=( X ' X )−1 X ' Ε(UU ' ) X ( X ' X )−1

=( X ' X )−1 X ' σ 2u I n X ( X ' X )−1
=σ 2u ( X ' X )−1 X ' X ( X ' X )−1

^ 2 −1
var( β ) =σ u ( X ' X ) ………………………………………….……..(4.37)
2
I
Note: (σ u being a scalar can be moved in front or behind of a matrix while identity matrix n
can be suppressed).
^ 2 −1
Thus we obtain, var( β )=σ u ( X ' X ) 4
[ ]
n ΣX 1n . .. .. . . ΣX kn
2
1n
ΣX 1 n ΣX . .. .. . . ΣX 1 n X kn
: : :
: : :
2
−1 kn
ΣX kn ΣX 1n X kn . .. .. . . ΣX
Where, ( X ' X ) =
^
We can, therefore, obtain the variance of any estimator say β 1 by taking the ith term from the
−1 2
principal diagonal of ( X ' X ) and then multiplying it by σ u .
Where the X’s are in their absolute form. When the x’s are in deviation form we can write the
multiple regression in matrix form as ;
^ x' x )−1 x ' y
β=(
[ ]
β^ 1 2
∑x1 Σx 1 x 2 .. . .. .. Σx 1 x k
β^2 Σx 2 x 1 Σx 2 .. . .. .. Σx 2 x k
2
: : : :
: : : :
^ β^ Σx n x1 Σx n x2 .. . .. .. Σx
where β = k
' 2
and ( x x )= k
The above column matrix β^ doesn’t include the constant term β^ 0 .Under such conditions the
^ 2 −1
variances of slope parameters in deviation form can be written as: var( β )=σ u (x ' x )
…………………………………………………….(4.38)

(the proof is the same as (4.37) above). In general we can illustrate the variance of the
parameters by taking two explanatory variables.
The multiple regressions when written in deviation form that has two explanatory variables is
y 1 = β^ 1 x 1 + β^ 2 x 2
var( β^ )=Ε [ ( β−β

^ ^
)( β−β )' ]
^
( β−β)=¿ [( β^ 1−β1 ) ¿] ¿ ¿¿
In this model; ¿
^
( β−β ) ' =[( β^ 1 −β 1 )( β^ 2 −β 2 ) ]
)'=¿ [ ( β1 −β 1 ) ¿ ] ¿ ¿ ¿
^ ^ ^
∴ ( β−β )( β−β
¿
[ (
Ε [( β− β)( β−β)' ]= Εalignl 1 1
^ ^ ^ −β )2
β ( β^ 1−β 1)( β^ 2−β 2 ) ¿ ] ¿ ¿¿
and ¿
=
[ var ( β^ 1 ) cov ( β^ 1 , β^ 2 )
cov ( β^ 1 , β^ 2 ) var( β^ 2 ) ]
In case of two explanatory variables, x in the deviation form shall be:
[ ]
x 11 x 21
x= x 12
:
x1 n
x 22
:
x 2n
and x '=
[ x 11
x 12
x 12 .. .. . .. x 1n
x 22 . . .. .. . x 2n ]
[ ]
−1
−1 Σx21 Σx1 x 2
∴ σ 2u ( x ' x) =σ 2u
Σx1 x 2 Σx22
σ 2u ( x ' −1
x) =
σ 2u
[ Σx 22
−Σx 1 x 2
−Σx 1 x 2
Σx 21 ]
Σx 21 Σx 1 x 2
| |
Or Σx 1 x 2 Σx 22

σ 2u Σx22
var ( β^ 1 )=
i.e., Σx 12 Σx 22−( Σx 1 Σx 2 )2 …………………………………… (4.39)
σ 2u Σx 12
var ( β^ 2 )=
and, Σx 21 Σx 22−( Σx 1 Σx 2 )2 ………………. …….…….(4.40)
(−) σ 2u Σx 1 x 2
cov ( β^ 1 , β^ 2 )=
Σx 21 Σx 22−( Σx 1 Σx 2 )2 …………………………………….(4.41)
2
The only unknown part in variances and covariance of the estimators is σ u .
{ }
Σe 2i
As we have seen in simple regression modelσ^ = n−2 . For k-parameters (including the
2
constant parameter)σ^ = n−k .

2 { } Σe 2i
In the above model we have three parameters including the constant term and
{ }
σ^ 2= n−3
Σei2
∑ ei 2=∑ y i 2−β 1 ∑ x 1 y−β 2 ∑ x 2 y .. . .. .. . .+ β K ∑ x K y ………………………(4.42) this is

for k explanatory variables. For two explanatory variables
∑ ei 2=∑ y i 2−β 1 ∑ x 1 y−β 2 ∑ x 2 y ………………………………………...(4.43)
This is all about the variance covariance of the parameters. Now it is time to see the minimum
variance property.
Minimum variance of β^
^
β i ' s in the β
To show that all the vector are Best Estimators, we have also to prove that the
variances obtained in (3.37) are the smallest amongst all other possible linear unbiased
estimators. We follow the same procedure as followed in case of single explanatory variable
model where, we first assumed an alternative linear unbiased estimator and then it was
established that its variance is greater than the estimator of the regression model.
^
Assume that β^ is an alternative unbiased and linear estimator of β . Suppose that
^^
β= [ ( X ' X )−1 X ' +B ] Y

Where B is (k x n) matrix of known constants.

^^
∴ β= [( X ' X )−1 X '+B ] [ Xβ+U ]
^^
β=( X ' X )−1 X ' ( Xβ+U )+B( Xβ +U )
^
Ε( β^ )=Ε [( X ' X )−1 X ' ( Xβ+U )+B ( Xβ+U ) ]
=Ε [ ( X ' X )−1 X ' Xβ+( X ' X )−1 X ' U +BX β +BU ]
=β + BX β , [since E(U) = 0].……………………………….(4.44)
^
Since our assumption regarding an alternative β^ is that it is to be an unbiased estimator of β ,
^
^
therefore, Ε( β ) should be equal to β ; in other words ( β XB) should be a null matrix.
^
Thus we say, BX should be =0 if ( β )= [( X ' X ) X '+B ] Y is to be an unbiased estimator. Let
^ −1
us now find variance of this alternative estimator.
^^
var( β)=Ε
^^
( β−β[ ^^
)( β−β)' ]
=Ε [ {[ ( X ' X )
−1
X ' + B ] Y −β }{ [( X ' X )−1 X ' + B ] Y −β } ' ]
=Ε [ {[ ( X ' X )−1
X ' + B ] ( Xβ +U )−β }{[ ( X ' X )−1 X ' + B ] ( Xβ +U )− β } ' ]
=Ε [ {( X ' X )−1 X ' Xβ +( X ' X )−1 X ' U + BX β + BU −β }
{( X ' X )−1 X ' Xβ+( X ' X )−1 X ' U + BX β + BU −β } ' ¿
¿
=Ε [ {( X ' X )−1 X ' U +BU }{( X ' X )−1 X ' U +BU } ' ¿ ¿
( ∵ BX=0)
=Ε [ {( X ' X )−1 X ' U + BU }{U ' X ( X ' X )−1 +U ' B ' }]
=Ε [ {( X ' X )−1 B } UU ' { X ( X ' X )−1 +U ' B ' } ]
=[ ( X ' X )−1 X '+B ] Ε(UU ' ) [ X ( X ' X )−1 +B ' ]
=σ 2u I n [( X ' X )−1 X '+ B ][ X ( X ' X )−1 + B' ]
=σ 2u [ ( X ' X )−1 X ' X ( X ' X )−1 + BX ( X ' X )−1 +( X ' X )−1 ]
=σ 2u [ ( X ' X )−1 X ' X ( X ' X )−1 + BX ( X ' X )−1 +( X ' X )−1 X ' B' + BB' ]
=σ 2u [ ( X ' X )−1 +BB ' ] ( ∵ BX=0 )

^
var( β^ ) =σ 2u ( X ' X )−1 + σ 2u BB ' ……………………………………….(4.45)
^^
Or, in other words, var( β ) is greater than var( β^ ) by an expression σ 2u BB ' and it proves that
β^ is the best estimator.

4.4.2. Coefficient of multiple determination in Matrix Form
In simple regression model we have discussed about the coefficient of determination and its
interpretation. In this section, we will discuss the coefficient of multiple determination which
has an equivalent role with that of the simple model. As coefficient of determination is the
square of the simple correlation in simple model, coefficient of multiple determination is the
square of multiple correlation coefficient.
The coefficient of multiple determination is the measure of the proportion of the

variation in the dependent variable that is explained jointly by the independent variables in the
n−k
model. One minus is called the coefficient of non-determination. It gives the proportion of
the variation in the dependent variable that remains unexplained by the independent variables
MSR=
∑ e2
n−k
in the model. As in the case of simple linear regression, is the ratio of the explained
variation to the total variation. Mathematically:
SST =∑ y 2
¿^
SSE=∑ y2=∑ (Y^ i−Ȳ)2=Explained sum ofsquares.¿SSR=∑ y2i =∑ (Y i−Y^ )2=Unexplainedsumof squares.
Or n−1 can also be given in terms of the slope coefficients SST=∑ y =∑(Y −Ȳ) =Totalsumofsquares.
2 2
i
SSE
MSE=
In simple linear regression, the higher the k−1 means the better the model is determined by the
explanatory variable in the model. In multiple linear regression, however, every time we
SSR
MSR=
insert additional explanatory variable in the model, the n−k increases irrespective of the
MSR
Fcal= ≈Fα(k−1,n−k)
improvement in the goodness-of- fit of the model. That means high MSE may not imply that the
model is good.
Thus, we adjust the Rej ct H0 if Fcal¿Fα(k−1,n−k)as follows:

F α (k −1 , n−k )
Where, k = the number of explanatory variables in the model.

In multiple linear regression, therefore, we better interpret the adjusted α than the ordinary
2 ∑ ^y 2
or the unadjusted R , . We have known that the value of R2 =
∑ y2 is always between zero and one.
But the adjusted ∑ ^y2= R2∑ y2can lie outside this range even to be negative.
R2=1−
∑ e2
In the case of simple linear regression, ∑ y2 is the square of linear correlation coefficient.
Again as the correlation coefficient lies between -1 and +1, the coefficient of determination
∑ x1 y=1043.25 ∑ x2 y=−509
lies between 0 and 1. The of multiple linear regression also lies between 0 and +1.
The adjusted ∑
x1 x2=960.6667
however, can sometimes be negative when the goodness of fit is poor. When
the adjusted ∑ y =1536.25 value is negative, we considered it as zero and interpret as no variation of the
2
dependent variable is explained by regressors.
2
The coefficient of determination( R ) can be derived in matrix form as follows.
2 ^ ^ ^ ^
We know that Σei =e ' e=Y ' Y −2 β ' X ' Y + β ' X ' X β since ( X ' X ) β= X ' Y and
∑ Y i2=Y ' Y
∴ e' e=Y ' Y −2 β^ ' X ' Y + β'
^ X'Y
e' e=Y ' Y − β^ ' X ' Y ……………………………………...……..(4.46)

^ X ' Y = e ' e−Y ' Y
β' ……………………………………………….(4.47)
We know,
y i =Y i−Ȳ
1
∴ Σy 2i =ΣY 2i − ( ΣY i )2
n
In matrix notation
1
Σy2i =Y ' Y − ( ΣY i )2
n ………………………………………………(4.48)
Equation (3.48) gives the total sum of squares variations in the model.
2 2
Explained sum of squares=Σy i −Σei
1
=Y ' Y − ( Σy )2 −e ' e
n

1
= β^ ' X ' Y − ( ΣY i )2
n ……………………….(4.49)
Explained sum of squares
R2 =
Since Total sum of squares
1
β^ ' X ' Y − (ΣY i )2 ^
n β ' X ' Y −n Ȳ
∴ R 2= =
1 Y ' Y −n Ȳ 2
Y ' Y − ( ΣY i )2
n ……………………(4.50)
We hope that from the discussion made so far on multiple regression model, in general, you
may make the following summary of results.
(i) Model: Y = Xβ+U

^ X ' X )−1 X ' Y
β=(
(ii) Estimators:
(iii) Statistical properties: BLUE
^ 2 −1
(iv) Variance-covariance: var( β )=σ u ( X ' X )
(v) Estimation of (e’e): e' e=Y ' Y − β^ ' X ' Y
1
β^ ' X ' Y − ( ΣY i )2
2 n
R=
1 2 β^ ' X ' Y −n Ȳ
Y ' Y − ( ΣY i ) =
(vi) Coeff. of determination: n Y ' Y −n Ȳ
4.5. Hypothesis Testing in Multiple Regression Model

In multiple regression models we will undertake two tests of significance. One is significance
of individual parameters of the model. This test of significance is the same as the tests
discussed in simple regression model. The second test is overall significance of the model.
4.5.1. Tests of individual significance
2
If we invoke the assumption that U i ~. N (0 , σ ) , then we can use either the t-test or standard
error test to test a hypothesis about any individual partial regression coefficient. To illustrate
consider the following example.
^ ^ ^
Let Y = β0 + β 1 X 1 + β 2 X 2 +e i ………………………………… (4.51)
A.
H 0 : β 1 =0

H 1 : β 1 ≠0
B.
H 0 : β 2 =0
H 1 : β 2 ≠0
The null hypothesis (A) states that, holding X2 constant X1 has no (linear) influence on Y.
Similarly hypothesis (B) states that holding X1 constant, X2 has no influence on the dependent
variable Yi.To test these null hypothesis we will use the following tests:
i- Standard error test: under this and the following testing methods we test only for
β^ 1 .The test for β^ 2 will be done in the same way.
SE( β^ 1 )=√ var( β^ 1 )=
SE( β^ 1 )> 1 2 β^ 1
√ ∑ ∑
x12 i
σ^ 2 ∑ x 22i
x 22i −( ∑ x 1 x 2)
2
; where
σ^ 2=
Σe 2i
n−3
 If , we accept the null hypothesis that is, we can conclude that

β
the estimate i is not statistically significant.
SE( β^ 1 < 1 2 β^ 1
 If , we reject the null hypothesis that is, we can conclude that the
estimate
β i is statistically significant.
Note: The smaller the standard errors, the stronger the evidence that the estimates are
statistically reliable.
^
ii. The student’s t-test: We compute the t-ratio for each β i
β^ i−β
t∗¿ ~ t n-k
SE ( β^ )i , where n is number of observation and k is number of parameters.
If we have 3 parameters, the degree of freedom will be n-3. So;
β^ 2 −β2
t∗¿
SE ( β^ )2 ; with n-3 degree of freedom
In our null hypothesis β 2 =0 , the t* becomes:

β^ 2
t∗¿
SE ( β^ 2 )

^
 If t*<t (tabulated), we accept the null hypothesis, i.e. we can conclude that β 2 is
not significant and hence the regressor does not appear to contribute to the
explanation of the variations in Y.
 If t*>t (tabulated), we reject the null hypothesis and we accept the alternative one;
β^ 2 is statistically significant. Thus, the greater the value of t* the stronger the
evidence that
β i is statistically significant.
Testing the Overall Significance of Regression Model

Here, we are interested to test the overall significance of the observed or estimated
regression line, that is, whether the dependent variable is linearly related to all of the
explanatory variables. Hypotheses of such type are often called joint hypotheses.
Testing the overall significance of the model means testing the null hypothesis that
none of the explanatory variables in the model significantly determine the changes in
the dependent variable. Put in other words, it means testing the null hypothesis that
none of the explanatory variables significantly explain the dependent variable in the
model. This can be stated as:
β
¿
=
∑ x i
y i
bu t y = Y −Y
¿
an d substi t u t e thi s in pla c e of yi
1
∑ x 2
i
i i
¿ ¿
¿ ∑ x ( Y −Y ) ∑ xi Y Y ∑ xi
β =
i i
=
i
− bu t ∑ x =0
1
∑ x 2
i ∑ x i2 ∑ xi
2 i
β
¿
=
∑ x i
Y i
1
∑ x 2
i
x i
Le t K =
i
∑ x
2
i
¿
β 1
=∑ K i
Y i
K i ha s th e follo w i n g proper t i e s ;
∑ K =0 . Th i s is beca u s e ∑ K =
∑ x
=
0
=0
i i
∑ x
2
i ∑ x
2
i
∑ x X ¿

∑ K X =∑ K x =1 . Th i s is beca u s e ∑ K X =
i i
bu t x = X − X an d substi t u t e
i i i i i i
∑ x 2
i
i i
in t o th e form u l a
¿ ¿
∑ ( X − X ) X ∑ X 2
X ∑ X ¿
=
i i
=
i
−
i
. Bu t ∑ X =n X an d substi t u t e in th e form u l a
∑ x i2 ∑ x i2 ∑ xi
2 i
¿ ¿
∑ X
2
n X 2
∑ X
2
−n X
2
¿ ∑ xi
2
=
i
− =
i
. Bu t ∑ X 2
−n X 2
=∑ x 2
= =1
∑ xi
2
∑ x
2
i ∑ x
2
i
i i
∑ xi
2
1 x x 2
∑ xi
2
1
∑ K 2
= . Th i s is beca u s e K =
i
an d K 2
=
i
. Theref o re , ∑ K 2
= =
i
∑ x 2
i
i
∑ x 2
i
i
( ∑ x 2
i
)2
i
∑ xi
( 2
)2 ∑ x 2
i
 The test statistic for this test is given by:

¿
β 1=∑ K i Y i . But Y i=β 0 +β 1 X i +U i
=∑ K i (β 0+β 1 X i +U i )
=∑ (β 0 K i +β i K i X i +KU i )
=β0 ∑ K i +β 1∑ K i X i +∑ K i U i
¿
β1 =β 1+∑ K i U i
¿
E(β1 )=E(β1 +∑ K i U i )
= E(β 1 )+E(∑ K i U i )
¿
E(β 1 )=β 1+∑ K i E(U i ). But by theassumptionthat E(U i )=0

¿
E(β 1 )=β 1
 Where, k is the number of explanatory variables in the model.

 The results of the overall significance test of a model are summarized in the
analysis of variance (ANOVA) table as follows.
¿
Source of Sum of squares Degrees of Mean sum of β1
variation freedom squares
¿ ¿ ¿
¿ ¿ ¿ ¿
Var ( β 1 )=E(( β1 −E( β 1 ) )2 . But we have seen that E( β 1 )=β 1
Regression
¿ ¿
=E( β 1 −β 1 )2 . But β 1 −β 1=∑ K i U i
=E( ∑ K i U i )2 =E( ∑ K i2 X i2 +2 ∑ K i U i U j )
i≠ j
=∑ K i E(U i )+2 ∑ K i E(U i U j ).

2 2
But E (U i U j )=0 and E (U i )2 =δ U2 and is cons tan t by assumption 3
β1
=∑ K i δ U
β0
2 2
β0
=δ U2 ∑ K i2 . But ∑ K i2=1 x2
∑ i
¿ δ U2
Therefo re, Var ( β 1 )=
∑x 2
i
¿
¿
¿ ¿ ¿ ¿ ¿
β 0 =Y − β 1 X . But β1 = ∑ K i Y i
Residual
¿ ¿
=Y − X ∑ K i Y i
β0
¿
1
=∑ ( −X Ki) Y i
n
1 ¿
=∑ ( − X K i )( β 0 + β 1 X i +U i )
β0
n
β0 β1 Xi Ui ¿ ¿ ¿
=∑ ( + + − X β 0 K i − X β1 K i X i − X K i U i )
n n n
=
∑ β0
+
β1∑ X i
+
∑ Ui −X ¿ ¿ ¿
β 0 ∑ K i − X β1 ∑ K i X i − X ∑ K i U i
n n n
nβ0 ¿ ¿ ¿
=
n
+ β 1 X − β1 X − X ∑ Ki Ui
¿ ¿
β 0 = β 0− X ∑ K i U i
¿
Total β0 β0

The values in this table are explained as follows:

¿
β 0
These three sums of squares are related in such a way that
¿ ¿ ¿ ¿
1 ¿
Var ( β 0 )= E (( β 0− E ( β 0 ))2
. But β 0 =∑ (
n
− X K i ) Y i
¿
1 ¿
Var ( β 0 )=∑ (
n
− X K i )2
Var (Y i ) . But Var ( Y i ) 2
= δU which is cons tan t .
1 ¿
= δ 2
U ∑ (
n
− X K i )2
1 ¿ ¿
=δ 2
U ∑ (
n 2
−2 X K i + X 2
K 2
i )
=δ 2 ∑
(
1
− 2 X ∑
¿
K +
¿
X 2
∑ K 2
U 2 i i
n
2 n ¿
2 1
=δ U ( + X )
n2
∑ x 2
i
¿
2
2 1 X
= δU ( + )
n ∑ xi
2
¿
∑ x2
i +n X 2 ¿
=δ 2
U (
n ∑ 2
x i
) . But ∑ x 2
i + n X 2
=∑ X 2
i
Therefore , Var ( β
¿
)=
δ 2
U∑ X 2
i
0
n ∑ x 2
i
This implies that the total sum of squares is the sum of the explained (regression) sum of
squares and the residual (unexplained) sum of squares. In other words, the total variation in
the dependent variable is the sum of the variation in the dependent variable due to the
variation in the independent variables included in the model and the variation that remained
unexplained by the explanatory variables in the model. Analysis of variance (ANOVA) is the
technique of decomposing the total sum of squares into its components. As we can see here,
the technique decomposes the total variation in the dependent variable into the explained and
the unexplained variations. The degrees of freedom of the total variation are also the sum of
the degrees of freedom of the two components. By dividing the sum of squares by the
corresponding degrees of freedom, we obtain what is called the Mean Sum of Squares
(MSS).
The Mean Sum of Squares due to regression, errors (residual) and Total are calculated as the
Sum of squares and the corresponding degrees of freedom (look at column 3 of the above
ANOVA table.
The final table shows computation of the test statistic which can be computed as follows:
¿
δ U 2=
∑ e2
i
n −2 [The F statistic follows F distribution]
Y =β 0 + β 1 X +U i
δ2U
i i
¿ ¿ ¿
Y =β 0 + β 1 X +U
¿ ¿ ¿
¿
y i =β 1 xi + U i−U
∑ xi2 E ( β1−β 1 )2. But E( β 1−β 1)2 is the var iance of β 1 which isgivenas
¿
y i =β 1 xi
¿
ei = y i− y i
¿ ¿
ei =β 1 x i +U −U −β 1 x
∑ xi2
i i
¿ ¿
ei =U i −U −( β1 −β 1 ) x i
¿ ¿ ¿ ¿
2
ei =( Ui −U )2
+( β 1 −β 1 )2 2
xi −2( β 1− β1 ) xi (U i −U )
¿ ¿ ¿ ¿
∑ e2
i =∑ (U −
iU )2
+( β 1−β 1 )2
∑ 2
xi −2( β1 −β 1 )∑ xi ( U i −U )
The test rule: where is the value to be read

¿ ¿ ¿ ¿
=∑ Ui
2
−U ∑ U i +( β 1− β1 )2
∑ xi 2
− 2( β 1−β 1 )∑ x i ( U i− U )
( ∑ Ui ) 2 ¿ ¿ ¿
E ( ∑ e2
) =E (∑ Ui2
− )+∑ x i 2
E ( β 1 −β1 )2
− 2 E (( β 1 −β1 )∑ xi ( U i −U ) )
δ U2
i
n
Now let us see the ex p e ctation o ne by on e .
( ∑ Ui )2 ∑ Ui 2
+2 ∑ U i U j
E (∑ =E (∑
¿ ∑ x 2i
2 i≠ j
Ui − ) Ui 2
− ( ) )
n n
=δ 2U
∑ E ( U i )2 +2 ∑ E ( U i U j )
=∑ E ( Ui )2
−
n
∑ E( Ui ) 2
+0
∑ xi
=∑ δ 2
U −
n
=∑ 2
δU −
δU
n
2
∑ 2
δU
2
=nδ 2
U −n
n
=nδ 2
U −δU 2
¿ ¿
−2 E (( β 1 − β 1 )∑ x i ( U i −U ))
from the F- distribution table at a given level.

¿ ¿ ¿ ∑ xiU i
¿−2 E (( β 1− β 1 ) ∑ xi U i−U ∑ x i . Bu t ∑ x i =0 an d β 1− β 1 = ∑ K i U i=
∑ x i2
∑ x iU i . x U )
¿−2 E ( ∑i i
∑ x i2
¿−2 E (( 2
∑ x i U i )2 )
∑ xi
¿−2 E (
∑ x i2 U i2 + 2 i≠∑j x i U i U j )
∑ x i2 ∑ x i2
∑ x i2 E ( U i2 ) + ∑
2 x i E ( U i U j)
i≠ j
¿−2 (
∑ x i2 ∑ x i2
¿−2 (
∑ x i2 δ U2 + 0 )
∑ x i2
¿−2 δ U2 (
∑ x i2 )=−2 δ 2
∑ x i2 U
Example: If Y is quantity demanded of a commodity, X 1 is the price of the commodity and
X2 is income of the consumer. The hypothesis suggests that the price and income elasticity of
demand are the same.

We can test the null hypothesis using the classical assumption that

^
∴β t distribution with N - K degrees of freedom.
Where K = the total number of parameters estimated.
^ ^ = Σ (1 n− X̄ k i ) Y i
α
The α is given as
Thus the t-statistic is:
^
α
Decision: Reject H0 if tcal. > ttab.
Note: Using similar procedures one can also test linear equality restrictions, for example
^
^ ∧ β
α
and other restrictions.
Illustration: The following table shows a particular country’s the value of imports (Y), the
level of Gross National Product(X1) measured in arbitrary units, and the price index of
imported goods (X2), over 12 years period.
Table 1: Data for multiple regression examples
Yea 196 196 196 196 196 196 196 196 196 196 197 197
r 0 1 2 3 4 5 6 7 8 9 0 1
Y 57 43 73 37 64 48 56 50 39 43 69 60
X1 220 215 250 241 305 258 354 321 370 375 385 385
X2 125 147 118 160 128 149 145 150 140 115 155 152
a) Estimate the coefficients of the economic relationship and fit the model.
To estimate the coefficients of the economic relationship, we compute the entries given in
Table 2

Table 2: Computations of the summary statistics for coefficients for data of Table 1
Year Y X1 X2 x1 x2 Y X12 x22 x1y x2y x1x2 y2
1960 57 220 125 -86.5833 -15.3333 3.75 7496.668 235.1101 -324.687 -57.4999 1327.608 14.0625
1961 43 215 147 -91.5833 6.6667 -10.25 8387.501 44.44489 938.7288 -68.3337 -610.558 105.0625
1962 73 250 118 -56.5833 -22.3333 19.75 3201.67 498.7763 -1117.52 -441.083 1263.692 390.0625
1963 37 241 160 -65.5833 19.6667 -16.25 4301.169 386.7791 1065.729 -319.584 -1289.81 264.0625
1964 64 305 128 -1.5833 -12.3333 10.75 2.506839 152.1103 -17.0205 -132.583 19.52731 115.5625
1965 48 258 149 -48.5833 8.6667 -5.25 2360.337 75.11169 255.0623 -45.5002 -421.057 27.5625
1966 56 354 145 47.4167 4.6667 2.75 2248.343 21.77809 130.3959 12.83343 221.2795 7.5625
1967 50 321 150 14.4167 9.6667 -3.25 207.8412 93.44509 -46.8543 -31.4168 139.3619 10.5625
1968 39 370 140 63.4167 -0.3333 -14.25 4021.678 0.111089 -903.688 4.749525 -21.1368 203.0625
1969 43 375 115 68.4167 -25.3333 -10.25 4680.845 641.7761 -701.271 259.6663 -1733.22 105.0625
1970 69 385 155 78.4167 14.6667 15.75 6149.179 215.1121 1235.063 231.0005 1150.114 248.0625
1971 60 385 152 78.4167 11.6667 6.75 6149.179 136.1119 529.3127 78.75022 914.8641 45.5625
Sum 639 3679 1684 0.0004 0.0004 0 49206.92 2500.667 1043.25 -509 960.6667 1536.25
Mean 53.25 306.5833 140.3333 0 0 0

From Table 2, we can take the following summary results.
α ∧ β
^
θ
θ
^
E( θ)−θ=the amount of bias
The summary results in deviation forms are then given by:
^
θ θ
^
E( θ)−θ=0⇒E( ^
θ)=θ α^ ∧ β^
α ∧ β Ε( β)^ =β and Ε(α^ )=α
^
The coefficients are then obtained as follows.
^ )= β
Ε( β
β ^
.
The fitted model is then written as: β=ΣkY i = 75.40512 + 0.025365X1 - 0.21329X2
b) Compute the variance and standard errors of the slopes.
First, you need to compute the estimate of the variance of the random term as follows
=Σk i ( α + βX i +U i )
=αΣk +βΣk X +Σk u
Variance of i i i i i
24
but Σki =0 and Σki X i=1

Σx Σ(X−X̄) ΣX−nX̄
Σki= 2i= 2 = 2
Standard error of Σx i Σx i Σx i
n X̄ −n X̄
= =0
Σx 2
i
Variance of
⇒∑ ki=0
Σx i X i Σ ( X − X̄ ) Xi
Σk i X i= =
Σx 2
i Σx 2
i
ΣX2−X̄ ΣX ΣX2−nX̄2
= 2 2 = 2 2=1
Standard error of ΣX −nX̄ ΣX −nX̄
⇒ ∑ k i X i =1.............................
Similarly, the standard error of the intercept is found to be 37.98177. The detail is left for you as
an exercise.
c) Calculate and interpret the coefficient of determination.
We can use the following summary results to obtain the R2.
 yˆ 2
 135.0262
β^=β+Σkiui⇒ β^−β=Σkiui− − − − − − − − − − − − −(2.26)
^
Ε( β)=E(β)+Σk i E(ui ), (The sum of the above two). Then,
k i
^
Ε( β )=β
or
d) Compute the adjusted R2.
Ε ( u i )= 0
e) Construct 95% confidence interval for the true population parameters (partial regression
coefficients).[Exercise: Base your work on Simple Linear Regression]
f) Test the significance of X1 and X2 in determining the changes in Y using t-test.
The hypotheses are summarized in the following table.
25
Coefficient Hypothesi Estimate Std. error Calculated t Conclusion

s
1 H0: 1=0
H1: 10
0.025365 0.056462
^
β
We do not
reject H0 since
tcal<ttab
2 H0: 2=0
H1: 20
-0.21329 0.25046
β We do not
reject H0 since
tcal<ttab
The critical value (t 0.05, 9) to be used here is 2.262. Like the standard error test, the t- test revealed
that both X1 and X2 are insignificant to determine the change in Y since the calculated t values
are both less than the critical value.
Exercise: Test the significance of X1 and X2 in determining the changes in Y using the standard
error test.
g) Test the overall significance of the model. (Hint: use  = 0.05)
This involves testing whether at least one of the two variables X 1 and X2 determine the changes
in Y. The hypothesis to be tested is given by:
^
α
The ANOVA table for the test is give as follows:
Source of Sum of Squares Degrees of Mean Sum of Squares Ε( α^ )=α
variation freedom
Regression
α^ =Σ ( n− X̄ k i ) Y i =Σ( −X̄k)(α+βX+U)Y =α+βX +U 1 1 1
1
[ ] i i i =3-1=2 1
n i ii
=α+β n ΣXi+ n Σui−α X̄ Σki−β X̄Σki Xi−X̄ Σkiui n Σui−X̄Σkiui
=α+
Ε(α^ )=α− − − − − − − − − − − − − − −(2.28)

1
Σu i− X̄ Σk i ui =∑( n−X̄ki)uiΕ(^α)=α+1nΣΕ(ui)−X̄Σki Ε(ui)=12-
1
Residual ⇒ α−α=
^ n
3=9
Total ∴α
^ α =12-
1=11
The tabulated F value (critical value) is F(2, 11) = 3.98

In this case, the calculated F value (0.4336) is less than the tabulated value (3.98). Hence, we do
not reject the null hypothesis and conclude that there is no significant contribution of the
variables X1 and X2 to the changes in Y.
h) Compute the F value using the R2.
26
^
α and ^
β
Extensions of Regression Models
As pointed out earlier non linearity may be expected in many Economic Relationships. In other
words the relationship between Y and X can be non-linear rather than linear. Thus, once the
independent variables have been identified the next step is to choose the functional form of the
relationship between the dependent and the independent variables. Specification of the functional
form is important, because a correct explanatory variable may well appear to be insignificant or
to have an unexpected sign if an inappropriate functional form is used. Thus the choice of a
functional form for an equation is a vital part of the specification of that equation. The choice of
a functional form almost always should be based on an examination of the underlying economic
theory. The logical form of the relationship between the dependent variable and the independent
variable in question should be compared with the properties of various functional forms, and the
one that comes closest to that underlying theory should be chosen for the equation.
Some Commonly Used Functional Forms
a) The Linear Form: It is based on the assumption that the slope of the relationship between the
independent variable and the dependent variable is constant.
β1
i=1,2,...K
X1
In this case elasticity is not constant.
27
X 2
If the hypothesized relationship between Y and X is such that the slope of the relationship can be
expected to be constant and the elasticity can therefore be expected to be variable, then the linear
functional form should be used.
Note: Economic theory frequently predicts only the sign of a relationship and not its functional
form. Under such circumstances, the linear form can be used until strong evidence that it is
inappropriate is found. Thus, unless theory, common sense, or experience justifies using some
other functional form, one should use the linear model.
b) Log-linear, double Log or constant elasticity model
The most common functional form that is non-linear in the variable (but still linear in the
coefficients) is the log-linear form. A log-linear form is often used, because the elasticities and
not the slopes are constant i.e.,  =  Constant.
β2
Thus, given the assumption of a constant elasticity, the proper form is the exponential (log-
linear) form.
X 2
Given:
The log-linear functional form for the above equation can be obtained by a logarithmic
transformation of the equation.
28
X 1
The model can be estimated by OLS if the basic assumptions are fulfilled.
ui ui
The model is also called a constant elasticity model because the coefficient of elasticity between
Y and X (1) remains constant.
X i
This functional form is used in the estimation of demand and production functions.
Note: We should make sure that there are no negative or zero observations in the data set before
we decide to use the log-linear model. Thus log-linear models should be run only if all the
variables take on positive values.
c) Semi-log Form
The semi-log functional form is a variant of the log-linear equation in which some but not all of
the variables (dependent and independent) are expressed in terms of their logs. Such models
expressed as:
E ( u i )= 0 ui
( i ) ( lin-log model ) and ( ii ) ( log-lin
model ) are called semi-log models. The semi-log functional form, in the case of taking the log
of one of the independent variables, can be used to depict a situation in which the impact of X on
Y is expected to ‘tail off’ as X gets bigger as long as 1 is greater than zero.
29
ui
Example: The Engel’s curve tends to flatten out, because as incomes get higher, a smaller
percentage of income goes to consumption and a greater percentage goes to saving.
 Consumption thus increases at a decreasing rate.
 Growth models are examples of semi-log forms
d) Polynomial Form
Polynomial functional forms express Y as a function of independent variables some of which are
raised to powers others than one. For example in a second degree polynomial (quadratic)
equation, at least one independent variable is squared.
X i
Such models produce slopes that change as the independent variables change. Thus the slopes of
Y with respect to the Xs are
E(u 2i )=δ2u Constant ui

, and
In most cost functions, the slope of the cost curve changes as output changes.
30
ui 2
ui≈N(0,δ u )
Simple transformation of the polynomial could enable us to use the OLS method to estimate the
parameters of the model
Y i=β0+β1 D1i+β2 D 2i+ui
Setting
Y^ i =26 , 158. 62−1734 . 47 D 1i −3264 . 62 β 2 D 2 i
 i
e) Reciprocal Transformation (Inverse Functional Forms)

The inverse functional form expresses Y as a function of the reciprocal (or inverse) of one or
more of the independent variables (in this case X1):
Se=(1128.52) (1435.95) (1499.62)

Or
t= (23.18) (−1.21) (−2.18)

The reciprocal form should be used when the impact of a particular independent variable is
expected to approach zero as that independent variable increases and eventually approaches
infinity. Thus as X1 gets larger, its impact on Y decreases.
31
ui
An asymptote or limit value is set that the dependent variable will take if the value of the X-
variable increases indefinitely i.e. 0 provides the value in the above case. The function
approaches the asymptote from the top or bottom depending on the sign of 1.
Example: Phillips curve, a non-linear relationship between the rate of unemployment and the
percentage wage change.
ln Y i=β0+β1ln X1i+β2 X2i+β3ln X3i+β4 ln X 4i+εi

Exercises
1. The following results were obtained from a sample of 12 agribusiness firms on their output
(Y), labour input (X1) and capital input (X2) measured in arbitrary units. (Hint: use 5% level
of significance).
^ ^ 2 ^ 2
var(β)=Ε( β−Ε( β)) =Ε( β−β)
a) Estimate and interpret the regression coefficients.
b) Compute the average and marginal productivity of labor and capital in the firms.
c) Compute the standard errors of the estimates.
d) Compute and interpret the coefficient of multiple determination.
e) Calculate the adjusted R2.
f) Test significance of individual coefficients.
g) Test the overall significance of the model.
h) Identify the type of economies of scale (returns to scale) for the firm and advise.
32
2. The following is a log-transformed Cobb-Douglas Demand function for Potatoes in Ethiopia.
where Y is per capita consumption of Potato in Birr, X1 is real disposable per capita income in
Birr, X2 retail price of Potato per Kg, X3 retail price of Cabbage per Kg and X 4 is retail price of
Cauliflower per Kg and var(β)^ =E(∑k u) is a random or error term.
2
ii
a) How you will interpret the coefficients?

b) How you will state the following hypotheses:
a. Own price elasticity of demand is negative as predicted by economic theory?
b. Potato and Cabbage are unrelated commodities?
c. Potato and Cauliflower are substitutes?
33

CH - 4 - Econometrics UG

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH - 4 - Econometrics UG

Uploaded by

Copyright:

Available Formats

MTU CANR, econometrics for 2nd year AgEC

and any other (minor) factors, other than

Chapter four Multiple Linear Regression Model 1

4.2. Assumptions of Multiple Regression Model

2. Zero mean of the error term: E(u i )=0

3. Homoscedasticity: The variance of each

4. Normality of u: The values of each

5. No auto or serial correlation: The values of

from the values of any other

explanatory variables. i.e. E(u i X 1i )=E(u i X 2i )=0

Chapter four Multiple Linear Regression Model 1

measures the effect on E(Y ) of a unit change in X 2 when X 1 is held constant.

Now it is time to state how (3.3) is estimated. Given sample observation on Y , X 1 ∧X 2 , we

is sample relation between Y , X 1 ∧X 2 .

Chapter four Multiple Linear Regression Model 1

∑ ( X i −Y i )2=(ΣX i Y i−n X̄ i Y i )=Σxi y i

Chapter four Multiple Linear Regression Model 1

We can rewrite the above two equations in matrix form as follows.

Chapter four Multiple Linear Regression Model 1

otalsumof ¿ var iation) ¿¿=β^ 1 Σx1i yi+ β^ 2 Σx2i yi underbracealignl ⏟

values of predicted by the model,

R2 is low, there is no association between the values of

Chapter four Multiple Linear Regression Model 1

3.4. General Linear Regression Model and Matrix Approach

equations, in which the unknowns are the parameters

With respect to β j ( j=0,1,2,....(k+1))

Chapter four Multiple Linear Regression Model 1

The normal equations of the general linear regression model are

4.4. Matrix Approach to Linear Regression Model

where (i=1,2,3,........n) and β 0 = the intercept, β 1 to β k = partial slope coefficients U=

Chapter four Multiple Linear Regression Model 1

In short Y = Xβ+U ……………………………………………………(4.29)

we define two vectors β^ and ‘e’ as:

e' e=(Y −X β^ )' (Y − X β^ )

Chapter four Multiple Linear Regression Model 1

β^ ' X ' Y =Y ' X β^

Minimizing e’e with respect to the elements in β^

4.4.1. Statistical Properties of the Parameters (Matrix) Approach

Since C is a matrix of fixed variables, equation (4.33) indicates us β^ is linear in Y.

Chapter four Multiple Linear Regression Model 1

since [ ( X ' X ) X ' X=I ]

Ε( β^ )=Ε { β +( X ' X )−1 X ' U }

=β+ Ε( X ' X )−1 X ' Ε(U )

⇒ β^ −β=( X X )−1 X U ………………………………………………(4.36)

Substituting (3.17) in (3.16)

var( β^ )=Ε [ ( X ' X )−1 X ' UU ' X ( X ' X )−1 ]

Chapter four Multiple Linear Regression Model 1

=( X ' X )−1 X ' Ε(UU ' ) X ( X ' X )−1

=σ 2u ( X ' X )−1 X ' X ( X ' X )−1

Chapter four Multiple Linear Regression Model 1

var( β^ )=Ε [ ( β−β

Chapter four Multiple Linear Regression Model 1

constant parameter)σ^ = n−k .

∑ ei 2=∑ y i 2−β 1 ∑ x 1 y−β 2 ∑ x 2 y .. . .. .. . .+ β K ∑ x K y ………………………(4.42) this is

Chapter four Multiple Linear Regression Model 1

Where B is (k x n) matrix of known constants.

us now find variance of this alternative estimator.

=Ε [ {( X ' X )−1 B } UU ' { X ( X ' X )−1 +U ' B ' } ]

=[ ( X ' X )−1 X '+B ] Ε(UU ' ) [ X ( X ' X )−1 +B ' ]

=σ 2u I n [( X ' X )−1 X '+ B ][ X ( X ' X )−1 + B' ]

=σ 2u [ ( X ' X )−1 X ' X ( X ' X )−1 + BX ( X ' X )−1 +( X ' X )−1 ]

=σ 2u [ ( X ' X )−1 +BB ' ] ( ∵ BX=0 )

Chapter four Multiple Linear Regression Model 1