CH - 3 - Econometrics UG

2010
MTU, CANR, Department of Agricultural Economics EC/2018

gc
Chapter Three
THE CLASSICAL REGRESSION ANALYSIS
[The Simple Linear Regression Model]
Economic theories are mainly concerned with the relationships among various economic
variables. These relationships, when phrased in mathematical terms, can predict the effect of one
variable on another. The functional relationships of these variables define the dependence of one
variable upon the other variable (s) in the specific form. The specific functional forms may be
linear, quadratic, logarithmic, exponential, hyperbolic, or any other form.
In this chapter we shall consider a simple linear regression model, i.e. a relationship between two
variables related in a linear form. We shall first discuss two important forms of relation:
stochastic and non-stochastic, among which we shall be using the former in econometric
analysis.
3.1. Stochastic and Non-stochastic Relationships

A relationship between X and Y, characterized as Y = f(X) is said to be deterministic or non-
stochastic if for each value of the independent variable (X) there is one and only one
corresponding value of dependent variable (Y). On the other hand, a relationship between X and
Y is said to be stochastic if for a particular value of X there is a whole probabilistic distribution
of values of Y. In such a case, for any given value of X, the dependent variable Y assumes some
specific value only with some probability. Let’s illustrate the distinction between stochastic and
non-stochastic relationships with the help of a supply function.
Assuming that the supply for a certain commodity depends on its price (other determinants taken
to be constant) and the function being linear, the relationship can be put as:
Q=f ( P )=α + βP−−−−−−−−−−−−−−−−−−−−−−−−−−−( 2 .1 )
The above relationship between P and Q is such that for a particular value of P, there is only one
corresponding value of Q. This is, therefore, a deterministic (non-stochastic) relationship since
for each price there is always only one corresponding quantity supplied. This implies that all the
variation in Y is due solely to changes in X, and that there are no other factors affecting the
dependent variable.
1
2010
gc
If this were true all the points of price-quantity pairs, if plotted on a two-dimensional plane,
would fall on a straight line. However, if we gather observations on the quantity actually
supplied in the market at various prices and we plot them on a diagram we see that they do not
fall on a straight line.
The derivation of the observation from the line may be attributed to several factors.
a. Omission of variables from the function
b. Random behavior of human beings
c. Imperfect specification of the mathematical form of the model
d. Error of aggregation
e. Error of measurement
In order to take into account the above sources of errors we introduce in econometric functions a
random variable which is usually denoted by the letter ‘u’ or ‘ ε ’ and is called error term or
random disturbance or stochastic term of the function, so called be cause u is supposed to
‘disturb’ the exact linear relationship which is assumed to exist between X and Y. By
introducing this random variable in the function the model is rendered stochastic of the form:
Y i =α+ βX+ui ……………………………………………………….(2.2)
Thus a stochastic model is a model in which the dependent variable is not only determined by the
explanatory variable(s) included in the model but also by others which are not included in the
model.
3.2. Simple Linear Regression model.
The above stochastic relationship (2.2) with one explanatory variable is called simple linear
regression model.
The true relationship which connects the variables involved is split into two parts:
2
2010
gc
a part represented by a line and a part represented by the random term ‘u’.
The scatter of observations represents the true relationship between Y and X. The line
represents the exact part of the relationship and the deviation of the observation from the line
represents the random component of the relationship.
' ' '

Were it not for the errors in the model, we would observe all the points on the line
Y 1 ,Y 2 ,......,Y n
X , X ,...., X n .
corresponding to 1 2 However because of the random disturbance, we observe
Y 1 ,Y 2 ,......,Y n corresponding to X 1 , X 2 ,...., X n . These points diverge from the regression line by
u1 ,u2 ,....,u n .
Yi = α + βx i +
⏟ ⏟ ⏟
ui
the dependent var iable the regression line random var iable
The first component in the bracket is the part of Y explained by the changes in X and the second
is the part of Y not explained by X, that is to say the change in Y is due to the random influence
u
of i .
3.2.1 Assumptions of the Classical Linear Stochastic Regression Model.
3
2010
gc
The classical made important assumptions in their analysis of regression .The most important of
these assumptions are discussed below.
1. The model is linear in parameters.

The classical assumed that the model should be linear in the parameters regardless of whether the
explanatory and the dependent variables are linear or not. This is because if the parameters are
non-linear it is difficult to estimate them since their value is not known but you are given with
the data of the dependent and independent variable.
Example 1. Y =α + βx +u is linear in both parameters and the variables, so it

Satisfies the assumption
2.ln Y =α + β ln x +u is linear only in the parameters. Since the
the classicals worry on the parameters, the model satisfies
the assumption.
 Check yourself whether the following models satisfy the above assumption and give your
answer to your tutor.
2 2
a.
ln Y =α + β ln X +U i
b.
Y i =√ α+βX i +U i
2.
U i is a random real variable
This means that the value which u may assume in any one period depends on chance; it may be
positive, negative or zero. Every value has a certain probability of being assumed by u in any
particular instance.
3. The mean value of the random variable(U) in any particular period is zero
4
2010
gc
This means that for each value of x, the random variable(u) may assume various values, some
greater than zero and some smaller than zero, but if we considered all the possible and negative
values of u, for any given value of X, they would have on average value equal to zero. In other
words the positive and negative values of u cancel each other.
Mathematically,
E(U i )=0 ………………………………..…. (2.3)
4. The variance of the random variable(U) is constant in each period (The assumption of
homoscedasticity)
For all values of X, the u’s will show the same dispersion around their mean. In Fig.2.c this
assumption is denoted by the fact that the values that u can assume lie with in the same
limits, irrespective of the value of X. For X 1 , u can assume any value with in the range AB;
for X 2 , u can assume any value with in the range CD which is equal to AB and so on.
Graphically;
Mathematically;
Var(U i )=E [U i −E(U i )]2 =E(U i )2 =σ 2 (Since E(U i )=0 ).This constant variance is called
homoscedasticity assumption and the constant variance itself is called homoscedastic variance.
5
2010
gc
4. The random variable (U) has a normal distribution
This means the values of u (for each x) have a bell shaped symmetrical distribution about their
2
zero mean and constant varianceσ , i.e.
U i  N (0 , σ 2 ) ………………………………………..……2.4
5. The random terms of different observations ( U i ,U j ) are independent. (The

assumption of no autocorrelation)
This means the value which the random term assumed in one period does not depend on the
value which it assumed in any other period.
Algebraically,
Cov (ui u j )=Ε [ [(ui −Ε(ui )][u j −Ε(u j )] ]
=E (ui u j )=0 …………………………..…. (2.5)
6. The
Xi are a set of fixed values in the hypothetical process of repeated sampling
which underlies the linear regression model.
This means that, in taking large number of samples on Y and X, the

Xi values are the
same in all samples, but the

ui values do differ from sample to sample, and so of course
do the values of
yi .
7. The random variable (U) is independent of the explanatory variables.
This means there is no correlation between the random variable and the explanatory
variable. If two variables are unrelated their covariance is zero.
Hence
Cov ( X i ,U i )=0 ………………………………………..….(2.6)
Proof:-
cov ( XU )= Ε [ [( X i−Ε ( X i )][U i −Ε(U i )]]

=Ε [( X i −Ε( X i )(U i )] given E(U i )=0
=Ε ( X i U i )−Ε( X i ) Ε(U i )
=Ε ( X i U i )
6
2010
gc
= X i Ε(U i ) , given that the x i are fixed

=0 .
8. The explanatory variables are measured without error
U absorbs the influence of omitted variables and possibly errors of measurement in the
y’s. i.e., we will assume that the repressors are error free, while y values may or may not
include errors of measurement.
We can now use the above assumptions to derive the following basic concepts.
A. The dependent variable

Y i is normally distributed.
∴ Y i ~ N [ (α + βxi ), σ 2 ]
i.e ………………………………(2.7)
Proof:
Ε(Y )=Ε ( α+βx i +ui )
Mean:
=α+βX i Since
Ε(u i )=0
Var(Y i )=Ε ( Y i−Ε(Y i ) )2
Variance:
=Ε ( α+ βX i +ui −(α + βX i ) ) 2
=Ε (ui )2
= σ 2 (since Ε(u i )2 =σ 2 )
∴ var (Y i )=σ 2
……………………………………….(2.8)
The shape of the distribution of

Yi is determined by the shape of the distribution of
ui which is
y
normal by assumption 4. Sinceα and β , being constant, they don’t affect the distribution of i .
Furthermore, the values of the explanatory variable,

x i , are a set of fixed values by assumption 5
and therefore don’t affect the shape of the distribution of

yi .
∴ Y i ~ N (α+βx i , σ 2 )
7
2010
gc
B. successive values of the dependent variable are independent, i.e
Cov(Y i ,Y j )=0
Proof:
Cov (Y i ,Y j )=E{[ Y i−E(Y i )][Y j−E(Y j )]}

=E {[α+ βX i +U i −E(α +βX i +U i )][ α+βX j +U j −E( α+βX j +U j )}
(Since
Y i =α+ βX i +U i andY j=α+βX j +U j )
=
E [( α+βX i +Ui−α−βX i )(α+ βX j +U j−α−βX j )] ,Since Ε(u i )=0
=E (U i U j )=0 (from equation (2.5))
Therefore,
Cov (Y i, Y j )=0 .
3.2.2 Methods of estimation

Specifying the model and stating its underlying assumptions are the first stage of any
econometric application. The next step is the estimation of the numerical values of the
parameters of economic relationships. The parameters of the simple linear regression model can
be estimated by various methods. Three of the most commonly used methods are:
1. Ordinary least square method (OLS)
2. Maximum likelihood method (MLM)
3. Method of moments (MM)
But, here we will deal with the OLS and the MLM methods of estimation.
3.2.2.1 The ordinary least square (OLS) method
The model
Y i =α+ βX i +U i is called the true relationship between Y and X because Y and X
represent their respective population value, and α and β are called the true parameters since
they are estimated from the population value of Y and X But it is difficult to obtain the
population value of Y and X because of technical or economic reasons. So we are forced to take
the sample value of Y and X. The parameters estimated from the sample value of Y and X are
called the estimators of the true parameters α and β and are symbolized as α^ and β^ .
8
2010
gc
^
^ β X i +ei , is called estimated relationship between Y and X since
Y =α+
The model i α^ and β^
are estimated from the sample of Y and X and

ei represents the sample counterpart of the
population random disturbance

Ui .
Estimation of α and β by least square method (OLS) or classical least square (CLS) involves
finding values for the estimates α^ and β^ which will minimize the sum of square of the squared
residuals ( ∑ e2i ).
^ β X i +ei , we obtain:
Y =α+ ^
From the estimated relationship i
^ β^ X i ) …………………………… (2.6)
e i=Y i−( α+
^ β^ X i )2 ………………………. (2.7)
∑ e2i =∑ (Y i− α−
To find the values of α^ and β^ that minimize this sum, we have to partially differentiate ∑ e2i
with respect to α^ and β^ and set the partial derivatives equal to zero.
∂ ∑ e 2i
=−2 ∑ (Y i− α^ − β^ X i )=0.......................................................(2.8)
1. ∂ ^
α
Rearranging this expression we will get: ∑ Y i=nα + β^ ΣX i …… (2.9)

If you divide (2.9) by ‘n’ and rearrange, we get
α^ =Ȳ − β^ X̄ ..........................................................................(2.10)
∂ ∑ e 2i
^ β^ X )=0 . .. . .. .. . .. .. . .. .. . .. .. . .. .. .. . .. .. . .. .. . .. .. . .. ..(2 . 11)
=−2 ∑ X i (Y i −α−
2. ∂ ^
β
Note: at this point that the term in the parenthesis in equation 2.8and 2.11 is the residual,
^ β^ X i . Hence it is possible to rewrite (2.8) and (2.11) as −2 ∑ e i=0 and

e=Y i −α−
−2 ∑ X i e i=0 . It follows that;
∑ ei =0 and ∑ X i ei=0............................................(2.12)
If we rearrange equation (2.11) we obtain;
9
2010
gc
∑ Y i X i= α^ ΣXi + β^ ΣX2i ……………………………………….(2.13)
^ from
Equation (2.9) and (2.13) are called the Normal Equations. Substituting the values of α
(2.10) to (2.13), we get:
∑ Y i X i=ΣX i ( Ȳ − β^ X̄ )+ β^ ΣXi2
=Ȳ ΣX i− β^ X̄ ΣX i + β^ ΣX 2i
∑ Y i X i−Ȳ ΣX i= β^ ( ΣX2i − X̄ ΣX i )
^ ΣX i −n X̄ 2
Σ XY −n X̄ Ȳ = β
2
( )
^ = Σ XY −n X̄ Ȳ
β
ΣX 2i − n X̄ 2 ………………….(2.14)
Equation (2.14) can be rewritten in somewhat different way as follows;
Σ( X− X̄ )(Y −Ȳ )=Σ( XY−X Ȳ − X̄ Y + X̄ Ȳ )

=Σ XY −Ȳ ΣX− X̄ ΣY +n X̄ Ȳ
=Σ XY −n Ȳ X̄−n X̄ Ȳ +n X̄ Ȳ
−
Σ ( X − X̄ )( Y −Ȳ )=Σ XY −n X Ȳ −−−−−−−−−−−−−−( 2 .15 )
Σ( X − X̄ )2 =ΣX 2−n X̄ 2 −−−−−−−−−−−−−−−−−(2.16)

Substituting (2.15) and (2.16) in (2.14), we get
^ Σ( X− X̄ )(Y −Ȳ )
β=
Σ( X − X̄ )2
Now, denoting
( X i − X̄ ) as x i , and (Y i −Ȳ ) as y i we get;
Σx i y i
^=
β
Σx 2
i ……………………………………… (2.17)
The expression in (2.17) to estimate the parameter coefficient is termed is the formula in
deviation form.
3.2.2.2 Estimation of a function with zero intercept
10
2010
gc
Y =α+ βX i +U i , subject to the restriction α =0 . To

Suppose it is desired to fit the line i
^
estimate β , the problem is put in a form of restricted minimization problem and then Lagrange
method is applied.
n
∑ (Y i −α^ − β^ X i )2
Σe2i =
We minimize: i=1
Subject to: α^ =0
The composite function then becomes
^ β^ X i )2− λ α^ ,
Z=∑ (Y i −α− where λ is a Lagrange multiplier.
^ and λ
We minimize the function with respect to α^ , β ,
∂Z
^ β^ X i )−λ=0−−−−−−−−(i )
=−2 Σ(Y i −α−
∂ α^
∂Z
^ β^ X i ) ( X i )=0−−−−−−−−(ii )
=−2 Σ(Y i −α−
^
∂β
∂z
=−2 α=0−−−−−−−−−−−−−−−−−−−(iii )
∂λ
Substituting (iii) in (ii) and rearranging we obtain:
ΣX i (Y i− β^ X i )=0
ΣY i X i− β^ ΣX 2 =0
i
ΣX i Y i
^
β=
ΣX 2i ……………………………………..(2.18)
This formula involves the actual values (observations) of the variables and not their deviation
^
forms, as in the case of unrestricted value of β .
3.2.2.3. Statistical Properties of Least Square Estimators

There are various econometric methods with which we may obtain the estimates of the
parameters of economic relationships. We would like to an estimate to be as close as the value of
the true population parameters i.e. to vary within only a small range around the true parameter.
11
2010
gc
How are we to choose among the different econometric methods, the one that gives ‘good’
estimates? We need some criteria for judging the ‘goodness’ of an estimate.
‘Closeness’ of the estimate to the population parameter is measured by the mean and variance or
standard deviation of the sampling distribution of the estimates of the different econometric
methods. We assume the usual process of repeated sampling i.e. we assume that we get a very
large number of samples each of size ‘n’; we compute the estimates β^ ’s from each sample, and
for each econometric method and we form their distribution. We next compare the mean
(expected value) and the variances of these distributions and we choose among the alternative
estimates the one whose distribution is concentrated as close as possible around the population
parameter.
PROPERTIES OF OLS ESTIMATORS

The ideal or optimum properties that the OLS estimates possess may be summarized by well-
known theorem known as the Gauss-Markov Theorem.
Statement of the theorem: “Given the assumptions of the classical linear regression model, the
OLS estimators, in the class of linear and unbiased estimators, have the minimum variance,
i.e. the OLS estimators are BLUE.
According to the theorem, under the basic assumptions of the classical linear regression model,
the least squares estimators are linear, unbiased and have minimum variance (i.e. are best of all
linear unbiased estimators). Sometimes the theorem referred as the BLUE theorem i.e. Best,
Linear, and Unbiased Estimator. An estimator is called BLUE if:
a. Linear: a linear function of the random variable, such as, the dependent variable Y.
b. Unbiased: its average or expected value is equal to the true population parameter.
c. Minimum variance: It has a minimum variance in the class of linear and unbiased
estimators. An unbiased estimator with the least variance is known as an efficient
estimator.
According to the Gauss-Markov theorem, the OLS estimators possess all the BLUE properties.
The detailed proof of these properties are presented below
Dear colleague lets proof these properties one by one.
12
2010
gc
^
β
a. Linearity: (for )
Proposition: α^ ∧ β^ are linear in Y.
Proof: From (2.17) of the OLS estimator of β^ is given by:

Σx i y i Σxi (Y −Ȳ ) Σx i Y −Ȳ Σx i
^
β= = = ,
Σx2i Σx 2i Σx 2i
(but Σ xi=∑ ( X− X̄ )=∑ X−n X̄ =n X̄−n X̄=0 )

Σxi Y xi
⇒ β^ = =K i
Σx 2i ; Now, let Σx 2i (i=1 , 2, . .. .. n )
^
∴ β=ΣK i Y −−−−−−−−−−−−−−−−−−−−−−−−−−(2. 19)
⇒ β^ =K 1 Y 1 +K 2 Y 2 +K 3 Y 3 +−−−−+K n Y n
∴ β^ is linear in Y
Check yourself question:
α^ =Σ ( n− X̄ k i ) Y i
1
α^ α^
Show that is linear in Y? Hint: . Derive this relationship between and
Y.
b. Unbiasedness:
Proposition: α^ ∧ β^ are the unbiased estimators of the true parameters α ∧β
From your statistics course, you may recall that if θ^ is an estimator of θ then
E( θ^ )−θ=the amount of bias and if θ^ is the unbiased estimator of θ then bias =0 i.e.
E( θ^ )−θ=0⇒ E ( θ)=θ
^
In our case, α^ ∧ β^ are estimators of the true parameters α ∧ β .To show that they are the
unbiased estimators of their respective parameters means to prove that:
Ε( β^ )=β and Ε( α^ )=α

^
β Ε( β^ )=β .
 Proof (1): Prove that is unbiased i.e.
13
2010
gc
^ kY
β=Σ =Σk i (α +βX i +U i )
We know that i
=αΣk i +βΣk i X i +Σk i u i ,

but Σk i =0 and Σk i X i=1
Σx i Σ ( X − X̄ ) ΣX − n X̄ = n X̄ −n X̄ =0
Σk i = = =
Σx 2
i
2
Σx i Σx 2
i Σx2i
⇒ ∑ k i =0 …………………………………………………………………(2.20)
Σx i X i Σ( X − X̄ ) Xi
Σk i X i= =
Σx 2i Σx 2i
ΣX 2− X̄ ΣX ΣX 2−n X̄ 2
= = =1
ΣX 2 −n X̄ 2 ΣX 2−n X̄ 2
⇒ ∑ k i X i =1............................. ……………………………………………(2.21)
^ +Σk u ⇒ β−
β=β ^ β=Σk u −−−−−−−−−−−−−−−−−−−−−−−−−(2 . 22)
i i i i
Ε( β^ )=E( β )+Σk i E (ui ), Since k i are fixed
Ε( β^ )=β , since
Ε(u i )=0
^
β
Therefore, is unbiased estimator of β .
 Proof(2): prove that α^ is unbiased i.e.: Ε ( α^ )=α
From the proof of linearity property under 2.2.2.3 (a), we know that:
^
α=Σ (1 n− X̄ k i ) Y i
=Σ [( 1
n
− X̄ k i ) ( α + βX i +U i ) ] , Since Y i =α+ βX i +U i
=α+β 1 n ΣX i + 1 n Σu i−α X̄ Σk i −β X̄ Σki X i− X̄ Σk i ui
=α+ 1 n Σu i − X̄ Σki ui ⇒ α−α=

^ 1
Σu i− X̄ Σk i ui
, n
=∑ ( 1 n − X̄ k i )ui
……………………(2.23)
14
2010
gc
Ε( α^ )=α+ 1 n ΣΕ(ui )− X̄ Σk i Ε(ui )

Ε( α^ )=α −−−−−−−−−−−−−−−−−−−−−−−−−−−−−(2 .24 )
∴ α^ is an unbiased estimator of α .
c. Minimum variance of α^ and β^

Now, we have to establish that out of the class of linear and unbiased estimators of α and β ,
α^ and β^ possess the smallest sampling variances. For this, we shall first obtain variance of
α^ and β^ and then establish that each has the minimum variance in comparison of the variances
of other linear and unbiased estimators obtained by any other econometric methods than OLS.
^
β
a. Variance of
^
var( β)=Ε ( β−Ε( ^ 2=Ε ( β−β)
β)) ^ 2
……………………………………(2.25)
Substitute (2.22) in (2.25) and we get
var( β^ )=E ( ∑ k i ui )2
=Ε [ k 21 u21 +k 22 u22 +............+k 2n u2n +2k 1 k 2 u1 u 2 +.......+2k n−1 k n un−1 u n ]
=Ε [ k 21 u21 +k 22 u22 +............+k 2n u2n ]+Ε [2k 1 k 2 u1 u2 +.......+2k n−1 k n un−1 u n ]
=Ε (∑ k 2i u2i )+Ε( Σk i k j u i u j ) i≠ j
=Σk 2i Ε(u2i )+2 Σki k j Ε(u i u j )=σ 2 Σk2i (Since Ε(u i u j ) =0)
Σx i Σx2i 1
Σk i = Σki2= =
Σx 2i , and therefore, ( Σx2i )2 Σx2i
σ2
∴ var ( β^ )=σ 2 Σk 2i =
Σxi2 …………………………………………….. (2.26)
^
b. Variance of α
var( α^ )=Ε ( ( α−Ε(

^ α)) 2
=Ε ( α^ −α )2 −−−−−−−−−−−−−−−−−−−−−−−−−−(2.27)
15
2010
gc
Substituting equation (2.23) in (2.27), we get
[
var ( α^ ) =Ε Σ ( n− X̄ k i ) ui
1 2 2
]
2
=∑ ( 1 n − X̄ k i ) Ε(ui )2
=σ 2 Σ( 1 n − X̄ k i )2
1 2 2 2
=σ2 Σ ( − X̄ k i + X̄ k i )
n2 n
=σ 2 Σ( 1 n −2 X̄ n Σki + X̄ 2 Σk2i )
, Since ∑ k i=0
=σ 2 ( 1 n + X̄ 2 Σk 2i )
1 X̄ 2 Σx2i 1
=σ 2 ( + ) Σki2= =
n ∑x 2 ( Σx2i )2 Σx2i
i , Since
Again:
( )
1 X̄ 2 Σx 2i +n X̄ 2 ΣX 2
+ = =
n Σx 2 nΣx2i nΣx 2i
i
( ) ( ) …………………………………………(2.28)
1 X̄ 2 ΣX 2i
∴ var ( α^ )=σ 2 n
+ =σ 2
Σx 2i nΣx2i
Dear student! We have computed the variances OLS estimators. Now, it is time to check
whether these variances of OLS estimators do possess minimum variance property compared to
the variances other estimators of the trueα and β , other thanα^ and β^ .
To establish that α^ and β^ possess minimum variance property, we compare their variances
with that of the variances of some other alternative linear and unbiased estimators of α and β ,
say α∗¿ ¿ and β∗¿ ¿. Now, we want to prove that any other linear and unbiased estimator of the true
population parameter obtained from any other econometric method has larger variance that that
OLS estimators.
Lets first show minimum variance of β^ ^ .

and then that of α
^
β
1. Minimum variance of
Suppose: β∗¿ ¿ an alternative linear and unbiased estimator of β and;
16
2010
gc
Let
β∗¿ Σwi Y i ......................................... ………………………………(2.29)
where ,
w i≠k i ; but:
w i=k i +c i
β∗¿ Σw i (α +βX i +ui ) Since Y i =α+ βX i +U i
=αΣwi +βΣw i X i +Σw i ui
∴ Ε( β∗)=αΣwi +βΣw i X i ,since Ε(u i )=0
Since β∗¿ ¿is assumed to be an unbiased estimator, then for β∗¿ ¿is to be an unbiased estimator of
β , there must be true that Σw i=0 and Σw i X=1 in the above equation.
But,
w i=k i +c i
Σw i=Σ(k i +c i )=Σki +Σc i
Therefore,
Σci =0 since Σki =Σw i=0
Again
Σw i X i=Σ (k i +c i ) X i=Σk i X i +Σci X i
Since
Σw i X i=1 and Σki X i=1 ⇒ Σc i X i=0 .
From these values we can drive

Σci x i=0 , where xi =X i − X̄
Σci x i=∑ ci ( X i − X̄ )= Σci X i + X̄ Σci
Since
Σci x i=1 Σci =0 ⇒ Σc i x i =0
Thus, from the above calculations we can summarize the following results.
Σw i=0 , Σw i x i=1, Σci =0, Σci X i =0
^
β
To prove whether has minimum variance or not lets compute var ( β∗) to compare with
^
var( β) .
var( β∗)=var( Σw i Y i )
=Σw 2 var(Y i )
i
∴ var (β∗)=σ 2 Σw 2i since

Var(Y i )=σ 2
17
2010
gc
Σw 2 =Σ ( k i + ci )2 =Σk 2i + 2 Σk i ci + Σc 2i
But, i
Σc i x i
Σki ci = =0
⇒ Σw 2i =Σk2i +Σc2i Since Σx2i
Therefore,
var( β∗)=σ 2 ( Σk 2i +Σc2i )⇒ σ 2 Σk 2i +σ 2 Σc2i
var ( β∗)=var( β^ )+ σ 2 Σc2i
Given that ci is an arbitrary constant,

σ 2 Σc 2i is a positive i.e it is greater than zero. Thus
^
var( β∗)>var( β) ^
. This proves that β possesses minimum variance property. In the similar
^ ) possesses
way we can prove that the least square estimate of the constant intercept ( α
minimum variance.
^
2. Minimum Variance of α
We take a new estimator α∗¿ ¿, which we assume to be a linear and unbiased estimator of
^ is given by:
function of α . The least square estimator α
1
^
α=Σ( n
− X̄ k i )Y i
^
β
By analogy with that the proof of the minimum variance property of , let’s use the weights wi
= ci + ki Consequently;
α∗¿ Σ( 1 n − X̄ wi )Y i
Since we want α∗¿ ¿ to be on unbiased estimator of the true α , that is, Ε( α∗)=α , we substitute
for
Y =α+βx i +u i in α∗¿ ¿and find the expected value of α∗¿ ¿.
α∗¿ Σ( 1 n − X̄ wi )(α+βX i +ui )
α βX ui
=Σ ( + + − X̄ w i α −β X̄ X i wi − X̄ w i u i )
n n n
α∗¿ α+β X̄ + ∑ ui/n−α X̄ Σw i−β X̄ Σw i X i− X̄ Σw i ui

For α∗¿ ¿ to be an unbiased estimator of the true α , the following must hold.
∑ ( wi )=0 , Σ (wi X i )=1 and ∑ ( wi ui )=0

18
2010
gc
i.e., if
Σw i=0 , and Σw i X i=1 . These conditions imply that Σci =0 and Σci X i =0 .
^
As in the case of β , we need to compute Var ( α∗¿ ¿) to compare with var(α
^ )
var(α∗)=var ( Σ( n− X̄ wi )Y i )
1
=Σ( 1 n− X̄ wi )2 var(Y i )
=σ 2 Σ( 1 n − X̄ w i )2
1 2 1
=σ2 Σ ( + X̄ w i −22 ¯ X wi)
n2 n
n 2 1
= σ 2( + Σ X̄ wi −2 X̄ 2 Σw i )
n2 n
var ( α∗)= σ 2 ( 1
n
+ X̄ 2 Σw
i2 ) ,Since
Σw i=0
Σw 2 =Σk2i + Σc 2i
but i
⇒ var ( α∗)=σ
2 1
( n + X̄
2
( Σk i + Σci )
2 2
var ( α∗)=σ 2
( 1 X̄ 2
+
n Σx i2
)
+ σ 2 X̄ 2 Σc 2i
=σ 2
( ) ΣX 2i
nΣx 2i + σ 2 X̄ 2 Σc 2i
The first term in the bracket it var ( α^ ) , hence
var (α∗)=var( α^ )+ σ 2 X̄ 2 Σc2i
⇒ var ( α∗)> var ( α^ ) , Since σ 2 X̄ 2 Σc2i > 0

Therefore, we have proved that the least square estimators of linear regression model are best,
linear and unbiased (BLU) estimators.
The variance of the random variable (Ui)

2
You may observe that the variances of the OLS estimates involve σ , which is the population
variance of the random disturbance term. But it is difficult to obtain the population data of the
2
disturbance term because of technical and economic reasons. Hence it is difficult to compute σ ;
this implies that variances of OLS estimates are also difficult to compute. But we can compute
19
2010
gc
these variances if we take the unbiased estimate of σ which is σ^ computed from the sample
2 2
value of the disturbance term ei from the expression:

Σe 2i
σ^ 2u=
n−2 …………………………………..2.30
To use σ^ in the expressions for the variances of

2 α^ and β^ , we have to prove whether σ^ 2 is the
∑ e i2
2
^ 2 )=E(
E( σ )=σ 2
unbiased estimator of σ , i.e., n−2
To prove this we have to compute

∑ ei 2 from the expressions of Y, ^ , y , ^y
Y and ei .
Proof:
^ β^ X i + ei
Y i =α+
^ β^ x
Y^ = α+
⇒Y =Y^ +e i ……………………………………………………………(2.31)
^
⇒ e i=Y i−Y ……………………………………………………………(2.32)
Summing (2.31) will result the following expression
ΣY i =Σyi +Σei
ΣY i =Σ Y^ i sin ce ( Σei )=0
Dividing both sides the above by ‘n’ will give us
ΣY Σ Y^ i
n
=
n  Ȳ =Ȳ^ −−−−−−−−−−−−−−−−−−−−(2.33 )
Putting (2.31) and (2.33) together and subtract
Y =Y^ + e
Ȳ =Ȳ^
⇒(Y −Ȳ )=( Y^ −Ȳ^ )+e
⇒ y i= ^y i + e ………………………………………………(2.34)
From (2.34):
20
2010
gc
e i= y i− ^y i ………………………………………………..(2.35)
Where the y’s are in deviation form.
y and ^y i in other expression as derived below.
Now, we have to express i
From:
Y i =α+ βX i +U i
Ȳ=α+β X̄ + Ū
We get, by subtraction
y i =(Y i −Ȳ )=βi ( X i− X̄ )+(U i−Ū )=βx i +(U−Ū )

⇒ y i=βx +(U −Ū ) …………………………………………………….(2.36)
Note that we assumed earlier that , Ε( u )=0 , i.e in taking a very large number samples we expect
U to have a mean value of zero, but in any particular single sample Ū is not necessarily zero.
Similarly: From;
^ β^ x
Y^ = α+
^ β^ x̄
Ȳ = α+
We get, by subtraction
Y^ −Ȳ^ = β^ ( X− X̄ )
⇒ ^y = β^ x …………………………………………………………….(2.37)
Substituting (2.36) and (2.37) in (2.35) we get
e i=βx i +(ui −ū)− β^ x i

= (ui −ū)−( β^ i−β )x i
The summation over the n sample values of the squares of the residuals over the ‘n’ samples
yields:
^ )x ]2
Σe2i =Σ [(u i−ū )−( β−β i
^ β )2 x −2 (u −ū )( β−β
=Σ [( ui −ū )2 +( β− ^ ) xi ]
2
i i
^ β )2 Σx −2[( β−β
=Σ( ui −ū )2 +( β− ^ ) Σxi ( ui −ū) ]
2 i
21
2010
gc
Taking expected values we have:
^
Ε( Σe2i )=Ε [ Σ(ui −ū )2 ]+ Ε[( β−β ^ β )Σx (u −ū )]
)2 Σx 2 ]−2 Ε [( β− i i
i ……………(2.38)
The right hand side terms of (2.38)may be rearranged as follows
a.
Ε [ Σ(u−ū )2 ]=Ε( Σu2i −ū Σui )
=Ε Σu 2i − ( ( Σu i )2
n )
1
=ΣΕ(u2i )− Ε (Σu)2
n
=nσ 2 − 1n Ε(u1 +u2 +. . .. .. .+ui )2 since Ε(u 2i )=σ 2u
=nσ 2 − 1n Ε( Σu 2i +2 Σui u j )
=nσ 2 − 1n (ΣΕ (u2i )+ 2 Σu i u j ) i≠ j
=nσ 2 − 1n nσ 2u − 2n ΣΕ(ui u j )
=nσ 2u −σ 2u (given Ε (ui u j )=0)

=σ 2u (n−1) ……………………………………………..(2.39)
^
Ε [( β−β ^
)2 Σx 2 ]=Σx 2i . Ε ( β−β )2
b. i
Given that the X’s are fixed in all samples and we know that
^ β )2= var ( β
^ )=σ 2 1
Ε( β− u
Σx 2
1
Σx 2
. Ε( ^ )2 =Σx2 . σ 2u 2
β−β ^
Σx 2i . Ε( β−β )2 =σ 2u
Hence i i Σx
……………………………………………(2.40)
^ )Σx (u −ū )]=−2 Ε[( β−β
Ε [( β−β ^ )(Σx u −ū Σx )]
c. -2 i i i i i
^ )(Σx u )] ,sin ce ∑ x =0
Ε [( β−β
= -2 i i i
^
( β−β )=Σk i ui and substitute it in the above expression, we will get:
But from (2.22) ,
^ )Σx (u −ū )=−2 Ε( Σk u )( Σx u )]
Ε [( β−β
-2 i i i i i i
22
2010
gc
= -2
Ε
[( ) Σx i u i
Σx 2
( Σx i ui )
i
] ,since
k i=
xi
∑ x i2
=−2 Ε
[ ]
( Σxi ui )2
Σx 2
i
[ ]
Σx 2 u 2 +2 Σx i x j ui u j
i i
=−2 Ε
Σx 2
i
[ ]
2
Σx Ε (u 2 ) +2 Σ ( x x ) Ε ( u u )
i i j i j
=−2 i≠ j
Σx 2 Σx 2
i i
Σx 2 Ε( u 2 )
i
=−2 ( given Ε( ui u j )=0 )
Σx
i2
=−2 Ε (u 2i )=−2 σ 2
…………………………………………………….(2.41)
Consequently, Equation (2.38) can be written interms of (2.39), (2.40) and (2.41) as follows:
Ε ( Σe2i )=( n−1 ) σ u2 +σ 2−2 σ 2u=(n−2)σ 2u

………………………….(2.42)
From which we get
Ε ( )
Σe2i
n−2
2 2
= E( σ^ u )=σ u
………………………………………………..(2.43)
Σe2i
σ^ 2u=
Since n−2
Σe 2i 2
σ^ =
The conclusion that we can drive from the above proof is that we can substitute n−2 for
2
(σ ) in the variance expression of α^ and β^ , since E( σ^ 2 )=σ 2 . Hence the formula of
variance of α^ and β^ becomes;

σ^ 2 Σe2i
Var ( β^ )= 2
Σxi = ( n−2) ∑ x i2 ……………………………………(2.44)
23
2010
gc
∑ e i2 ∑ X i2
Var ( α^ )=σ^
2
( )
ΣX 2i
nΣx2i
=
n( n−2) ∑ x 2
i ……………………………(2.45) Note:
∑ ei 2
can be computed as
∑ ei 2=∑ y i 2− β^ ∑ x i y i .
Do not worry about the derivation of this expression! we will perform the derivation of it in our
subsequent subtopic.
3.2.2.4. Statistical test of Significance of the OLS Estimators

(First Order tests)
After the estimation of the parameters and the determination of the least square regression line,
we need to know how ‘good’ is the fit of this line to the sample observation of Y and X, that is to
say we need to measure the dispersion of observations around the regression line. This
knowledge is essential because the closer the observation to the line, the better the goodness of
fit, i.e. the better is the explanation of the variations of Y by the changes in the explanatory
variables.
We divide the available criteria into three groups: the theoretical a priori criteria, the statistical
criteria, and the econometric criteria. Under this section, our focus is on statistical criteria (first
order tests). The two most commonly used first order tests in econometric analysis are:
i. The coefficient of determination (the square of the correlation coefficient i.e. R 2). This test is
used for judging the explanatory power of the independent variable(s).
ii. The standard error tests of the estimators. This test is used for judging the statistical
reliability of the estimates of the regression coefficients.
1. TESTS OF THE ‘GOODNESS OF FIT’ WITH R2

r2 shows the percentage of total variation of the dependent variable that can be explained by the
changes in the explanatory variable(s) included in the model. To elaborate this let’s draw a
horizontal line corresponding to the mean value of the dependent variable Ȳ . (see figure‘d’
below). By fitting the line

Y^ = α^ 0 + β^ 1 X we try to obtain the explanation of the variation of the
dependent variable Y produced by the changes of the explanatory variable X.
.Y
24
2010
gc
Y ^ − Ȳ
=Y
= ^
Y Y^ = α^ 0 + β^ 1 X
=Y^ −Ȳ
Ȳ .
X
Figure ‘d’. Actual and estimated values of the dependent variable Y.
As can be seen from fig.(d) above, Y −Ȳ represents measures the variation of the sample
observation value of the dependent variable around the mean. However the variation in Y that
can be attributed the influence of X, (i.e. the regression line) is given by the vertical distance
Y^ −Ȳ . The part of the total variation in Y about Ȳ that can’t be attributed to X is equal to
^ − Ȳ which is referred to as the residual variation.
Y
In summary:
e i=Y i−Y^ = deviation of the observation Yi from the regression line.

y i =Y −Ȳ = deviation of Y from its mean.
^y =Y^ −Ȳ = deviation of the regressed (predicted) value (Y^ ) from the mean.
^ ) and the residual term
Now, we may write the observed Y as the sum of the predicted value ( Y
(ei.).
⏟i
Y = ⏟^
Y ⏟ei
+
predicted Y i Re sidual
Observed Y i
From equation (2.34) we can have the above equation but in deviation form
y= ^y + e . By squaring and summing both sides, we obtain the following expression:
Σy2 =Σ( ^y 2 +e)2

Σy2 =Σ( ^y 2 +e2i +2 yei )
=Σy 2 + Σe2i +2 Σ ^y ei
i
25
2010
gc
But Σ ^y ei =
Σe( Y^ −Ȳ )=Σe( α^ + β^ x i −Ȳ )
= α^ Σei + β^ Σ ex i −Y^ Σei
(but
Σe i=0 , Σ ex i =0 )
⇒ ∑ ^y e=0 ………………………………………………(2.46)
Therefore;
Σy2i underbracealignl T⏟otal ¿ =Σ y^2underbracealignl Ex⏟

plained ¿ ¿+Σei underbracealignl U⏟
2
nexplained ¿ ¿¿
var iation ¿ var iation ¿ var ation ¿ ………………………………...(2.47)
OR,
⏟
Total sum of ¿ TSS⏟=
square ¿
Explained sum ¿ ESS⏟ + ¿Residual
⏟ ⏟ sum¿ ¿ RSS⏟ ¿ ¿¿
¿ of square ¿ of square¿ ¿
i.e
TSS=ESS+ RSS ……………………………………….(2.48)
Mathematically; the explained variation as a percentage of the total variation is explained as:
ESS Σ y^ 2
=
TSS Σy 2 ……………………………………….(2.49)
From equation (2.37) we have ^y = β^ x . Squaring and summing both sides give us
Σ ^y 2= β^ 2 Σx2 −−−−−−−−−−−−−−−−−−−−−−−(2.50)
We can substitute (2.50) in (2.49) and obtain:
β^ 2 Σx 2
ESS /TSS=
Σy2 …………………………………(2.51)
Σx2i Σxi y i
( ) Σy ,
2
Σ xy ^
β=
=
Σx2 2
Since Σx2i
Σ xy Σ xy
=
Σx 2 Σy2 ………………………………………(2.52)
Comparing (2.52) with the formula of the correlation coefficient:
26
2010
gc
2
r = Cov (X,Y) / x2x2 = Σ xy / nx2x2 = Σ xy / ( Σx Σy )1/2 ………(2.53)
2
Squaring (2.53) will result in: r2 = ( Σ xy )2 / ( Σx

2
Σy2 ). ………….(2.54)
Comparing (2.52) and (2.54), we see exactly the expressions. Therefore:
Σ xy Σ xy
=
ESS/TSS Σx 2 Σy2 = r2
From (2.48), RSS=TSS-ESS. Hence R2 becomes;

2
2 TSS−RSS RSS =1− Σei
R = =1−
TSS TSS Σy 2 ………………………….…………(2.55)
From equation (2.55) we can drive;
RSS=Σei2=Σy 2i (1−R 2 )−−−−−−−−−−−−−−−−−−−−−−−−−−−−(2. 56)

2
The limit of R2: The value of R2 falls between zero and one. i.e. 0≤R ≤1 .
 Interpretation of R2
2
Suppose R =0 . 9 , this means that the regression line gives a good fit to the observed data since
this line explains 90% of the total variation of the Y value around their mean. The remaining
10% of the total variation in Y is unaccounted for by the regression line and is attributed to the
u.
factors included in the disturbance variable i
Check yourself question:
2
a. Show that 0≤R ≤1 .
b. Show that the square of the coefficient of correlation is equal to ESS/TSS.
Exercise:
r =
Suppose xy is the correlation coefficient between Y and X and is give by:
Σx i y i
=
√ Σx2i √ Σy 2i
27
2010
gc
2
^y
^ , and is given by:
And let r y = the square of the correlation coefficient between Y and Y
2 ( Σy ^y )2
r y^y =
Σy 2 Σ ^y 2
Show that: i)
2
r y^y =R2 ii)
r yy =r yx
2. TESTING THE SIGNIFICANCE OF OLS PARAMETERS

To test the significance of the OLS parameter estimators we need the following:
 Variance of the parameter estimators
2
 Unbiased estimator of σ
 The assumption of normality of the distribution of error term.
We have already derived that:

σ^ 2
var ( β^ )= 2
 Σx
^ σ^ 2 ΣX 2
var( α )=
 nΣx 2
Σe2 RSS
σ^ 2= =
 n−2 n−2
For the purpose of estimation of the parameters the assumption of normality is not used, but we
use this assumption to test the significance of the parameter estimators; because the testing
methods or procedures are based on the assumption of the normality assumption of the
disturbance term. Hence before we discuss on the various testing methods it is important to see
whether the parameters are normally distributed or not.
We have already assumed that the error term is normally distributed with mean zero and variance
σ 2 , i.e. U i ~ N (0, σ 2 ) . Similarly, we also proved thatY i ~ N [(α +βx ), σ 2 ] . Now, we want to
show the following:
28
2010
gc
1.
β^ ~ N β ,
(σ2
Σx 2 )
2.
(
α^ ~ N α ,
σ 2 ΣX 2
nΣx2 )
To show whether α^ and β^ are normally distributed or not, we need to make use of one
property of normal distribution. “........ any linear function of a normally distributed variable is
itself normally distributed.”
^
β=Σk i Y i=k 1 Y 1 +k 2 Y 2i +. . ..+k n Y n
α^ =Σw i Y i=w1 Y 1 +w2 Y 2i +....+w n Y n
Since α^ and β^ are linear in Y, it follows that
( σ2
) ( )
2 2
σ ΣX
β^ ~ N β , α^ ~ N α ,
Σx 2 ; nΣx 2
The OLS estimates α^ and β^ are obtained from a sample of observations on Y and X. Since
sampling errors are inevitable in all estimates, it is necessary to apply test of significance in order
to measure the size of the error and determine the degree of confidence in order to measure the
validity of these estimates. This can be done by using various tests. The most common ones are:
i) Standard error test ii) Student’s t-test iii) Confidence interval
All of these testing procedures reach on the same conclusion. Let us now see these testing
methods one by one.
i) Standard error test
This test helps us decide whether the estimates α^ and β^ are significantly different from zero,
i.e. whether the sample from which they have been estimated might have come from a
population whose true parameters are zero. α =0 and / or β=0 .
Formally we test the null hypothesis
H 0 : β i =0 against the alternative hypothesis
H 1 : β i≠0
The standard error test may be outlined as follows.
29
2010
gc
First: Compute standard error of the parameters.
SE( β^ )= √ var( β^ )
SE( α^ )= √ var( α^ )
Second: compare the standard errors with the numerical values of α^ and β^ .
Decision rule:
SE( β^ i )> 1 2 β^ i
 If , accept the null hypothesis and reject the alternative hypothesis. We
conclude that
β^ i is statistically insignificant.
SE( β^ i )< 1 2 β^ i
 If , reject the null hypothesis and accept the alternative hypothesis. We
conclude that
β^ i is statistically significant.
The acceptance or rejection of the null hypothesis has definite economic meaning. Namely, the
acceptance of the null hypothesis β=0 (the slope parameter is zero) implies that the
explanatory variable to which this estimate relates does not in fact influence the dependent
variable Y and should not be included in the function, since the conducted test provided evidence
that changes in X leave Y unaffected. In other words acceptance of H 0 implies that the relation
ship between Y and X is in fact Y =α +( 0) x=α , i.e. there is no relationship between X and Y.
Numerical example: Suppose that from a sample of size n=30, we estimate the following supply
function.
Q= 120 + 0 .6 p +ei
SE : (1. 7 ) (0 . 025 )
Test the significance of the slope parameter at 5% level of significance using the standard error
test.
^
SE( β)=0.025
( β^ )=0 . 6
1 ^
β=0. 3
2
SE( β^ i )< 1 2 β^ i ^
β
This implies that . The implication is is statistically significant at 5% level of
significance.
30
2010
gc
Note: The standard error test is an approximated test (which is approximated from the z-test and
t-test) and implies a two tail test conducted at 5% level of significance.
ii) Student’s t-test
Like the standard error test, this test is also important to test the significance of the parameters.
From your statistics, any variable X can be transformed into t using the general formula:
X−μ
t=
s x , with n-1 degree of freedom.
μ=
Where i value of the population mean
s x= sample estimate of the population standard deviation
s x=
n= Sample size
√
Σ( X− X̄ )2
n−1
We can derive the t-value of the OLS estimates
t β^ =
β^ i−β
^
SE( β) }
¿ ¿ ¿¿
with n-k degree of freedom.
Where:
SE = is standard error
k = number of parameters in the model.
Since we have two parameters in simple linear regression with intercept different from zero, our
degree of freedom is n-2. Like the standard error test we formally test the hypothesis:
H 0 : β i =0 against the alternative
H 1 : β i≠0 for the slope parameter; and
H 0 : α=0 against the alternative H 1 : α ≠0 for the intercept.
To undertake the above test we follow the following steps.
31
2010
gc
Step 1: Compute t*, which is called the computed value of t, by taking the value of β in the null
hypothesis. In our case β=0 , then t* becomes:
β^ −0 β^
t∗¿ =
SE ( β^ ) SE( β^ )
Step 2: Choose level of significance. Level of significance is the probability of making ‘wrong’
decision, i.e. the probability of rejecting the hypothesis when it is actually true or the probability
of committing a type I error. It is customary in econometric research to choose the 5% or the 1%
level of significance. This means that in making our decision we allow (tolerate) five times out
of a hundred to be ‘wrong’ i.e. reject the hypothesis when it is actually true.
Step 3: Check whether there is one tail test or two tail test. If the inequality sign in the
alternative hypothesis is ¿ , then it implies a two tail test and divide the chosen level of
significance by two; decide the critical rejoin or critical value of t called t c. But if the inequality
sign is either > or < then it indicates one tail test and there is no need to divide the chosen level
of significance by two to obtain the critical value of to from the t-table.
Example:
If we have
H 0 : β i =0
against:
H 1 : β i≠0
Then this is a two tail test. If the level of significance is 5%, divide it by two to obtain critical
value of t from the t-table.
α
Step 4: Obtain critical value of t, called tc at 2 and n-2 degree of freedom for two tail test.
Step 5: Compare t* (the computed value of t) and tc (critical value of t)
 If t*> tc , reject H0 and accept H1. The conclusion is β^ is statistically significant.

^
β
 If t*< tc , accept H0 and reject H1. The conclusion is is statistically insignificant.
Numerical Example:
Suppose that from a sample size n=20 we estimate the following consumption function:
32
2010
gc
C= 100 + 0.70+e
(75 .5) (0.21 )
The values in the brackets are standard errors. We want to test the null hypothesis:
H 0 : β i =0
against the alternative

H 1 : β i≠0 using the t-test at 5% level of significance.
a. the t-value for the test statistic is:
β^ −0 β^ 0 .70
t∗¿ = ≃3 . 3
SE ( β^ ) SE( β^ ) = 0 .21
b. Since the alternative hypothesis (H1) is stated by inequality sign ( ) ,it is a two tail test,
α
=0 . 05 2 =0 . 025 α
hence we divide 2 to obtain the critical value of ‘t’ at 2 =0.025 and 18
degree of freedom (df) i.e. (n-2=20-2). From the
t-table ‘tc’ at 0.025 level of significance and 18 df is 2.10.
^
β
c. Since t*=3.3 and tc=2.1, t*>tc. It implies that is statistically significant.
iii) Confidence interval
Rejection of the null hypothesis doesn’t mean that our estimate α^ and β^ is the correct estimate
of the true population parameter α and β . It simply means that our estimate comes from a
sample drawn from a population whose parameter β is different from zero.
In order to define how close the estimate to the true parameter, we must construct confidence
interval for the true parameter, in other words we must establish limiting values around the
estimate with in which the true parameter is expected to lie within a certain “degree of
confidence”. In this respect we say that with a given probability the population parameter will
be with in the defined confidence interval (confidence limits).
We choose a probability in advance and refer to it as confidence level (interval coefficient). It is

customarily in econometrics to choose the 95% confidence level. This means that in repeated
sampling the confidence limits, computed from the sample, would include the true population
33
2010
gc
parameter in 95% of the cases. In the other 5% of the cases the population parameter will fall
outside the confidence interval.
In a two-tail test at  level of significance, the probability of obtaining the specific t-value either
α
–tc or tc is 2 at n-2 degree of freedom. The probability of obtaining any value of t which is
^
β−β
^
equal to SE ( β ) at n-2 degree of freedom is 1−( 2 + 2 )
α α
i . e . 1−α
.
Pr {−t c <t∗¿t c }=1−α
i.e. …………………………………………(2.57)
^
β−β
t∗¿
but SE ( β^ ) …………………………………………………….(2.58)
Substitute (2.58) in (2.57) we obtain the following expression.
{ }
^
β−β
Pr −t c < <t c =1−α
SE( β^ ) ………………………………………..(2.59)
Pr {−SE( β^ )t c < β−β

^ <SE( β^ )t c }=1−α−−−−−by multiplying SE( β^ )
Pr {− β^ −SE ( β^ )t c <−β <− β^ +SE( β^ )t c }=1−α−−−−−by subtracting β^
Pr {+ β^ +SE( β^ )>β > β−SE(

^ β^ )t c }=1−α−−−−−by multiplying by −1
Pr { β−SE(
^ β^ )t c <β < β^ +SE ( β^ )t c }=1−α−−−−−int erchanging
The limit within which the true β lies at (1−α)% degree of confidence is:
^
[ β−SE ( β^ )t c , β+SE(
^ β^ )t c ] ; where t c is the critical value of t at α 2 confidence interval and n-2
degree of freedom.
The test procedure is outlined as follows.
H 0 : β=0
H 1 : β≠0
Decision rule: If the hypothesized value of β in the null hypothesis is within the confidence
^
β
interval, accept H0 and reject H1. The implication is that is statistically insignificant; while if
the hypothesized value of β in the null hypothesis is outside the limit, reject H 0 and accept H1.
^
β
This indicates is statistically significant.
34
2010
gc
Numerical Example:
Suppose we have estimated the following regression line from a sample of 20 observations.
Y =128.5+2.88 X +e
(38 .2) (0.85 )
The values in the bracket are standard errors.
a. Construct 95% confidence interval for the slope of parameter
b. Test the significance of the slope parameter using constructed confidence interval.
Solution:
a. The limit within which the true β lies at 95% confidence interval is:
^
β±SE( β^ )t c
^
β=2.88
^
SE( β)=0.85
t c at 0.025 level of significance and 18 degree of freedom is 2.10.
⇒ β^ ±SE ( β^ )t c =2. 88±2. 10(0 . 85 )=2.88±1. 79 .
The confidence interval is:
(1.09, 4.67)
b. The value of β in the null hypothesis is zero which implies it is out side the confidence
interval. Hence β is statistically significant.
2.2.3 Reporting the Results of Regression Analysis

The results of the regression analysis derived are reported in conventional formats. It is not
sufficient merely to report the estimates of β ’s. In practice we report regression coefficients
together with their standard errors and the value of R 2. It has become customary to present the
estimated equations with standard errors placed in parenthesis below the estimated parameter
values. Sometimes, the estimated coefficients, the corresponding standard errors, the p-values,
and some other indicators are presented in tabular form.
These results are supplemented by R2 on ( to the right side of the regression equation).
35
2010
gc
Y =128 . 5+2 . 88 X
Example: ( 38 . 2) ( 0. 85 ) , R2 = 0.93. The numbers in the parenthesis below
the parameter estimates are the standard errors. Some econometricians report the t-values of the
estimated coefficients in place of the standard errors.
Review Questions
Review Questions
1. Econometrics deals with the measurement of economic relationships which are stochastic
or random. The simplest form of economic relationships between two variables X and Y
can be represented by:
Y i =β 0 +β 1 X i +U i ; where
β 0 and β 1 = are regression parameters and
U i = the stochastic disturbance term
What are the reasons for the insertion of U-term in the model?
2. The following data refers to the demand for money (M) and the rate of interest (R) in for
eight different economics:
M (In billions) 56 50 46 30 20 35 37 61
R% 6.3 4.6 5.1 7.3 8.9 5.3 6.7 3.5
a. Assuming a relationship
M=α+βR+U i , obtain the OLS estimators of α and β
b. Calculate the coefficient of determination for the data and interpret its value
c. If in a 9th economy the rate of interest is R=8.1, predict the demand for money(M) in
this economy.
3. The following data refers to the price of a good ‘P’ and the quantity of the good supplied,
‘S’.
P 2 7 5 1 4 8 2 8
S 15 41 32 9 28 43 17 40
a. Estimate the linear regression line Ε( S )=α + βP
b. Estimate the standard errors of α^ and β^

c. Test the hypothesis that price influences supply
d. Obtain a 95% confidence interval for α
36
2010
gc
4. The following results have been obtained from a simple of 11 observations on the values
of sales (Y) of a firm and the corresponding prices (X).
X̄=519 .18
Ȳ =217 .82
∑ X 2i =3,134 ,543
∑ X i Y i=1,296 ,836
∑ Y 2i =539 ,512
i) Estimate the regression line of sale on price and interpret the results
ii) What is the part of the variation in sales which is not explained by the
regression line?
iii) Estimate the price elasticity of sales.
5. The following table includes the GNP(X) and the demand for food (Y) for a country over
ten years period.
year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Y 6 7 8 10 8 9 10 9 11 10
X 50 52 55 59 57 58 62 65 68 70
a. Estimate the food function
b. Compute the coefficient of determination and find the explained and unexplained
variation in the food expenditure.
c. Compute the standard error of the regression coefficients and conduct test of
significance at the 5% level of significance.
6. A sample of 20 observations corresponding to the regression model

Y i =α+ βX i +U i
gave the following data.
∑ Y i=21 . 9 ∑ ( Y i −Y )2 =86 . 9
∑ X i=186 . 2 ∑ ( X i −X ) 2=215 . 4
∑ ( X i −X )( Y i −Y ) =106 . 4
a. Estimate α and β
b. Calculate the variance of our estimates
c. Estimate the conditional mean of Y corresponding to a value of X fixed at X=10.
37
2010
gc
7. Suppose that a researcher estimates a consumptions function and obtains the following
results:
C= 15 + 0 .81 Yd n=19
2
( 3. 1 ) ( 18. 7 ) R =0 . 99
where C=Consumption, Yd=disposable income, and numbers in the parenthesis are the ‘t-
ratios’
a. Test the significant of Yd statistically using t-ratios
b. Determine the estimated standard deviations of the parameter estimates
8. State and prove Guass-Markov theorem

9. Given the model:
Y i =β 0 +β 1 X i +U i with usual OLS assumptions. Derive the expression for the error
variance.
38

CH - 3 - Econometrics UG

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH - 3 - Econometrics UG

Uploaded by

Copyright:

Available Formats

2010

MTU, CANR, Department of Agricultural Economics EC/2018

3.1. Stochastic and Non-stochastic Relationships

' ' '

3.2.1 Assumptions of the Classical Linear Stochastic Regression Model.

1. The model is linear in parameters.

Example 1. Y =α + βx +u is linear in both parameters and the variables, so it

5. The random terms of different observations ( U i ,U j ) are independent. (The

Cov (ui u j )=Ε [ [(ui −Ε(ui )][u j −Ε(u j )] ]

=E (ui u j )=0 …………………………..…. (2.5)

This means that, in taking large number of samples on Y and X, the

same in all samples, but the

cov ( XU )= Ε [ [( X i−Ε ( X i )][U i −Ε(U i )]]

= X i Ε(U i ) , given that the x i are fixed

A. The dependent variable

The shape of the distribution of

Furthermore, the values of the explanatory variable,

and therefore don’t affect the shape of the distribution of

Cov (Y i ,Y j )=E{[ Y i−E(Y i )][Y j−E(Y j )]}

3.2.2 Methods of estimation

are estimated from the sample of Y and X and

population random disturbance

Rearranging this expression we will get: ∑ Y i=nα + β^ ΣX i …… (2.9)

^ β^ X i . Hence it is possible to rewrite (2.8) and (2.11) as −2 ∑ e i=0 and

−2 ∑ X i e i=0 . It follows that;

∑ Y i X i= α^ ΣXi + β^ ΣX2i ……………………………………….(2.13)

Equation (2.14) can be rewritten in somewhat different way as follows;

Σ( X− X̄ )(Y −Ȳ )=Σ( XY−X Ȳ − X̄ Y + X̄ Ȳ )

Σ( X − X̄ )2 =ΣX 2−n X̄ 2 −−−−−−−−−−−−−−−−−(2.16)

Y =α+ βX i +U i , subject to the restriction α =0 . To

3.2.2.3. Statistical Properties of Least Square Estimators

PROPERTIES OF OLS ESTIMATORS

Proposition: α^ ∧ β^ are linear in Y.

Proof: From (2.17) of the OLS estimator of β^ is given by:

(but Σ xi=∑ ( X− X̄ )=∑ X−n X̄ =n X̄−n X̄=0 )

Proposition: α^ ∧ β^ are the unbiased estimators of the true parameters α ∧β

Ε( β^ )=β and Ε( α^ )=α

=αΣk i +βΣk i X i +Σk i u i ,

Ε( β^ )=E( β )+Σk i E (ui ), Since k i are fixed

=α+ 1 n Σu i − X̄ Σki ui ⇒ α−α=

Ε( α^ )=α+ 1 n ΣΕ(ui )− X̄ Σk i Ε(ui )

c. Minimum variance of α^ and β^

var( α^ )=Ε ( ( α−Ε(

Lets first show minimum variance of β^ ^ .

From these values we can drive

∴ var (β∗)=σ 2 Σw 2i since

var ( β∗)=var( β^ )+ σ 2 Σc2i

Given that ci is an arbitrary constant,

α∗¿ α+β X̄ + ∑ ui/n−α X̄ Σw i−β X̄ Σw i X i− X̄ Σw i ui

∑ ( wi )=0 , Σ (wi X i )=1 and ∑ ( wi ui )=0

var (α∗)=var( α^ )+ σ 2 X̄ 2 Σc2i

⇒ var ( α∗)> var ( α^ ) , Since σ 2 X̄ 2 Σc2i > 0

The variance of the random variable (Ui)

value of the disturbance term ei from the expression:

To use σ^ in the expressions for the variances of

To prove this we have to compute

y i =(Y i −Ȳ )=βi ( X i− X̄ )+(U i−Ū )=βx i +(U−Ū )

e i=βx i +(ui −ū)− β^ x i

=nσ 2 − 1n Ε(u1 +u2 +. . .. .. .+ui )2 since Ε(u 2i )=σ 2u

=nσ 2 − 1n (ΣΕ (u2i )+ 2 Σu i u j ) i≠ j

=nσ 2u −σ 2u (given Ε (ui u j )=0)

Ε ( Σe2i )=( n−1 ) σ u2 +σ 2−2 σ 2u=(n−2)σ 2u

variance of α^ and β^ becomes;