CHP 3 PDF

Chapter 3.
Two-Variable Regression Model:

The Problem of Estimation
Ordinary Least Squares Method (OLS)
Recall that, PRF: Yi = β1 + β2 Xi + ui
Thus, since PRF is not directly observable, it is

estimated by SRF; that is,
Yi = βˆ1 + βˆ2 X i + uî
And,
Y i = Yî + uˆ i
On Error Term More
If
Yi = Yî + uˆ i
Then,
uî = Yi − Yî
And,
uî = Yi − βˆ1 − βˆ2 X i
On error term more
We need to choose SRF in such a way that, error terms should be as
small as possible,
That is,
The sum of residuals which is represented by
∑ uˆ i = ∑ (Y i − Yî )
Should be as SMALL as possible
On Error Terms more
Therefore, the essential solution is to find a criterion in
order to minimize error disturbances in SRF.
All of the errors are to be as

closer as possible to the
central line of SRF
Then, Least Squares Criterion Comes as a
Solution
Least Squares Criterion is based on:
∑ ∑( )
2
uˆ i2 = Y i − Yî
∑( )
2
= Yi − βˆ1 − βˆ2 X i
Thus,
∑ (
uˆ i2 = f βˆ1 , βˆ 2 )
Example to Least Squares Criterion
TheSufirst
m oModel is sBetter?
f square of Error
Why?
disturbances of th
e
second model is lo
wer
Regression Equation
Yi = βˆ1 + βˆ2 X i + uî
∑ X Y −∑ X ∑Y
∑ ∑ Y − ∑X ∑X Y
n
X i2
i i i i
βˆ =
n ∑ X − (∑ X )
2 i i i i
βˆ =
2 2
n∑ X − (∑ X )
i i 1 2
2
∑ (X − X )(Y − Y )
i i
i i
=
∑ (X − X ) = Y - βˆ2 X
2
i
=
∑ x y
i i
∑x i
2 SampleSample
mean ofmean
Y of X
The Classical Linear Regression Model (CLRM):
The Assumptions Underlying The Method of Least Squares
The inferences about the true β1 and β2 are

important because the estimated values of
them are needed to be closer and closer to
population values.
Therefore CLRM, which is the cornerstone of

most econometric theory, makes 10
assumptions.
Assumptions of CLRM:
Assumption 1. Linear Regression Model
The regression model is linear in the parameters, that is:
Yi = β1 + β2 Xi + ui
Assumption 2. X values are fixed in repeated sampling.

More technically, X is assumed to be non-stochastic
X: 80$ income level → Y: 60$ weekly consumption of a family
X: 80$ income level → Y: 75$ weekly consumption of another family
Assumption 2 is known as: Conditional Regression Analysis, that is,

conditional on the given values of the regressor(s) X.
Assumption 3. Zero Mean value of disturbance ui
E (ui / X i ) = 0
Assumption 4. Homoscedasticity or Equal Variance of ui
var(ui / X i ) = E [ui − E (ui ) / X i ]2

( )
= E ui2 / X i because of Assumption 3
=σ 2
where var stands for variance
Homoscedasticity vs Heteroscedasticity
var (u i / X i ) = σ 2
Assumption 5. No Autocorrelation between the disturbances
Autocorrelation
If :
PRF: Yt = β1 + β2Xt + ut
And if ut and ut-1 are correlated, then Yt depends not only Xt, but also
on ut-1.
Autocorrelation
in Graphs
Assumption 6. Zero Covariance between ui and Xi.
Assumption 7.
Assumption 8.
Assumption 9.
Assumption 10. There is No Perfect Multicollinearity
That is, there is no perfect linear relationship among the

explanatory variables.
Yt = β0 + β1 X1 + β2 X 2 + .....βn X n + ut
High correlation among independent variables causes multicollinearity

which also causes standard errors to be high, hypotheses to be inefficient
(low t values), etc...
Properties of the Least-Squares Estimators:
The Gauss-Markov Theorem
Gauss-Markov Theorem is the least squares approach of Gauss
(1821) with the minimum variance approach of Markov (1900).
Standard error of estimate is simply the standard deviation of

the Y values about the estimated regression line and is often
used as a summary measure of the “goodness of fit” of the
estimated regression line.
BLUE (Best Linear Unbiased Estimator)
1. An estimator is linear, that is, a linear function of a random

variable, such as the dependent variable Y in the regression
model.
2. An estimator is unbiased, that is, its average or expected
value, E(β2), is equal to the true value, β2.
3. An estimator has minimum variance in the class of all such
linear unbiased estimators; an unbiased estimator with the
least variance is known as an efficient estimator.
Therefore, in the regression context it can be proved that the OLS

estimators are BLUE which also sets the base of Gauss-
Markov Theorem.
The Coefficient of Determination, r2:
A Measure of “Goodness of Fit”
The coefficient of determination, r2 (two-variable

case) or R2 (multiple regression) is a
summary measure that tells how well the
sample regression line fits the data.
The Ballentine View of R2
See Peter Kennedy, “Ballentine: A Graphical Aid for Econometrics”, Australian Economics Papers, Vol 20,
1981, 414-416. The name Ballentine is derived from the emblem of the well-known Ballantine beer with
its circles.
Coefficient of Determination, r2
TSS = ESS + RSS
where;
TSS = total sum of squares
ESS = explained sum of squares
RSS = residual sum of squares
If TSS = ESS + RSS, then:
ESS RSS
1= +
TSS TSS
=
∑ (Ŷ − Y )
i
+
2
∑ uˆ 2
i
∑ (Y − Y ) ∑ (Y − Y )
2 2
i i
On r2 more:
R2 indicates the explained part of the
regression model, therefore,
2 ESS
r =
TSS
And,
r 2
=
∑ ( )
Yî − Y
2
=
ESS
∑ (Y − Y )
2
TSS
i
Alternatively,
r 2
= 1−
∑ ˆ
ui2
∑ (Y )
2
i −Y
2 RSS
r = 1−
TSS
Coefficient
of
Determination
Coefficient Of
Determination
HW # 1:
Problem 3.20 (Chapter 3)
Consumer Prices and Money Supply in Japan
1982 to 2001

CHP 3 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CHP 3 PDF

Uploaded by

Copyright:

Available Formats

Chapter 3.

Two-Variable Regression Model:

Thus, since PRF is not directly observable, it is

Yi = βˆ1 + βˆ2 X i + uˆi

The sum of residuals which is represented by

All of the errors are to be as

Least Squares Criterion is based on:

Yi = βˆ1 + βˆ2 X i + uˆi

The inferences about the true β1 and β2 are

Therefore CLRM, which is the cornerstone of

Assumption 2. X values are fixed in repeated sampling.

Assumption 2 is known as: Conditional Regression Analysis, that is,

var(ui / X i ) = E [ui − E (ui ) / X i ]2

That is, there is no perfect linear relationship among the

High correlation among independent variables causes multicollinearity

Standard error of estimate is simply the standard deviation of

1. An estimator is linear, that is, a linear function of a random

Therefore, in the regression context it can be proved that the OLS

The coefficient of determination, r2 (two-variable

If TSS = ESS + RSS, then:

Consumer Prices and Money Supply in Japan

You might also like

CHP 3 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CHP 3 PDF

Uploaded by

Copyright:

Available Formats

Chapter 3.

Two-Variable Regression Model:

Thus, since PRF is not directly observable, it is

Yi = βˆ1 + βˆ2 X i + uˆi

The sum of residuals which is represented by

All of the errors are to be as

Least Squares Criterion is based on:

Yi = βˆ1 + βˆ2 X i + uˆi

 The inferences about the true β1 and β2 are

 Therefore CLRM, which is the cornerstone of

 Assumption 2. X values are fixed in repeated sampling.

Assumption 2 is known as: Conditional Regression Analysis, that is,

var(ui / X i ) = E [ui − E (ui ) / X i ]2

That is, there is no perfect linear relationship among the

High correlation among independent variables causes multicollinearity

 Standard error of estimate is simply the standard deviation of

1. An estimator is linear, that is, a linear function of a random

Therefore, in the regression context it can be proved that the OLS

The coefficient of determination, r2 (two-variable

If TSS = ESS + RSS, then:

Consumer Prices and Money Supply in Japan

You might also like

The inferences about the true β1 and β2 are

Therefore CLRM, which is the cornerstone of

Assumption 2. X values are fixed in repeated sampling.

Standard error of estimate is simply the standard deviation of