You are on page 1of 39

ECON2280 Introductory Econometrics

Topic 2 Estimation

Yiming Cao

The University of Hong Kong

January 22, 2024

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 1 / 38


Finding the “Best” Line to Fit the Data

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 1 / 38


Lecture Plan

Model Setup (25min)

Estimation Problem (15min)

The OLS Estimator with One Dependent Variable (40min)

The OLS Estimator with Multiple Dependent Variables (40min)

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 2 / 38


Model Simple Linear Regression

The Simple Regression Model


Model:
y = β0 + β1 x + u
y is called the dependent variable
Other names: endogenous variable, explained variable, response variable,
left-hand-side (LHS) variable, regressand
x is called the independent variable
Other names: exogenous variable, explanatory variable, right-hand-side (RHS)
variable, regressor
u is called: the error term
Other names: disturbance, noise, unobservables
β0 and β1 are called the parameters of the model
The model is linear in β0 and β1
ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 3 / 38
Model Simple Linear Regression

Parameters and Variables

Model:
y = β0 + β1 x + u

Parameters are fixed but unknown numbers in the model

Variables are observed numbers in the model

Errors are unobserved random numbers in the model

The goal of econometrics is to use the observed data to estimate the unknown
parameters

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 4 / 38


Model Simple Linear Regression

Interpretation of Parameters

Model:
y = β0 + β1 x + u
β1 is the slope of the regression line
β1 measures the marginal effect of x on y , holding other factors constant
∆y ∆u
β1 = if =0
∆x ∆x
β0 is the intercept of the regression line
β0 measures the average value of y when x = 0, assuming the errors are mean zero
E (u) = 0

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 5 / 38


Model Simple Linear Regression

Examples

Soybean yield and fertilizer use:

Yield = β0 + β1 Fertilizer + u

A simple wage equation:

Wage = β0 + β1 Educ + u

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 6 / 38


Model Simple Linear Regression

A Model Is A Data Generating Process (DGP)


The model is a data generating process (DGP) that generates the observed data
Also called a “population” model
For each observation i, we have: yi = β0 + β1 xi + ui

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 7 / 38


Model Simple Linear Regression

Why Called “Regression”?

The term “regression” was coined by Francis Galton in the 19th century
Galton was interested in the relationship between the height of fathers and sons
He found that the sons of tall fathers tend to be tall, but not as tall as their fathers
He also found that the sons of short fathers tend to be short, but not as short as
their fathers
He concluded that the sons “regress” toward the mean height of the population
This is the regression toward the mean phenomenon

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 8 / 38


Model Simple Linear Regression

“Regression to the Mean”

Why do kids of genius parents sometimes underperform?

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 9 / 38


Model Multiple Linear Regression

Multiple Regression Model


Model:
y = β0 + β1 x1 + β2 x2 + ... + βk xk + u

y is called the dependent variable


(endogeneous variable, explained variable, left-hand-side (LHS) variable)

x1 , x2 , ... xk are the independent variables


(explanatory variables, right-hand-side (RHS) variables, regressors)

u is the error term

K + 1 unknown parameters: β0 , β1 , ..., βk

The model is linear in β0 , β1 , ... , βk


ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 10 / 38
Model Multiple Linear Regression

Why Multiple Regression?

Taking into account more factors that may affect y :

E.g., the wage equation:

Wage = β0 + β1 Education + β2 Experience + u

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 11 / 38


Model Multiple Linear Regression

Why Multiple Regression?

Explicitly holding constant (i.e., controlling for) other factors that may
affect y :

E.g., average test score and per student spending:

TestScore = β0 + β1 Spending + β2 AvgIncome + u

The estimation of β1 is the effect of spending on test scores, holding average


family income constant

Otherwise the effect of spending on test scores would be confounded by the effect
of family income on test scores

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 12 / 38


Model Multiple Linear Regression

Why Multiple Regression?

Allow for more flexible functional forms:

E.g., family income and family consumption:

Consumption = β0 + β1 Income + β2 Income2 + u

E.g., CEO salary, sales, and CEO tenure:

log Salary = β0 + β1 log(Sales) + β2 Tenure + β3 Tenure2 + u

“Linear” regression means linear in the parameters, not necessarily linear in the
variables

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 13 / 38


Methodology

Lecture Plan

Model Setup (25min)

The Estimation Problem (15min)

The OLS Estimator with One Dependent Variable (40min)

The OLS Estimator with Multiple Dependent Variables (40min)

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 14 / 38


Methodology

The Estimation Problem

Model:
y = β0 + β1 x + u

Data generated from the model:

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 15 / 38


Methodology

The Estimation Problem


Estimate the model parameters β0 and β1 using the data:

yi = βˆ0 + βˆ1 xi + ûi = yˆi + ûi

βˆ0 and βˆ1 are called the estimators of β0 and β1

yˆi is called the fitted (or predicted) value of yi

ûi is called the residual

ŷ = βˆ0 + βˆ1 x is called the fitted regression line


ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 16 / 38
Methodology

A Visual Illustration

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 17 / 38


Methodology

What Is A “Good” Fitting Line?

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 18 / 38


Methodology

Estimation Methods

Least Squares: Minimize the distance to the data points.

Maximum Likelihood: Maximize the probability of observing the data.

Method of Moments: Match the moments of the data.

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 19 / 38


Methodology

Lecture Plan

Model Setup (25min)

The Estimation Problem (15min)

The OLS Estimator with One Dependent Variable (40min)

The OLS Estimator with Multiple Dependent Variables (40min)

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 20 / 38


Simple OLS Estimator

The OLS Estimator


Model:
y = β0 + β1 x + u (1)
Fitted value:
ŷi = βˆ0 + βˆ1 xi (2)
Residual:
ûi = yi − yˆi = yi − βˆ0 − βˆ1 xi (3)

Sum of the squared residual:


N
X N
X
2
ûi = (yi − βˆ0 − βˆ1 xi )2 (4)
i=1 i=1

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 21 / 38


Simple OLS Estimator

The OLS Estimator


Objective: find the βˆ0 and βˆ1 that minimize the sum of the squared residual:
N
X
min (yi − βˆ0 − βˆ1 xi )2 (5)
βˆ0 ,βˆ1
i=1

FOC w.r.t. βˆ0 :


N
X
−2(yi − βˆ0 − βˆ1 xi ) = 0 (6)
i=1

FOC w.r.t. βˆ1 :


N
X
−2xi (yi − βˆ0 − βˆ1 xi ) = 0 (7)
i=1

Two linear
ECON2280: equations
Introductory ˆ and
Econometricsin two unknowns β βˆEstimation
Topic 2: . Jan 22, 2024 22 / 38
Simple OLS Estimator

The OLS Estimator


Equation (6) implies:
N N
1 X 1 ˆ 1 ˆX
yi = N β0 + β1 xi (8)
N i=1 N N i=1

which can be written as:


ȳ = βˆ0 + βˆ1 x̄ (9)
N N
1 1
P P
where ȳ = N
yi and x̄ = N
xi .
i=1 i=1
Thus, we have:
βˆ0 = ȳ − βˆ1 x̄ (10)

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 23 / 38


Simple OLS Estimator

The OLS Estimator


Equation (7) implies:
N
X
xi (yi − βˆ0 − βˆ1 xi ) = 0 (11)
i=1

Substitute βˆ0 = ȳ − βˆ1 x̄ into Equation (7):


N
X
xi [yi − (ȳ −βˆ1 x̄) − βˆ1 xi ] = 0 (12)
i=1

Rearrange:
N
X N
X
xi (yi − ȳ ) = βˆ1 xi (xi − x̄) (13)
i=1 i=1

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 24 / 38


Simple OLS Estimator

The OLS Estimator

Note that:
N
X N
X
xi (xi − x̄) = [(xi −x̄)+x̄](xi − x̄)
i=1 i=1
XN N
X
2
= (xi − x̄) + x̄ (xi − x̄)
(14)
i=1 i=1
| {z }
=0
N
X
= (xi − x̄)2
i=1

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 25 / 38


Simple OLS Estimator

The OLS Estimator

Similarly:
N
X N
X
xi (yi − ȳ ) = [(xi −x̄)+x̄](yi − ȳ )
i=1 i=1
XN N
X
= (xi − x̄)(yi − ȳ ) + x̄ (yi − ȳ )
(15)
i=1
|i=1 {z }
=0
N
X
= (xi − x̄)(yi − ȳ )
i=1

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 26 / 38


Simple OLS Estimator

The OLS Estimator

Combining Equations (13), (14), and (15), we have:


PN
i=1 (xi − x̄)(yi − ȳ )
βˆ1 = PN (16)
2
i=1 (xi − x̄)

This is equivalent to:


ˆ
Cov(x, y)
βˆ1 = (17)
ˆ
Var(x)

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 27 / 38


Simple OLS Estimator

The OLS Estimator and Correlation Coefficient

The correlation coefficient between two variables x and y is defined as:

Cov(x, y )
ρxy = (18)
σx σy
p p
where σx = Var (x) and σy = Var (y ) are the standard deviations of x and y .
Thus,
σ̂y
βˆ1 = ρ̂xy · ( ) (19)
σ̂x
βˆ1 is the sample correlation between x and y scaled by the ratio of the sample
standard deviations of y and x.

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 28 / 38


Multiple OLS Estimator

Lecture Plan

Model Setup (25min)

The Estimation Problem (15min)

The OLS Estimator with One Dependent Variable (40min)

The OLS Estimator with Multiple Dependent Variables (40min)

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 29 / 38


Multiple OLS Estimator

The OLS Estimator with Two Dependent Variables


Model:
y = β0 + β1 x1 + β2 x2 + u (20)
Fitted value:
ŷi = βˆ0 + βˆ1 xi1 + βˆ2 xi2 (21)
Residual:
ûi = yi − yˆi = yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 (22)
Sum of the squared residual:
N
X N
X
ûi 2 = (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 )2 (23)
i=1 i=1

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 30 / 38


Multiple OLS Estimator

THe OLS Estimator with Two Dependent Variables


FOC w.r.t. βˆ0 :
N
X
(yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 ) = 0 (24)
i=1

FOC w.r.t. βˆ1 :


N
X
xi1 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 ) = 0 (25)
i=1

FOC w.r.t. βˆ2 :


N
X
xi2 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 ) = 0 (26)
i=1

Three linear equations in three unknowns βˆ0 , βˆ1 , and βˆ2 .


ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 31 / 38
Multiple OLS Estimator

The OLS Estimator with k Dependent Variables


Model:
y = β0 + β1 x1 + β2 x2 + ... + βk xk + u (27)
Fitted value:
ŷi = βˆ0 + βˆ1 x1 + βˆ2 x2 + ... + βˆk xk (28)
Residual:
ûi = yi − yˆi = yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − ... − βˆk xik (29)
Sum of the squared residual:
N
X N
X
ûi 2 = (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − ... − βˆk xik )2 (30)
i=1 i=1

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 32 / 38


Multiple OLS Estimator

The OLS Estimator with k Dependent Variables


FOCs: 
 XN
(yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − ... − βˆk xik )



 =0

i=1





 XN
xi1 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − ... − βˆk xik )




 =0
 i=1


XN (31)


 xi2 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − ... − βˆk xik ) =0



 i=1



 ...

N


 X
xik (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − ... − βˆk xik )




 =0
i=1
ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 33 / 38
Multiple OLS Estimator

Matrix Representation of the OLS Estimator


Denote:
      
y1 u1 β0 1 x11 x12 ... x1k
 y2   u2   β1   1 x21 x22 ... x2k 
 ...  , u =  ...  , β =  ...  , X = ... ... ...
y =       , (32)
... ... 
yN uN βk 1 xN1 xN2 ... xNk

Then the model can be written as:

y = Xβ + U (33)

Fitted values: ŷ = X β̂;


Residuals: û = y − ŷ = y − X β̂
ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 34 / 38
Multiple OLS Estimator

Matrix Representation of the OLS Estimator

Minimize the sum of the squared residuals:

min û ′ û = min(y − X β̂)′ (y − X β̂) (34)


β̂ β̂

The OLS estimator is:


β̂ = (X ′ X )−1 X ′ y (35)

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 35 / 38


Summary

Lecture Plan

Model Setup (25min)

Criteria and Approaches to Estimation (15min)

The OLS Estimator with One Dependent Variable (40min)

The OLS Estimator with Multiple Dependent Variables (40min)

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 36 / 38


Summary

Key Equations
Simple regression model:
y = β0 + β1 x + u
Fitted value:
ŷi = βˆ0 + βˆ1 xi
Residual:
ûi = yi − yˆi
The simple OLS estimators:

Covˆ (x, y ) σ̂y


βˆ1 = = ρ̂xy · ( )
ˆ
Var (x) σ̂x
βˆ0 = ȳ − βˆ1 x̄
ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 37 / 38
Summary

Next Lecture

Topic 3 Interpretation of OLS Regressions.

Please read Chapters 2-1, 2-4, 2-7, 3-2 of the textbook.

ECON2280: Introductory Econometrics Topic 2: Estimation Jan 22, 2024 38 / 38

You might also like