Professional Documents
Culture Documents
OLS Derivation PDF
OLS Derivation PDF
Teresa Randazzo
(Introduction to Econometrics) 1
Today
(Introduction to Econometrics) 2
Basic ingredient of regression analysis
The Simple Linear Regression Model
Issues:
1 How do we allow factors other than x to affect y ? There is never
an exact relationship between two variables!
2 What is the functional form of the relationship between y and x?
3 How can we be sure we are capturing a ceteris paribus
relationship between y and x?
(Introduction to Econometrics) 3
Basic ingredients of regression analysis
Suppose you want to know the mean earnings of women recently
graduate from college (µy ).
I Unbiasedness: E (µ̂Y ) = µY
p
I Consistency: Ȳ −
→ µY
I Law of large numbers: An estimator is consistent if the probability
that its falls within an interval of the true population value tends
to one as the sample size increases.
(Introduction to Econometrics) 5
Basic ingredients of regression analysis
Linear regression
What does relationship exist between wage and a number of
background characteristics (e.g gender, age, education)?
y = f (x1 , x2 ....xk )
yi = xi0 β + εi
where
I yi is the endogenous variable observed on unit i or at time i
I xi is a k × 1 vector of explanatory variables observed for unit i of
at time i (gender, years of education, age,..)
I β is a k × 1 vector of associated (slope) parameters
I εi is an unobservable disturbance term relative to unit i or time i
(Introduction to Econometrics) 6
Ordinary Least Squared (OLS)
I Given a sample of N observations we are interested in finding
which linear combination of x1 ...xk and a constant give a good
approximation of y
I Clearly, we would like to choose values for β1 , ...βk such that the
difference between yi − xi0 β is small
I The most common approach is to chose β such that the sum of
squared differences is as small as possible
I We determine β̂ to minimize the following obiective function
N
X N
X
S(β) = (yi − xi0 β)2 = ei2
i=1 i=1
(Introduction to Econometrics) 7
Ordinary Least Squared (OLS)
Simple Linear Regression
(Introduction to Econometrics) 8
Ordinary Least Squared (OLS)
Simple Linear Regression
I In the simpliest case we have just one regressor and a constant
I Given the following Simple Linear Regression model
y = β0 + β1 x1 + ε
I We want to know how y changes when x changes, holding the
other factors in ε fixed
I Holding ε fixed means ∆ε = 0 so that:
∆y = β1 ∆x + ∆ε
= β1 ∆x when ∆ε = 0.
I We therefore have β1 = ∆y ∆x ⇒ β1 measures by how much y
changes if x is increased by one unit, holding ε fixed.
I Linearity implies that a one-unit change in x has the same effect
on y , regardless of the initial value of x.
(Introduction to Econometrics) 9
Ordinary Least Squared (OLS)
Simple Linear Regression
Examples
I #1: yield and fertilizer:
yield = β0 + β1 fertilizer + ε,
Figure: Savings and income for 15 families, and the PRF E(savings|income)
(Introduction to Econometrics) 11
Ordinary Least Squared (OLS)
Simple Linear Regression: Minimizing the Sum of Squared Residuals
yi = β0 + β1 x1,i + εi
(Introduction to Econometrics) 12
Ordinary Least Squared (OLS)
Simple Linear Regression: Minimizing the Sum of Squared Residuals
(Introduction to Econometrics) 13
Ordinary Least Squared (OLS)
Simple Linear Regression: Methods of Moments
β̂0 = b0 = ȳ − β1 x̄
PN
i=1 (y − ȳ )(x − x̄) Cov (x, y )
βˆ1 = b1 = PN =
i=1 (x − x̄)
2 Var (x)
(Introduction to Econometrics) 14
Ordinary Least Squared (OLS)
Simple Linear Regression
ŷ = β̂0 + β̂1 xi
ε̂i = yi − yˆi
The estimated intercept (βˆ0 ), slope (βˆ1 ), and residual (ε̂) are
(Introduction to Econometrics) 15
OLS in Matrix form
Bivariate case
y = β1 + β2 xi2 + εi
y1 1 x12 ε1
y2 1 x22 ε2
β1
.. = .. .. β + ..
. . . 2 .
yn 1 xn2 εn
0
y1 x1 ε1
y2 x 0 ε2
2 β1
.. = .. + .
. . 2 β ..
yn 0
xn εn
(Introduction to Econometrics) 16
Multivariate Regression Model
(Introduction to Econometrics) 17
Multivariate Regression Model
(Introduction to Econometrics) 18
Multivariate Regression Model
I Generally, we can write a model with two explanatory variables as:
y = β0 + β1 x1 + β2 x2 + ε,
where β0 is the intercept, β1 measures the change in y with
respect to x1 , holding other factors fixed, and β2 measures the
change in y with respect to x2 , holding other factors fixed.
I In the model with two explanatory variables, the key assumption
about how ε is related to x1 and x2 is
E(ε|x1 , x2 ) = 0.
I For any values of x1 and x2 in the population, the average
unobservable is equal to zero
I In the wage equation, the assumption is E (ε|educ, exper ) = 0
implies that other factors affecting wage are not related on
average to educ and exper
(Introduction to Econometrics) 19
Multivariate Regression Model
y = β0 + β1 x1 + β2 x2 + . . . + βk xk + ε
where β0 is the intercept, β1 is the parameter associated with x1 ,
β2 is the parameter associated with x2 , and so on.
I The MLRM contains k + 1 (unknown) population parameters.
We call β1 , . . . , βk the slope parameters.
I The error term ε contains factors other than x1 , x2 , . . . , xn that
affect y
(Introduction to Econometrics) 20
Multivariate Regression Model
E(ε|x1 , ..., xk ) = 0
I At minimum, this assumption requires that all factors in ε are
uncorrelated with the explanatory variables
I We can make this condition closer to being true by controlling for
more variables
(Introduction to Econometrics) 21
Multivariate Regression Model
I Suppose we have x1 , x2 , . . . , xk (k regressors) along with y . We
want to fit an equation of the form
(Introduction to Econometrics) 22
Multivariate Regression Model
I We can use multivariate calculus. The OLS first order
conditions are the k + 1 linear equations in the k + 1 unknowns
β̂0 , β̂1 ,. . . ,β̂k :
Xn
(yi − β̂0 − β̂1 xi1 − . . . − β̂k xik ) = 0
i=1
n
X
xi1 (yi − β̂0 − β̂1 xi1 − . . . − β̂k xik ) = 0
i=1
Xn
xi2 (yi − β̂0 − β̂1 xi1 − . . . − β̂k xik ) = 0
i=1
..
.
n
X
xik (yi − β̂0 − β̂1 xi1 − . . . − β̂k xik ) = 0
i=1
I The OLS regression line is written as
ŷ = β̂0 + β̂1 x1 + β̂2 x2 + . . . + β̂k xk
(Introduction to Econometrics) 23
OLS in Matrix form
Multivariate case
y1 1 x12 . . . x1k−1 β1 ε1
y2 1 x22 . . . x2k−1 β2 ε2
.. = .. .. + ..
..
. . . . .
yn 1 xn2 . . . xnk−1 βk−1 εn
Y = X β + ε
n×1 n×k k×1 nx1
(Introduction to Econometrics) 24
OLS in Matrix form
N
X
S(β) = (yi − xi0 β)2
i=1
min ∂S(β) = 0
β
(Introduction to Econometrics) 25
OLS in Matrix Form
∂S(β)
I Notice that β is a k × 1 vector, e.i
∂S(β)
1 β
∂S(β)
∂S(β) β2
= .
β ..
∂S(β)
βn
β̂ = (X 0 X )−1 X 0 y
(Introduction to Econometrics) 26
OLS in Matrix Form
I Therefore we obtain
N
X N
−1 X
0 −1 0 0
β̂ = (X X ) Xy= xx xy
i=1 i=1
(Introduction to Econometrics) 27
Formal link between the two representations
N
X
X 0y = xy
i=1
N
X N
X
2
ε = (yi − xi0 β)2 = (y − X β)0 (y − X β) = ε0 ε
i=1 i=1
I These equations link the two (different but perfectly equivalent)
representations.
(Introduction to Econometrics) 28
Variance-Covariance Matrix
I The crucial fact in the matrix representation
y = Xβ + ε
(Introduction to Econometrics) 29
E (ε21 ) E (ε1 ε2 ) . . . E (ε1 εn )
E (ε2 ε1 ) E (ε2 ) . . . E (ε2 εn )
2
= .. .. .. ..
. . . .
..
E (εn ε1 ) E (εn ε2 ) . E (ε2n )
2
σ1 σ12 . . . σ1n
σ21 σ 2 . . . σ2n
2
= ... ..
.
..
.
..
.
..
σn1 σn2 . σn 2
(Introduction to Econometrics) 30