You are on page 1of 3

1

1.1

Derivation of OLS estimators
Simple Regression: Two Variable Model

The Ordinary Least Squares (OLS) technique involves finding parameter estimates by minimizing the sum of square errors, or, what is the same thing, minimizing the sum of square residuals (SSR) n ˆ ˆ ˆ ˆ or i=1 (Yi − Yi )2 , where Yi = β0 + β1 Xi is the fitted value of Yi corresponding to a particular observation Xi . ˆ ˆ We minimize the SSR by taking the partial derivatives with respect to β0 and β1 , setting each equal to 0, and solving the resulting pair of simultaneous equations. δ ˆ δ β0 δ ˆ δ β1
n i=1 n

ˆ ˆ (Yi − β0 − β1 Xi )2 = −2

n

ˆ ˆ (Yi − β0 − β1 Xi )

(1) (2)

ˆ ˆ (Yi − β0 − β1 Xi )2 = −2

i=1 n i=1

ˆ ˆ Xi (Yi − β0 − β1 Xi )

i=1

Equating these derivatives to zero and dividing by −2 we get
n i=1 n i=1

ˆ ˆ (Yi − β0 − β1 Xi ) = 0

(3) (4)

ˆ ˆ Xi (Yi − β0 − β1 Xi ) = 0

Finally, rewriting eqns. 3 and 4 we obtain a pair of simultaneous equations (known as the normal equations):
n i=1

ˆ ˆ Yi = nβ0 + β1
n i=1

n

Xi
i=1 n 2 Xi i=1 n i=1

(5) (6) Xi and multiplying

n i=1

ˆ Xi Yi = β0

ˆ Xi + β1

ˆ ˆ Now we can solve for β0 and β1 simultaneously by multiplying eqn. 5 by eqn. 6 by n.
n n

Xi
i=1 n i=1

ˆ Yi = nβ0

n i=1 n i=1

ˆ Xi + β1 ˆ X i + nβ 1

n

2

Xi
i=1 n 2 Xi i=1

(7) (8)

n
i=1

ˆ Xi Yi = nβ0

Subtracting eqn. 7 from 8 we get:
n n n

n
i=1

Xi Yi −
i=1

Xi
i=1

ˆ Yi = β1 n

n 2 Xi − i=1

n

2

Xi
i=1

from which it follows that n ˆ β1 =
n i=1

Xi Yi −
2 Xi −

n i=1

Xi Xi

n

n i=1

n i=1

n i=1 2

Yi

(9)

1

subtracting the second equation from the first gives ¯ us the deviation form: 2 . the OLS estimator β1 becomes ˆ β1 = n i=1 xi yi n 2 i=1 xi (12) ˆ ˆ Hence. i.. Yi = β0 + β1 X1i + β2 X2i . the above two equations help us to find the OLS estimates of β0 and β1 . in deviation form. i. where. We can do this by calculating the partial derivatives with respect to the three unknown parameters ˆ ˆ ˆ β1 .e. Now. and β3 . since e is zero.e.ˆ Dividing eqn. respectively (note: ˆ when doing the calculations. The normal equations then become: ˆ ˆ nβ0 + β1 ˆ β0 ˆ β0 n i=1 n i=1 n i=1 ˆ X1i + β2 n n n X2i = i=1 i=1 n Yi X1i Yi i=1 n ˆ X1i + β1 ˆ X2i + β1 n i=1 n i=1 2 ˆ X1i + β2 X1i X2i = i=1 n 2 X2i = i=1 i=1 ˆ X1i X2i + β2 X2i Yi which can be easily solved using Cramer’s rule or matrix algebra to find the formula for the parameter estimates. find β1 first). we can calculate β0 from eqn. as was done with the n ˆ ˆ ˆ ˆ simple regression model above. ˆ β1 = 1 n 1 n n ¯¯ i=1 Xi Yi − X Y n 2 − X2 ¯ i=1 Xi (10) ˆ ˆ Given β1 . 9 by 1/n2 give the OLS derivation for β1 corresponding to the text. equating each to zero. 1. β2 . The least-squares equation (for the three-variable regression model)is ˆ ˆ ˆ ˆ Yi = β0 + β1 X1i + β2 X2i + ei Averaging over the sample observations gives ˆ ˆ ¯ ˆ ¯ ¯ Yi = β0 + β1 X1i + β2 X2i which gives no term in e. An alternative approach is to begin by expressing all the data in the form of deviations from the sample means. say. and solving. 5 of the normal equations: ˆ β0 = n i=1 Yi n ˆ − β1 n i=1 Xi n ˆ ¯ ¯ = Y − β1 X (11) ˆ Note. minimize i=1 (Yi − Yi )2 .2 Multiple Regression: Three-variable Model The goal is to find parameter estimates by minimizing the sum of square errors.

17 and 13 help us to find the OLS estimates of β1 . and β0 respectively ˆ0 last). find β 3 . we can multiply eqn. 14 by subtract the latter from the former to get n n n n n i=1 x2 and multiply eqn. equations 16. and then x1i yi i=1 i=1 x2 − 2i i=1 x2i yi i=1 ˆ x1i x2i = β1 [ x2 1i i=1 i=1 x2 − ( 2i i=1 x1i x2i )2 ] or ( ˆ β1 = It follows that ( ˆ β2 = n i=1 n i=1 x1i yi )( i=1 x2 ) − ( i=1 x2i yi )( i=1 x1i x2i ) 2i n n n ( i=1 x2 )( i=1 x2 ) − ( i=1 x1i x2i )2 2i 1i n n n (16) x2i yi )( i=1 x2 ) − ( i=1 x1i yi )( i=1 x1i x2i ) 1i n n n ( i=1 x2 )( i=1 x2 ) − ( i=1 x1i x2i )2 1i 2i n n n (17) ˆ ˆ ˆ Hence. but it may be recovered from ˆ ¯ ˆ ˆ ¯ ¯ β0 = Y − β1 X1i − β2 X2i So. respectively. (note: when doing the calculations.ˆ ˆ yi = β1 x1i + β2 x2i + ei ˆ where lowercase letters denote deviations from the sample means. n i=1 n i=1 n i=1 n i=1 ˆ ˆ (yi − β1 x1i − β2 x2i )2 = 0 ˆ ˆ (yi − β1 x1i − β2 x2i )2 = 0 ˆ x1i yi = β1 ˆ x2i yi = β1 n i=1 n i=1 ˆ x2 + β2 1i n x1i x2i i=1 n (14) ˆ x1i x2i + β2 x2 2i i=1 n i=1 (15) To solve this. β2 . Note the intercept β0 disappears from the deviation form of the equation. to minimize n (13) SSR = i=1 ˆ ˆ (yi − β1 x1i − β2 x2i )2 we need to solve δ ˆ δ β1 δ ˆ δ β2 which give. 15 by 2i n n n x1i x2i .