You are on page 1of 7

48 CHAPTER 2.

SIMPLE LINEAR REGRESSION

2.8 Matrix approach to simple linear regression

In this section we will briefly discuss a matrix approach to fitting simple linear
regression models. A random sample of size n gives n equations. For the full
SLRM we have
Y1 = β0 + β1 x1 + ε1
Y2 = β0 + β1 x2 + ε2
.. ..
. .
Yn = β0 + β1 xn + εn
We can write this in matrix formulation as
Y = Xβ + ε, (2.22)
where Y is an (n×1) vector of response variables (random sample), X is an (n×
2) matrix called the design matrix, β is a (2 × 1) vector of unknown parameters
and ε is an (n × 1) vector of random errors. That is,
     
Y1 1 x1 ε1
 Y2   1 x2     ε2 
    β0  
Y =  ..  , X =  .. ..  , β= , ε =  ..  .
 .   . .  β1  . 
Yn 1 xn εn
The assumptions about the random errors let us write

ε ∼ N n 0, σ 2 I ,
that is vector ε has n-dimensional normal distribution with
     
ε1 E(ε1 ) 0
 ε2   E(ε2 )   0 
     
E(ε) = E  ..  =  ..  =  ..  = 0
 .   .   . 
εn E(εn ) 0
and the variance-covariance matrix
 
var(ε1 ) cov(ε1 , ε2 ) . . . cov(ε1 , εn )
 cov(ε2 , ε1) var(ε2 ) . . . cov(ε2 , εn ) 
 
Var(ε) =  .. .. .. .. 
 . . . . 
cov(εn , ε1 ) cov(εn , ε2 ) . . . var(εn )
 
σ2 0 . . . 0
 0 σ2 . . . 0 
  2
=  .. .. . . ..  = σ I
 . . . . 
0 0 . . . σ2
2.8. MATRIX APPROACH TO SIMPLE LINEAR REGRESSION 49

This formulation is usually called the Linear Model (in β). All the models we
have considered so far can be written in this general form. The dimensions of
matrix X and of vector β depend on the number p of parameters in the model
and, respectively, they are n × p and p × 1. In the full SLRM we have p = 2.

The null model (p = 1)

Yi = β0 + εi for i = 1, . . . , n

is equivalent to
Y = 1β0 + ε
where 1 is an (n × 1) vector of 1’s.

The no-intercept model (p = 1)

Yi = β1 xi + εi for i = 1, . . . , n

can be written as in matrix notation with


 
x1
 x2  
 
X =  ..  , β= β1 .
 . 
xn

Quadratic regression (p = 3)

Yi = β0 + β1 xi + β2 x2i + εi for i = 1, . . . , n

can be written in matrix notation with


 
1 x1 x21  
 1 x2 x2  β0
 2 
X =  .. .. ..  , β =  β1  .
 . . .  β2
1 xn x2n


The normal equations obtained in the least squares method are given by
b
X T Y = X T X β.
50 CHAPTER 2. SIMPLE LINEAR REGRESSION

It follows that so long as X T X is invertible, i.e., its determinant is non-zero, the


unique solution to the normal equations is given by
βb = (X T X)−1 X T Y .
This is a common formula for all linear models where X T X is invertible. For the
full simple linear regression model we have
 
Y1
  
T 1 1 ··· 1  Y2 
X Y =  
x1 x2 · · · xn  ... 
Yn
P ! !
Yi nȲ
= P = P
xi Yi xi Yi
and  P   
T Pn xi
P 2 = n nx̄
P 2 .
X X=
xi xi nx̄ xi
The determinant of X T X is given by
X X 
T 2 2 2 2
|X X| = n xi − (nx̄) = n xi − nx̄ = nSxx .

Hence, the inverse of X T X is


 P 2   1
P 
T 1 xi −nx̄ 1 x2i −x̄
(X X) = −1
= n .
nSxx −nx̄ n Sxx −x̄ 1
So the solution to the normal equations is given by
b = (X T X)−1 X T y
β
 1
P  
1 x2i −x̄ nȲ
= n P
Sxx −x̄ 1 xi Yi
 P 2 P 
1 Ȳ P xi − x̄ xi Yi
=
Sxx xi Yi − nx̄Ȳ
 P 2 P 
1 Ȳ xi − nx̄2 Ȳ + nx̄2 Ȳ − x̄ xi Yi
=
Sxx SxY
 P 2 P 
1 2
Ȳ ( xi − nx̄ ) − x̄( xi Yi − nx̄Ȳ )
=
Sxx SxY
 
1 Ȳ Sxx − x̄SxY
=
Sxx SxY
!
b
Ȳ − β1 x̄
=
βb1
2.8. MATRIX APPROACH TO SIMPLE LINEAR REGRESSION 51

which is the same result as we obtained before. 

Note:
Let A and B be a vector and a matrix of real constants and let Z be a vector of
random variables, all of appropriate dimensions so that the addition and multipli-
cation are possible. Then

E(A + BZ) = A + B E(Z)


Var(A + BZ) = Var(BZ) = B Var(Z)B T .

In particular,
E(Y ) = E(Xβ + ε) = Xβ
Var(Y ) = Var(Xβ + ε) = Var(ε) = σ 2 I.
These equalities let us prove the following theorem.
b of β is unbiased and its variance-
Theorem 2.7. The least squares estimator β
covariance matrix is
b = σ 2 (X T X)−1 .
Var(β)

b is unbiased. Here we have


Proof. First we will show that β

b = E{(X T X)−1 X T Y } = (X T X)−1 X T E(Y )


E(β)
= (X T X)−1 X T Xβ = Iβ = β.

Now, we will show the result for the variance-covariance matrix.

b = Var{(X T X)−1 X T Y }
Var(β)
= (X T X)−1 X T Var(Y )X(X T X)−1
= σ 2 (X T X)−1 X T IX(X T X)−1 = σ 2 (X T X)−1 .


We denote the vector of residuals as

e = Y − Yb ,

\) = X β
where Yb = E(Y b is the vector of fitted responses µ
bi . It can be shown that
the following theorem holds.
52 CHAPTER 2. SIMPLE LINEAR REGRESSION

Theorem 2.8. The n × 1 vector of residuals e has mean

E(e) = 0

and variance-covariance matrix



Var(e) = σ 2 I − X(X T X)−1 X T .

Hence, variance of the residuals ei is

var[ei ] = σ 2 (1 − hii ),

where the leverage hii is the ith diagonal element of the Hat Matrix H = X(X T X)−1 X T ,
i.e.,
T
hii = xT
i (X X) xi ,
−1

where xT
i = (1, xi ) is the ith row of matrix X.

The ith mean response can be written as


 
β0
E(Yi ) = µi = xT
i β = (1, xi ) = β0 + β1 xi
β1

and its estimator as


bi = xT b
µ i β.

Then, the variance of the estimator is

µi ) = var(xT
var(b b 2 T T 2
i β) = σ xi (X X) xi = σ hii
−1

and the estimator of this variance is


\ µi ) = S 2 hii ,
var(b

where S 2 is a suitable unbiased estimator of σ 2 .

We can easily obtain other results we have seen for the SLRM written in non-
matrix notation, now using the matrix notation, both for the full model and for a
reduced SLM (no intercept or zero slope).

We have seen on page 50 that


 P 
T 1 x2i −nx̄
(X X) −1
= .
nSxx −nx̄ n
2.8. MATRIX APPROACH TO SIMPLE LINEAR REGRESSION 53

b = σ 2 (X T X)−1 . Thus
Now, by Theorem 2.7, Var[β]
P 2
xi
var[βb0 ] = σ 2
nSxx
P P n o
1 x̄2
which, by writing x2 = x2 − nx̄2 + nx̄2 , can be written as σ 2 n
+ Sxx
.
Also,
 
−nx̄
cov(βb0 , βb1 ) = σ 2
nSxx
−σ 2 x̄
= ,
Sxx
and
σ2
var[βb1 ] = .
Sxx
The quantity hii is given by
T
hii = xTi (X X) xi
−1
 P 2  
1 xj −nx̄ 1
= (1 xi ) .
nSxx −nx̄ n xi

We shall leave it as an exercise to show that this simplifies to

1 (xi − x̄)2
hii = + .
n Sxx

2.8.1 Some specific examples


1. The Null model
As we have seen, this can be written as

Y = Xβ0 + ε
P
where X = 1 is an (n × 1) vector of 1’s. So X T X = n, X T Y = Yi ,
which gives
1X
βb = (X T X)−1 X T Y = Yi = Ȳ = βb0 ,
n

b T −1 2 σ2
var[β] = (X X) σ = .
n
54 CHAPTER 2. SIMPLE LINEAR REGRESSION

2. No-intercept model
We saw that this example fits the General Linear Model with
 
x1
 x2 
 
X =  ..  , β = β1
 . 
xn

P P
So X T X = x2i and X T Y =
xi Yi , and we can calculate
P
xi Yi
βb = (X X) X Y = P 2 = βb1 ,
T −1 T
xi
2
b = σ 2 (X T X)−1 = Pσ .
Var[β]
x2i

You might also like