You are on page 1of 6

Math 235 w 15

Collinearity
arises when there is linear corelation between 2 or more explone-
tory variables
p=2
tome RER
Example 1: Yi - Bexin + B2 Xia + Ei i = 1, 2...n
Suppose that Xia = Xinh for
then we can write the model:
Yi =Bexin + B₂ Xii -> +E;=(3, +382) Xip + Ei
Example 2: in the last week's quiz, we had
Yi - Bexie + B2 Xiz+ B₂ tist Butiq+Ei
duppose Xin
Bo
1
doesn't do sport
does sport
0
otherwise
Xid
otherwise
1
1
not a smoker
Xist
Xin
otherwise
1 suooker
o otherwise
The design matrix X in given by
0
0
1
1
X=
nx4
0
Q
i
4
1
1+X4=
X2 + X₂
in general,
we
are
in this case X is not full columm rank: sank (x) <4 =)
=> xlx isn't invertible & (x)" doesn't exist and we comment
compute ß
say
that the explanatory
vanables
Xx and Xe
i) Onthogonal if coulx, Xelzo
ü) Collinear if cour (Xk, te ) =+1
interaction (example ng 99
)
Yi= gas consumption
Xin - outside temperature
if insulation
othercurise
Xiasha
Y = 8 + Be Xie + patre + By Xca Xie +Ei
But Be + Be + Bulxin + Ei
insulation Xiz=1
U
Bitpatint ei
no insulation Xiato
Test for interaction
Ho: Bu=0
VS
Ha Puto (See table of 18)
6,85-939 Xia - 2,26 Xia
2,26 tid + 0,14 xil tie
By
3
At level of 5%=4, p-value for Bu = 0,00252 ts <0,05 so
we reject the null hypothesis Ho: B4 =0,i.e. we believe that there
is no read inutteraction
between outhele Hemperature and
insulation for gas consumption of the house
Covariate selection
have some
Goal: to nd out which of our couauates acctually coutribute to
the
regression
in model. We want to keep the covariates that
in uence to y. if p is longe is likely that
(XTX) is not invertible
Nested Mollels.
Suppose that we have the "hruple model (RM);
Yi - Boxin + Ba Xiut.... + ßpolipp + Ej
and the full model (FM):
Yi - Bi xin + Akrat ...+ Be tipa + Ei
for pe>P1
say that the RM is nested in the FM iff the FM contains all
covariates of the SM.
Ne
fl
fi
We want to test :
Hoi Berto - Peute ... = BPR -
Hii at least 33 to ja pitt, ...,P2
aut agn lul a
=
=Bp2co
going to construct
First Idea: we know har to test Ho: B; =0 for regression program.
în
principal, we can
we can perform x-level test
x-level test for each of these ja pott,..,pa
and then if one of these test reject to then B; 2O. 2) Problem
of multiple testing
We need to test pointly that
Porte
We are
F-test statistic by comparing the
Seum of squared residuels for reduced and full nuadel.
Fact: inclusion of additional covaricates in
always reduce the sum of squared renduals.
Sum of sares FMC Sur of sq res RM
suppese LSE exint for RM= 8 and for FM = ?
Som [- ße Xin
(:- Xiva - Bp
Stres
(
-Bertipe
Claim Shres (FM) < Stres (RM) because
global min is always
sanaller than local min. (see notes from lecture)
regression model will
a
F - test
Ho: Prove" Por ex... - Prato
Hp :JB; s.t. Bjto , j = Pitt....,P2
This is accomplished by considering an F-test.
Ssros (RM) - SS res1FM)
Pa-pi
Fi-
NE
Shres (FM)
P2-prin-pa
11
Ferpoin-pa
n-pa
:
We
reject the Ho if Fobserved in longer than Fea-prin-p211-08
Example n observations
FM:
Ya = Boxen + Baxia Bexist Buxiu tei
RM: Yi = Beteet Petia +Ei
Ho: =B4 = 0
at
x - lvl
1 compute LSE for RM
RM Ê and for FM
2) Compute Stres (FM) and Ssnes CRM)
3
Remark: ik Papp-1, then
we use
t-test.
a
(tee written notes for the Stecture)
Two estimators for ga
Si res (RM) - SSres (FM)
and
SSres (FM)
P2-p1
u-pa
Quiz
3. A.
d)
3.11. al
3.2. al
3.3 a)
3.4. C)
3.12 ?
3.13 b
3.14 al sane
3.5.b)
3.6.
3.7.cl
3.8 el
3.9 C)
3.10. 2)
de continuat.

You might also like