You are on page 1of 59

M hnh hi qui n gin

y = 0 + 1x + u

Economics 20 - Prof. Anderson

D bo s dng m hnh chui thi


gian
(Time Series Models for Forecasting)

n tp phng php hi qui


Review of Regression

Nguyn Ngc Anh


Trung tm Nghin cu Chnh sch v Pht trin
Nguyn Vit Cng
i hc Kinh t Quc dn

Economics 20 - Prof. Anderson

Hi qui l g?
L mt cng c quan trng nht ca cc nh
nghin cu kinh t
Hi qui l phng php m t v nh gi mi
quan h gia mt bin (gi l bin ph thuc,
thng k hiu l y) vi mt hay nhiu bin khc
(gi l bin c lp, x1, x2, ... , xk )

Economics 20 - Prof. Anderson

So snh hi qui v tng quan

Trong quan h tng quan, hai bin y v x


l tng ng nhau.
Trong m hnh hi qui, chng ta coi bin
c lp v bin ph thuc l hon ton khc
nhau. Bin y c gi thit l c tnh ngu
nhin, cn bin x c gi thit l c nh
(nhn gi tr c nh)
Economics 20 - Prof. Anderson

So snh hi qui v tng quan


M hnh hi qui cho php chng ta c
lng (estimate) v suy din thng k
(inferences) cc tham s ca tng th.
Trong kinh t lng, mc tiu ca chng ta
l c lng tc ng nhn qu ca vic X
thay i mt n v i vi Y.

Economics 20 - Prof. Anderson

M hnh hi qui n gin


Nu so snh, th gic tng quan vic c
lng m hnh hi qui cng ging nh c
lng con s trung bnh. Trong m hnh hi qui,
vic suy din thng k bao gm cc vic sau
c lng (Estimation):

Lm th no c lng

Kim nh gi thuyt (Hypothesis testing):

Tham s c lng c c khc 0 hay khng?

Xy dng khong tin cy :

Xy dng khong tin cy cho tham s c c


lng
Economics 20 - Prof. Anderson

M hnh hi qui n gin


M hnh ch bao gm mt bin c lp k=1. Trong
m hnh ny bin y ch ph thuc vo mt bin x
M hnh c th c nhiu bin x, nhng ta s xt
trng hp ny sau. M hnh hi qui n gin c th
s dng trong mt s trng hp :
Lm pht v tht nghip
Li nhun ca chng khon quan h th no vi
ri ro
M phng quan h gia gi chng khon v c tc
Economics 20 - Prof. Anderson

M hnh hi qui n gin : V d

Gi s ta c s liu nh :
Year, t
1
2
3
4
5

Excess return
= rXXX,t rft
17.8
39.0
12.8
24.2
17.2

Excess return on market index


= rmt - rft
13.7
23.2
6.9
16.8
12.3

Chng ta mun tm hiu mi quan h gia x v y

Economics 20 - Prof. Anderson

Biu ri rc

Excess return on fund XXX

45
40
35
30
25
20
15
10
5
0
0

10

15

20

25

Excess return on market portfolio


Economics 20 - Prof. Anderson

Tm ng ph hp nht

Chng ta c th s dng phng trnh


y= + x
c lng ng thng tt nht.
l dc ca ng thng
ng thng ny cn gi l ng hi qui
ca tng th (population regression line)
Ta khng bit v , nn phi c lng
ng thng nh vy hon ton mang tnh
xc nh (deterministic) c hp l khng?
Economics 20 - Prof. Anderson

10

Mt s k hiu v thut ng
Vit dng tng qut hn, vi m hnh hi qui
tuyn tnh gin n, ta c y = + x+ u,
y c gi l m hnh hi qui tuyn tnh ca
tng th
Chng ta thng gi y l bin ph thuc v x l
bin c lp/bin kim soat.
l intercept, l slope ( dc)
u l sai s ca ng hi qui tng th
Economics 20 - Prof. Anderson

11

Ti sao li c sai s u
- Chng ta c th b st nhng yu t c tc ng
n yt
- Vic o lng/ghi nhn s liu i vi bin s yt c
th c sai
- Nhng tc ng ngu nhin i vi bin s yt m
chng ta khng th m hnh ha c

Economics 20 - Prof. Anderson

12

Biu din m hnh trn bng hnh nh

Economics 20 - Prof. Anderson

13

Mt s gi thit
Trung bnh ca cc sai s trong m hnh
hi qui bng 0.
E(u) = 0
y khng phi l mt gi thit qu nng
n, do chng tao lun c th dng
chun ha trung bnh/k vng ton ca u,
E(u) v khng.
Economics 20 - Prof. Anderson

14

Gi thit ca m hnh hi qui


Chng ta cn phi a ra gi thit v mi
quan h gia u v x
Chng ta mun gi thit rng, nhng thng
tin m chng ta bit v x s khng cho
chng ta bit g v u, v nh vy, u v x l
hon ton khng c quan h vi nhau
E(u|x) = E(u) = 0, v iu ny dn ti
E(y|x) = 0 + 1x
Economics 20 - Prof. Anderson

15

E(u|x) = E(u) = 0

Economics 20 - Prof. Anderson

16

Phng php bnh phng cc tiu


tng c bn ca vic hi qui l c
lng cc tham s ca tng th trn c s
mt mu s liu
Gi {(xi,yi): i=1, ,n} l mt mu ngu
nhin, c c l n m ta thu c t tng th
Vi mi quan st trong mu ny, ta s c
yi = + xi + ui
Economics 20 - Prof. Anderson

17

ng hi qui ca tng th, im s liu


v cc sai s
y

E(y|x) = + x
.{
u4

y4

y3
y2

y1

u2 {.

.} u3

} u1

x1

x2

x3

Economics 20 - Prof. Anderson

x4

x
18

c lng vi phng php bnh


phng cc tiu
c lng vi phng php bnh phng cc
tiu, chng ta cn thy rng, gi thit chnh ca
chng ta l E(u|x) = E(u) = 0, v iu ny c
ngha l
Cov(x,u) = E(xu) = 0
Ti sao? T l thuyt c bn v xc sut ta c
Cov(X,Y) = E(XY) E(X)E(Y)
Economics 20 - Prof. Anderson

19

c lng vi phng php bnh


phng cc tiu
Vi tng l tm ng ph hp nht, chng ta
c th xy dng bi ton cc tiu
Tc l chng ta mun tm cc tham s sao cho
biu thc di y t gi tr cc tiu :

Economics 20 - Prof. Anderson

20

(ui )
i =1

= yi ( + xi )
i =1

L
= 2 ( y t xt ) = 0

t
L
= 2 xt ( yt xt ) = 0
t

.
Economics 20 - Prof. Anderson

21

c lng vi phng php bnh


phng cc tiu
Chng ta c th s dng o hm gii bi ton cc tiu
ny, chng ta nu ly o hm bc 1, theo va v gii cc
phng trnh thu. Qua ta c th c lng c cc tham
s ca m hnh hi qui.

= Y X

= i =1
N

( X i X ) (Yi Y )

2
X

X
(
)
i =1 i
N

S XY
= 2
SX

SXY = ng phng sai ca (X, Y)


SX2 = phng sai ca (X)
Economics 20 - Prof. Anderson

22

Tm tt v c lng tham s beta


(slope estimate)
c lng v dc l ng phng sai
tnh trn mu gia y v x, chia cho phng
sai mu ca x.
Nu x v y c tng quan thun (dng) vi
nhau, th c lng c du dng
Nu x v y c tng quan nghch (m) vi
nhau, th c lng c du m
Chng ta ch cn x bin thin trong
Economics 20 - Prof. Anderson

23

OLS
V mt trc gic, OLS l vic c lng ng
thng qua cc im s liu trong mu sao cho tng
khong cch bnh phng sai s l nh nht, nn
c tn l bnh phng cc tiu.
Sai s, , chnh l c lng cho sai s u v l s
sai khc gia ng c lng (ng hi qui
trn mu) v cc im s liu.

Economics 20 - Prof. Anderson

24

ng hi qui mu, im s liu


v cc sai s c lng
y

y4

4 {

y = 0 + 1 x
y3
y2

y1

2 { .

.} 3

1
}
.
x1

x2

x3

Economics 20 - Prof. Anderson

x4

x
25

ng hi qui tng th l m hnh m chng ta cho


rng to ra s liu, v cc tham s thc l v .
Hi qui tng th
Hi qui mu
v chng ta bit rng

yt = + xt + ut
y t = + x t

ut = yt y t.

Chng ta s dng ng hi qui mu suy din v


ng m hnh ca tng th
Chng ta cng mun bit l cc c lng v
c phi l cc c lng tt hay khng
Economics 20 - Prof. Anderson

26

Tnh cht ca OLS


Tng cc sai s (residual) OLS l bng 0
Nh vy, trung bnh mu cc sai s OLS
cng bng 0
ng phng sai mu gia cc bin c
lp v sai s OLS cng bng 0
ng OLS s chy xuyn qua im trung
bnh ca s liu
Economics 20 - Prof. Anderson

27

Biu din bng i s, ta c


n

i =1

ui = 0 and thus,
n

x u
i =1

i =1

=0

=0

y = 0 + 1 x
Economics 20 - Prof. Anderson

28

Tnh cht ca c lng OLS


$

Tuyn tnh (linear)

Khng trch (unbiased)


$
$

Hiu qu nht (best)

Best Linear Unbiased Estimator

Economics 20 - Prof. Anderson

29

S dng STATA c lng OLS


Thc hin hi qui trong STATA rt gin
n. V c lng m hnh hi qui y
theo x th ta ch cn nh lnh
reg y x

Economics 20 - Prof. Anderson

30

c lng s dng STATA


regress testscr str, robust
Regression with robust standard errors

Number of obs
F( 1,
418)
Prob > F
R-squared
Root MSE

=
=
=
=
=

420
19.26
0.0000
0.0512
18.581

------------------------------------------------------------------------|
Robust
testscr |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
--------+---------------------------------------------------------------str | -2.279808
.5194892
-4.39
0.000
-3.300945
-1.258671
_cons |
698.933
10.36436
67.44
0.000
678.5602
719.3057
-------------------------------------------------------------------------

Economics 20 - Prof. Anderson

31

Mc ph hp ca m hnh
(Goodness-of-Fit)
Chng ta c th coi mi quan st l gm c 2 phn
Phn duoc giai thich v phn khng duoc giai thich :
yi = y i + ui Chng ta se co mt so inh nghia nhu sau :

( y y ) Tng binh phuong khoang cach (SST)


( y y ) Phn tng binh phuong duoc giai thich (SSE)
u Phan tng binh phuong con du (SSR)
2

2
i

Ta co SST = SSE + SSR


Economics 20 - Prof. Anderson

32

Chng minh rng SST = SSE + SSR

( y y ) = [( y y ) + ( y y )]
= [u + ( y y )]
= u + 2 u ( y y ) + ( y y )
= SSR + 2 u ( y y ) + SSE
v ta bit rang u ( y y ) = 0
2

2
i

Economics 20 - Prof. Anderson

33

Mc ph hp ca m hnh
(Goodness-of-Fit)
Chng ta nh gi th no v ng hi qui m ta
c lng? C ph hp vi s liu hay khng?
C th tnh t l tng bnh phng khong cch
(SST) c gii thch bi m hnh, v gi t l
ny l R-bnh phng ca m hnh hi qui.
R2 = SSE/SST = 1 SSR/SST
Nm trong khong 0-1. Cng ln cng tt!!!!
Economics 20 - Prof. Anderson

34

Phn phi mu ca c lng OLS


c lng OLS c tnh ton da trn mt mu s liu, mt
mu s liu khc s cho ta mt gi tr khc ca 1 . y c gi
l tnh bt nh theo mu ca 1 . Chng ta mun
nh gi mc bt nh ca 1
S dng 1 tin hnh kim nh gi thuyt nh 1 = 0
Xy dng khong tin cy cho 1
Tt c nhng iu ny i hi chng ta phi xem xt ti
phn phi mu (sampling distribution) ca c lng
OLS. lm c iu ny, ta phi xem xt Phn phi
ca c lng OLS
Economics 20 - Prof. Anderson

35

1
Hm phn phi ca
Cng ging nh trung bnh mu, Y , 1 cng c phn phi mu .
Vy k vng ton ca E( ) l bao nhiu
1

Nu nh E( 1 ) = 1, th c lng OLS l c lng


khng trch Cn mun g hn?!
Phng sai ca 1 - var( 1 )? (cho chng ta bit c mc
bt nh ca c lng)
Phn phi ca trong cc mu nh l phn phi g ?
1

Vn ny rt kh!!!!!
Phn phi ca cc mu ln l phn phi g ?
1

Vi cc mu ln, 1 c phn b l phn b chun


(normally distributed).
Economics 20 - Prof. Anderson

36

Tnh khng trch ca OLS


(Unbiasedness)
Gi thit rng m hnh tng th l tuyn
tnh theo tham s c dng y = 0 + 1x + u
Gi thit rng chng ta s dng mt mu c
qui m n, {(xi, yi): i=1, 2, , n}, c ly
t m hnh tng th. Nh vy ta c th biu
din m hnh mu l yi = 0 + 1xi + ui
Gi thit E(u|x) = 0 v nh vy E(ui|xi) = 0
Gi thit rng xi c bin thin
Economics 20 - Prof. Anderson

37

Tnh khng trch ca OLS


(Unbiasedness)
xt tnh khng trch ca c lng, chng ta
vit li di dng tham s ca tng th.
Vit mt cng thc ngin l

(x x ) y

=
i

2
x

, where

s ( xi x )
2
x

Economics 20 - Prof. Anderson

38

Tnh khng trch ca OLS


(Unbiasedness)

(x x )y = (x x )( + x
(x x ) + (x x ) x
+ ( x x )u =
(x x ) + (x x )x
+ ( x x )u
i

+ ui ) =

1 i

1 i

Economics 20 - Prof. Anderson

39

Tnh khng trch ca OLS


(Unbiasedness)

(x x ) = 0,
(x x )x = (x x )
i

so, the numerator can be rewritten as

s + (xi x )ui , and thus


2
1 x

1 = 1

(x x )u

+
i

2
x

Economics 20 - Prof. Anderson

40

Tnh khng trch ca OLS


(Unbiasedness)
let d i = ( xi x ), so that
1

i = 1 + 2 d i ui , then
sx
1

E 1 = 1 +
2 d i E (ui ) = 1
sx

( )

Economics 20 - Prof. Anderson

41

Tnh khng trch ca OLS


(Unbiasedness)
Cc c lng OLS ca tham s 1 v 0 l
khng trch
Vic chng minh tnh khng trch, da trn 04
gi thit. Nu mt gi thit m khng ng, th
c lng OLS s khng phi l khng trch
Lu rng, tnh khng trch l tnh cht ca php
c lng (estimator) cn trong mt mu c th,
th c lng thu c c th nhiu t khc vi
tham s thc t
Economics 20 - Prof. Anderson

42

Phng sai ca c lng OLS


Chng ta bit rng hm phn b (sampling
distribution) ca c lng nm xung
quanh tham s thc
Mun bit xem hm phn b ny c
phn tn nh th no
a thm mt gi thit na v phng sai
Gi thit l Var(u|x) = 2
(Homoskedasticity)
Economics 20 - Prof. Anderson

43

Phng sai ca c lng OLS


Var(u|x) = E(u2|x)-[E(u|x)]2
E(u|x) = 0, so 2 = E(u2|x) = E(u2) = Var(u)
Nh vy, 2 cng l phng sai khng iu
kin, v c gi phng sai ca sai s
, c gi l sai s chun ca sai s
C th ni rng : E(y|x)=0 + 1x v
Var(y|x) = 2
Economics 20 - Prof. Anderson

44

Trng hp phng sai ng nht


(Homoskedastic)
y
f(y|x)

.
x1

. E(y|x) = + x
0

x2
Economics 20 - Prof. Anderson

45

Phng sai khng ng nht


(Heteroskedastic)
f(y|x)

.
x1

x2

x3
Economics 20 - Prof. Anderson

E(y|x) = 0 + 1x

x
46

Phng sai ca c lng OLS

( )

=
Var 1 = Var 1 +
2 d i ui

sx

1
1

2 Var ( d i u i ) =
2
sx
sx
1
=
2
sx

2
d
i Var (ui )

1
d = s x2
2
i

2
d
i =

( )

1 2 2


2 sx =
2 = Var 1
sx
sx
2

Economics 20 - Prof. Anderson

47

Phng sai ca c lng OLS


Phng sai ca sai s, 2 cng ln, th
phng sai ca c lng cng ln
xi bin thin cng nhiu, th phng sai
ca c lng cng nh
Do , mu ln s lm gim phng sai
ca c lng
Vn l phng sai ca sai s chng ta
li khng bit
Economics 20 - Prof. Anderson

48

c lng phng sai ca sai s


Chng ta khng bit phng sai ca sai, 2,
ca sai s l bao nhiu v chng ta khng
quan st c sai s, ui
Chng ta ch quan st c , i
Chng ta c th s dng i c lng
phng sai ca sai s
Economics 20 - Prof. Anderson

49

c lng phng sai ca sai s


ui = yi 0 1 xi
= ( 0 + 1 xi + ui ) 0 1 xi
= u
i

) (

Then, an unbiased estimator of 2 is


1
2
2

=
ui = SSR / (n 2 )

(n 2)
Economics 20 - Prof. Anderson

50

c lng phng sai ca sai s


= 2 = Standard error of the regression
recall that sd =

()

sx

if we substitute for then we have


the standard error of ,

( )

se 1 = / ( xi x )

Economics 20 - Prof. Anderson

51

Tm tt v phn phi mu ca

Nu cc gi thit ca OLS l ng th
Hm phn phi mu ca c:
1

E( 1 ) = 1 (tc l, 1 l c lng khng trch)


1 var[( X i x )ui ] 1

var( 1 ) =
.
4
n
X
n
1 E ( 1 )
Khi mu ln ,
~ N(0,1) (CLT)
var( 1 )

Economics 20 - Prof. Anderson

52

Kim nh gi thuyt v sai s chun ca 1


Mc tiu ca vic kim nh trong m hnh hi qui l s dng s
liu kim nh mt gi thuyt v tng th nh 1 = 0, v a
ra kt lun liu gi thuyt c ng hay khng
Gi thuyt trng v gi thuyt thay th hai pha
H0: 1 = 1,0 vs. H1: 1 1,0
Trong 1,0 l mt gi tr gi thuyt
Gi thuyt trng v gi thuyt thay th mt pha :
H0: 1 = 1,0 vs. H1: 1 < 1,0
Economics 20 - Prof. Anderson

53

Phng php kim nh: Xy dng thng k t hoc z, tnh pvalue, hoc so snh vi gi tr ti hn ca hm phn phi N(0,1))
Ni chng ta c:
t = (c lng gi tr mun kim nh)/sai s chun ca c
lng
Y Y ,0
Khi kim nh v trung bnh ca Y: ta c t =
sY / n
1 1,0
Khi kim nh 1, ta c
t=
,
SE ( 1 )

Economics 20 - Prof. Anderson

54

Cng thc tnh SE(1)


1 n 2
vi

n 2 i =1

2
1
1
estimator
of

2
v
=
=
2
2 2
1
n (estimator of X )
n 1 n
2
n ( Xi X )
i =1

Trong vi = ( X i X )ui .

OK. SE( 1 ) trng phc tp, nhng STATA tnh rt nhanh v


ta khng phi nh cc cng thc ny .

Economics 20 - Prof. Anderson

55

V d
c lng ca m hnh hi qui: Test score = 698.9 2.28STR
STATA cng cho ta c lng lch chun ca con s c
lng l
SE( 0 ) = 10.4

SE( 1 ) = 0.52
Ta c th tnh cc kim nh thng k cho vi gi
1

thuyt Ho:

1,0 = 0
1 1,0 2.28 0
t-statistic testing 1,0 = 0 =
=
= 4.38

0.52
SE ( 1 )
mc ngha 1% gi tr l 2.58, nn ta c th bc b gi
thuyt trng vi mc ngha 1%.
Ta cng c th tnh gi tr p-value . Nhng STATA lm h
ht ri !
Economics 20 - Prof. Anderson

56

c lng s dng STATA


regress testscr str, robust
Regression with robust standard errors

Number of obs
F( 1,
418)
Prob > F
R-squared
Root MSE

=
=
=
=
=

420
19.26
0.0000
0.0512
18.581

------------------------------------------------------------------------|
Robust
testscr |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
--------+---------------------------------------------------------------str | -2.279808
.5194892
-4.39
0.000
-3.300945
-1.258671
_cons |
698.933
10.36436
67.44
0.000
678.5602
719.3057
-------------------------------------------------------------------------

Economics 20 - Prof. Anderson

57

Tm tt: kim nh H0: 1 = 1,0 v. H1: 1 1,0,


Tnh kim nh thng k t (t-statistic)
1 1,0 1 1,0
t=
=
SE ( 1 )
2
1

Bc b gi thuyt trng vi mc ngha 5% nu |t| > 1.96


Bc b gi thuyt trng nu p<5%
Vic kim nh ny da trn mu khong t nht l 30 quan
st.

Economics 20 - Prof. Anderson

58

c kt qu STATA
regress testscr str, robust
Regression with robust standard errors

Number of obs =
420
F( 1,
418) =
19.26
Prob > F
= 0.0000
R-squared
= 0.0512
Root MSE
= 18.581
------------------------------------------------------------------------|
Robust
testscr |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
--------+---------------------------------------------------------------str | -2.279808
.5194892
-4.38
0.000
-3.300945
-1.258671
_cons |
698.933
10.36436
67.44
0.000
678.5602
719.3057
-------------------------------------------------------------------------

2
= .05,
=
698.9

2.28STR,
,
R
Y

(10.4) (0.52)
t (1 = 0) = 4.38, p-value = 0.000 (2-sided)
Khong tin cy 95% ca 1 l (3.30, 1.26)
Economics 20 - Prof. Anderson

59

You might also like