Professional Documents
Culture Documents
2 - Biến phụ thuộc là biến giả EN
2 - Biến phụ thuộc là biến giả EN
1
Regression with a binary dependent
variable
1. Linear Probability Model (LPM)
2. Logit Model
3. Probit Model
4. Comparing LPM, Logit, Probit
2
I. Linear Probability Model (LPM)
3
1.Model: Probability analysis
We have a model:
Yi = 0 + 1X1i+ui
5
- Linear regression model: when X changes
by 1 unit, the average value of Y (E (Y|X))
changes by 1 unit.
- LPM: when X changes by 1 unit, the
probability for the event A happening
changes by 1 unit.
6
We can write the regression model as:
+ P(y=1|x) = 0 + 1Xi as a linear function of
X
Regression model with a binary dependent
variable is called Linear Probability Model
(LPM) as P(y=1|x) is linear with parameters j
7
How to interpret coefficients in LPM
9
Weakness of LPM
10
Exercise 1
A survey of 40 students after graduation for 6
months, with variables GS - Graduate Score,
EN – English grade. The scale of grade equals
100.
Y = 1, if student can have a job, Y =0, if student
has not got a job yet. (inlf)
Let “probability of getting job” is the probability
for a student to get a job after 6 month from
graduation, =5%
11
Variable Coefficient Std. Error t-Statistic Prob
1. Write LPM.
2. Interpret coefficients
3. Estimate the probability of getting job when GS and
EN equal: (70,80); ( 60,60)?
12
Answer
1. LPM
P(Y 1| X ) E (Y / GS , EN ) + GS + EN
LPM
p 0 1 i 2 i
LPM
p 3.01567 + 0.020158GS i + 0.02922 EN i
13
2. Interpretation:
- P-value <0.05 both GS and EN have
statistically significance on Y.
- Holding other factors fixed, when GS
increases by 1 unit, the probability of getting a
job increases by 2%.
- Holding other factors fixed, when EN
increases by 1 unit, the probability of getting a
job increases by 2.9%.
14
3. When GS = 70; EN = 80:
LPM
p (Y 1| X ) 3.01567 + 0.020158*70 + 0.02922*80 0.733
15
II. Logit and Probit Model
16
Logit Model
exp
z
p( x)
j
p (1 p )
x j
1+exp
2 i i j
z
18
2. Ước lượng mô hình Logit
19
Giả sử rằng mẫu có Ni giá trị Xi; trong Ni quan sát
này chỉ có ni giá trị mà Yi =1, khi đó ước lượng
điểm của pi là:
p n i
i
N i
Chúng ta dùng ước lượng pi để ước lượng mô hình:
p
Li 1+ 2 X i
ln i
1 p
i
20
Thủ tục ước lượng:
p
n
i
Bước 1: Với mỗi Xi ta tính p ;
i
ln
Li
N i
i 1 p
i
w N p (1 p )
i i i i
w L w + w X
i i 1 i 2 i i
+ wui i
OLS mô hình trên thu được 1 2
22
Chúng ta có thể viết lại như sau:
f ( y | xi ; ) [G( xi )] [1 G( xi )] , y 0,1
y 1 y
( ) y log[G( xi )]+(1-y)log[1 G( xi )]
Tìm MLE của , được ký hiệu là , nhằm tối đa
tổng của hàm logarit trên.
Tìm để ( ) max
23
Probit Model
1. Model
P( y 1| x) G( + X 1
+ ... + X k
) G( z )
0 1 k
x j
24
+ Some properties of cdf:
• The values is between (0;1)
• F0)= 0.5; F∞) = 0; F+∞) =1
25
Exercise
A survey of 40 students after graduation for 6
months, with variables GS - Graduate Score,
EN – English grade. The scale of grade equals
100.
Y = 1, if student can have a job, Y =0, if student
has not got a job yet.
Let “probability of getting job” is the probability
for a student to get a job after 6 month from
graduation, =5%
26
Exercise 1: Estimation of the model
Variable Coefficient Std. Error t P-value