Professional Documents
Culture Documents
Mei-Yuan Chen
Department of Finance
National Chung Hsing University
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 1 / 37
Qualitative Response Data
1 binary data: way of to work (0 for private and 1 for public), default or not
(0 for not default and 1 for default), buy or not buy (0 for not buy and 1
for buy), attend the class or not (0 for not attend and 1 for attend).
2 multinomial data: extents of agree (1 表非常不同意、2 表不同意、3 表
無意見、4 表同意及 5 表非常同意) ,Woody’s credit scoring (用 1 到
16 表示由最低到最高的信用評等) 。
3 truncated data: wage determination (觀察到的工資必須大於法定最低工
資水準)
4 limited data: limits of stock returns(股票價格的漲跌幅限制)。
This provides a very basic motivation for qualitative response and limited
dependent variable models in economics and finance.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 2 / 37
Information and the Observational Rule
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 3 / 37
Observational Rules
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 4 / 37
Linear Probability Models
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 5 / 37
Now consider the conditional distribution of Y on other random
variables X and then the conditional mean of Y on X is defined as
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 6 / 37
Given a sample of random observations,
{(y1 , x′1 )′ , (y2 , x′2 )′ , . . . , (yn , x′n )′ }, a linear probability model is
specified as
yi = x′i β0 + ei , i = 1, . . . , n.
Then
if yi = 0, then ei = −x′i β0 ,
if yi = 1, then ei = 1 − x′i β0 .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 7 / 37
E(ei ) = (1 − x′i β0 ) · P(yi = 1|X = xi ) + (0 − x′i β0 ) · P(yi = 0|X = xi )
= [1 − P(yi = 1|X = xi )] · P(yi = 1|X = xi )
−P(yi = 1|X = xi ) · [1 − P(yi = 1|X = xi )] = 0
var(ei ) = E(e2i )
= (1 − x′i β0 )2 · P(yi = 1|X = xi ) + (0 − x′i β0 )2 · P(yi = 0|X = xi )
= [1 − P(yi = 1|X = xi )]2 · P(yi = 1|X = xi )
+[−P(yi = 1|X = xi )]2 · [1 − P(yi = 1|X = xi )]
= [1 − P(yi = 1|X = xi )] · P(yi = 1|X = xi )
{[1 − P(yi = 1|X = xi )] + [P(yi = 1|X = xi )]}
= [1 − P(yi = 1|X = xi )] · P(yi = 1|X = xi )
= (1 − x′i β0 )(x′i β0 ) not constant ∀i.heteroskedastic
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 8 / 37
Weighted Least Squares Estimator
To have a constant variance for regression errors, Golberger (1964)
proposed a two-step, weighted estimator for estimating a linear
probability model.
(1) Construct the weight wi by
[ ]1/2
1
wi = ,
(x′i β̂n )(1 − x′i β̂n )
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 10 / 37
Logit Transformation
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 11 / 37
Take exponential function on both sides of above equation
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 12 / 37
Given a sample of random observation, {(yi , xi ), i = 1, . . . , n}, the likelihood
function is
∏
n
L(y|x, β) = pyi i × (1 − pi )1−yi
i=1
{[ ] yi [ ]1−yi }
∏n
exp(x′i β) exp(x′i β
= 1− ) ,
i=1
1 + exp(x′i β) 1 + exp(x′i β)
log L(y|x, β)
∑n { [ ] [ ]}
exp(x′i β) exp(x′i β
= yi log + (1 − y i ) log 1 − ) .
i=1
1 + exp(x′i β) 1 + exp(x′i β)
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 13 / 37
Probit Transformation
∫ x′i β0
1
pi = Φ(x′i β0 ) = √ exp(−u2 /2)du.
−∞ 2π
∑
n
log L(y|x, β) = {yi log[Φ(x′i β)] + (1 − yi ) log[1 − Φ(x′i β)]}.
i=1
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 14 / 37
Marginal Effect on the Probability
For the probit model,
∂P(yi = 1) 1
= √ exp(−x′i β0 /2)β0j = ϕ(x′i β0 )β0j ,
∂xij 2π
and for the logit model,
∂P(yi = 1)
∂xij
= a ∗ e(ax+b) (1 + e(ax+b) )−1 + (−1)(a ∗ e(ax+b) (1 + e(ax+b) )−2 e(ax+b) )
[ ] [ ]
exp(x′i β0 ) exp(x′i β0 )
= β0j − β0j exp(x′i β0 )
1 + exp(x′i β0 ) (1 + exp(x′i β0 ))2
[ ][ ]
exp(x′i β0 ) exp(x′i β0 )
= β0j 1 −
1 + exp(x′i β0 ) 1 + exp(x′i β0 )
= P(yi = 1)[1 − P(yi = 1)]β0j .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 15 / 37
Ordered Probit/Logit Models
The ordered probit/logit models is to consider the dependent variable
outcomes have a natural (ordinal) ranking (i.e., the respones can be
ordered in some meanful fashion). Consider a random sample,
{(yi , xi ), i = 1, . . . , n}, where y− = m for m = 1, . . . , M with a
natural ordering (that is m + 1 is in some sense better than m). The
observed values are assumed to derive from some unobservable latent
variable y∗i , where
y∗i = x′i β0 + ui , i = 1, . . . , n,
with
where α0 = −∞ and αM = ∞.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 16 / 37
Then, the conditional probability of observing the mth category (i.e.,
yi = m) can be written as
for m = 1, . . . , M.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 17 / 37
Ordered Probit/Logit Model
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 18 / 37
Likelihood Function of Ordered Probit Model
Denote zim = 1(yi = m) for m = 1, . . . , M as the indicator function for the ith
observation yi = m. The likelihood of the ith observation is
∏
M
li = P(yi = m|xi )zim
m=1
∏
M
zim
= [Φ(αm − x′i β) − Φ(αm−1 − x′i β)]
m=1
∑
n ∑
M
l(α, β0 ) = zim ln [Φ(αm − x′i β) − Φ(αm−1 − x′i β)] .
i=1 m=1
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 19 / 37
MLE for Ordered Probit Model
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 20 / 37
Measures for Goodness-of-Fit
(1) Squared correlation between y and ŷ;
∑
[ ni=1 (yi − ȳn )(ŷi − ȳn )]2
R = ∑n
2
∑ .
[ i=1 (yi − ȳn )2 ][ ni=1 (ŷi − ȳn )2 ]
n ∑
n
= 1− (yi − ŷi )2 Effron’s measure
n1 n2 i=1
[ ∑n ]
(y i − ŷ i ) 2
= 1 − ∑ni=1 , Amemiya’s measure
i=1 ŷi (1 − ŷi )
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 21 / 37
(3) Likelihood ratio:
where yS∗
i is the relization of the latent value if the selection
“tendency” for the individual i and yO∗i is the latent outcome. xSi and
xO
i are explanatory variables for the selection and outcome equation,
respectively. xS and xO may or may not be equal.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 23 / 37
We observe that
{
0 if yS∗
i <0
ySi = (3)
1 otherwise
{
0 if ySi = 0
yO
i = (4)
yO∗
i otherwise
i.e. we observe the outcome only if the latent selection variable yS∗
i is positive.
The observed dependence between yO and xO can now be written as
′ ′
E[yO |xO = xiO , xS = xSi , yS = 1] = xO O O S S S
i β + E[ϵ |ϵ ≥ xi β ]. (5)
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 24 / 37
Sample Selection Bias
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 25 / 37
Given (y∗i |X = xi ) ∼ N(x′i β0 , σ02 ) and ϵi = (y∗i − x′i β0 )/σ0 ∼ N(0, 1), the density
function of ϵi is
( ∗ ′ )2
( 2) yi −xi β0
1 ϵi 1 σ0
fϵ (ϵi ) = √ exp = √ exp
2π 2 2π 2
( ) [ ( )]
1 (y∗i − x′i β0 )2 1 (y∗i − x′i β0 )2
= √ exp = σ0 √ exp
2π 2σ02 2πσ0 2σ02
( ∗ )
y − x′i β0
= σ0 fy∗i −x′i β0 (y∗i − x′i β0 ) = ϕ i .
σ0
Therefore,
( )
1 y∗i − x′i β0
fy∗i −x′i β0 (yi∗ − x′i β0 ) = fy∗i (y∗i ) = ϕ .
σ0 σ0
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 26 / 37
Then, E(yi |xi , y(i > 0) =)E(y∗i |xi , y∗i > 0). As y∗i ∼ N(x′i β0 , σ02 ) and
yi −x′i β0
fy∗i (y∗i ) = 1
σ0 ϕ σ0 , we have
( ) ( )
1 y∗ ′
i −xi β0
f (y∗i )
y∗ σ0 ϕ σ0
f(y∗i |y∗i > 0) = i
= ( ) .
P(y∗i ≥ 0) 1−Φ
−x′i β0
σ0
Then,
( ) ( )
∫ 1 y∗ ′
i −xi β0
∞ ϕ
σ0 σ0
E(y∗i |xi , y∗i > 0) = y∗i ( ′ ) dy∗i
−xi β0
0 1−Φ σ0
∗ ′
( ∗ ′ ) ( ∗ ′ )
∫ yi −xi β0 yi −xi β0 x′i β0 yi −xi β0
∞ ϕ + ϕ
σ0 σ0 σ0 σ0
= ( ′ ) dy∗i .
−xi β0
0 1−Φ σ0
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 27 / 37
∗
y −x β ′
Let x = i σ i 0 , thus σ0 dx = dy∗
i , then we have that solutions to the first part and second part of the integral are:
0
∫ ∞ ∗ ( ∗ )
1 yi − x′i β0 yi − x′i β0 ∗
(1) : ( ) ϕ dy
−x′ β0 0 σ0 σ0
1−Φ i
σ0
∫ ∞
σ0
= ( ) x′ β 0 xϕ(x)dx, d exp(g(x))/dx = exp(g(x))[dg(x)/dx]
−x′ β − i
1−Φ i 0 σ0
σ0
∫ ∞
σ0 ϕ(x)
= ( ) x′ β 0 − dx, dϕ(x)/dx = −xϕ(x)
−x′ β − i dx
1−Φ i 0 σ0
σ0
[ ( )]
σ0 −x′i β0
= ( ) −ϕ(∞) + ϕ
−x′ β σ0
1−Φ i 0
σ0
( ) ( )
−x′i β0 −x′i β0
σ0 ϕ σ0
σ0 ϕ σ0
= ( ) = ( ′ ) ,
−x′ β0 x β
1−Φ i Φ i 0
σ0 σ0
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 28 / 37
∫ ( ∗
∞ )
x′i β0 1 y − x′i β0
(2) : ( ′ ) ϕ i dy∗i
−xi β0 σ0 σ0
1−Φ σ0
0
( )
′ ∫ ∞ dϕ y∗i −x′i β0
xi β0 σ0
= ( ′ ) ∗ dy∗i
−xi β0 dy
1−Φ σ0
0 i
′
( ( ′ ))
xi β 0 −xi β0
= ( ′ ) 1−Φ
−xi β0 σ0
1−Φ σ0
= x′i β0 .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 29 / 37
The unconditional mean of yi is,
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 30 / 37
Estimating the model above by OLS gives in general biased results,
′
as E[ϵO |ϵS ≥ xSi β S ] ̸= 0.
Assuming the error terms follow a bivariate normal distribution:
( S ) (( ) ( ))
ϵ 0 1 ρ
∼ N , (6)
ϵO 0 ρ σ2
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 31 / 37
Sample Selection Bias Models
where I(·) is an indicator function. I(·) = 1 if the argument (·) is true, otherwise
I(·) = 0. y2i is unobservable, only di is observable. Meanwhile y1i is observable
only when di = 1. When di = 0, all y1i are observed to be zero. In literature, two
estimations are developed for a sample selection model: one is Heckman
two-stage method and maximum likelihood method.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 32 / 37
Heckman Two-Stage Estimator
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 33 / 37
Therefore, for the observations with di = 1, (7) can be represented as
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 34 / 37
Heckman’s Two-stage Procedures
1 Use the MLE (8) to estimate the parameter α in a probit model. Denote
the estimates as α̂ and put them into (11), then we have
λ̂i = λ(w′i α̂), i = 1, . . . , n.
2 Given the observation with di = 1, estimate a regression of y1i on xi and λ̂i
using least squares method. Denote β̂ and σ̂uv as the estimated results for
β and σuv . The parameter σu2 can be estimated by the following method
suggested by Heckman (1979):
1 [ ]
1∑
n
σ̂u2 = ê2i + σ̂uv λ̂i , (13)
n i=1
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 35 / 37
It’s obvious that the performance of β̂ and σ̂uv depend on the
performance of λ̂i . If wi and xi are highly correlated, severe problem
of multicollinearity exists between λ̂i and xi so that the Heckman
two-stage estimator will not perform well. This issue is discussed in
Olsen (1980)、Nawata (1993) and Leung and Yu (1996, 2000).
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 36 / 37
Maximum Likelihood Estimator
Under the assumption of normal distribution as (9), the log-likelihood function of
(7) is
∑
n
log L = (1 − di ) log[1 − Φ(w′i α)]
i=1
[
+di log Φ{[w′i α + σuv σu−2 (y1i − x′i β)][1 − σuv
2
/σu2 ]−1/2 }
]
− log σu + log Φ[σu−1 (y1i − x′i β)] .
∑
n
log L = (1 − di ) log[1 − Φ(w′i α)]
i=1
[
+di log Φ{[w′i α + ρσu−1 (y1i − x′i β)][1 − ρ2 ]−1/2 }
]
− log σu + log Φ[σu−1 (y1i − x′i β)] . (14)
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
M.-Y. Chen E370ProbitLogit.tex August 19, 2022 37 / 37