You are on page 1of 10

3/31/2023

Logistic regression
• A version of the multiple regression in which
the outcome (i.e., the DV) is a categorical
Logit Models: variable.
Nonstop fun for the whole family – If the # of categories = 2, then it is called binary
logistic regression (or simply a logit model)
L9 – If the # of categories > 2, then it is called
“Gee dad, after dinner will you
“Hey I want to know about
show us how to do logit models?”
ordered Logit Models” multinomial logistic regression (or multinomial
logit model)
“Dad, I want to learn about
Multinomial Logit models!!!”

“But I want Conditional Logit Models”

Multiple regression (MR) vs. logistical


Form of Logit Relationship b/t DV & IV
regression (LR)
• MR: Predicting value of Y for a given value of 1.0

X…
• LR: Predict the probability of Y occurring given

Prob of Event (DV)


known values of X (or X’s)
– Linear probability models
– Logit models
– Binomial logit models
0
Low High

Level of Independent variable


Hair et al, 4th ed; p. 131

1
3/31/2023

Dad, why we can’t use ordinary linear


regression?

()
= B0+B1X1+….BnXn • Model needs to be linear…violated with
( ) categorical variable.
– “e” is the base of the natural log (~2.71…) • Values are constrained to 0-1 …with a linear
– The estimated coefficients of B0, B1, …,Bn are regression predicted values of Y may exceed 1 or be
less than 0. (See diagram)
– ODDS RATIO - Measures the change in the ratio of
the probabilities.
• To deal with this problem…
– Transform the data to express nonlinear
relationships in a linear

Dad…I’m kinda confused about the


For coefficients
odds ratio thing
• Take natural log of both sides of the equation • Probability of success of some event is 0.8
•  Log odds ratio • Then the probability of failure is 1- 0.8 = 0.2
• Odds of success = Prob of success / Prob of
Ln(Prob (event)/ Prob (No event))= B0 + B1X1 +….BnXn failure
• Odds of success = .8/.2 = 4
• Also called the Odds Ratio (OR)

2
3/31/2023

Gee dad, how do I interpret the sign? Odds Ratios versus Coefficients
• If reporting coefficients
– A positive coefficient  increase the probability or Coefficient
likelihood of Y = 1. Odds Ratio Ln(Odds Ratio)
– A negative coefficient  decreases the likelihood of Y 1.002267 0.00226
= 1).
2.234545 0.80404

• If reporting the odds ratio


– A value greater than 1  increase the odds in favor of
Y=1
– A value less than 1 (but positive…it will always be
positive)  decrease the odds in favor of Y = 1 0.9 -0.10536
0.5 -0.69315

In other words… A Fun Logit Example


• Odds ratio = 1.0  associated with a regression • Predicting the Probability of promotion to
coefficient is zero (B = 0)…indicating an absence
of a relationship. Associate Professor
• Odds ratio > 1.0  correspond to a positive B – Logit (promotion) = B0 + B1*Publications =
(regression) coefficient…Reflecting an increase in .39*publications – 6.00; where B1 = .39* B0 = -6.00
odds of being a case category associated with – The .39 implies that the “predicted logit” increase
each unit increase in X.
by 0.39 for each increase in one publication
• Odds ratio < 1.0  correspond to a negative B
• Odds (promotion) = e (.39*publications – 6.00)
(regression) coefficient…Reflecting a decrease in
odds of being a case category associated with • Probability (promotion) = 1/ (1+ e (.39*publications – 6.00))
each unit increase in X.

3
3/31/2023

Dad, you’re the best!!! But how do I


Another Really Fun Logit Example
talk about it in a paper?
• DV = Predicting compliance with Mammogram Screening guidelines (1=compliance; 0=no compliance)

• Variable B Exp(B_ (Odds ratio)


• Hypotheses
• PHYSREC 1.842 6.311
• KNOWLEDG -.079 .924 – H1: X1 increases the likelihood of Y
• BENEFITS .544 1.722


BARRIERS
Const.
-.581
-3.051
.559
– H2: X2 decreases the likelihood of Y

• Interpretation
– For a physician’s recommendation, (c.p.), 1.842  higher likelihood of compliance.


Recall that it is a logistical relationship…
So the odds ratio is computed using the exponential function  e1.842 = 6.311 (“e” equals about
– Publications increase the likelihood of receiving
2.741).
– So the odds of compliance with a physicians recommendation increase by a factor of over 6! tenure.

– Alternatively, perceived barriers (B = -.581) corresponds to an odds ratio of e -.581 = .559. So the odds
of compliance with perceived barriers decreases by a factor of .559

Odds Ratio & Ln(Odds Ratio) Assessing the model


• Likelihood value: Similar to sum of squared errors…in that it is an
indicator of how much unexplained error information there is after
• X1 – Gender dummy (Male =1; female =0) the model has been fitted.
• DV = Admittance to a particular graduate program
– It measures -2 log of the likelihood value
– -2LL or -2 Log likelihood
• Assume B1 coefficient = 1.694596 – Well fitted model  -2LL has small value.
– Interpretation  a one unit change in gender (i.e., being male) results – Perfect fit  likelihood = 1; -2LL = 0
in a 1.694596 unit change in the log odds of being admitted.
• LL…Compare (new model) vs. a baseline model (only a
• Odds ratio  e1.694596 = 5.44 constant…that is it assumes all coefficients equal “0”)…
– Meaning = The odds of being admitted increases by a factor of 5.44 for – R^2Logit = (-2LLnull – (-2LLmodel))/-2LLnull
males. – Chi square distribution

4
3/31/2023

But dad, what about the predictor


But how do you estimate it???
variables?
• Maximum likelihood estimation • The Wald test is unreliable in logit analysis…
– Originally developed by R.A. Fisher in the 1920s, – Use a LR (likelihood ratio) test
states that the desired probability distribution is the – Estimation of a logit model is usually by MLE.
one that makes the observed data ‘‘most likely,’’
– No universally-accepted goodness of fit measure (i.e.,
–  one must seek the value of the parameter vector
that maximizes the likelihood function. pseudo R2)
• The resulting parameter vector (i.e., the – Be careful if you use percentage of correct predictions
coefficients), which is sought by searching the – Kennedy mentions a test of the fraction of 1’s
multi-dimensional parameter space, is called the correctly predicted + the fraction of zeroes correctly
MLE estimate predicted. Should be greater than 1 (see. P. 249)

Dichotomous DV (Two
Bowen & Wiersema (Modeling Limited DVs)
choices/categories)
• DV  0-1 dummy variable • The interpretation of the directional impact (+
– Can’t have values outside of 0 or 1 …(e.g., it is a buy, or -) of a change in an explanatory variable in
no buy decision)
the binary LM or PM is identical to that for
• Logit modeluses logistical distribution
OLS, except, …the direction of the effect refers
• Probit modeluses standard normal distribution
to the change in the probability of the choice
– They tend to produce similar results
– Heterosked. causes major problems for Logit and
for which y = 1).
Probit models
– Use probit models for sample selection (Heckman
models)!!!

5
3/31/2023

See STATA Logit example Interpreting the logit output


See Logit model annotated output


• . logit admit gre gpa i.rank, robust

• Iteration 0: log pseudolikelihood = -249.98826 – For every one unit change in gre, the log odds of
• Iteration 1: log pseudolikelihood = -229.66446


Iteration
Iteration
2:
3:
log
log
pseudolikelihood
pseudolikelihood
=
=
-229.25955
-229.25875
admission (versus non-admission) increases by 0.002.
• Iteration 4: log pseudolikelihood = -229.25875

• Logistic regression Number of obs = 400
– For a one unit increase in gpa, the log odds of being


• Log pseudolikelihood = -229.25875 Pseudo R2
Wald chi2(5)
Prob > chi2
=
=
=
0.0829
36.66
0.0000 admitted to graduate school increases by 0.804.
------------------------------------------------------------------------------
– The indicator variables for rank have a slightly

• | Robust
• admit | Coef. Std. Err. z P>|z| [95% Conf. Interval]


-------------+----------------------------------------------------------------
gre | .0022644 .0011027 2.05 0.040 .0001032 .0044257 different interpretation. For example, having attended
• gpa | .8040377 .3451359 2.33 0.020 .1275838 1.480492


|
rank |
an undergraduate institution with rank of 2, versus an



2 | -.6754429
3 | -1.340204
4 | -1.551464
.3144686
.3445257
.4160544
-2.15
-3.89
-3.73
0.032
0.000
0.000
-1.29179
-2.015462
-2.366915
-.0590958
-.6649459
-.7360121
institution with a rank of 1, decreases the log odds of


|
_cons | -3.989979 1.138089 -3.51 0.000 -6.220593 -1.759366
admission by 0.675.
• ------------------------------------------------------------------------------

Similar results for a probit model Additional tests

6
3/31/2023

Hoetker, 2007
• Since y* is unobserved, we use do not know • For logit models that report the odd ratio
the distribution of the errors, ε 1:1  an event is equally likely to occur (50% prob)
• In order to use maximum likelihood 2:1  an event is twice as likely to occur (66.7%
estimation (ML), we need to make some prob)
assumption about the distribution of the The effect of a one unit change in variable X is to
errors. change the odds by a factor of exp(Bx)
Values > 1 increase the odds of the event occurring
• A good (but a bit technical) summary of MLE: Values < 1 decrease the odds of the event occurring
– https://online.stat.psu.edu/stat415/lesson/1/1.2

Dad, “What if you have more than 2 CEO’s preferred flavor of ice cream:
categories?” Chocolate, Vanilla, or Strawberry
• Multinomial logit models
Manatee
– Use when there are > 2 categories
– Categories
Poop…
Served at
• Ordinal  consisting of ordered categories Lickety Split in
– Socio-economic status (e.g., lower, middle, upper class) Englewood FL!!!
– Use ordered logit model (in Stata, ologit command)
• or
• Nominal  consisting of unordered categories
– Favorite ice cream flavor (e.g., Vanilla, chocolate, strawberry,
manatee poop (FL)

7
3/31/2023

Hypotheses and testing for multiple


The Florida Manatee categories
• Hypothesis examples
– Age increases the likelihood of selecting choosing
chocolate versus manatee poop
– Musicians are more likely to choose vanilla versus
manatee poop
• Testing
– A variable named flavor
• chocolate = 1; vanilla = 2; manatee poop = 3
– Anchor on manatee poop
– mlogit flavor age Musicians gender, base(3) robust

Multinomial logit/multinomial probit Interpretation of coefficient estimates


Manatee Poop vs. Vanilla or Chocolate vs. Vanilla

– Polychotomous DVs (Many choices/categories) • Different than linear regression


• Consider four choices  A, B, C, and D. – Other choices (chocolate or Manatee Poop) relative to
• Consider A as the Base case…compare A – B, A – C, and A – D the reference group (vanilla)
– For a 1 unit change in the IV, the multinomial log
– Weakness of the MNL  odds for preferring Manatee Poop to the reference
• Characterized by the IIA (Independence of irrelevant group (Vanilla) is expected to increase/decrease by
alternatives) coefficient units, ceteris paribus.
• MNL  it is inappropriate when two or more alternatives
are close substitutes
• More useful to examine the marginal effects of a
• MNP allows the error terms to be correlated across variable (Hoetker, 2007)
alternatives thereby permitting it to circumvent the IIA
• “How much a change in a variable changes the probability of
problem. the focal outcome” (p. 334)

8
3/31/2023

Stata code See STATA handout


• Logit and Probit • Multinomial Logit Model Example handout
– logit DV IV1 IV2…
– probit DV IV1 IV2…
• Multinomial Logit & Multinomial Probit
– mlogit DV IV1 IV2…
– mprobit DV IV1 IV2…

Dad, one more question, “What if I Dad, will you talk about
have panel data?” Conditional Logit Models?
• Logit and Probit • “Not today son…We have to save some of the
– xtlogit DV IV1 IV2… fun for next time”
– xtprobit DV IV1 IV2… • “Ok dad”
• Multinomial Logit & Multinomial Probit
– xtmlogit DV IV1 IV2…
– xtmprobit DV IV1 IV2…

9
3/31/2023

For more on Logit Models,


• https://www.youtube.com/watch?v=vCSh613
UMic

10

You might also like