You are on page 1of 13

Multinomial logit or probit model/

polychotomous dependent variable


Model
Economic theory

• The probability that a particular consumer will


choose a particular alternative is given by the
probability that the utility of that alternative
to that consumer is greater than the utility to
that consumer of all other alternatives. Then
the consumer picks the alternative that
maximizes his or her utility
Multinomial Logit Model
 

• Simple extension to the logit model when the


dependent variable can take more than two
categorical values.

• unordered dependent variables case i. There is


no order within the categories of Y ( any of a
chose can be the baseline for comparison)
•  is not inherently ordered
For Example.

• A traveler has a choice of for a trip to work either through


• Car
• Bus
• Train/subway
• Comparison of the type of drink brand choices
• Dashen Beer
• Harer Beer
• Meta beer
• a person may has a voting options
• labour,
• Conservative,
• liberal democrat
• A response variable with K categories will generate K-1
equations
– In multinomial logit model we have K-1 equations instead of one
equation. That is way Multinomial logit models are called as
multi-equation models
• Each of these K-1 equations is a binary logistic regression
comparing a group with the reference group
• The choice of reference category is arbitrary but should be
theoretically motivated
• Example: if our dependent variable Y= 0, 1, 2 and our
reference category was 0, then we would have two logit
functions,
– First equation : Y=1 verse Y=0,
– Second équation: Y=2 verses Y=0
– The probabilities for all the categories of Y( all the possible
outcomes for our dependent variable) add to 1 or 100%. o
– P1 + P2 + Pj =1
– The multinomial logit is equivalent to running a a series
separate binary logit models to find the coefficients, but
these would not give us a single overall measures of the
deviance

- Example -Imagine the outcome is voting and the options are


• labour,
• Conservative,
• liberal democrat.
Conservatives Vs Labour
Liberal democrat Vs Labor
– Multinomial logistic regression simultaneously estimates the
K-1 logit coefficients through MLE

 Pr ob 
• Ln = si   0   1 X 1i   2 X 2i
 Pr obbi
•  

• 1= sub way
• 0= bus
•  P r ob si= the probability of the ith person choosing the sub way alternative
• Pr ob bi= the probability of the ith person choosing the bus or base alternative

 Pr obci 
Ln   0  1 X 1i   2 X 1i
 Pr ob
•  
• = bi 
• 1= car
• 0= bus
•  
• Where s= subway, c=Car, b= bus
•  A key distinction is that the dependent variable of these equations is the log of
the odds of the ith alternative being chosen compared to the base alternative.
Multinomial Logit Model: Probabilities
 

• Pr(Y=1)= e (b1X) /e(b1X) + e( b2X) + e(b3X)


• Pr(Y=2)= e(b2X) /e (b1X) + e ( b2X) + e (b3X)
• Pr(Y=3)= e (b3X) /e (b1X) + e ( b2X) + e(b3X)

• For a three category dependent variable, the


probability estimates would be:
– Pr(Y=1 )=1 /1 + exp( b2X) + exp(b3X) référence category
– Pr(Y=2)= e(b2X) /e (b1X) + e ( b2X) + e (b3X)
– Pr(Y=3)= e (b3X) /e (b1X) + e ( b2X) + e(b3X)

• In terms of Odd ration we can rewrite as :


•    Pr obsi  x

 Pr ob    
si

 bi 
– The sign of a coefficient estimates reflects the directions of
changes in the ratio between P(Y=k)/P(Y=1) in response to a
ceteris paribus change in the value to which the coefficient is
attached
• It does not reflect the direction of change in the individual probabilities
pr(Y=K)
  .

– Significance of coefficients: similar to the binomial logit model

– Goodness of fit: similar to the binomial logit model With Chi-


squre test and pseudo-R2
interpretation
mlogit brand female age, basecat(1)

Iteration 0: log likelihood = -795.89581


Iteration 1: log likelihood = -709.10396
Iteration 2: log likelihood = -703.08391
Iteration 3: log likelihood = -702.97081
Iteration 4: log likelihood = -702.9707
Multinomial logistic regression Number of obs= 735
LR chi2(4)= 185.85
Prob > chi2= 0.0000
Log likelihood = -702.9707 Pseudo R2 = 0.1168
-----------------------------------------------------------------------
brand | Coef. Std. Err. z P>|z| [95% Conf.Interval]
-------------+--------------------------------------------------------
2 |
female .5238143 .1942466 2.70 0.007 .143098 .90453
age .3682065 .0550031 6.69 0.000 .2604024 .47601
_cons |-11.77466 1.77461 -6.64 0.000 -15.25283 -8.2964
-------------+-----------------------------------------------------
3 |
female .4659414 .2260895 2.06 0.039 .022814 .9090688
age |.6859082 .0626265 10.95 0.000 .5631626 .8086539
_cons -22.7214 2.058027 -11.040.000 -26.75505 -18.68774
-----------------------------------------------------------------------
(brand==1 is the base outcome)
• We would say that for a one unit increase in age , we expect an
increase or decrease in the log odds of Harer to Dashen by + β12,
given all of the other variables in the model are held constant.
•  
• For example from the above table, we can say that for one unit change
in the variable age, the log of the ratio of the two probabilities,
P(Harer=2)/P(Dashen=1), will be increased by 0.368, and

• the log of the ratio of the two probabilities P(Meta=3)/P(Dashen=1)


will be increased by 0.686. Therefore, we can say that, in general, the
older a person is, the more he/she will prefer brand Meta berr or
Harer than Dashen . but more for Meta barnd
Odd ratio
•  
• The ratio of the probability of choosing one outcome category over the
probability choosing the reference category is often referred as relative risk
ratio. 

• So another way of interpreting the regression results is in terms of relative


risk ration.
– We can say one unit change in the variable age, we expect the relative risk of
choosing brand 2 over 1 to increase by exp(.3682) = 1.45. So we can say that the
relative risk is higher for older people.
•  
• For a dichotomous dummy explanatory variable such as female, we can say
that the ratio of the relative risks of choosing brand 2 ( Harer) over
1( Dashen) for female as compare to male is exp(.5238).
•  
• The log of the ratio of the two probabilities, (Harer=2)/P(Dashen=1), for
female will be higher by 0.52 than men, and the log of the ratio of the two
probabilities
•  Remark
•  
– If a coefficient is insignificant it does not mean that
its variable is completely irrelevant. It only means
that the variable does not affect the choice between
that alternative and the base alternative.
•  
•  
•  

You might also like