You are on page 1of 37

Multinomial logistic regression

Hirbo Shore (MPH, Assistant


Professor)
Multinomial logistic regression
• handles the case of dependent variables with more classes than
two
• compares multiple groups through a combination of binary
logistic regressions
• The group comparisons are equivalent to the comparisons for
a dummy-coded dependent variable, with the group with the
highest numeric score used as the reference group
Multinomial logistic regression

• Used to explain the relationship between one nominal


dependent variable and one or more independent variables

• The response variable in multinomial logistic regression


follows a multinomial distribution, with the total number of
outcome categories denoted by J and each individual
categories j.
• The probability associated with the jth outcome category πj,
j=1,2,…, J.
• The multinomial logistic regression model consists of (J-1)
logits, with one outcome category serving as the reference
category
• The logit formed as the log odds of the jth outcome relative to
the reference category
• Typically, the last (or Jth) category is used to denote the
reference category, so the Jth logit (for the jth category) is
ln, the j=1,2, …, (J-1) logits for all categories are;
ln, ln, …, ln,
• the jth logit, in ln,
– represents the log odds of the outcome j relative to the
baseline or reference category J
• if J=2 (indicating a total of two outcome categories), the only
logit would be
ln=ln, typical logistic regression model
• Multinomial logistic regression model used to predict the odds
of outcome j relative to outcome J using the predictor variable
=
• The j subscript on the model parameters (intercept and slopes)
is needed because it is likely that the model parameters would
differ depending on the outcome category being modeled.
• For example,
– Suppose that we were to use the age at which a respondent
got married as a predictor of the respondent ‘s highest level
of education, where education level is measured (less than
high school, high school, junior college, bachelor’s degree
or graduate degree.
– If we use the less than high school category as the
reference category, then the relationship between age at
marriage and the odds of completing high school is likely
to be different than the relationship between the age at
marriage and the odds of completing a bachelor’s degree
• Although there are similarity in the form and interpretation of
the multinomial logistic models and the logistic models,
• Estimation algorithms for multinomial logistic models
estimate all (J-1) models simultaneously to ensure the smallest
standard errors for the parameters.
• Therefore, the parameter estimates obtained from a
simultaneous fitting of all (J-1) models is advantageous to
fitting (J-1) separate logistic regression models
• The other advantage is that the odds involving any two
outcomes can directly modeled using the estimated parameters
• For example, to predict the odds of outcome a relative to
outcome b, we can simply take the difference between the
logit for outcome a and the logit for outcome b:

ln==ln - ln
=

=
• For each independent variable, there would be two
comparisons.
• Multinomial logistic regression provides a set of coefficients
for each of the two comparisons
• Predicted group membership can be compared to actual group
membership to obtain a measure of classification accuracy.
Model for three categories

Need k-1 generalized logits to represent a dependent


variable with k categories
The model
Meaning of the regression coefficients

A positive regression coefficient for logit j means that higher


values of the independent variable are associated with
greater chances of response category j, compared to the
reference category.
Odds Ratio

• To predict the odds of the jth outcome relative to the reference


category, J, we can exponentiate the jth logit

ORj =
The predicted probability
• To predict the probability of the jth outcome, π, the sum of all
odds ( or exponentiated logits) including that forms jth is
+ +,…, + +
= =
• Whereas the odds of the jth outcomes is , therefore, the
predicted probability of the jth outcome is
=
• to compute the predicted probability for category j, πj, the
prediction equation of the jth logit is used in the numerator and
the sum of the prediction equations over all logits is used in
the denominator
Requirement of multinomial logistic regression
model

• Multinomial logistic regression analysis requires that the


dependent variable be non-metric.
– Dichotomous, nominal, and ordinal variables satisfy the
level of measurement requirement.
• Multinomial logistic regression analysis requires that the
independent variables be metric or dichotomous.
• Multinomial logistic regression does not make any
assumptions of normality, linearity, and homogeneity of
variance for the independent variables.
• Because it does not impose these requirements, it is preferred
to discriminant analysis when the data does not satisfy these
assumptions.
• The minimum number of cases per independent
variable is 10
• For preferred case-to-variable ratios, we will use 20
to 1.
Example
• Let consider the data on the program the students prefer after
graduating high school, the dependent variable program
1=general, 2= academic, 3=vocation. The predictors
sex(0=male, 1=female) SES(1=low, 2 middle, 3=high)m
schtyp=school types(1=public, 2=private), reading score,
writing score, maths score, science score, social studies score,
honors English(0=not enrolled, 1=enrolled) and awards
Descriptives
Single binary predictor
• Command: mlogit prog i.schtyp, rrr base(3)
• Interpretation
– Private institution educated person 3.69 times more likely
to join general program category rather than vocational
program (reference category)
– Private institution educated person 7.11 times more likely
to join academic program rather than vocational program
• Multinomial logistic regression using single independent
variable (vocational category set as reference category)
• In stata: rrr is used to general odds ration (ralative to risk
ratio)
• Interpretation
– Socio-economic status is not associated with the general
category
– people with high SES 3.80 times more likely to join
academic program rather than vocational when compared
with low SES
Multiple independent variable
Command: mlogit prog i.ses i.schtyp i.honors read write math
science socst, rrr base(3)
• Interpretation
– A person with middle SES 76.3% less likely to join
academic program than vocational program when
controlled for other factors when other factors in the model
kept constant
– Likewise, a person from middle SES 69.2% less likely to
join academic program rather than vocational when other
factors in the model kept constant
• A unit increase in maths, the likelihood of joining academic
program rather than vocational increases by 13.2% keeping
other variables in the model constant
• A unit increase in social science score, the likelihood of
joining academic program rather than vocational increases by
7.4% keeping other variables in the model constant
Testing the effects of the independent variables
• Likelihood-ratio test (lrtest)
– Similar to binary or ordinal logistic regression
• Wald test
Testing the effects of the independent variables
Thank your for listening

You might also like