You are on page 1of 20

Multinomial logistic regression using SPSS

Mike Crowson, Ph.D.


Created July 11, 2019

Binary logistic regression is utilized in those cases when a researcher is modeling a predictive relationship
between one or more independent variables and a binary dependent variable. Although this is probably
the most common form of logistic regression utilized in research literatures, there are other logistic
regression models that can be useful when your dependent variable is not binary and/or the categories are
unordered or ordered.

In this video, I provide a demonstration of how to carry out and interpret a multinomial logistic regression,
which is generally used when values on the dependent variable represent unordered categories (i.e., the
variable is nominal).

A link for the data used, as well as this Powerpoint, will be made available for download underneath the
video description. Additionally, a “running” document containing links to other videos on logistic regression
and using other programs will be made available as well.

If you find video and materials useful, please take the time to “like” the video and share the link with
others. Also, please consider subscribing to my Youtube channel.
Youtube video link: https://www.youtube.com/watch?v=1BL5cL8_Cyc

For more videos and resources, check out my website:


https://sites.google.com/view/statistics-for-the-real-world/home
Example: We are studying predictors of peoples’ voting behavior during the 2016 Presidential Election. We
hypothesize that age, gender identification, economic beliefs, and religious beliefs will predict whether a
person voted for Hillary Clinton (coded 1 on “vote2016”), Donald Trump (coded 2 on “vote2016”), an Other
Candidate (coded 3), or Did Not Vote (coded 4). [Note: The “other” category on the dependent variable was
created because of the low frequencies – i.e., sparse counts – associated with the other candidates.
Nevertheless, it is not a terribly informative category because the “Other candidates” reflected perspectives
that are an amalgam of liberal and conservative ideological positions. As such, this category is only being
used as “another group” and not as one that represents a commitment to vote for a specific type of
candidate associated with a given party platform.]

Data found at
https://drive.google.com/open?id=1r6dX2SHqYjz7UCYYnFFVL7zbCDBoT_Ts
Our sample is comprised of n=120 observations. “Age” is measured as self-reported age of the participant.
“Econ.conlib” is a rating of self-reported economic liberalism, where participants rated their beliefs on a
scale ranging from 1=extremely conservative to 7=extremely liberal (higher scores, thus, reflect greater
liberalism). “Rel.conlib” is a rating of self-reported religious liberalism, where participants rated their beliefs
on a scale ranging from 1=extremely conservative to 7=extremely liberal (higher scores, thus, reflect greater
liberalism). Gender identification is coded 0=identified male, 1=identified female.

When setting up our analysis, individuals indicating they voted for Hillary Clinton (group 1) as the
reference (or baseline category) against which all other groups are compared. [We could also be using a
different reference category if we had interests in other comparisons between a different baseline
category and the remaining groups.]
Because ‘genderid’ is a binary variable, we can include it under Covariate(s), along with the remaining
IV’s.

We’ll need to click the ‘Reference category’ tab under ‘Dependent’ to re-set the reference category to
the first group (i.e., Hillary Clinton voters).
Click on Statistics. I prefer to ask for Classification Table and Goodness of fit, along with the other defaults.
This table contains information on the number and % of cases observed in each category on the
dependent variable.
The “Model Fitting Information” table contains a Likelihood Ratio chi-square test, comparing the full model
(i.e., containing all the predictors) against a null (or intercept only model). Statistical significance indicates
that the full model represents a significant improvement in fit over the null model.

In this example, we see that the full model is a significant improvement in fit over a null model
[χ²(12)=71.567, p<.001].
The “Goodness of Fit” table contains the Deviance and Pearson chi-square tests, which are useful for
determining whether a model exhibits good fit to the data. Non-significant test results are indicators that
the model fits the data well (Field, 2018; Petrucci, 2009). [Note: They do not always necessarily agree, as
in the case we see here. So the results are somewhat mixed.]

Pearson’s chi-square test indicates that the model does not fit the data well [χ²(309)=370.099, p=.010],
whereas the Deviance chi-square does indicate good fit [χ²(309)=231.961, p=1.00].
These are pseudo-R-square values that are treated as rough analogues to the R-square value in OLS
regression. In general, there is no strong guidance in the literature on how these should be used or
interpreted (Lomax & Hahs-Vaugn, 2012; Osborne, 2015; Pituch & Stevens, 2016; Smith & McKenna,
2013).
These results contain likelihood ratio tests of the overall contribution of each independent variable to the
model (Note: if a variable is added in as a factor, the result for that variable is treated as an omnibus test of
that factor). Using the conventional α=.05 threshold, we see that economic liberalism was the only significant
predictor in the model, although age was “near significant” (at p=.051).
These results involve comparisons between each voter group against the Reference Category (Hillary Clinton
voters). Specifically, the regression coefficients indicate which predictors significantly discriminate between
persons voted for Donald Trump (coded 1 in this portion of the model) and those and those voting for Clinton
(coded 0); between persons voting for ‘Other candidate’ (coded 1 in this portion of the model) and Clinton
(coded 0); and between those who ‘Did not vote’ (coded 1 in this portion of the model) and those voting for
Clinton (coded 0).
The B column contains regression coefficients (expressed in the metric of log-odds). The Exp(B) column
contains odds ratios (Field, 2018; Osborne, 2015; Petrucci, 2009).

See Tabachnick & Fidell (2013, ppp. 490-502) for a thorough demonstration of how to interpret multinomial
logistic regression results based on SPSS output, including odds ratios from the Exp(B) column in the output.
The first set of coefficients represents comparisons between Hillary Clinton voters (coded 0) and those voting
for Donald Trump (coded 1 in this portion of the output). Only ‘economic liberalism’ was a significant predictor
(b=-1.568, s.e.=.328, p<.001) in the model, as persons scoring higher on this variable were less likely to vote for
Donald Trump. The odds ratio of .208 indicates that for every one unit increase on economic liberalism, the
odds of a person voting for Trump changed by a factor of .208 (in other words, the odds were decreasing).
[Note: The remaining coefficients are interpreted as you would with standard binary logistic regression. See
additional links provided at the end of this Powerpoint]
The second set of coefficients represents comparisons between Hillary Clinton voters (coded 0) and those
voting for an ‘Other candidate’ (coded 1 in this portion of the output). Again, only ‘economic liberalism’ was a
significant predictor (b=-.808, s.e.=.290, p=.005) in the model, as persons scoring higher on this variable were
less likely to vote for the ‘Other candidate’. The odds ratio of .446 indicates that for every one unit increase on
economic liberalism, the odds of a person voting for ‘Other candidate’ changed by a factor of .446 (in other
words, the odds were decreasing).
The final set of coefficients represents comparisons between Hillary Clinton voters (coded 0) and those who ‘Did
not vote’ (coded 1 in this portion of the output). Age was a significant negative predictor (b=-.106, s.e.=.043,
p=.013), indicating that persons who were older were more likely to vote for Clinton than to not vote. The
regression coefficients for economic and religious liberalism are consistent with the notion that individuals rating
themselves as more economically or religiously liberal were more likely to vote for Clinton than to not vote at all.
Nevertheless, these predictors were not significant in the model.
These are classification statistics used to determine which group memberships were best predicted by the
model.

Hillary Clinton voters were correctly predicted by the model 75.8% of the time [as 25 of the 33 people who
actually voted for Clinton were predicted to do so by the model; 25/(25+2+0+6) = .758]. Donald Trump voters
were correctly predicted by the model 82.4% of the time]. Persons expressing that they Did Not Vote were
correctly predicted by the model 55.9% of the time. The model did a particularly poor job of predicting (at a
rate of 5.3%) those who voted for Other candidate.
References

Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed). Los Angeles: Sage.

Lomax, R.G., & Hahs-Vaughn (2012). An introduction to statistical concepts (3rd ed). New York: Routledge.

Osborne, J.W. (2015). Best practices in logistic regression. Los Angeles: Sage.

Osborne, J.W. (2017). Regression and linear modeling: Best practices and modern methods. Thousand Oaks, CA: Sage.

Petrucci, C.J. (2009). A primer for social worker researchers on how to conduct a multinomial logistic regression. Journal
of Social Service Research, 35, 193-205.

Pituch, K.A., & Stevens, J.A. (2016). Applied multivariate statistics for the social sciences (6th ed). New York: Routledge.

Smith, T.J., & McKenna, C.M. (2013). A comparison of logistic regression pseudo R 2. Multiple Linear Regression
Viewpoints, 39, 17-26. Retrieved from http://www.glmj.org/archives/articles/Smith_v39n2.pdf on June 20, 2019.

Tabachnick, B.G., & Fidell, L.S. (2013). Using multivariate statistics (6th ed.). New York: Pearson.

Additional links covering interpretation of binary logistic regression results

https://drive.google.com/open?id=1atjwuodokqqNE98oCjbrOpO6SuC-cwWf
https://drive.google.com/open?id=1JfEgp0u4ZiyVOh3yV2pyQ-hjedBafwWV

You might also like