You are on page 1of 27

ORDINAL REGRESSION 2014 Edition

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 1

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

@c 2014 by G. David Garson and Statistical Associates Publishing. All rights


reserved worldwide in all media.
ISBN: 978-1-62638-029-5
The author and publisher of this eBook and accompanying materials make no
representation or warranties with respect to the accuracy, applicability, fitness, or
completeness of the contents of this eBook or accompanying materials. The
author and publisher disclaim any warranties (express or implied),
merchantability, or fitness for any particular purpose. The author and publisher
shall in no event be held liable to any party for any direct, indirect, punitive,
special, incidental or other consequential damages arising directly or indirectly
from any use of this material, which is provided “as is”, and without warranties.
Further, the author and publisher do not warrant the performance, effectiveness
or applicability of any sites listed or linked to in this eBook or accompanying
materials. All links are for information purposes only and are not warranted for
content, accuracy or any other implied or explicit purpose. This eBook and
accompanying materials is © copyrighted by G. David Garson and Statistical
Associates Publishing. No part of this may be copied, or changed in any format,
sold, or used in any way under any circumstances.
Contact:
G. David Garson, President
Statistical Publishing Associates
274 Glenn Drive
Asheboro, NC 27205 USA

Email:sa.publishers@gmail.com
Web: www.statisticalassociates.com

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 2

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Table of Contents
Overview ......................................................................................................................................... 7
Data examples in this volume ......................................................................................................... 8
Key Terms and Concepts................................................................................................................. 9
Location variables and thresholds .............................................................................................. 9
Prediction equations ................................................................................................................... 9
Ordinal Regression in SPSS.............................................................................................................. 9
Overview ..................................................................................................................................... 9
SPSS inputs ................................................................................................................................ 10
The main “Ordinal Regression” dialog................................................................................... 10
The Ordinal Regression “Location” dialog ............................................................................. 14
The Ordinal Regression “Options” dialog .............................................................................. 17
The Ordinal Regression “Scale” dialog .................................................................................. 19
The Ordinal Regression “Bootstrap” dialog........................................................................... 20
The Ordinal Regression “Output” dialog ............................................................................... 22
SPSS outputs .............................................................................................................................. 23
Overview ................................................................................................................................ 23
The parallel lines test............................................................................................................. 24
Tests and effect size measures for model goodness of fit ....... Error! Bookmark not defined.
Parameter estimates ................................................................ Error! Bookmark not defined.
Odds ratios................................................................................ Error! Bookmark not defined.
Other output ............................................................................. Error! Bookmark not defined.
Ordinal Regression in SAS ................................................................ Error! Bookmark not defined.
Overview ...................................................................................... Error! Bookmark not defined.
SAS syntax for ordinal regression ................................................. Error! Bookmark not defined.
SAS output for ordinal regression ................................................ Error! Bookmark not defined.
The parallel lines test................................................................ Error! Bookmark not defined.
Testing the global null hypothesis ............................................ Error! Bookmark not defined.
Parameter estimates ................................................................ Error! Bookmark not defined.
Type 3 Analysis of Effects ......................................................... Error! Bookmark not defined.
Odds ratio estimates ................................................................ Error! Bookmark not defined.
R-square .................................................................................... Error! Bookmark not defined.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 3

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Association of predicted probabilities and observed responsesError! Bookmark not


defined.
Model fit statistics .................................................................... Error! Bookmark not defined.
Saving estimates ....................................................................... Error! Bookmark not defined.
Ordinal regression in Stata............................................................... Error! Bookmark not defined.
Overview ...................................................................................... Error! Bookmark not defined.
Stata input for ordinal regression ................................................ Error! Bookmark not defined.
Stata output for ordinal regression .............................................. Error! Bookmark not defined.
The parallel lines test................................................................ Error! Bookmark not defined.
Overview ................................................................................... Error! Bookmark not defined.
Likelihood ratio test of the model ............................................ Error! Bookmark not defined.
Pseudo-R2.................................................................................. Error! Bookmark not defined.
Parameter estimates ................................................................ Error! Bookmark not defined.
Odds ratios................................................................................ Error! Bookmark not defined.
Model fit statistics .................................................................... Error! Bookmark not defined.
Saving estimates ....................................................................... Error! Bookmark not defined.
Other Stata statistical output ................................................... Error! Bookmark not defined.
Partial proportional odds models .................................................... Error! Bookmark not defined.
Overview ...................................................................................... Error! Bookmark not defined.
Partial proportional odds models in SAS...................................... Error! Bookmark not defined.
Partial proportional odds models in SAS...................................... Error! Bookmark not defined.
Example .................................................................................... Error! Bookmark not defined.
Overview ................................................................................... Error! Bookmark not defined.
Determining variables to constrain .......................................... Error! Bookmark not defined.
The PPO model ......................................................................... Error! Bookmark not defined.
Interpreting PPO results ........................................................... Error! Bookmark not defined.
Likelihood ratio tests ................................................................ Error! Bookmark not defined.
Partial proportional odds models in Stata ................................... Error! Bookmark not defined.
Example .................................................................................... Error! Bookmark not defined.
Overview ................................................................................... Error! Bookmark not defined.
Categorical predictor variables................................................. Error! Bookmark not defined.
Determining variables to constrain .......................................... Error! Bookmark not defined.
The PPO model ......................................................................... Error! Bookmark not defined.
Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 4

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Interpreting PPO results ........................................................... Error! Bookmark not defined.


Likelihood ratio tests ................................................................ Error! Bookmark not defined.
Postestimation .......................................................................... Error! Bookmark not defined.
Assumptions..................................................................................... Error! Bookmark not defined.
Parallel lines assumption.............................................................. Error! Bookmark not defined.
Adequate cell count ..................................................................... Error! Bookmark not defined.
One ordinal dependent variable .................................................. Error! Bookmark not defined.
Data level of predictor variables .................................................. Error! Bookmark not defined.
Normal distribution of the dependent variable ........................... Error! Bookmark not defined.
Adequate sample size .................................................................. Error! Bookmark not defined.
No complete or quasi-complete separation ................................ Error! Bookmark not defined.
Absence of high multicollinearity................................................. Error! Bookmark not defined.
Frequently Asked Questions ............................................................ Error! Bookmark not defined.
Why not use ordinary least-squares regression instead of ordinal (logit) regression? ....... Error!
Bookmark not defined.
Why not use ANOVA instead of ordinal (logit) regression? ......... Error! Bookmark not defined.
Why do parameter estimates differ between packages, and what is "parameterization"?
...................................................................................................... Error! Bookmark not defined.
Does the direction of coding of the ordinal dependent matter? Error! Bookmark not defined.
How do I save predicted values as variables? .............................. Error! Bookmark not defined.
SPSS........................................................................................... Error! Bookmark not defined.
SAS ............................................................................................ Error! Bookmark not defined.
Stata .......................................................................................... Error! Bookmark not defined.
What are heteroskedastic ordinal regression models? ............... Error! Bookmark not defined.
SPSS........................................................................................... Error! Bookmark not defined.
SAS ............................................................................................ Error! Bookmark not defined.
Stata .......................................................................................... Error! Bookmark not defined.
When should I use a link function other than logit? .................... Error! Bookmark not defined.
What are ordinal probit models? ................................................. Error! Bookmark not defined.
SPSS........................................................................................... Error! Bookmark not defined.
SAS ............................................................................................ Error! Bookmark not defined.
Stata .......................................................................................... Error! Bookmark not defined.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 5

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

What are ordinal regression signal-response models (probit link)?Error! Bookmark not
defined.
In Stata’s gologit2 partial proportional odds procedure, how are standardized estimates
obtained? ..................................................................................... Error! Bookmark not defined.
What is the SPSS syntax for ordinal regression models? ............. Error! Bookmark not defined.
Acknowledgements.......................................................................... Error! Bookmark not defined.
Bibliography ..................................................................................... Error! Bookmark not defined.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 6

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Overview
Ordinal regression, also called the ordered logit model, is used with ordinal
dependent (response) variables, where the independent variables may be
categorical factors or continuous covariates. Ordinal regression avoids the
measurement error inherent in OLS regression using ordinal data. When the
response variable is ordinal rather than nominal in data level, ordinal regression
also has more statistical power than multinomial regression.
Ordinal regression models are sometimes called “cumulative logit models” since
they are a variant on logistic regression, except using the cumulative logit link
rather than the logit link (though other link functions are possible). Ordinal
regression models are also called “proportional odds models” because of their
requirement that the regression lines they generate must be proportional,
meaning parallel, sharing the same regression coefficients but varying in their
intercepts.

Ordinal regression creates multiple prediction equations. For an ordinal


dependent variable with k categories, k -1 equations will be created, each with a
different intercept but all with the same b coefficients (slopes) for the predictor
variables. It is “k – 1” because no equation is created for the reference level,
which is usually the highest-coded level. That is, ordinal regression requires
assuming that the effects of the independent variables are the same for each
level of the dependent variable. In practice, researchers often consider it
sufficiently "the same" if the slopes do not cross. The "test of parallel lines
assumption" tests this critical assumption, which is almost always reported for
ordinal regression studies.

Ordinal regression models are also called a “proportional odds models” since the
k–1 regression lines are parallel, hence proportional, and because the b
coefficients may be converted to odds ratios as in logistic regression. The natural
logarithm base e exponentiated to the power of b is the odds ratio, discussed
below.
See also the separate Statistical Associates "Blue Book" volume on Generalized
Linear Models. Ordinal regression is a special case of generalized linear modeling
(GZLM). Identical parameter and model fit estimates can be obtained using the
GZLM procedure, but options vary somewhat between it and the stand-alone

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 7

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

ordinal regression procedures discussed in this volume. Other coverage of ordinal


regression is found in the separate Statistical Associates "Blue Book" on Probit
Regression and Response Models, which covers ordinal signal-response models.

Data examples in this volume


The example dataset used in this volume is listed below, with versions for SPSS
(.sav), SAS (.sas7bdat), and Stata (.dta). In SPSS format it is survey_sample.sav, an
example file provided in the SPSS "Samples" folder.
A subset of the General Social Survey 2012 was used for this volume. Variables
are described below.

• Click here to download survey_sample.sav for SPSS.


• Click here to download survey_sample.sas7bdat for SAS.
• Click here to download survey_sample.dta for Stata.

In this example, the dependent variable is "happy". It has three levels: 1=very
happy, 2=pretty happy, 3=not too happy. What is modeled is the log odds of
"happy"=1 vs. "happy" equaling the higher values, 2 or 3. Also modeled is the log
odds of "happy" = 1 or 2 vs. "happy" equaling the higher value, 3. Put another
way, ordinal regression models odds of cumulative counts, not odds of individual
levels of the dependent. For example, for "happy" = 1, we model
ln[(prob(happy=1/prob(happy>1)]. We also model, for "happy" = 2,
ln[(prob(happy=1 or 2/prob(happy>2)]. Etc. The SAS default reverses the
comparisons compared to SPSS, as discussed further below, but this can be
overrridden by the researcher.

In the example the predictor (location) variables are the main effects of:

• sex: 1 = male, 2 = female


• marital2: 1 -- married, 2 = all other
• agecat3: 1 = < 35, 2 = 35 -64, 3 = 65+
• educ: highest year of school completed (continuous)
• income: total family income (continuous)
• tvhours: television viewing hours per day (continuous)

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 8

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Key Terms and Concepts


Location variables and thresholds
The dependent variable is, of course, ordinal in level. In most statistical packages,
a binary variable also may be entered as a dependent variable, though there is
little reason to do this and in most cases the dependent variable is ordinal. Each
level of the dependent variable except the reference category (usually the
highest-coded category) has the same b coefficients but a different intercept. The
“threshold” is the intercept or, in SPSS, is -1 times the intercept. The value of any
given observation on the ordinal dependent variable depends on whether that
observation has crossed a given threshold or cut-point separating the levels of the
ordinal dependent variable..
Predictor variables, which are called “location variables,” may be categorical
(factors) or may be continuous (covariates). The location coefficients in SPSS have
the reverse sign of those in SAS, causing odds ratios to differ also, though
interpretation does not, as explained below.

Prediction equations
Ordinal regression will result in (k - 1) predictions for the dependent variable,
where k is the number of its categories. For instance, for the case of a dependent
variable with five values, the first prediction will be for the log odds of a score of 1
compared to a score higher than 1 (that is, ln(prob(1)/prob(>1)). The second
prediction will be the log odds of a score of 1 or 2 compared to a score higher
than 2. Etc. Note that it is unnecessary to have a fifth prediction to predict the
probability of a score from 1 to 5 as by definition that probability is 100%.

Ordinal Regression in SPSS


Overview
The data and variables used in this section are described above. In this national
sample, the ordinal variable “happy” is predicted from the categorical factors sex
and marital2, and from the continuous covariates educ, income, and tvhours.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 9

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Note that "happy" has three levels: 1=very happy, 2=pretty happy, 3=not too
happy. Thus a higher value is less happiness in life. The highest level (3 = not too
happy) is the reference level for comparison purposes when interpreting odds
ratios, discussed further below.

For reader convenience, coding of predictor variables is repeated here:

• sex: 1 = male, 2 = female


• marital2: 1 - married, 2 = all other
• agecat3: 1 = < 35, 2 = 35 -64, 3 = 65+
• educ: highest year of school completed (continuous)
• income: total family income (continuous)
• tvhours: television viewing hours per day (continuous)

SPSS inputs
The main “Ordinal Regression” dialog

The main ordinal regression dialog box in SPSS is where the dependent variable,
factors, and covariates are entered, as depicted below.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 10

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

The dependent variable

In SPSS the ordinal dependent variable may be coded in numeric or string terms
but coding is assumed to be ascending, with the first category corresponding to
the lowest value. What is predicted is not the raw value of the dependent variable
but some transformation of it. Usually this is the logistic function, using a logit
transformation, but SPSS offers four other transformations discussed below (ex.,
probit is available). Note that predicted values may be saved as variables, as
discussed in the FAQ below.

Factors

Factors are categorical predictor variables. In SPSS, they must be coded in


numeric form.

In the example below, unhappiness is predicted from sex, marital status, and age
category. The unhappiness variable is coded 1 = very happy, 2 = pretty happy, and
3 = not too happy. Higher values are more unhappiness. Sex (1 = male, 2 =
female), marital status (1 = married, 2 = all others), and age category (1 = under
35, 2 = 35 -64, 3 = over 64) are entered as categorical factors. Respondent's
highest year of education, total family income, and hours per week watching
television are entered as continuous covariates.

Is factor space adequate?

Factor space is the table formed by all the categorical variables plus the ordinal
dependent variable, which is also categorical. By common rule of thumb, no cell in
factor space should be 0 and 80% or more of the cells should be greater than 5.
For the example, there are 36 cells in factor space: 3 levels of happy times 3 levels
of agecat3 times 2 levels of marital2 times 2 levels of sex. In the figure below,
factor space is adequate because there are no 0-count cells and only one cell of
five or less. That is the cell for married, not too happy males under 35.

Why is this important? After computing ordinal regression, the researcher will
want to discuss effect sizes in a general way, generalizing from his or her sample
to the population from which it was taken. Because there are only four cases of
married, not too happy males under 35, the researcher’s generalizations logically
could be footnoted with a note stating the generalizations do not apply to that

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 11

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

group. The “no 0 cells, 80% more than 5” rule is, in effect, a permission to
generalize if this condition is met. Methodologists are tacitly agreeing that if the
rule is met, there are not “too many” gaps in the data to generalize one’s findings.

Factor space for the example is viewed in SPSS by running a crosstabulation of the
three factors (sex, marital status, age category) plus the categorical dependent
variable. (An ordinal variable is a type of categorical variable). As illustrated
below, all factor cells contain 5 or more observations, so cell count is adequate for
analysis ordinal regression. In SPSS, this table is obtained by selecting Analyze >
Descriptive Statistics > Crosstabs, then entering marital2 as the row variable, sex
as the column variable, and agecat3 as the layer variable for layer 1 and the
dependent variable, happy, as the layer variable for layer 2. The SPSS syntax is
this:
CROSSTABS
/TABLES=marital2 BY sex BY agecat3 BY happy
/FORMAT=AVALUE TABLES
/CELLS=COUNT
/COUNT ROUND CELL.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 12

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 13

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Covariates

Covariates are continuous numeric independent variables which may be


conceived of as control variables or as additional predictors. When covariates are
in the model, the factor space warning in SPSS, reporting the proportion of cells
with an expected count less than 5, will be inflated and should be disregarded.
Put another way, cell count adequacy depends on factor space, not covariate
space, as discussed further in the assumptions section below regarding cell
count..

The Ordinal Regression “Location” dialog

The location model

The location model is the ordinal regression analog to the OLS regression model.
It specifies the predictor variables. That is, "location" simply refers to the
independent variables and their b coefficients. In SPSS the location model is
specified by clicking the "Location" button in the main "Ordinal Regression"
dialog, leading to the screen illustrated below. Even though factors and covariates
were specified in the previous main “Ordinal Regression” dialog discussed above,
the “Location” dialog still is needed because not all predictors may be used for a
given run of a model and because the researcher may choose to create
interaction and nested effects in addition to main effects.

The "Factors/covariates" pane on the left lists any factors or covariates entered in
the initial "Ordinal Regression" dialog. If the “Custom” radio button is selected,
the researcher then uses the variables listed on the left to construct main, nested,
and/or interaction effects, moving them into the "Location model" pane on the
right. In this example, happy (actually unhappiness since higher values correspond
to being less happy) is predicted from sex, marital status, age category, highest
year of school completed (educ), total family income, and hours of television
viewed per day.

It is also possible to select the “Main effects” radio button to get the default
location model. The default model consists of all of the specified covariates and
all of the main factor effects. For this example, It would be identical to the custom
model shown below.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 14

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Interactions

Interactions represent the joint effect of two or more variables over and above
their main effects of the variables in the interaction. The “Type” drop-down menu
supports interaction effects (not shown). Simply control-click the variables
wanted, then click on the enter arrow in the “Type” pane to create an interactions
such as sex*marital2. Interactions may be between factors, between covariates,
or between a factor and a covariate. Multi-way interactions are supported, not
just two-way.

Creation of an interaction effect is illustrated below for SPSS. For syntax-based


packages like SAS and Stata, the interaction term (educ*marital2 below) may be
entered directly in the variable list. For this example, however, no interactions are
included in the model.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 15

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Nested effects

The model may also contain nested effects, including multiple levels of nesting. A
nested effect is one where levels of one variable are dependent on which level of
a second variable they are nested in. For example, cereal(brand) is a nested effect
where the particular levels of cereal (ex., All-Bran, Apple Jacks, Corn Pops, etc.)
depend on which level of brand is being considered (here, Kellogg’s brand).

Nested effects may be more complex. For example, A(B(C)) means that B is nested
within C, and A is nested within B(C). Nesting within an interaction effect is valid
(ex., A(B*C) means that A is nested within B*C). Factors inside a nested effect
must be distinct (ex., A(B*A) is invalid). Factors cannot be nested within a
covariate effect (ex., if A and B are factors and X is a covariate, then X(A) is valid
but A(X) is invalid).

There are no nested effects in the current example.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 16

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

In the next three sections (the Options, Scale, and Bootstrap button dialogs) most
researchers accept the defaults. The skip these sections and go to the “Output“
button dialog, click here.

The Ordinal Regression “Options” dialog

The Options button in SPSS is where the estimation options are specified, as in
the figure below:

The Options dialog in SPSS has elements related to the iterative process used in
calculating ordinal regression and also elements related to confidence intervals
and statistical tolerance.

• Maximum iterations can be increased if the model fails to converge, forcing


the algorithm to try longer.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 17

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

• Maximum step-halvings can be increased to check for lower log-likelihood


(hence better model fit) between iterations, forcing the algorithm to
undertake a finer analysis.
• Log-likelihood convergence is an alternative to parameter convergence.
Both are criteria for stopping the algorithm's search for a convergent
solution. In the figure, the log-likelihood value of "0" indicates this criterion
is not used (the default). If the researcher overrides this, convergence will
be assumed if the absolute or relative change in log-likelihood is less than
the value specified for "log-likelihood convergence."
• Parameter convergence is the default stopping criterion. Convergence is
assumed if the maximum absolute change in each of the parameter
estimates is less than the value specified for "parameter convergence"
(default = .000001). Making this a larger number (ex., .00001) liberalizes the
criterion for convergence. This would only be done in exploratory research
or when the model would not converge under default settings. Any such
change should be reported in the researcher's analysis.
• Confidence interval. This sets the confidence level (default is .95).
• Delta. The delta value is how much to add to cells with 0 observed
frequency (default = 0). In syntax one may also specify BIAS, which is the
value to add to all observed cell frequencies (default=0). Delta and bias
must be in the range 0 to 1.
• Singularity tolerance. The value for "Singularity tolerance" is used to check
for singularity (default = 10-8). Singularity is the mathematical condition
caused by redundant (very highly correlated) predictors. Singularity can
lead to unreliable estimates (ones in which small data changes lead to large
parameter estimate changes) or failure to converge on a solution at all.
• Link function. The link function specifies what transformation is applied to
the dependent variable (that is, to the cumulative probabilities of the
ordinal categories). By default, ordinal regression models use the
(cumulative) logit link function. That is, ordinal regression by default is a
form of logit regression model, specifically a "cumulative logit model".
However, SPSS offers five possible link functions, not just logit.

1. Logit. f(x) = log(x / (1 – x)). This is the default (and used in the
example in this module) and is recommended when the dependent
ordinal variable has relatively equal categories. As it offers more
interpretable parameter estimates, including odds ratios as measures
Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 18

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

of effect size, and as use of other link functions often makes no


substantive difference in findings, the logit link function is by far the
most utilized in ordinal regression.
2. Probit. f(x) = F -1(x), where F -1 is the inverse standard normal
cumulative distribution function. Recommended when categories of
the dependent variable are normally distributed.
3. Complementary log-log. f(x) = log(– log(1 – x)). Recommended when
higher categories are more probable than lower categories. Cloglog
models are also called "conditional ratio models" because this link is
a ratio of the conditional probability that the dependent is at a given
level given the predictors, to the conditional probability it is at a
higher level. Cloglog models are also called "proportional hazard
models" (Bender and Benner, 2000).
4. Negative log-log. f(x) = –log(– log(x)). Recommended when lower
categories are more probable than higher categories.
5. Cauchit. f(x) = tan(p(x – 0.5)). Recommended when many extreme
values are present. Note: The Cauchit link is the inverse Cauchy link.

The Ordinal Regression “Scale” dialog

When a probit rather than logit link is selected in ordinal regression, the SPSS
algorithm may need to be told what the model is to scale different variances for
different groups. In the “Scale” dialog, the researcher may create such a scale
model. As shown in the figure below, factors and covariates are listed, from which
the researcher may construct main and interaction effects.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 19

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

When is a location-scale model needed in a probit signal-response study? In


probit signal-response models, subjects rate their confidence that a signal is
present. Default models assume that error variances are the same for both the
group when the signal was actually present and for the group when it was not
present. When the equal variances assumption is violated, computed standard
errors and significance levels are wrong. That is, parameter estimates are biased.
To avoid bias, the researcher creates what is variously called a scale model, a
location-scale model, or a heterogeneous model.

Scale models are outside the scope of the present volume, which focuses on logit
link ordinal models. For discussion of scale models, see the separate Statistical
Associates “Blue Book” volume on “Probit and Logit Response Models”.
The Ordinal Regression “Bootstrap” dialog

Default significance testing provides asymptotic significance estimates which


depend on large samples and the assumption of known, usually normal,
distributions. When samples are small, the distribution is unknown or non-
normal, or errors (residuals) are heteroskedastic*, bootstrapping may be used to
derive estimates of standard errors and significance levels. Sometimes

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 20

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

bootstrapping also is used simply because the asymptotic formula seems too
complicated, as for standard errors of the median.

Bootstrapping is a resampling method which uses repeated samples from the


same original data sample to compute some test statistic. The distribution of this
statistic is computed for a large number of runs (ex., 1,000) of the resampling
process is used to estimate the variance of the statistic in the underlying
population, thereby allowing the significance of the statistic to be estimated.

As such, bootstrapped significance estimates are data-driven, which means they


may not generalize to another sample. For this reason, cross-validation is
recommended when bootstrapping is used. Cross-validation, in turn, is developing
the statistical model for a development dataset (e.g., even-numbered
observations) and then testing it on a validation dataset (e.g., odd-numbered
observations).
* The spellings heteroskedasticity and heteroscedasticity are used with almost
equal frequency, with both being correct. As the former has the slight edge in
frequency, it is used in this volume.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 21

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

The Ordinal Regression “Output” dialog

The Output button in SPSS is where statistical output and saved variables are
specified. The default outputs are goodness of fit statistics, summary statistics,
and parameter estimates. While not a default, requesting the test of parallel lines
is strongly recommended. Other output options include iteration history,
correlation and covariance of estimates, and outputting to file for each case the
predicted category of the dependent variable, response probabilities, the log-
likelihood, and more. The figure below shows settings for the example which
follows.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 22

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

SPSS outputs
Overview

In the SPSS example discussed below, the dependent variable is "happy", coded
1=very happy, 2=pretty happy, 3=not too happy. What is modeled is the log odds
of "happy"=1 vs. "happy" equaling the higher values, 2 or 3. Also modeled is the
log odds of "happy" = 1 or 2 vs. "happy" equaling the higher value, 3. Put another
way, ordinal regression models odds of cumulative counts, not odds of individual
levels of the dependent. For example, for "happy" = 1, we model
ln[(prob(happy=1/prob(happy>1)]. We also model, for "happy" = 2,
ln[(prob(happy=1 or 2/prob(happy>2)]. Etc. The SAS default reverses the
comparisons, as discussed further below, but this may be overrridden by the
researcher.

In this example, the location (predictor) variables are the main effects of:

• sex: 1 = male, 2 = female


• marital2: 1 - married, 2 = all other
• agecat3: 1 = < 35, 2 = 35 -64, 3 = 65+
• educ: highest year of school completed (continuous)
• income: total family income (continuous)
• tvhours: television viewing hours per day (continuous)
Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 23

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

The parallel lines test

The parallel lines assumption, also called the "proportionality of odds"


assumption, is critical to ordinal regression. Ordinal regression computes multiple
thresholds, which are the intercepts times -1 (discussed below), but only one set
of effect coefficients (b's, which are the slopes of the effects). There is one
prediction equation for each threshold and it is assumed the slopes are identical,
meaning the lines will be parallel, separated by the magnitude of the thresholds.

The parallel lines test is non-significant in a well-fitting model which meets the
parallel lines assumption. The test is a likelihood ratio test of the difference in -2
Log Likelihood between a model constrained to have equal slopes for the
predictor (location) variables and an unconstrained model. Note that the parallel
lines test is available only for the location-only model (not a location-scale model
for unequal variances. discussed below).

Although not part of SPSS default output, perhaps the first output table to
examine is that testing the critical parallel lines assumption: that the slopes of the
predictor variables (to be shown as the parameter estimates for the location
variables) are the same for each level of the dependent variable ("happy" in the
example).

In this example, the parallel lines test is significant, meaning that the regression
slopes do differ significantly across levels of the dependent variable, "happy".
When the test of parallel lines returns a finding of significance, there are four
options:

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 24

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

1. Drop ordinal regression in favor of a partial proportional odds model,


discussed below. These models do not require ordinal regression’s parallel
lines assumption but do not “throw away” ordinal information about the
dependent variable, as do multinomial regression models.
2.

END OF PREVIEW OF FIRST 25 PAGES


To buy the Kindle version for $5, click here.

To buy the entire Statistical Associates “Regression Models” library of 10 statistics


books in no-password pdf format on DVD plus one year of free updates for $50,
click here.

To buy the entire Statistical Associates library of 50 statistics books in no-


password pdf format on DVD plus one year of free updates for $120, click here.

To register for a password-protected pdf version when available, go to


http://www.statisticalassociates.com .

@c 2006, 2008, 2010, 2012, 2013, & 2014 by G. David Garson and Statistical Associates
Publishers. Worldwide rights reserved in all languages and all media. Do not copy or post on
other servers, even for educational use. Last updated: 5/12/2014.

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 25

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Statistical Associates Publishing


Blue Book Series

The entire Statistical Associates library of 50 “Blue Books” on statistical topics


below is available on DVD in no-password pdf format under individual license at
http://www.amazon.com/dp/1626380201 .

Association, Measures of
Canonical Correlation
Case Studies
Cluster Analysis
Content Analysis
Correlation
Correlation, Partial
Correspondence Analysis
Cox Regression
Creating Simulated Datasets
Crosstabulation
Curve Estimation & Nonlinear Regression
Delphi Method in Quantitative Research
Discriminant Function Analysis
Ethnographic Research
Evaluation Research
Factor Analysis
Focus Group Research
Game Theory
Generalized Linear Models/Generalized Estimating Equations
GLM (Multivariate), MANOVA, and MANCOVA
GLM (Univariate), ANOVA, and ANCOVA
Grounded Theory
Life Tables & Kaplan-Meier Survival Analysis
Literature Review in Research and Dissertation Writing
Logistic Regression: Binary & Multinomial
Log-linear Models,
Longitudinal Analysis

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 26

Preview. Do not post or distribute.


ORDINAL REGRESSION 2014 Edition

Missing Values & Data Imputation


Multidimensional Scaling
Multiple Regression
Narrative Analysis
Network Analysis
Neural Network Models
Nonlinear Regression
Ordinal Regression
Parametric Survival Analysis
Partial Correlation
Partial Least Squares Regression
Participant Observation
Path Analysis
Power Analysis
Probability
Probit and Logit Response Models
Research Design
Scales and Measures
Significance Testing
Social Science Theory in Research and Dissertation Writing
Structural Equation Modeling
Survey Research & Sampling
Testing Statistical Assumptions
Two-Stage Least Squares Regression
Validity & Reliability
Variance Components Analysis
Weighted Least Squares Regression

Statistical Associates Publishing


http://www.statisticalassociates.com
sa.publishers@gmail.com

Copyright @c 2012 by G. David Garson and Statistical Associates Publishing Page 27

Preview. Do not post or distribute.

You might also like