You are on page 1of 3

806282

2018
ASRXXX10.1177/0003122418806282American Sociological ReviewLizardo et al.

American Sociological Review

Editors’ Comment: 2018, Vol. 83(6) 1281­–1283


© American Sociological
Association 2018
A Few Guidelines for DOI: 10.1177/0003122418806282
https://doi.org/10.1177/0003122418806282
journals.sagepub.com/home/asr

Quantitative Submissions

Sarah A. Mustillo,a Omar A. Lizardo,b


and Rory M. McVeigha
In the three years we have been editing ASR, We will not take a stand in this debate
we have been impressed with the method- except to say that, in general, p < .10 and one-
ological breadth and depth of the submissions tailed tests should only be used in rare, excep-
to the journal. Among the subset of papers tional circumstances with proper justification.
that use primarily quantitative analytic strate- Many papers attempt to justify use of p < .10
gies, an equally impressive range of methods standards by pointing to “directionality” in
and techniques is on display. The field has their verbally stated hypotheses. Others use
come a long way since any of the three of us vague language of p < .10 indicating “border-
were in graduate school and, indeed, many of line” or “suggestive” findings. We do not find
the articles we have published in our role as the first rationale compelling. In terms of the
editors represent the forefront of sophistica- second practice, ASR is our discipline’s top
tion in techniques as varied as fixed and ran- journal. We need to be publishing strong evi-
dom effects on the one end to web scraping dence rather than “suggestive” findings.
and text analysis on the other end. In this
editorial, we would like to focus on a set of
issues that seem to come up repeatedly in the Testing Mediation
thousands of papers we have read. These are We get many submissions to the journal attempt-
not errors per se, but fall in the category of ing to test mediation with a stripped down ver-
gaps or lags between previously accepted sion of the Baron and Kenny (1986) steps.
practices among quantitative scholars in soci- Authors usually proceed like this—they run one
ology and the state of the art consensus model with their key predictor plus controls and
among quantitative methodologists. These then a second model adding the mediator. If the
issues happen with such frequency that we coefficient of the key predictor is reduced or
feel compelled to offer some recommenda- rendered nonsignificant, the authors conclude
tions for future ASR submissions. that the main effect has been mediated.
There are several problems with this
approach. Most commonly, authors fail to run a
P-Values and One versus significance test for the difference in magnitude
Two-Tailed Tests between coefficients. This step is necessary to
Debates about the utility of p-values abound determine whether mediation has occurred. The
in the scientific literature. On the one hand, coefficient of the key predictor can be reduced
those concerned about replicability and stan- or even rendered nonsignificant yet still be in
dards for new discovery argue that the thresh- the window of what could be considered to have
old for statistical significance should be occurred by chance alone. As Gelman and Stern
reduced below .05 (Benjamin et al. 2018). On (2006) note, changes in statistical significance
the other hand, some argue that we should do may not themselves be significant.
away with p values and null hypothesis sig-
nificance testing altogether (McShane et al. a
University of Notre Dame
2017). b
University of California-Los Angeles
1282 American Sociological Review 83(6)

Occasionally, authors run a significance Sociologists are not alone in this common
test like a Sobel (1986) test to see if the reduc- practice. Using them interchangeably can
tion is statistically significant, which is a step cause confusion as to what kind of model is
in the right direction. More recent tests and actually being estimated.
procedures have been developed by various Simple regression refers to a model with
authors in various traditions that can be imple- one independent variable and one dependent
mented in different software packages that variable. Multiple regression refers to a model
improve upon the Sobel test. Some of these with multiple independent variables and one
deal with the next issue we will mention, dependent variable. Another term for multiple
which is that mediation in nonlinear models regression is multivariable regression.
such as logit and probit cannot be determined A multivariate model is an entirely different
by looking at changes in coefficient magni- model from those mentioned above: a multi-
tude across models. See work by MacKinnon variate model is a model with multiple depend-
(2008), Imai, Keele, and Tingley (2010), Karl- ent variables, such as factor analysis, a
son, Holm and Breen (2012), and Vander- structural equation model, or a latent growth
weele (2015, 2016) for more detail. We curve model. Given how often these terms are
recommend that authors who aim to test medi- confused in published work, many argue that
ation in future ASR publications implement the distinction has become arbitrary or seman-
more sophisticated strategies as appropriate. tic, but we think it is important to maintain the
distinction given that multivariate statistics is a
well-developed branch of statistics in its own
Interactions in Models right, often the subject of entire courses, and it
with Categorical is consistent with usage in other fields. This is
Dependent Variables important, since ASR papers are read widely
across the social sciences, not just sociology.
Various problems have been raised with using
z-statistics (and associated p-values) of the
coefficient of a multiplicative term to test for a Measurement
statistical interaction in nonlinear models with Many authors need to take measurement
categorical dependent variables. Allison (1999), issues more seriously than they currently do.
Williams (2009), and others focus on one type We often receive submissions in which key
of problem (e.g., differences in residual varia- variables, both independent and dependent,
tion among groups), others such as Mood are ad hoc creations of the authors in which
(2010), Breen and Karlson (2013), and Long the rigors of measurement science have not
and Mustillo (forthcoming) focus on a litany of been properly engaged. This strategy may be
issues. The case is closed: don’t use the coeffi- acceptable for simple, straightforward con-
cient of the interaction term to draw conclu- cepts, but for anything more complex, vali-
sions about statistical interaction in categorical dated measures or measures with a body of
models such as logit, probit, Poisson, and so on. literature behind them would be preferred
Each scholar recommends a different approach when possible, appropriate, or relevant.
for testing interactions. We recommend that A less serious concern, but still a concern,
future authors use the appropriate technique is authors who use validated scales in a man-
depending on the particular application. ner that is inconsistent with published validity
work on those scales. If a validated scale has
12 items that were coded 1 to 4 and added
“Multivariate” versus together in the published validity work, then
“Multivariable” the scale should be treated as such unless jus-
Scholars from many disciplines use these tification is provided for treating it differently.
terms interchangeably to describe their mod- Often we receive submissions in which items
els, but they do not mean the same thing. are cherry-picked from the scale, or the coding
Mustillo et al. 1283

of such items is changed, or the summing Benjamin, Daniel J., James O. Berger, Magnus Johan-
scheme is altered. Any of these changes may nesson, Brian A. Nosek, E.-J. Wagenmakers, Rich-
ard Berk, Kenneth A. Bollen, et al. 2018. “Redefine
undermine the validity of the scale. Statistical Significance.” Nature Human Behaviour
2(1):6–10.
Breen, Richard, and Kristian Bernt Karlson. 2013.
Methods Sections “Counterfactual Causal Analysis and Nonlinear Prob-
Better organization of methods sections and ability Models.” Pp. 167–87 in Handbook of Causal
Analysis for Social Research. Dordrecht: Springer.
more detail about procedures is one final area Imai, Kosuke, Luke Keele, and Dustin Tingley. 2010. “A
that could use attention. Many submissions General Approach to Causal Mediation Analysis.”
fail to provide enough detail on data collec- Psychological Methods 15(4):309–334.
tion procedures, sample size, missing data, Gelman, Andrew, and Hal Stern. 2006. “The Difference
decisions about who is included or excluded between ‘Significant’ and ‘Not Significant’ Is Not
Itself Statistically Significant.” The American Statis-
from the sample, what types of models are tician 60(4):328–31.
estimated and why, response rate of a survey, Karlson, Kristian B., Anders Holm, and Richard Breen.
selection effects, measurement of the vari- 2012. “Comparing Regression Coefficients between
ables, and so forth. At times, all of this infor- Same-Sample Nested Models using Logit and Pro-
mation is provided, but instead of being bit: A New Method.” Sociological Methodology
42(1):286–313.
neatly packaged in the methods section, it is Long, J. Scott, and Sarah A. Mustillo. Forthcoming.
doled out piecemeal across the methods and “Comparing Groups in Binary Regression Mod-
results or relegated to obscure sections in an els Using Predictions.” Sociological Methods and
Appendix. The easiest papers for readers to Research.
follow are the ones in which the methods are MacKinnon, David P. 2008. Introduction to Statistical
Mediation Analysis. New York: Routledge.
tightly linked to the front-end and in which McShane, Blakeley B., David Gal, Andrew Gelman,
the methods proceed in a unified, organized Christian Robert, and Jennifer L. Tackett. 2017.
fashion—data, variables, models. “Abandon Statistical Significance.” Unpublished
In summary, sociology has made substan- Manuscript. Retrieved June 21, 2018 (http://www
tial progress in terms of the range and sophis- .stat.columbia.edu/~gelman/research/unpublished/
abandon.pdf).
tication of quantitative methods used to Mood, Carina. 2010. “Logistic Regression: Why We
address important questions in our field. We Cannot Do What We Think We Can Do, and What
offer these recommendations to assist schol- We Can Do About It.” European Sociological Review
ars in making decisions about which methods 26(1):67–82.
to use and how to best present results in areas Sobel, Michael E. 1986. “Some New Results on Indi-
rect Effects and Their Standard Errors in Covari-
that are currently lacking clear guidelines. ance Structure Models.” Sociological Methodology
16:159–86.
VanderWeele, Tyler J. 2015. Explanation in Causal
References Inference: Methods for Mediation and Interaction.
Allison, Paul D. 1999. “Comparing Logit and Probit Oxford, UK: Oxford University Press.
Coefficients across Groups.” Sociological Methods & VanderWeele, Tyler J. 2016. “Mediation Analysis: A
Research 28(2):186–208. Practitioner’s Guide.” Annual Review of Public
Baron, Reuben M., and David A. Kenny. 1986. “Mod- Health 37:17–32.
erator-Mediator Variables Distinction in Social Psy- Williams, Richard. 2009. “Using Heterogeneous Choice
chological Research: Conceptual, Strategic, and Models to Compare Logit and Probit Coefficients
Statistical Considerations.” Journal of Personality across Groups.” Sociological Methods & Research
and Social Psychology 51(6):1173–82. 37(4):531–59.

You might also like