You are on page 1of 20

Multivariate Behavioral Research

ISSN: 0027-3171 (Print) 1532-7906 (Online) Journal homepage: http://www.tandfonline.com/loi/hmbr20

The Relationship Between the Standardized Root


Mean Square Residual and Model Misspecification
in Factor Analysis Models

Dexin Shi, Alberto Maydeu-Olivares & Christine DiStefano

To cite this article: Dexin Shi, Alberto Maydeu-Olivares & Christine DiStefano (2018): The
Relationship Between the Standardized Root Mean Square Residual and Model Misspecification in
Factor Analysis Models, Multivariate Behavioral Research, DOI: 10.1080/00273171.2018.1476221

To link to this article: https://doi.org/10.1080/00273171.2018.1476221

Published online: 30 Dec 2018.

Submit your article to this journal

Article views: 14

View Crossmark data

Citing articles: 1 View citing articles

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=hmbr20
MULTIVARIATE BEHAVIORAL RESEARCH
https://doi.org/10.1080/00273171.2018.1476221

The Relationship Between the Standardized Root Mean Square Residual and
Model Misspecification in Factor Analysis Models
Dexin Shia , Alberto Maydeu-Olivaresa,b, and Christine DiStefanoa
a
University of South Carolina; bUniversity of Barcelona

ABSTRACT KEYWORDS
We argue that the definition of close fitting models should embody the notion of substan- Structural equation
tially ignorable misspecifications (SIM). A SIM model is a misspecified model that might be modeling (SEM);
selected, based on parsimony, over the true model should knowledge of the true model be standardized root mean
square residual (SRMR);
available. Because in applications the true model (i.e., the data generating mechanism) is close fit
unknown, we investigate the relationship between the population standardized root mean
square residual (SRMR) values and various model misspecifications in factor analysis models
to better understand the magnitudes of the SRMR. Summary effect sizes of misfit such as
the SRMR are necessarily insensitive to some non-ignorable localized misspecifications (i.e.,
the presence of a few large residual correlations in large models). Localized misspecifications
may be identified by examining the largest standardized residual covariance. Based on the
findings, our population reference values for close fit are based on a two-index strategy: (1)
largest absolute value of standardized residual covariance 0.10, and (2) SRMR 0.05R 2
the average R2 of the manifest variables; for acceptable fit our values are 0.15 and 0.10R 2,
respectively.

In structural equation modeling (SEM), the assess- model is actionable (whether it fits ‘closely’) or not
ment of model-data fit has long been an important, are often based on goodness of fit indices (e.g., the
but difficult, issue. The most common test of fit, the Comparative Fit Index, CFI; Bentler, 1990). Sample
likelihood ratio (LR)-based chi-square test, is typically values of these goodness-of-fit indices are compared
used to evaluate the discrepancy between a proposed to a fixed cutoff value that have been proposed in the
model and the data. The results of the test suggest if literature (e.g., Hu & Bentler, 1999). If a sample esti-
the model is an adequate representation of the data. mate meets the recommended cutoff value (e.g., CFI
However, the LR chi-square is a test of exact fit, 0.95), the model is retained as a ‘close fitting’ model.
meaning it is testing that there is no discrepancy Otherwise, the model is rejected.
between the hypothesized model and the data. In Researchers have pointed out several problems with
most empirical situations, the model under consider- the practice of using goodness-of-fit indices with this
ation is to some degree incorrect i.e., misspecified ‘hypothesis testing’ approach. One source of problems
(Box, 1979; MacCallum, 2003). Thus, in large samples, involves the use of goodness-of-fit indices (Barrett,
the use of the LR chi-square test will often suggest an 2007; Maydeu-Olivares, 2017; Yuan, 2005). Another
unacceptable fit, even when the model misspecifica- source of problems involves the definition (or lack
tion is relatively minor. thereof) of ‘close fitting’ model. We describe each of
In applications, tests of exact fit often reject the fit- these topics in turn.
ted model and researchers are keenly interested in
determining whether their misfitting model is action-
Goodness-of-fit indices versus Effect sizes of
able (i.e., it could be retained and substantive infer-
model misfit
ences could be drawn from it), or should be rejected
and a better model should be sought. Current practi- One main concern involving goodness-of-fit indices is
ces involving the decision of whether a mispecified that the procedures are largely heuristic, and are not

CONTACT Dexin Shi shid@mailbox.sc.edu Department of Psychology, University of South Carolina, Barnwell College, 1512 Pendleton St., Columbia,
SC 29208, USA
ß 2018 Taylor & Francis Group, LLC
2 D. SHI ET AL.

grounded in statistical theory. A decision is made the population model (i.e., ‘incidental parameters’,
solely by evaluating the estimated value (i.e., sample Saris, Satorra, & van der Veld, 2009). For example,
statistic), which may be a biased estimator of the the same population RMSEA (say 0.05) may hold a
population parameter of interest. In addition, the sam- different meaning in terms of the model misspecifica-
pling variability of the statistic is blatantly ignored. tion when models differ in terms of the magnitude of
Thus, researchers may not know how widely estimates factor loadings and model size (Chen, Curran, Bollen,
vary across samples. Kirby, & Paxton, 2008; Savalei, 2012; Shi, Lee, &
A better alternative for assessing close fit is to use Maydeu-Olivares, 2018).
effect size measures of misfit. Effect sizes of model Standardized effect sizes are preferable to unstan-
misfit are population parameters that capture the dis- dardized measures as they facilitate the interpretation
crepancy between the fitted model and the data gener- of the magnitude of misfit. The most popular standar-
ating process. For effect sizes of model misfit, dized effect size of misfit is the SRMR, which can be
statistical theory is available, which enables the con- crudely interpreted as the average standardized
struction of confidence intervals, and if of interest, residual covariance. Recently, Maydeu-Olivares (2017)
statistical tests (Maydeu-Olivares, 2017)1. Various derived an unbiased estimator of the population
forms of effect sizes exist in the SEM literature, SRMR, its asymptotic standard error, and suggested
including measures which are unstandardized (e.g., using a standard normal reference distribution to
the root mean squared error of approximation, or approximate its asymptotic distribution. As a result,
RMSEA; Browne & Cudeck, 1993; Steiger, 1989, the SRMR can be used to provide a statistical test of
1990), standardized (e.g., the standardized root mean close fit (i.e., SRMR  c) and it is an attractive alter-
squared residual, or SRMR; Bentler, 1995; J€ oreskog & native to the use of the RMSEA. A major advantage
Sorbom, 1988), or relative (e.g., the goodness-of-fit of using SRMR over RMSEA is that its value can be
index, or GFI; J€ oreskog & S€ orbom, 1988; Maiti & substantively interpreted. Also, in finite samples,
Mukherjee, 1990; Steiger, 1989). Maydeu-Olivares, Shi, & Rosseel (2018) showed that
Currently, the most widely used effect size of compared to RMSEA, SRMR yielded more accurate
model misfit is the RMSEA (Browne & Cudeck, 1993; empirical rejection rates and better coverage to its
Steiger, 1990). The RMSEA measures the unstandar- population value, especially when the number of
dized discrepancy between the population and the fit- observed variables was larger than 30.
ted model, adjusted by the degrees of freedom (df) of A few studies have examined the behavior of the
the model. Formal statistical inferences can be made sample SRMR under various types of model misspeci-
by testing the hypothesis thatRMSEA  c; where, c is fications (Beauducel & Wittmann, 2005; Fan & Sivo,
the reference cutoff in the population suggesting close 2005, 2007; Garrido, Abad, & Ponsoda, 2016; Hu &
fit. The most commonly used cutoff value is based on Bentler, 1998, 1999), including the most influential
the recommendation from Browne and Cudeck (1993, simulation study conducted by Hu and Bentler (1999).
p. 144), where the authors stated that ‘practical experi- In trying to find balance between minimizing type I
ence has made us feel that a value of the RMSEA of errors (i.e., rejecting a correctly specified model) and
about 0.05 or less would indicate a close fit of the maximizing power (i.e., rejecting a misspecified
model in relation to the degrees of freedom’. model), Hu and Bentler (1999) suggested that models
However, the population RMSEA is impossible to with sample SRMR values <0.08 generally indicated
interpret because it is in an unstandardized metric. As adequate fit. These previous studies have focused on
Edwards (2013, p. 213) puts it ‘We do not know what simulating data through use of finite samples, which
a 0.01 difference in RMSEA values means. We do not inherently include sampling error. In addition, the
know that a model with an RMSEA of 0.12 is incap- formula used is a biased estimate of the population
able of telling us something useful about the world. SRMR, and the resultant estimate may be noticeably
We do not know that a model with an RMSEA of different from its population counterpart in small
0.01 is telling us anything useful about the world.’ samples (Maydeu-Olivares, 2017). Given that previous
Besides the level of model misspecification, the popu- studies have focused on the sample SRMR, researchers
lation RMSEA is dependent on other characteristics of would benefit from greater understanding of how
population SRMR is affected by model
1
In this paper, we distinguish between effect sizes of model misfit and misspecifications.
goodness of fit indices. The term goodness of fit indices is reserved for
sample statistics used to adjudge model fit disregarding their sampling
SEM misfit may best be characterized as a multi-
variability and without referencing the population parameter. variate problem, and it requires examining all
MULTIVARIATE BEHAVIORAL RESEARCH 3

statistically significant standardized residual covarian- the model’s use. A direct analogy involves p values
ces, especially the ones with largest absolute value and significance levels. Statistics has often been
(Maydeu-Olivares & Shi, 2017; McDonald & Ho, described as the quantification of uncertainty. From
2002b; Raykov, 2000)2. An examination of the full this viewpoint, statistics finishes once a p value (or a
matrix of residual covariances, or at least the most confidence interval) is obtained. However, if a deci-
extreme value within that matrix, is seldom performed sion must be made, the use of an agreed upon signifi-
in practice. However, being (crudely) the average cance level (a cutoff value) greatly facilitates scientific
standardized residual covariance, researchers have communication.
suggested that it is only meaningful to interpret the Putting forth suggested cutoff values to distinguish
SRMR when there is little variability among the stand- between close and non-close fitting models is made
ardized residual covariances and there are no clear more difficult by the absence of any definition of close
outliers (i.e., some standardized residual covariances fitting model in the literature. For instance, based on
much larger than the rest: McDonald & Ho, 2002; their practical experience, Browne and Cudeck (1993)
Raykov, 2000). Also, the SRMR may not be sensitive simply defined a model as fitting closely when RMSEA
to model misspecifications which cause only a small is 0.05. To overcome this shortcoming we define a
proportion of residuals to be large in a residual model to provide a close fit to the data generating
covariance matrix that may include many zeros or mechanism if its misspecification is substantively ignor-
small values. However, these aspects have not been able. More specifically, we define a model with substan-
thoroughly explored, and as of yet, the relationship tively ignorable misspecifications (SIM) as a
between the largest standardized residual covariance mispecified model that might be selected, based on par-
and model misspecification has not been fully simony, over the true model should knowledge of the
explored, and therefore no clear criteria are available true model be available. To the best of our knowledge,
to assess close fit. the first research that used the notion of SIM (without
explicitly defining it) to specify criteria for close fit is
A definition of close fit: substantively Maydeu-Olivares and Joe (2014) who used SIM to
ignorable misfit establish cutoff values for the use of the RMSEA in IRT
models. Also, our use of SIM is consistent with some of
Although the population SRMR is easy to interpret in the most influential writings in goodness-of-fit assess-
a standardized metric, it is not intuitive to researchers ment in SEM (e.g., Saris et al., 2009).
what a specific value of SRMR (for example, 0.05) Because in applications the data generating mech-
implies in terms of misspecification(s) in the fitted anism (or in short, the true model, although the
model. Therefore, it is necessary to investigate the expression is an oxymoron) is unknown, it is neces-
relationship between the magnitudes of effect size sary to probe different combinations of true and fitted
(i.e., the population SRMR) and some common types models and consider for each combination which mis-
of model misspecification to gain greater understand- specifications are substantively ignorable. In this art-
ing of the meaning of the population parameter. icle, we focus on factor analysis models and we
Thus, we aim at addressing Edwards’s concern (2013) consider three classes of misspecifications: (1) misspe-
(using the SRMR) and help applied researchers to cified dimensionality (e.g., fitting a one-factor model
make a more informed decision on whether to retain when the true model is a two-factor model); (2) fitting
or reject a misspecified model. a factor model with independent clusters when the
We recognize that whether to retain or reject a true model contains small cross-loadings; and (3) fit-
misspecified model depends on the purpose of the ting a model with uncorrelated errors when the true
application, and therefore it is necessarily subjective. model contains correlated errors.
Yet, in many instances, an application may serve sev- Consider the following specific example of the first
eral purposes, or the study be purely exploratory. In scenario: choosing between a one-factor model and a
these instances, researchers may find helpful that two-factor model with independent clusters (i.e., every
some reference cutoff values be provided. Choosing a item loads on a single factor) whose factors correlate
reference cutoff value (c) is a difficult but unavoidable q. What magnitude of q is substantively ignorable?
issue, because a decision needs to be made regarding Certainly, not 0, but neither it is 0.3. We believe that
2
most researchers confronted with a choice between a
When interpreting the standardized residual covariances, we ignore the
signs and refer the “largest standardized residual covariances” as the one
one-factor model and a two-factor model with q¼.99
with the largest absolute value. would choose a one-factor model. Consequently, the
4 D. SHI ET AL.

misspecification obtained when fitting a one-factor


vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
 2
model in this case is substantively ignorable. We also u
rffiffiffiffiffiffiffiffiffiffiffiffiffiu
X rij  rij
0
believe that the misfit corresponding to q ¼ 0.90 is sub- e0 s es u 1
P:SRMR ¼ ¼t pffiffiffiffiffiffiffiffiffiffi : (1)
stantively ignorable, but we do not wish to go further. t t ij rii rjj
We prefer to err on the safe side (the same spirit was
used to establish cutoff values—significance levels—for Here, es is the vector of the population standar-
p values). Therefore we define a one-factor model fitted dized residual covariances, t ¼ pðp þ 1Þ=2 signifies the
to a two-factor model with q  0.90 as fitting closely, number of unique elements in the (residual) covari-
and not fitting closely otherwise. We use similar criteria ance matrix, where p denotes the number of observed
for other combinations of true and fitted models. For variables being modeled. Thus, Equation (1) approxi-
instance, we consider (standardized) cross-loadings mates the average population standardized
0.10 as substantively ignorable (provided they do not residual covariance.
follow a substantively meaningful pattern). We believe In finite samples, let sij be the sample covariance, r^ij
most researchers would prefer to fix cross-loadings to denote the model implied covariance, es be the t vector
zero when their standardized magnitude is 0.10, but of the standardized residual covariances with elements
again we prefer to err on the safe side, and we do not
sij r^ij
wish to go further. Similarly, we consider correlations eij ¼ pffiffiffiffiffiffiffiffi (2)
among the residuals 0.10 as substantively ignorable sii sjj
(provided the correlations are not patterned). and Ns represent the asymptotic covariance matrix of es .
To summarize, to help researchers to make a more Maydeu-Olivares (2017) showed that regardless of the
informed decision on whether to retain or reject a misspe- discrepancy function and distributional assumptions
cified model, we investigate the relationship between the used, an asymptotically unbiased estimate of the popula-
population SRMR and model misspecifications in the tion SRMR of Equation (1) can be expressed as:
context of the factor analysis models. In addition to sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
   ffi
the SRMR, the behavior of the largest standardized 1 max es 0 es  tr N ^s ;0
^ u ¼ ^k
SRMR ; where
residual covariance (i.e., defined in terms of absolute val- s
t
ues) is inspected. In particular, we are interested in exam-
^k s ¼ 1 trðN^ s Þþ2es2N^ s es :
2

ining whether models with substantively ignorable misfit ð3Þ


4ðe0 s es Þ
(SIM, our definition of close fit) can be distinguished
from non-SIM models using these two parameters. However, the asymptotic covariance matrix, Ns , depends
The remainder of this article is organized as fol- on the discrepancy function used to estimate the model,
lows. We first review statistical theory and clarify the and on whether normality or asymptotically distribution
formula for unbiased estimation of the population free (ADF: Browne, 1982) assumptions are used.
SRMR. Next, using population covariance matrices, In typical applications, SEM software programs
we explore the behaviors of the population SRMR and compute a sample counterpart of the population
the largest standardized residual covariance under dif- SRMR in Equation (1) as:
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
 2
ferent types and degrees of model misspecification. u
rffiffiffiffiffiffiffiffiffi u  r ^
That is, for an array of true and fitted models we spe- 0 u X ij s
^ b ¼ es es ¼ t1
ij
cify which ones we consider close fitting or acceptable SRMR pffiffiffiffiffiffiffiffi (4)
t t ij sii sjj
(i.e., actionable) and we report the corresponding val-
ues of the SRMR (and the largest standardized where, the elements in the equation are defined ear-
residual covariance). We conclude this discussion by lier. It is noted that the sample SRMR shown in
offering practical guidance for empirical research Equation (4) is a biased estimate of the population
when using the SRMR to assess goodness of fit. SRMR. Following Maydeu-Olivares (2017), we derived
the expected value of the biased estimate SRMRb . This
Statistical theory for the SRMR can be approximated in large samples using
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
  trðNs Þ þ e0 s es
Let rij denote the unknown population covariance ^
E SRMR b ¼
between variables i and j (or the variance if i¼j) and t
r0ij denote the population covariance (or variance) 8½trðNs Þ þ e0 s es 2 2trðNs Þ4e0 s Ns es
: (5)
under the fitted model. Then, the population SRMR 8½trðNs Þ þ e0 s es 2
(P.SRMR) is defined as (Maydeu-Olivares, 2017):
MULTIVARIATE BEHAVIORAL RESEARCH 5

In Appendix A, we support the accuracy of this with correlated factors. A one-factor model was
approximation by demonstrating the biases of the fit to the two-dimensional structure.
SRMRb and SRMRu in estimating the population SRMR b. Omitting cross-loadings. Items related to each factor
using a small simulation example. It is noted that the via an independent clusters structure; however, one
SRMRb (Equation (4)) generally reported in software or multiple indicators cross-loaded on both factors.
packages is upwardly biased, meaning the index typically The fitted model assumed an independent clusters
suggests worse model fit than is actually present. The structure for both factors, where the cross-loading
amount of bias is compounded when sample size is value(s) was incorrectly fixed to zero.
small and low standardized factor loadings are present. c. Omitting residual correlations. In the population
Confidence intervals for the SRMR and tests of model, one or multiple residual correlations
close fit can be obtained using its unbiased estimate (covariances) were present. A simple structure
and a reference normal distribution. Specifically, with model with no correlated error was fit.
large samples, a (100–a)% confidence interval for the
SRMR can be obtained using: To produce these population covariance matrices,
 factor variances were fixed to one. The error variances
Pr SRMRu  za=2 SEðSRMRu Þ  SRMR  SRMRu
were set such that all factor loadings (including cross-
þza=2 SEðSRMRu ÞÞ¼ 1a; loadings) are on a standardized scale. Other character-
istics that were manipulated are as follows:
Where, SE () denotes asymptotic standard error, Magnitude of factor loadings. The population factor
which is given as (Maydeu-Olivares, 2017): loadings (k) included low (0.40), medium (0.60), and
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
  high (0.80). The primary factor loadings are the same
2
tr N2s þ 2e0 s Ns es
SEðSRMRu Þ ¼ ks : (7) for all items across factors.
2te0 s es Model size. The model size is indicated by the
In addition, statistical test for model close fit can number of observed variables (p; Moshagen, 2012; Shi,
be conducted with hypotheses such that Lee, & Terry, 2015, 2018)3. Model size ranged from
small to very large, including p ¼ 12, 36, 72, or 144.
H0 : SRMR  c vs: H1 : SRMR>c;
Equal number of items loaded on each factor.
Where, c>0 is a reference cutoff value for close fit, Magnitude of model misspecification. In the scenario
p values are obtained using p ¼ 1UðzÞ, where UðÞ in which multidimensionality was ignored, the level of
denotes a standard normal distribution function misspecification was manipulated by changing the
SRMRu c
and z ¼ SEðSRMR uÞ
. degree of correlation between the factors in the true
model. The true correlation coefficients ranged from
0.60 to 0.90, in increments of 0.10. Given that the esti-
Population SRMR and model misspecification
mated model collapsed across two factors in the popula-
The relationship between the population SRMR and tion model, a smaller inter-factor correlation indicated a
misspecification in factor analysis models is explored in greater level of misspecification. When the misspecifica-
this section. In addition to the population SRMR, the tion included omitted cross-loadings or residual correla-
behavior of the largest standardized residual covariance tions, the inter-factor correlation was fixed to 0.30 and
(i.e., defined in terms of absolute value) was inspected the level of misspecification was determined by the
across different model misspecifications. The perform- population value of the omitted parameters. The omit-
ance of these two parameters was evaluated through a ted cross-loadings or residual correlations ranged from
simulation study in which we generated population 0.10 to 0.40, in increments of 0.10. Here, larger values
covariance matrices as the purpose of this study is to indicated a higher level of model misspecification.
better understand the behavior of the population SRMR. Number of omitted parameters. When the misspeci-
The population model used was a confirmatory fac- fications were introduced by omitting cross-loadings
tor analysis (CFA) model with two correlated factors.
The following three types of misspecification that are 3
The size of an SEM model has been indicated by different indices,
often observed in practice when fitting CFA models, including the number of observed variables (p), the number of
parameters to be estimated (q), the degrees of freedom (df ¼ p (p þ 1)/
were considered. 2  q), and the ratio of the observed variables to latent factors (p/f).
Recent studies have suggested that the number of observed variables (p)
is the most important determinant of model size effects (Moshagen 2012;
a. Misspecified dimensionality. The population model Shi, Lee & Terry, 2015, 2017). Therefore, in the current study, we define
was as two factor, independent clusters model large models as SEM models with many observed variables.
6 D. SHI ET AL.

Figure 1. Behavior of (A) the population SRMR, (B) largest standardized residual covariance, and (C) SRMR/communality in models
with misspecified dimensionality.

or residual correlations, we manipulated the number under the misspecification of disregarding multidi-
of omitted cross-loadings/residual correlations. The mensionality are shown in Figure 1 (panel A and B,
number of omitted parameters ranged from one to respectively). In these figures, different markers were
four, in increments of one. used to indicate different values of inter-factor corre-
In summary, the number of conditions examined lations (i.e., degree of model misspecification) and for
were 48 ¼ 3 (factor loading levels)  4 (model size lev- each level of inter-factor correlation, the population
els)  4 (factor inter-correlation levels) for misspecified SRMR and the largest (absolute value of) standardized
dimensionality. When investigating situations with residual covariance were plotted against the magni-
cross-loadings or residual correlations, the number of tudes of the factor loadings. Given a fixed level of fac-
conditions examined were 192 ¼ 3 (factor loading lev- tor loading and inter-factor correlation, cases with
els)  4 (model size levels)  4 (magnitudes of omitted different model sizes were labeled separately. By com-
parameters)  4 (number of omitted parameters). paring Figure 1(A, B), we can see that when dimen-
For each condition, a population covariance matrix sionality is mispecified, the SRMR value and largest
was computed. Then, the population values of SRMR standardized residual covariance provide almost the
were calculated by fitting the misspecified model to the same information. A linear regression model predict-
population covariance matrices with maximum likelihood ing the SRMR from the largest standardized covari-
(ML) estimation, using the Lavaan package in R (Rosseel, ance showed that the two parameters are almost
2012; R Development Core Team, 2015). In addition to perfectly correlated (R2¼0.988). Therefore, only results
the SRMR values, every standardized residual covariance for the SRMR are discussed to avoid redundancy.
was obtained, and the largest absolute value for the stand- The results from the ANOVA showed that the
ardized residual covariance was used as an alternative most important sources of population SRMR variance
index for evaluating model fit. Analyses of variance were the magnitude of factor loadings (g2¼0.50) and
(ANOVAs) were conducted to identify conditions which inter-factor correlations (g2¼0.39). Specifically, Figure
affected the outcome of interest. An eta squared (g2) value 1(A) demonstrates SRMR’s sensitivity to factor’s struc-
above 10% was used to identify conditions that contrib- tural misspecification; as the inter-factor correlation
uted to sizeable amounts of variability in the outcome. To decreased (indicating a more severe misspecification),
better demonstrate the behaviors of the population SRMR the population SRMR increased accordingly. In add-
and the largest standardized residual covariance, all results ition, at any fixed level of model misspecification (e.g.,
were presented in the form of figures; tables that include q ¼ 0.8), the population SRMR depends on the values
complete simulation results are provided as of the standardized factor loadings according to a
Supplementary data. curvilinear relationship. The higher the factor load-
ings, the higher the population SRMR. Moreover, the
effects of model misspecification and factor loading
Behavior of the population SRMR and largest
are multiplicative, meaning that the higher the level of
standardized residual covariance in models with
model misspecification, the higher the effect of the
mispecified dimensionality
factor loading. Finally, model size did not have a great
The behavior of SRMR and the behavior of the largest impact on the results. For a fixed level of factor load-
absolute value of the standardized residual covariance ing and inter-factor correlation, variability displayed
MULTIVARIATE BEHAVIORAL RESEARCH 7

in Figure 1(A) due to differences in model size was the omitted cross-loading(s) increased. However, the
ignorable (g2<0.01). commonly used cutoff is not sensitive to detect mis-
As seen in Figure 1(A), the commonly used cutoff specification caused by omitting cross-loading(s).
of SRMR ¼0.08 (the solid horizontal line) appears to Almost all misspecified models would be retained
be too liberal a criterion for identifying misspecifica- using the guideline of 0.08 for close fit (the solid hori-
tion in the factor structure. For example, when factor zontal line), even when the size of the omitted cross-
loadings are 0.6, fitting a one-factor model to a two- loading(s) reached 0.40. In addition, by comparing the
factor structure with a low inter-factor correlation four panels in Figure 2, the population SRMR slightly
(e.g., q ¼ 0.6), the reference of SRMR 0.08 would increased as the number of omitted cross-loadings
always suggest acceptable model fit. This finding holds increased. However, the effect of number of omitted
regardless of model size. In fact, because the popula- cross-loadings on SRMR was quite small (g2¼0.06).
tion SRMR heavily depends on the size of factor load- The population SRMR was also influenced by the
ings, it is difficult to find a single reference to magnitude of the factor loadings; higher factor load-
separate practically well-fitting models (e.g., q > 0.9) ings were associated with a larger population SRMR,
from non-closely fitting models that should be especially when the omitted cross-loading value was
rejected (e.g., q ¼ 0.6). large. For example, when the number of observed var-
A curvilinear relationship was observed between iables was 12 and one cross-loading of 0.40 magnitude
SRMR and the magnitude of the standardized factor was omitted, the population SRMR increased from
loading (Figure 1(A)). Therefore, we considered to use 0.029 to 0.085 as the magnitude of the (primary) load-
and evaluate the SRMR in light of the communality ing values increased from 0.40 to 0.80.
(i.e., the squared standardized loading, or k2)4. Using For a fixed level of primary factor loading and
the ratio of SRMR to the communality (SRMR/k2) as cross-loading(s), the variability of the population
the outcome variable, the results from the ANOVA SRMR was still noticeable, indicating a substantial
indicated that the inter-factor correlation explained a effect of model size. Specifically, holding other condi-
very large amount of the variance (g2¼0.99). Figure tions constant, larger model size was associated with
1(C) depicts the relationship between SRMR/k2 and smaller SRMR values. Moreover, the effect of model
the standardized factor loadings across levels of inter- size was more noticeable as the level of model misspe-
factor correlation (q) and model size. As shown, cification and the magnitude of the factor loading
population SRMR/k2 only depended on the level of increased. For example, when the factor loadings were
model misspecification (q), regardless of the magni- set to 0.80 and one cross-loading of 0.40 was omitted,
tude of the factor loadings and model size. Thus, if the population SRMR decreased from 0.085 (p ¼ 12)
close fit is defined as fitting a one-factor model to to 0.024 (p ¼ 144). With a smaller factor loading (i.e.,
two-factor data when inter-factor correlations are 0.40) and lower level of model misspecification (i.e.,
0.90, such models imply that SRMR/k20.05 (the cross-loading ¼0.20), model size had a smaller effect
solid line in Figure 1(C)). on the SRMR, yielding values which ranged from
0.016 (p ¼ 12) to 0.006 (p ¼ 144).
Behavior of the population SRMR and largest The behavior of SRMR/k2 in the presence of omit-
standardized residual covariance in models with ted cross-loadings is shown in Figure 3. ANOVA
omitted cross-loadings results indicated that the most important sources of
variance in SRMR/k2 were the size of cross-loading (s)
When the models were misspecified by omitting (g2¼0.46) and model size (g2¼0.24). It is noted that
cross-loadings, ANOVA results showed that the most the effect of model size on SRMR/k2 was less notice-
important sources of population SRMR variance were able as the level of model misspecification decreased.
the size of cross-loading (g2¼0.28), magnitude of the For example, when the factor loading was 0.40 and
factor loadings (g2¼0.18), and model size (g2¼0.16). one cross-loading with size of 0.40 was omitted, the
In Figure 2, we plotted the behavior of SRMR in the SRMR to communality ratio (SRMR/k2) decreased
presence of one to four omitted cross-loadings in four from 0.183 (p ¼ 12) to 0.074 (p ¼ 144). When the
separate panels (i.e., panel A–D). These figures show model misspecification was less severe (i.e., cross-load-
that population SRMR increased as the magnitude of ing ¼0.10), the range of SRMR/k2 was smaller across
4
model sizes, with values from 0.053 (p ¼ 12) to 0.019
To compute the ratio of SRMR to communality, we used the average
(standardized) factor loadings estimated by fitting the misspecified
(p ¼ 144). Generally speaking, if researchers would
models to the population covariance matrices. accept models that omit minor cross-loadings (i.e.,
8 D. SHI ET AL.

Figure 2. Behavior of the population SRMR in models with one (A) to four (D) omitted cross-loadings.

cross-loading 0.10), SRMR/k20.05 (the solid line in omitted cross-loading increased. However, as more
Figure 3) could be used as a reference. cross-loadings were omitted, the amount of increase
Using the largest absolute value of the standardized in the largest standardized residual covariance ‘leveled
residual covariance as the outcome variable, ANOVA off’ (i.e., smaller increases were observed). For
results indicated that the level of cross-loadings example, when p ¼ 144, and the magnitude of factor
(g2¼0.37), the magnitude of factor loadings (g2¼0.17), loading was 0.6, if omitting one cross-loading of 0.20,
model size (g2¼0.15), and number of omitted cross- the largest standardized residual covariance was 0.107;
loadings (g2¼0.14) were the major sources that the largest standardized residual covariance increased
explained the majority of the variance. Each panel in to 0.224 as two cross-loadings were omitted.
Figure 4 illustrates the behavior of the largest absolute Nevertheless, as the number of omitted cross-loadings
value of the standardized residual covariance in terms kept increasing, the largest standardized residual
of the magnitudes of factor loadings, levels of cross- covariance tended to remain stable, yielding SRMR
loadings and model size across various number of ¼0.222 (omitting three cross-loadings) and 0.219
cross-loadings. We can see that the largest absolute (omitting four cross-loadings).
value of the standardized residual covariance increased Given a fixed level of factor loading and cross-load-
when the magnitude of factor loadings and level of ing, it can be seen that the variability of the points
cross-loadings increased. displayed in Figure 4 is much smaller than those in
In addition, the largest standardized residual Figure 2. This implies that the effect of model size on
covariance generally increased as the number of the largest absolute value of the standardized residual
MULTIVARIATE BEHAVIORAL RESEARCH 9

Figure 3. Behavior of SRMR/communality in models with one (A) to four (D) omitted cross-loadings.

covariance was much smaller than the effect on the (g2¼0.13). In Figure 5, we present the behavior of
SRMR. Moreover, the effects of model size on SRMR population SRMR across the study conditions. Panels
and the largest standardized residual covariance were A–D represent conditions where one to four residual
in opposite directions. That is, as p increased, the correlations were omitted. For each panel from Figure
largest standardized residual covariance tended to 5 (i.e., same number of omitted residual correlations),
slightly decrease. As shown in Figure 4, if close fit is given a fixed level of factor loadings and residual cor-
defined as omitting cross-loadings 0.10, such models relations, the variability of the population SRMR was
can usually be identified by applying the cutoff with very large. This indicates that model size was the
the largest absolute value of the standardized residual dominant factor for the values of population SRMR.
covariance 0.10 (the solid horizontal line). Specifically, models with more observed variables
yielded noticeably smaller population SRMR values.
For example, when the factor loadings were 0.40, and
Behavior of the population SRMR and largest
four residual correlations of 0.40 were ignored, the
standardized residual covariance in models with
population SRMR decreased from 0.070 (p ¼ 12) to
omitted correlations among the residuals
0.007 (p ¼ 144). In addition, for a fixed number and
In the presence of omitted residual correlations, level of residual correlations, the population SRMR
ANOVA results showed that the most important sour- slightly decreased as the magnitude of factor loadings
ces of population SRMR variance were model size increased; yet, the effect size of the magnitude of fac-
(g2¼0.53) and the size of residual correlations tor loadings was quite small (g2¼0.07). For example,
10 D. SHI ET AL.

Figure 4. Behavior of the largest standardized residual covariance in models with one (A) to four (D) omitted cross-loadings.

when p ¼ 12, and four residuals correlations of 0.10 SRMR/k2) as the criteria to detect model misspecifica-
were ignored, the population SRMR dropped margin- tions from omitting residual correlations.
ally, from 0.018 (k ¼ 0.40) to 0.008 (k ¼ 0.80). On the other hand, for the largest absolute value of
The figures also show that population SRMR the standardized residual covariance, the ANOVA
increased as the number and magnitude of the omit- results indicated that the level of model misspecifica-
ted residual correlations increased. However, the value tion (residual correlations) could explain the majority
of population SRMR was still rather small, even when of the variance (g2¼0.62). The magnitude of the factor
four residual correlations with size of 0.40 were loading was an important source of variance
ignored. The commonly used cutoff for SRMR failed (g2¼0.31); however, the main effect of the model size
to detect the misspecification caused by omitting was negligible (g2<0.01). Figure 6 illustrates the
residual correlations. That is, all misspecified models behavior of the largest absolute value of the standar-
considered would be retained using the guideline of dized residual covariance when omitted residual corre-
0.08 for close fit (i.e., denoted by solid horizontal lations were present. We can see that the largest
line). Based on the results, when the magnitude of absolute value of the standardized residual covariance
model misspecifications increased, the related change was more sensitive than the SRMR to identify misspe-
in the population SRMR could be small. Moreover, cified models with omitted residual correlations. The
the population SRMR was greatly impacted by model largest absolute value of the standardized residual
size. Therefore, it is problematic to use SRMR (or covariance increased as the size of the residual
MULTIVARIATE BEHAVIORAL RESEARCH 11

Figure 5. Behavior of the population SRMR in models with one (A) to four (D) omitted residual correlations.

correlations increased, indicating a more severe mis- dropped more gradually from 0.080 (k ¼ 0.40) to
specification. In addition, the value of the largest 0.035 (k ¼ 0.80). Therefore, when the misspecifications
standardized residual covariance was much larger occurred by omitting residual correlations, the largest
than the SRMR obtained from the same misspecified absolute value of the standardized residual covariance
model. The largest absolute value of the standardized was a more suitable index than the SRMR for assess-
residual covariance was negatively associated with the ing model fit. As shown in Figure 6, close fitting
magnitude of the factor loading. In addition, as the models can be identified by the largest absolute value
magnitude of factor loading increased, the decrease in of the standardized residual covariance 0.10 (the
the largest standardized residual covariance tended to solid horizontal line), which approximately corre-
be more gradual when the level of model misspecifica- sponds to omitting correlated residuals with correla-
tion was smaller. For example, for a fixed model size tions of 0.10.
(i.e., p ¼ 12), when the model was misspecified by
omitting one residual correlation ¼0.40, the largest
A numerical example: fitting a five-factor model
absolute value of the standardized residual covariance
to the SPRI-R
decreased from 0.313 (k ¼ 0.40) to 0.139 (k ¼ 0.80). If
the omitted residual correlation ¼0.10, the largest We provide a numerical example to illustrate our dis-
absolute value of the standardized residual covariance cussion. The Social Problem Solving Inventory-
12 D. SHI ET AL.

Figure 6. Behavior of the largest standardized residual covariance in models with one (A) to four (D) omitted residual correlations.

Revised (SPSI-R: D’Zurilla, Nezu, & Maydeu-Olivares, item is loaded only by its factor. The CFA yields v2
2002) is a 52-item questionnaire that according to its (i.e., the LR statistic) ¼ 3209.87 on 1264 df, p<.001. A
authors measures five attributes. Each item is scored 90% confidence interval (CI) for the RMSEA is (0.048;
using five ordered categories. Maydeu-Olivares and 0.053), and the unbiased SRMR is 0.053. The model
D’Zurilla (1996) report confirmatory (CFA) and does not fit the data exactly, but by current standards
exploratory (EFA) factor analyses fitted to a sample of of ‘close’ fit, we would conclude based on this infor-
N ¼ 601 individuals. These data will be re-analyzed mation that the model provides a close fit to the data.
here. We do not report parameter estimates as these In so doing, all that we have done is to apply some
are available in the original sources. Rather, we wish cutoff values recommended in the literature; the def-
to focus on answering the question ‘do these models inition of ‘close’ is based on those cutoffs. A model
fit closely enough?’. If the answer is negative, a better fits closely if the estimated values are below the cut-
model should be sought for these data. offs, and it does not fit closely otherwise
The data is quite normal, the excess kurtosis for all (Barrett, 2007).
items is well below one; only one item has skewness In this article, we have put forth a definition of
>j1j. Consequently, we simply provide fit results close fit. We deem a model to provide a close fit to
under normality assumptions to simplify our presen- the data generating process if we would choose the fit-
tation. First we fitted the independent clusters CFA ted model over the data generating process based on
model of Maydeu-Olivares and D’Zurilla (1996): every parsimony, i.e., if the fitted model showed
MULTIVARIATE BEHAVIORAL RESEARCH 13

substantively irrelevant misfit. We estimate a 90% CI and largest standardized residual, we conclude that
for the SRMR as (0.048; 0.057), we obtained the aver- the model provides an acceptable fit to the data gener-
age R2 of the items, R  2 ¼0.42, and we inspected the ating model: on average, the model provides a good
largest (in absolute value) standardized residual cova- overall fit to the data, but the model does not capture
riances. There are two residuals >j0.2j and over 30 well enough every association among these 52 items.
residuals >j0.1j, all statistically significant at the 5% The conclusion is not surprising: with 52 items, find-
level after applying a Bonferroni correction for the ing a model with substantively ignorable misspecifica-
(53  52)/2 ¼ 1378 residuals inspected. We cannot tions requires a lot of work.
ignore such large residuals and we conclude that the
model does not fit closely the unknown data generat-
Discussion
ing process. In fact, we conclude that the fit is not
even acceptable, in spite of the estimated RMSEA. This research investigated the relationship between
Therefore, a better model must be sought. Our con- the population SRMR (and largest standardized
clusion is supported by an inspection of the modifica- residual covariance) and different types and degrees of
tion indices. Five of the modification indices involving misspecification in factor analysis models. Population
the factor loadings involved expected standardized covariance matrices were used to determine the
cross-loadings larger than j0.3j and as high as j0.45j; impact of study conditions on the population SRMR
the modification indices for these five loadings ranged without the impact of sampling error. Conditions of
from 44 to 58 (on 1 df). factor loading size and model size were manipulated.
To obtain a better fitting model, we estimated an Findings showed that the population SRMR was
EFA model using a target rotation (Browne, 1972) sensitive to model misspecification due to fitting a
instead of adding cross-loadings (or correlated resid- one-factor model to two-factor data or ignoring cross-
uals) based on the modification indices results. An loadings, but less sensitive to misspecifications intro-
examination of the estimated loadings reveals that the duced by omitting residual correlations. These
structure conforms to that put forth by Maydeu- findings are consistent with previous studies (Fan &
Olivares and D’Zurilla (1996): all secondary loadings Sivo, 2005; Hu & Bentler, 1998;1999) using sample
are smaller than the primary loadings. Regarding the estimates of the SRMR.
fit of this model, we obtained X2¼2233.67 on 1076 df, The SRMR can be approximately interpreted as the
p<.001. A 90% CI for the RMSEA is (0.040; 0.045), average standardized residual covariance. As such, it is
and the unbiased SRMR is 0.021. The average com- sensitive to model misspecifications that cause a sub-
munality is R  2 ¼0.45, a 90% CI for the SRMR is stantial proportion of large elements (in absolute
(0.020, 0.023). There are only four standardized value) in the standardized residual covariance matrix.
residual covariances larger than j0.1j, a 90% CI for the When the nature of the model misspecification results
largest is (0.18, 0.14). Now, SRMR/R  2 ¼0.051, an in only a small proportion of large residuals, and/or
inspection of Figure 1(C) reveals that for this R  2 , the many structural zeros appear in the residual covari-
estimated SRMR is only slightly above what we would ance matrix, the average residual covariance cannot
have obtained if we had fitted a one-factor model to a provide an accurate representation of the misspecifica-
two-factor independent clusters model with correl- tion. Therefore, the SRMR was not sensitive to model
ation 0.9. As we would rather use a one-factor model misspecifications caused by omitting a few residual
than a two-factor model with correlation 0.9, this lead correlations, especially when the number of observed
us to believe that this EFA model fits rather closely variables (and thus, the number of residual covarian-
the unknown data generating process. Also, we ces) is large. In such situations, examining the largest
inspected Figures 4 and 6 involving the relationship (in absolute value) standardized residual covariance is
between the largest standardized residual and the size more useful. In fact, our findings revealed that both
of localized misfit. To properly interpret these figures, global fit and local fit are important concepts to con-
it is convenient to estimate the p ffiffiffiffiffi
average factor loading. sider when evaluating models (DiStefano, 2016;
To do so, we simply took k R  2 ¼ :67. Inspection McDonald & Ho, 2002b; Raykov, 2000; West, Taylor,
of Figures 4–6 gives us pause to conclude that the & Wu, 2012). That is, good global fit implies that
EFA model fits closely. Rather, the estimated value of even in the presence of some local misspecifications,
the largest residual suggests that there is some local- the overall model holds up well, and should not be
ized misfit in the model that exceeds our standards. completely discarded. On the other hand, the infor-
Taking together the estimated values of the SRMR mation about local misfit can help researchers identify
14 D. SHI ET AL.

the sources of poor global fit and thereby improve model to two-factor data), the majority of the residual
the model. covariances are impacted by the misspecifications.
We also explored the effects of two possible Therefore, their average values are less sensitive to the
‘incidental parameters’ (i.e., model size and the mag- size of the residual covariance matrix (as a function
nitude of the factor loading). In general, larger stand- of p).
ardized factor loadings were associated with higher We showed that the population SRMR is not only
population SRMR values when the model misspecifi- determined by the size of the model misspecification,
cations occurred by ignoring the multidimensionality but may be influenced by other factors, including the
or omitting cross-loadings. However, when the model type of model misspecification, the magnitude of the
was misspecified by ignoring correlated residuals, for factor loading, and model size. As a result, the prac-
a fixed level of residual correlations, the population tice of using the SRMR with a single cutoff for testing
SRMR tended to slightly decrease as factor loading model close fit may be misleading. For example, using
increased. The findings of the current study are con- the existing criterion of sample SRMR 0.08, would
sistent with previous methodological research regard- lead us to retain as a close fitting model a one-factor
ing the influence of measurement quality (i.e., model fitted to data generated according to two-factor
standardized factor loadings) on fit indices. model with an inter-factor correlation of q ¼ 0.6 if the
Methodologists have shown that for a given level of standardized factor loadings were 0.60 (e.g.,
model misspecification, poor measurement quality is k  0.60). Moreover, in spite of the magnitude of the
associated with better model fit (i.e., the reliability factor loading or the model size, the currently
paradox; see Hancock & Mueller, 2011; Heene et al., accepted close fit criteria cannot detect misspecifica-
2011; McNeish, An, & Hancock, 2018). This result tions caused by omitting correlated residuals, even if
can be somewhat counter-intuitive. In particular, the number and size of the omitted residual correla-
researchers would judge the value of ignorable sec- tions are fairly large (e.g., four residual correlations of
ondary factor loadings against the values of the pri- size 0.40).
mary loadings. For a given size of the secondary In order to better distinguish close fitting models
loading, it should be more ignorable when the pri- from models with substantially non-ignorable misspe-
mary loadings are larger; however, the population cification, we recommend examining both the largest
SRMR tends to increase as the values of the primary standardized residual covariance (in absolute value)
loadings increase, suggesting worse fit. and the SRMR in light of the average estimated com-
In addition, when omitting cross-loadings or munality that is R2. Using this two-index strategy will
residual correlations, the population SRMR decreased capture different types of model misspecifications. In
as the number of observed variables increased. addition, for each index, to identify closely fitting
However, when misspecifying dimensionality (i.e., fit- models, a reference that is relatively robust to the
ting a one-factor model to two-factor data), the effect magnitude of the factor loading and model size can
of the model size on the population SRMR is quite be proposed. In this study, we considered three
small. The patterns described above are slightly differ- instances of misspecified models with substantively
ent to the behavior of the population RMSEA ignorable misspecification (i.e., providing a close fit to
(Savalei, 2012). In Savalei’s (2012) study, she noted the true model): (1) fitting a one-factor model to two-
that the population RMSEA generally decreased as the factor data when inter-factor correlations 0.90; (2)
magnitude of the factor loading decreased and the omitting cross-loadings with standardized values
number of observed variables increased, despite the 0.10; or (3) ignoring correlated residuals with corre-
type of model misspecification. According to its defin- lations 0.10. At the population level, such closely fit-
ition, the RMSEA penalizes model complexity by ting models can be identified when: (a) the largest
incorporating a degree of freedom in the formulation, absolute value of the standardized residual covariance
and it measures the discrepancy due to approximation 0.10, and (b) an SRMR to average communality (i.e.,
per df. Therefore, for models with a fixed level of mis- R2) ratio, SRMR=R  2  :05(or SRMR  :05  R  2 ).
specification, the population RMSEA generally In Appendix B, we further evaluate the generaliz-
decreases as p increases because a higher p is typically ability of the proposed reference values in more com-
associated with larger degrees of freedom (df). The plex situations (i.e., the population model has three
SRMR, on the other hand, is (approximately) the correlated factors with different magnitudes of factor
average standardized residual covariance. When mis- loadings). The two-index strategy also performed rea-
specifying dimensionality (e.g., fitting a one-factor sonably well under the complex study conditions.
MULTIVARIATE BEHAVIORAL RESEARCH 15

Table 1. SRMR reference values that meet the criteria for close (SRMR  :05  R ) and adequate fit
2

(:05  R <SRMR  :10  R ) for selected values of average communality (R ).


2 2 2

R 2 SRMR reference value for close fit SRMR reference value for adequate fit
0.010 0.001 0.001
0.040 0.002 0.004
0.090 0.005 0.009
0.160 0.008 0.016
0.250 0.013 0.025
0.360 0.018 0.036
0.490 0.025 0.049
0.640 0.032 0.064

Table 2. Population SRMR and average sample estimates.


Factor loading ¼0.8 Factor loading ¼0.4
N P.SRMSR E[SRMSRb] SRMSRb SRMSRu P.SRMSR E[SRMSRb] SRMSRb SRMSRu
50 0.058 0.074 0.076 0.057 0.014 0.097 0.097 0.021
100 0.058 0.066 0.067 0.058 0.014 0.069 0.069 0.016
200 0.058 0.062 0.063 0.058 0.014 0.050 0.050 0.014
300 0.058 0.061 0.061 0.058 0.014 0.042 0.042 0.013
400 0.058 0.060 0.060 0.058 0.014 0.037 0.037 0.013
500 0.058 0.059 0.060 0.058 0.014 0.034 0.034 0.013
600 0.058 0.059 0.059 0.058 0.014 0.031 0.031 0.013
700 0.058 0.059 0.059 0.058 0.014 0.029 0.030 0.013
800 0.058 0.059 0.059 0.058 0.014 0.028 0.028 0.013
900 0.058 0.059 0.059 0.058 0.014 0.027 0.027 0.014
1000 0.058 0.058 0.059 0.058 0.014 0.026 0.026 0.014
2000 0.058 0.058 0.058 0.058 0.014 0.021 0.021 0.014
5000 0.058 0.058 0.058 0.058 0.014 0.017 0.017 0.014
10000 0.058 0.058 0.058 0.058 0.014 0.016 0.016 0.014

misspecifications. As a result, we believe models with


these levels of misspecification are acceptable. Our
results indicate that models with acceptable misfit can
be distinguished using our two-index strategy using
the following criteria: (1) largest absolute value of the
standardized residual covariance 0.15, and (2)
SRMR  :10  R  2 . For convenience, we provide in
Table 1 the values of SRMR corresponding to our def-
initions of close and acceptable fit for various levels of
average communality (R2 of the observed indicators).
We are unable to provide values of the biased SRMR
corresponding to our definitions of close and accept-
able fit because the expected
pffiffiffiffi value of the biased
SRMR is a function of 1= N (Appendix A).
It is worth noting that these reference values for
close and adequate fit do not have to serve as fixed
Figure 7. Relationship between the expected value of the criteria. The recommendations are based on our per-
biased sample SRMR, population SRMR, and sample size (N).
sonal cutoff for deciding whether to retain or reject a
misspecified model. We acknowledge that the defin-
Researchers may differ in what they consider a ition of closely fitting model is subjective; in practice,
SIM, and we acknowledge that our own personal cri- researchers may differ in what they consider a closely
teria for SIM, and hence, close fit are quite stringent. or acceptably fitting model, and they need not agree
We believe many researchers would also consider (1) with our classification for ‘close’ and ‘acceptable’ fit
fitting a one-factor model to two-factor data with an models. However, we provide enough information for
inter-factor correlation of q  0.80, (2) setting cross- them to make an informed decision on what cutoff to
loadings 0.20, or (3) setting residual correlations use given their qualitative decision on which models
0.20 to zero to be substantively ignorable are close enough to be actionable given the true
16 D. SHI ET AL.

models. That is, different reference values can be the rights of human or animal participants, and ensuring
obtained from the figures5, or easily generated by fol- the privacy of participants and their data, such as ensuring
lowing the paradigm from the current study, which that individual participants cannot be identified in reported
results or from publicly available original or archival data.
allow researchers to employ a criterion that meets the
unique needs of their proposed research. Funding: This work was supported by Grant SES-1659936
In the current study, we focus on population parame- from the National Science Foundation.
ters. Of course, in applications only sample estimates are
available. It is noted that with sampling errors, the refer- Role of the funders/sponsors: None of the funders or
ence values we provided are not sample cutoffs. That is, it sponsors of this research had any role in the design and
may not be proper to compare the sample SRMR (or the conduct of the study; collection, management, analysis, and
interpretation of data; preparation, review, or approval of
largest standardized residual covariance) with the popula- the manuscript; or decision to submit the manuscript for
tion cutoffs, unless the sample size is very large. To publication.
account for the sampling variability, we recommend
applying the population criteria with the confidence inter- Acknowledgments: We are indebted to Yves Rosseel for
vals (or statistical tests for model close fit; e.g., SRMRc), implementing the unbiased SRMR in Lavaan. We thank
which allows researchers to make population inference Peter Molenaar and the anonymous reviewers for their
valuable suggestions on prior versions of this manuscript.
using sample data. Specifically, for both SRMR and an
We also acknowledge the Research Computing Center at
individual standardized residual covariance term, statis- the University of South Carolina for providing the comput-
tical theory is available to obtain their asymptotic stand- ing resources that contributed to the results of this paper.
ard errors, and, in turn, to compute the confidence The ideas and opinions expressed herein are those of the
interval (Maydeu-Olivares, 2017; Maydeu-Olivares & Shi, authors alone, and endorsement by the authors’ institutions
2017; Maydeu-Olivares, Shi, & Rosseel, 2017; Ogasawara, or the National Science Foundation is not intended and
2001). Simulation studies have also shown that in finite should not be inferred.
samples, the point estimations and the confidence inter-
vals are quite accurate for both SRMR and the individual
standardized residual covariance even in samples of size ORCID
100 and very large models (Maydeu-Olivares, Shi, & Dexin Shi http://orcid.org/0000-0002-4120-6756
Rosseel, 2018; Maydeu-Olivares & Shi, 2017). Of course,
when considering the statistical significance (or confi-
dence intervals) for individual standardized residual it is
important to adjust for multiple testing. In our experience References
(Maydeu-Olivares & Shi, 2017), the Bonferroni method Barrett, P. (2007). Structural equation modelling: Adjudging
suffices for this purpose although more complex proce- model fit. Personality and Individual Differences, 42,
dures (e.g., Benjamini & Hochberg, 1995) may be 815–824. https://doi.org/10.1016/j.paid.2006.09.018
employed. Additional works are need to further explore Beauducel, A., & Wittmann, W. W. (2005). Simulation
study on fit indexes in CFA based on data with slightly
the usage of individual standardized residuals to assess the
distorted simple structure. Structural Equation Modeling:
overall fit of the model. We hope that this paper enlight- A Multidisciplinary Journal, 12(1), 41–75. https://doi.org/
ens SEM researchers and provides additional information 10.1207/s15328007sem1201_3
to assist them when conducting the difficult task of assess- Benjamini, Y., & Hochberg, Y. (1995). Controlling the false
ing model-data fit. discovery rate: A practical and powerful approach to
multiple testing. Journal of the Royal Statistical Society.
Series B, 57(1), 289–300. http://www.jstor.org/stable/
Article information 2346101
Bentler, P. M. (1990). Comparative Fit Indexes in structural
Conflict of interest disclosures: Each author signed a form models. Psychological Bulletin, 107(2), 238–246. https://
for disclosure of potential conflicts of interest. No authors doi.org/10.1037/0033-2909.107.2.238
reported any financial or other conflicts of interest in rela- Bentler, P. M. (1995). EQS 5 [Computer Program]. Encino,
tion to the work described. CA: Multivariate Software Inc.
Box, G. E. P. (1979). Some problems of statistics and every-
Ethical principles: The authors affirm having followed pro- day life. Journal of the American Statistical Association,
fessional ethical guidelines in preparing this work. These 74, 1–4. https://doi.org/10.1080/01621459.1979.10481600
guidelines include obtaining informed consent from human Browne, M. W. (1972). Oblique rotation to a partially speci-
participants, maintaining ethical treatment and respect for fied target. British Journal of Mathematical and Statistical
Psychology, 25, 207–212. https://doi.org/10.1111/j.2044-
5
Also see the supplemental tables. 8317.1972.tb00492.x
MULTIVARIATE BEHAVIORAL RESEARCH 17

Browne, M. W. (1982). Covariance structures. In D. M. new alternatives. Structural Equation Modeling, 6, 1–55.
Hawkins (Ed.), Topics in applied multivariate analysis https://doi.org/10.1080/10705519909540118
(pp. 72–141). Cambridge: Cambridge University Press. J€
oreskog, K. G., & S€ orbom, D. (1988). LISREL 7. A guide to
Browne, M. W., & Cudeck, R. (1993). Alternative ways of the program and applications (2nd ed.). Chicago, IL:
assessing model fit. In K. A. Bollen & J. S. Long (Eds.), International Education Services.
Testing structural equation models (pp. 136–162). MacCallum, R. C. (2003). Working with imperfect models.
Newbury Park, CA: Sage. Multivariate Behavioral Research, 38(1), 113–139. https://
Chen, F., Curran, P. J., Bollen, K. A., Kirby, J., & Paxton, P. doi.org/10.1207/S15327906MBR3801_5
(2008). An empirical evaluation of the use of fixed cutoff Maiti, S. S., & Mukherjee, B. N. (1990). A note on distributional
points in RMSEA test statistic in structural equation properties of the J€oreskog-S€
orbom fit indices. Psychometrika,
models. Sociological Methods & Research, 36, 462–494. 55, 721–726. https://doi.org/10.1007/BF02294619
https://doi.org/10.1177/0049124108314720 Maydeu-Olivares, A. (2017). Assessing the size of model
D’Zurilla, T. J., Nezu, A. M., & Maydeu-Olivares, A. (2002). misfit in structural equation models. Psychometrika, 82,
Manual of the Social Problem-Solving Inventory-Revised. 533–558. https://doi.org/10.1007/s11336-016-9552-7
North Tonawanda, NY: Multi-Health Systems, Inc. Maydeu-Olivares, A., & D’Zurilla, T. J. (1996). A factor-analytic
DiStefano, C., (2016). Using fit indices in structural equa- study of the Social Problem-Solving Inventory: An integration
tion modeling. In K. Schweizer, & C. DiStefano (Eds.), of theory and data. Cognitive Therapy and Research, 20,
Principles and methods of test construction: Standards 115–133. https://doi.org/10.1007/BF02228030
and recent advancements (pp. 166–196). G€ ottingen, Maydeu-Olivares, A., & Joe, H. (2014). Assessing approxi-
Germany: Hogrefe Publishers. mate fit in categorical data analysis. Multivariate
Edwards, M. C. (2013). Purple unicorns, true models, and Behavioral Research, 49, 305–328. https://doi.org/10.
other things I’ve never seen. Measurement: 1080/00273171.2014.911075
Interdisciplinary Research & Perspective, 11, 107–111. Maydeu-Olivares, A., & Shi, D. (2017). Effect sizes of model mis-
https://doi.org/10.1080/15366367.2013.835178 fit in structural equation models: Standardized residual cova-
Fan, X., & Sivo, S. A. (2005). Sensitivity of fit indexes to riances and residual correlations. Methodology, 13(Suppl. 1),
misspecified structural or measurement model compo- 23–30. https://doi.org/10.1027/1614-2241/a000129
nents: Rationale of two-index strategy revisited. Structural Maydeu-Olivares, A., Shi, D., & Rosseel, Y. (2018).
Equation Modeling: A Multidisciplinary Journal, 12, Assessing fit in structural equation models: Monte-Carlo
343–367. https://doi.org/10.1207/s15328007sem1203_1 evaluation of RMSEA versus SRMR confidence intervals
Fan, X., & Sivo, S. A. (2007). Sensitivity of fit indices to and tests of closefit. Structural Equation Modeling: A
model misspecification and model types. Multivariate Multidisciplinary Journal, 25, 389–402.
Behavioral Research, 42, 509–529. https://doi.org/10. McNeish, D., An, J., & Hancock, G. R. (2018). The thorny
1080/00273170701382864 relation between measurement quality and fit index cut-
Garrido, L. E., Abad, F. J., & Ponsoda, V. (2016). Are fit offs in latent variable models. Journal of Personality
indices really fit to estimate the number of factors with Assessment, 100(1), 43–52.
categorical variables? Some cautionary findings via Monte McDonald, R. P., & Ho, M.-H. R. (2002). Principles and
Carlo simulation. Psychological Methods, 21(1), 93–111. practice in reporting structural equation analyses.
https://doi.org/10.1037/met0000064 Psychological Methods, 7(1), 64–82. https://doi.org/10.
Gerbing, D. W., & Anderson, J. C. (1985). The effects of 1037//1082-989X.7.1.64
sampling error and model characteristics on parameter Moshagen, M. (2012). The model size effect in SEM:
estimation for maximum likelihood confirmatory factor Inflated goodness-of-fit statistics are due to the size of
analysis. Multivariate Behavioral Research, 20, 255–271. the covariance matrix. Structural Equation Modeling: A
https://doi.org/10.1207/s15327906mbr2003_2 Multidisciplinary Journal, 19(1), 86–98. https://doi.org/
Hancock, G. R., & Mueller, R. O. (2011). The reliability 10.1080/10705511.2012.634724
paradox in assessing structural relations within covari- Ogasawara, H. (2001). Standard errors of fit indices using
ance structure models. Educational and Psychological residuals in structural equation modeling. Psychometrika,
Measurement, 71, 306–324. 66, 421–436. https://doi.org/10.1007/BF02294443
Heene, M., Hilbert, S., Draxler, C., Ziegler, M., & B€ uhner, R Development Core Team. (2015). R: A language and
M. (2011). Masking misfit in confirmatory factor analysis environment for statistical computing. Vienna, Austria:
by increasing unique variances: A cautionary note on the The R Foundation for Statistical Computing.
usefulness of cutoff values of fit indices. Psychological Raykov, T. (2000). On sensitivity of structural equation
Methods, 16, 319–336. modeling to latent relation misspecifications. Structural
Hoogland, J. J., & Boomsma, A. (1998). Robustness studies Equation Modeling, 7, 596–607. https://doi.org/10.1207/
in covariance structure modeling. Sociological Methods & S15328007SEM0704_4
Research, 26, 329–367. https://doi.org/10.1177/ Rosseel, Y. (2012). Lavaan: An R package for structural
0049124198026003003 equation modeling. Journal of Statistical Software, 48,
Hu, L., & Bentler, P. M. (1998). Fit indices in covariance 1–36. https://doi.org/10.18637/jss.v048.i02
structure modeling: Sensitivity to underparameterized Saris, W. E., Satorra, A., & van der Veld, W. M. (2009).
model misspecification. Psychological Methods, 3, Testing structural equation models or detection of mis-
424–453. https://doi.org/10.1037//1082-989X.3.4.424 specifications? Structural Equation Modeling: A
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes Multidisciplinary Journal, 16, 561–582. https://doi.org/10.
in covariance structure analysis: Conventional criteria versus 1080/10705510903203433
18 D. SHI ET AL.

Savalei, V. (2012). The relationship between root mean on the factor loading. More accurate SRMR values were
square error of approximation and model misspecifica- obtained with higher factor loadings, undoubtedly because they
tion in confirmatory factor analysis models. Educational were more accurately estimated (Gerbing & Anderson, 1985;
and Psychological Measurement, 72, 910–932. https://doi. Hoogland & Boomsma, 1998). When the factor loading was
org/10.1177/0013164412452564 0.4, the unbiased SRMR provided estimates with a relative bias
Shi, D., Lee, T., & Terry, R. A. (2015). Abstract: Revisiting of <10% in samples of 100 observations and higher, but a
the model size effect in structural equation modeling sample size of 1000 was needed to obtain a relative bias <1%.
(SEM). Multivariate Behavioral Research, 50, 142–142. In contrast, the relative bias of the unbiased SRMR estimate
http://doi.org/10.1080/00273171.2014.989012 did not exceed 1%, even when sample size was 50.
Shi, D., Lee, T., & Terry, R. A. (2018). Revisiting the The results of Table 2 also reveal that the biased SRMR
model size effect in structural equation modeling. overestimated the population SRMR. This suggests that under
Structural Equation Modeling: A Multidisciplinary such conditions, model fits may appear poorer than they
Journal, 25, 21–40. https://doi.org/10.1080/10705511. actually are. The bias was not exceptionally large when the
2017.1369088 factor loading was high (0.8) and sample size was at least
Shi, D., Lee, T., & Maydeu-Olivares, A. (2018). 200. For a population value of 0.058, the average of the
Understanding the model size effect on SEM fit indices. biased SRMR was 0.063 at a sample size of 200 (the unbiased
Educational and Psychological Measurement. 10.1177/ average was 0.058). However, when the factor loading was
0013164418783530 low (0.4), the bias of the biased SRMR was unacceptable,
Steiger, J. H., (1989). A supplementary module for SYSTAT even with a sample size of 10,000. Thus, for a population
and SYGRAPH. Evanston, IL: Systat, Inc. value of 0.014, the average of the biased SRMR was 0.069 at
Steiger, J. H. (1990). Structural model evaluation and modi- a sample size of 200 (the unbiased average was 0.016).
fication: An interval estimation approach. Multivariate We also see in Table 2 that the asymptotic approximation
Behavioral Research, 25, 173–180. https://doi.org/10. of the average behavior of the biased SRMR was rather
1207/s15327906mbr2502_4
accurate. At low factor loadings, our approximation was
West, S. G., Taylor, A. B., & Wu, W. (2012). Model fit and
accurate to three digits, even with a sample size of 50. At
model selection in structural equation modeling. In R. H.
high factor loadings, three-digit accuracy was obtained as
Hoyle (Ed.), Handbook of structural equation modeling.
soon as the sample size reached 300. Even in the worst case
New York: Guilford Press.
(50 observations), the approximation was fairly accurate (an
Yuan, K.-H. (2005). Fit indices versus test statistics.
expected mean of 0.074 versus a 0.076 actual mean).
Multivariate Behavioral Research, 40(1), 115–148. https://
Although it is not apparent in Equation (5)—as N is
doi.org/10.1207/s15327906mbr4001_5
embedded in Ns —the expected value of the biased p sample
ffiffiffiffi
SRMR can be well-approximated by a function of 1= N for
Appendix A each value of the population SRMR. This is illustrated
graphically in Figure 7.
Computation of the population SRMR and bias of
its unbiased and biased estimates
To illustrate the biases of the biased and unbiased estimates of
the population SRMR we have provided a simulation example. Appendix B
The population model was an independent cluster confirma-
tory factor model with two correlated factors (correlation coef- Using the standardized root mean square residual
ficient q ¼ 0.80). Each factor had five normally distributed and the largest standardized residual covariance
indicators. The population factor loadings were set to be either to identify close fitting models
0.80 (error variance ¼0.36) or 0.40 (error variance ¼0.84).
Misspecification was introduced by ignoring the multidimen- This appendix evaluates the performance of the two-index
sional structure and fitting a common factor model to the strategy and examines the generalizability of the proposed
two-factor data. Table 2 provides the mean across 1000 repli- reference values under more complex modeling situa-
cations of the biased and unbiased SRMRs obtained across a tions. The population model used was a CFA model with
number of sample sizes (ranging from 50 to 10,000 observa- three correlated factors. The total number of observed
tions). Maximum likelihood estimation was used. All compu- variables (p) included 36, 72, or 144, resulting in 12, 24,
tations were performed using the Lavaan package in R (R or 48 variables per factor. The population variances for
Development Core Team, 2015; Rosseel, 2012). all three factors were set to one. The error variances were
Following Maydeu-Olivares (2017), the population set such that all factor loadings were on a standardized
SRMR was computed as follows: the population covariance scale. To reflect a more realistic situation, items loading
matrix was input into Lavaan as if it were the sample on the same factor were of different magnitudes. . The
covariance matrix, along with the target sample size. The size of the factor loadings included low (0.40), medium
estimated standardized residuals were then used to compute (0.60), or high (0.80) on the same factor. For a given fac-
the population SRMR. As a side product, we also computed tor, the number of items with each of the three levels of
the expected value of the biased sample SRMR given in factor loadings was the same (e.g., p ¼ 12, three items of
Equation (4). The results are summarized in Table 2. In this each loading size). Three types of model misspecifications
table, we see how the behavior of the sample SRMRs depended were manipulated as below.
MULTIVARIATE BEHAVIORAL RESEARCH 19

Figure 8. Using the two-index strategy for identifying close fitting models.

Table 3. Patterns of inter-factor correlations loadings). Closely fitting models were defined as omitting
considered in Appendix B. cross-loadings with standardized values 0.10.
Inter-factor correlations Close fit?
Omitting residual correlations. In the population model,
three correlated residuals were present7. A simple structure
q12 q23 q13 1F CFA 2F CFA three-factor CFA model with no correlated residuals was fit.
0.9 0.9 0.9 Yes Yes The correlated residuals were introduced between items
0.9 0.9 0.8 No Yes with either low (0.40), medium (0.60), or high (0.80) factor
0.9 0.9 0.7 No Yes
0.9 0.8 0.8 No Yes loadings. The level of model misspecification was indicated
0.9 0.8 0.7 No Yes by the values of residual correlations (i.e., 0.10, 0.20, or
0.8 0.8 0.8 No No 0.30); larger values implied a higher level of model misspe-
0.7 0.7 0.7 No No cification. Under omitting residual correlations, the total
number of conditions was 27 ¼ 3 (model size levels)  3
Misspecified dimensionality. The population model consists (magnitudes of omitted residual correlations)  3 (locations
of three correlated factors. The fitted models were either a of correlated residuals). If correlated residuals with correla-
one-factor model or a two-factor model, collapsing the correl- tions 0.10 were ignored, the (misspecified) model is con-
ation between factors one and two. The level of misspecifica- sidered closely fitting.
tion was manipulated by changing the degree of correlation Following the same procedure discussed in the earlier
among the factors in the population. When the unidimen- section, for each simulated condition, we computed the
sional structure was fitted, smaller inter-factor correlation(s) in population SRMR and the single standardized residual
the population model were indicative of a greater level of mis- covariance. The ratio of SRMR to average communality
specification. Seven patterns of inter-factor correlations were (i.e., the squared standardized factor loading) and the larg-
included and summarized in Table 3. In total, 42 cases (3 est absolute value of standardized residual covariance were
model sizes 7 inter-factor correlation patterns 2 fitted calculated and used as the two indices for examining model
models) were considered. Following the definition in this close fit. In Figure 8, for each type of model misspecifica-
study, closely fitting models imply that a unidimensional struc- tion, we presented the relationship between the SRMR/k2
ture was fit when the inter-factor correlations were 0.90. and the largest standardized residual covariance, along with
Omitting cross-loadings. The population model consists the recommended reference values for model close fit (i.e.,
of three correlated factors; however, three indicators loaded indicated by the solid horizontal line and the dashed verti-
on two different factors6. The fitted model ignored cross cal line, respectively). In addition, markers were used to dif-
loadings, with values incorrectly fixed to zero. The inter-fac- ferentiate cases that would be denoted as a close fitting
tor correlations were set to 0.30, and the level of misspecifi- model, and those that are not (based on the definitions dis-
cation was determined by the population value of the cussed above). As shown in the figures, the proposed refer-
omitted cross-loadings. The magnitudes of (standardized) ence values performed well in terms of identifying close
cross-loadings included 0.10, 0.20, or 0.30. We also manipu- fitting models. All close fitting models yielded the largest
lated the locations of cross-loadings, where the cross-load- absolute value of standardized residual covariance 0.10,
ings occurred on items with low (0.40), medium (0.60), or and SRMR/k20.05 (i.e., fall in the left lower quadrant cre-
high (0.80) primary factor loadings. The number of condi- ated by the two reference lines). On the other hand, by
tions considered was 27 ¼ 3 (model size levels)  3 (magni- using the two-index cutoffs, almost all models with more
tudes of omitted cross-loadings)  3 (locations of cross- severe misspecifications could be successfully identified.

6 7
The three indicators loaded on factors 1 & 2, factors 1 & 3, and factors 2 The three correlated residuals were introduced between items from
& 3, respectively. factors 1 & 2, factors 1 & 3, and factors 2 & 3, respectively.

You might also like