Professional Documents
Culture Documents
11.1 Introduction
H. Baumgartner ()
Department of Marketing, Smeal College of Business, The Pennsylvania State University,
482 Business Building, University Park, PA 16802, USA
e-mail: jxb14@psu.edu
B. Weijters
Department of Personnel Management, Work and Organizational Psychology, Ghent University,
Henri Dunantlaan 2, Ghent 9000, Belgium
jxb14@psu.edu
336 H. Baumgartner and B. Weijters
jxb14@psu.edu
11 Structural Equation Modeling 337
jxb14@psu.edu
338 H. Baumgartner and B. Weijters
other latent variables in the model. In contrast, attitude and USE are endogenous
latent variables (although USE is not really latent in the present case) because
they depend on other constructs in the model. The Greek letters
(ksi) and
(eta) are sometimes used to refer to exogenous and endogenous latent variables,
respectively, but more descriptive names are used in the present case. Directed paths
from exogenous to endogenous latent variables are sometimes called (gamma) and
directed paths from endogenous to other endogenous latent variables are called ˇ
(beta), although it is not necessary to make this distinction. The model assumes
that the determinants of an endogenous latent variable do not account for all of the
variation in the variable, which implies that an error term (, zeta) is associated
with each endogenous latent variable (a so-called error in equation or equation
disturbance); there is no error term for Probit [P(USE)] since it is fixed in the present
case. Curved arrows starting and ending at the same variable indicate variances, and
two-way arrows between variables indicate covariances. For example, the curved
arrows associated with the five belief factors are the variances of the exogenous
constructs, which are denoted by ®ii (phi). For simplicity, the variances of the errors
in equations (which are usually denoted by ii or psi) and the covariances between
the exogenous constructs (®ij ) are not shown explicitly in the model; usually, non-
zero covariances between the exogenous constructs are assumed by default.
If the constructs in one’s theory are latent variables, they have to be linked to
observed measures. Except for USE, each of the other six constructs is measured by
three observed (manifest) variables or indicators, which are shown as rectangles
(or squares). The letters x and y are sometimes used to refer to the indicators
of exogenous and endogenous latent constructs, respectively, but more descriptive
names are used in the present case. The model assumes that a respondent’s observed
score for a given variable is a function of the underlying latent variable of theoretical
interest; this is called a reflective indicator model, and the corresponding indicators
are sometimes called effect indicators. Since observed variables are fallible, there
is also a unique component of variation, which is frequently (and somewhat
inaccurately) equated with random error variance. The errors are usually denoted by
ı (delta) for indicators of exogenous latent variables and " (epsilon) for indicators
of endogenous latent variables; the corresponding variances are ı and " (theta),
respectively (which are not shown explicitly in Fig. 11.1). The strength of the
relationship between an indicator and its underlying latent variable (construct,
factor) is called a factor loading and is usually denoted by (lambda).
The observed USE measure is a 0/1 variable in the present case (self-scanning
was or was not used during the particular shopping trip in question) and one may
assume that the observed variable is a crude (binary) measure of an underlying latent
variable indicating a consumer’s propensity to use self-scanning. This requires the
estimation of a threshold parameter.
Of course, the model depicted in Fig. 11.1 can also be specified algebraically.
This is shown in Table 11.1. In Table 11.1, it is assumed that all relationships
between variables are linear. This is not explicitly expressed in the model in
Fig. 11.1 (which could be interpreted as a nonparametric structural equation model),
but relationships between variables are usually assumed to be linear (esp. when the
jxb14@psu.edu
11 Structural Equation Modeling 339
model is estimated with commonly used SEM programs), unless a distribution other
than the normal distribution is specified for a variable.
Note that the model in Fig. 11.1 or Table 11.1 is specified for observed variables
that have been mean-centered. In this case, latent variable means and equation
intercepts can be ignored. Although the means can be estimated, they usually do
not provide important additional information. However, in multi-sample analysis, to
be discussed below, means may be relevant (e.g., one may want to compare means
across samples) and are often modeled explicitly.
The hypothesized model shown in Fig. 11.1 contains six relationships between
constructs that are specified to be nonzero (i.e., the effect of the five belief factors
on attitude, and the effect of attitude on USE). However, one could argue that the
relationships that are assumed to be zero are even more important, because these
restrictions allow the researcher to test the plausibility of the specified model.
jxb14@psu.edu
340 H. Baumgartner and B. Weijters
The model in Fig. 11.1 contains several restrictions. First, it is hypothesized that,
controlling for attitude, there are no direct effects from the five benefit factors on
the use of self-scanning. Technically speaking, the model assumes that the effects
of benefit beliefs on the use of self-scanning are mediated by consumers’ attitudes
(see Chap. 8). Second, the errors in equations are hypothesized to be uncorrelated.
This means that there are no influences on attitude and use that are common to both
constructs other than those contained in the model. Third, each observed variable
is allowed to load only on its assumed underlying factor; non-target loadings are
specified to be zero. Fourth, the model assumes that all errors of measurement are
uncorrelated. Models in which at least some of the error correlations are nonzero
could be entertained. Testing the model on empirical data will show whether these
assumptions are justified.
Before a model can be estimated or tested, it is important to ascertain that the
specified model is actually identified. Identification refers to whether or not a unique
solution is possible. A unique aspect of structural equation models is that many
variables in the model are unobserved. For example, in the measurement equations
for the observed variables, all the right-hand side variables are unobserved (see
Table 11.1). A first requirement for identification is that the scale in which the latent
variables are measured be fixed. This can be done by setting one factor loading per
latent variable to one or standardizing the factor variance to one. In Fig. 11.1, one
loading per factor was fixed at one.
A second requirement is that the number of model parameters (i.e., the number
of parameters to be estimated) not be greater than the number of unique elements
in the variance-covariance of the observed measures. Since the number of unique
variances and covariances is (p)(pC1)/2, where p is the number of observed
variables (19 in the present case), and since (p)(pC1)/2r is the degrees of freedom
of the model, where r is the number of model parameters, this requirement says
that the number of degrees of freedom must be nonnegative. This is a necessary
condition for model identification, but it is not sufficient. If the model in Fig. 11.1
did not contain a categorical variable, the number of estimated parameters would be
53 and the model would have 137 degrees of freedom. Because of the presence of
the 0/1 USE variable, the situation is more complex, but the degrees of freedom of
the model is still 137. Thus, the necessary condition for identification is satisfied.
There are no identification rules that are both necessary and sufficient and can
be applied to any type of model. This makes determining model identification a
nontrivial task, at least for certain models. However, simple identification rules are
available for commonly encountered models and some of these will be described
later in the chapter.
Estimation means finding values of the model parameters such that the discrepancy
between the sample variance/covariance matrix of the observed variables (S) and
jxb14@psu.edu
11 Structural Equation Modeling 341
1
See Chap. 12 for a discussion of alternative methods that relax these assumptions.
jxb14@psu.edu
342 H. Baumgartner and B. Weijters
The fit of a specified model to empirical data can be tested with a chi-square test,
which examines whether the null hypothesis of perfect fit is tenable. In principle,
this is an attractive test of the overall fit of the model, but in practice there are
two problems. First, the test is based on strong assumptions, which are often not
met in real data (although as explained earlier, robust versions of the test are
available). Second, on the one hand the test requires a large sample size, but on
the other hand, as the sample size increases, it becomes more likely that (possibly
minor and practically unimportant) misspecifications will lead to the rejection of a
hypothesized model.
Because of these shortcomings of the chi-square test of overall model fit,
many alternative fit indices have been proposed. Although researchers’ reliance
on these fit indices is somewhat controversial (model evaluation is based on mere
rules of thumb, and some authors argue that researchers dismiss a significant
chi-square test too easily), several alternative fit indices are often reported in
practice. Definitions, brief explanations, important characteristics, and commonly
used cutoffs for assessing model fit are summarized in Table 11.2.
We offer the following guidelines to researchers assessing the overall fit of a
model. First, a significant chi-square statistic should not be ignored because of the
presumed weaknesses of the test; after all, a significant chi-square value does show
that the model is inconsistent with the data. Close inspection of the hypothesized
model is necessary to determine whether or not the discrepancies identified by the
chi-square test are serious (even if some of the alternative fit indices suggest that
the fit of the model is reasonable). Second, surprisingly often, different fit indices
suggest different conclusions (i.e., the CFI may indicate a good fit of the model,
whereas the RMSEA is problematic). In these cases, particular care is required in
interpreting the model results. Third, a hypothesized model may be problematic
even when the overall fit indices are favorable (e.g., if estimated error variances are
negative or path coefficients have the wrong sign). Fourth, a well-fitting model is not
necessarily the “true” model. There may be other models that fit equally or nearly
equally well. In summary, overall fit indices seem to be most helpful in alerting
researchers to possible problems with the specified model.
jxb14@psu.edu
Table 11.2 Summary of commonly used overall fit (or lack-of-fit) indices
Definition of the
Index indexa Characteristicsb Interpretation and use of the index
Minimum fit (N1)f BF, SA, NNO, NP Tests the hypothesis that the specified model fits perfectly (within the limits of sampling
function error); the obtained 2 value should be smaller than 2 crit ; note that the minimum fit
chi-square (2 ) function 2 is only one possible chi-square statistic and that different discrepancy functions
will yield different 2 values
Root mean square r BF, SA, NNO, P Estimates how well the fitted model approximates the population covariance matrix per df ;
.2 df /
error of .N1/df Browne and Cudeck (1992) suggest that a value of 0.05 indicates a close fit and that values
approximation up to 0.08 are reasonable; Hu and Bentler (1999) recommend a cutoff value of 0.06; a
(RMSEA) p-value for testing the hypothesis that the discrepancy is smaller than 0.05 may be
11 Structural Equation Modeling
jxb14@psu.edu
SRMR; Hu and Bentler (1999) recommend a cutoff value close to 0.08
Comparative Fit GF, IM, NO, NP Measures the proportionate improvement in fit (defined in terms of noncentrality, i.e.,
Index (CFI) 2 df ) as one moves from the baseline to the target model; originally, values greater than
0.90 were deemed acceptable, but Hu and Bentler (1999) recommend a cutoff value of 0.95
Tucker and Lewis 2 2 GF, IM, ANO, P Measures the proportionate improvement in fit (defined in terms of noncentrality) as one
n df n t df t
dfn dft
nonnormed fit 2
n df n
moves from the baseline to the target model, per df ; originally, values greater than 0.90
index (TLI, NNFI) dfn were deemed acceptable, but Hu and Bentler (1999) recommend a cutoff value of 0.95
a
N D sample size; f D minimum of the fitting function; df D degrees of freedom; r D number of parameters estimated; p D number of observed variables;
2 crit D critical value of the 2 distribution with the appropriate number of degrees of freedom and for a given significance level; the subscripts n and t refer to
the null (or baseline) and target models, respectively. The baseline model is usually the model of complete independence of all observed variables
b
GF D goodness-of-fit index (i.e., the larger the fit index, the better the fit); BF D badness-of-fit index (i.e., the smaller the fit index, the better the fit);
SA D stand-alone fit index (i.e., the model is evaluated in an absolute sense); IM D incremental fit index (i.e., the model is evaluated relative to a baseline
model); NO D normed (in the sample) fit index; ANO D normed (in the population) fit index, but only approximately normed in the sample (i.e., can fall
343
outside the [0, 1] interval); NNO D nonnormed fit index; NP D no correction for parsimony; P D correction for parsimony
344 H. Baumgartner and B. Weijters
Often, researchers will iterate between examining the overall fit of the model,
inspecting residuals and modification indices, and looking at some of the details
of the specified model. However, once the researcher is comfortable with the final
model, this model has to be interpreted in detail. Usually, this will involve the
following. First, all model parameters are checked for consistency with expectations
and significance tests are conducted at least for the parameters that are of substantive
interest. Second, depending on the model, certain other analyses will be conducted.
For example, a researcher will usually want to report evidence about the reliability
and convergent validity of the observed measures, as well as the discriminant
validity of the constructs. Third, for models containing endogenous latent variables,
the amount of variance in each endogenous variable explained by the exogenous
latent variables should be reported. Finally, for some models one may want to
conduct particular model comparisons. For example, if a model contains three layers
of relationships, one may wish to examine to what extent the variables in the middle
layer mediate or channel the relationships between the variables in the first and third
layer. Or if a multi-sample analysis is performed, one may wish to test the invariance
of particular paths across different groups. More details about local fit assessment
will be provided below.
jxb14@psu.edu
11 Structural Equation Modeling 345
jxb14@psu.edu
346 H. Baumgartner and B. Weijters
2ij 'jj
IIRxi D (11.1)
2ij 'jj C ii
jxb14@psu.edu
11 Structural Equation Modeling 347
total variance in an indicator should be substantive variance (i.e., IIR 0.5). One can
also summarize the reliability of all indicators of a given construct by computing the
average of the individual-item reliabilities. This is usually called average variance
extracted (AVE), that is,
P
IIRxi
AVE
j D (11.2)
K
where K is the number of indicators (xi ) for the construct in question (
j ). Similar to
IIR, a common rule of thumb is that AVE should be at least 0.5.
As a set, all measures of a given construct combined should be strongly related to
the underlying construct. One common index is composite reliability (CR), which
is defined as the squared correlation between an unweighted sum (or average) of the
measures of a construct and the construct itself. CR is a generalization of coefficient
alpha to a situation in which items can have different loadings on the underlying
factor and it can be computed as follows:
P 2
P
ij 'jj
CR xi D P 2 P : (11.3)
ij 'jj C ii
jxb14@psu.edu
348 H. Baumgartner and B. Weijters
unavoidable when the number of items in a scale is rather large (e.g., a personality
scale may consist of 20 or more items) and has certain advantages (e.g., parceling
may be used strategically to correct for lack of normality), but parceling has to be
used with care (e.g., the items in the parcel should be unidimensional). Particularly
when the factor structure of a set of items is not well-understood, item parceling is
not recommended. An alternative to item parceling is to average all the measures of
a given construct, fix the loading on the construct to one, and set the error variance
to (1 ˛) times the variance of the average of the observed measures, where ˛ is an
estimate of the reliability of the composite of observed measures (such as coefficient
˛). However, the same caution as for item parceling is applicable here as well.
The congeneric measurement model makes strong assumptions about the factor
loading matrix and the covariance matrix of the unique factors. Each indicator loads
on a single substantive factor, and the unique factors are uncorrelated.
It is possible to relax the assumption that the loadings of observed measures on
nontarget factors are zero. In Exploratory Structural Equation Modeling (ESEM),
the congeneric confirmatory factor model is replaced with an exploratory factor
model in which the number of factors is determined a priori and the initial factor
solution is rotated using target rotation (Marsh et al. 2014). The fit of the congeneric
factor model can be compared to the fit of an exploratory structural equation model
using a chi-square difference test (based on the difference of the two chi-square
values and the difference in the degrees of freedom of the two models) and, ideally,
the restrictions in the congeneric factor model will not decrease the fit substantially,
although frequently the fit does get worse. An alternative method for modeling a
more flexible factor pattern is based on Bayesian Structural Equation Modeling
(BSEM) (Muthén and Asparouhov 2012). In this approach, informative priors with
a small variance are specified for the cross-loadings (e.g., a normal prior with a
mean of zero and a variance of 0.01 for the standardized loadings, which implies
a 95 percent confidence interval for the loadings ranging from 0.2 to C0.2).2
Although both methods tend to improve the fit of specified models and may avoid
distortions of the factor solution when the congeneric measurement model is clearly
inconsistent with the data, the two approaches abandon the ideal that an indicator
should only be related to a single construct, which creates problems with the
interpretation of hypothesized factors.
The assumption that the substantive factors specified in the congeneric measure-
ment model are the only sources of covariation between observed measures is also
limiting. Frequently, there will be significant modification indices suggesting that
the covariation between certain unique factors should be freely estimated. However,
2
See Chap. 16 on Bayesian Analysis.
jxb14@psu.edu
11 Structural Equation Modeling 349
jxb14@psu.edu
350 H. Baumgartner and B. Weijters
jxb14@psu.edu
11 Structural Equation Modeling 351
of the sigmoid curve) and bi the difficulty parameter (i.e., the value of
j at which
the probability of a response of 1 is 0.5). The model is similar to logistic or probit
regression, except that the explanatory variable
j is latent rather than observed
(Wu and Zumbo 2007). The IRT model for binary data can be extended to ordinal
responses. The interested reader is referred to Baumgartner and Weijters (2017) for
a recent discussion.
jxb14@psu.edu
352 H. Baumgartner and B. Weijters
jxb14@psu.edu
11 Structural Equation Modeling 353
11.5.1 Introduction
jxb14@psu.edu
354 H. Baumgartner and B. Weijters
one product, and their observed self-scanning use or non-use could be matched with
their entry survey data). In this sample, 65% (35%) were female (male). Further,
63% had had education after secondary school. As for age, 1% were aged 12–19,
21% 20–29, 21% 30–39, 28% 40–49, 19% 50–59, 7% 60–69, 2% 70–79, and 1%
80–89 years. Finally, 36% used self-scanning during their visit to the store.
In what follows, we illustrate the use of SEM on the self-scanning data, roughly
following the outline of the preceding exposition. Thus, we start with a CFA of the
five belief constructs. Next, we test measurement invariance of this factor structure
across men and women (multi-sample measurement). We then move on to full SEM,
testing a two-group (men/women) mediation model where the five belief factors are
used as antecedents of self-scanning use, mediated by attitude toward self-scanning
use. All analyses were run in Mplus 7.4.
jxb14@psu.edu
11 Structural Equation Modeling 355
Our first aim is to assess the factor structure of the five belief factors (PU, PEU,
REL, FUN and NEW). Note that the factor models are intended as stand-alone
examples of a measurement analysis. If a factor analysis were used as a precursor
to a full structural equation model, it would be common to also include the
endogenous constructs and their indicators in the measurement analysis. We start by
running an exploratory factor analysis where the 15 belief items freely load on five
factors using the default ML estimator with oblique GEOMIN rotation. This model
shows acceptable fit to the data: 2 (40) D 86.725, p < 0.001; RMSEA D 0.048
(90% confidence interval (CI) D [0.034, 0.062]); SRMR D 0.014; CFI D 0.989;
TLI D 0.970. Each of the five factors shows loadings for the three target items
that are statistically significant (p < 0.05) and substantial (all loadings were greater
than 0.50, although most loadings were greater than 0.80). There were also six
significant cross-loadings, suggesting that the factor pattern does not have perfect
simple structure. However, these six cross-loadings do not seem problematic as they
are small (most are smaller than 0.10, and none are greater than 0.20).
We proceed to test a confirmatory factor analysis (CFA) of the five belief
factors. Even though the CFA model fits the data significantly worse than the
exploratory factor model (the two models are nested and can be compared with
a chi-square difference test, 2 (40) D 108.975, p < 0.001), the fit of the CFA
model is deemed acceptable, especially in terms of the alternative fit indices:
2 (80) D 195.70; RMSEA D 0.054 (90% CI D [0.044,0.064]); SRMR D 0.037;
CFI D 0.972; TLI D 0.963. Closer inspection of the local fit of the model shows
that five modification indices for factor loadings constrained to zero have a value
greater than 10; these five modification indices are for the non-target loadings
identified in the exploratory factor analysis. Although statistically significant, they
are not large enough to warrant model modifications, as this would come at the
expense of parsimony and replicability. Table 11.4 reports the CFA results for
individual items and factors. Overall, the results are satisfactory, with the exception
of two items that have problematic IIR values (less than 0.50). All AVE values
are at least 0.50 and all CR values are larger than 0.70, in support of convergent
validity. Table 11.5 evaluates discriminant and convergent validity by showing the
AVE’s and correlations for all factors. Discriminant validity is supported as the
squared correlations between constructs are smaller than the AVE’s of the constructs
involved in the correlation.
Now that we have established a viable factor model, we can test for measurement
invariance between male and female respondents. To this purpose, we use the same
CFA model as before, but additionally specify gender as the grouping variable
and run a sequence of three models with constraints corresponding to configural
invariance, metric invariance and scalar invariance. Table 11.6 reports the model fit
results.
The comparison of the metric invariance model with the configural invariance
model shows no significant deterioration in fit, so metric invariance can be
accepted. Strictly speaking, the 2 difference testing scalar invariance against metric
jxb14@psu.edu
356 H. Baumgartner and B. Weijters
invariance is significant at the 0.05 level, but there are good reasons to nevertheless
accept scalar invariance: the 2 difference is small, and the alternative fit indices
(CFI, TLI, SRMR, and RMSEA) do not deteriorate much, particularly the ones that
take into account model parsimony (TLI and RMSEA). The information-theory
based fit index BIC is lowest for the scalar invariance model. Moreover, closer
inspection of the results shows that the modification indices are rather small (the
highest modification index for an item intercept is 6.42). In sum, it is reasonable
jxb14@psu.edu
11 Structural Equation Modeling 357
to conclude that the five beliefs related to self-scanning are measured equivalently
among men and women, both in terms of scale metrics and item intercepts.
As a result, we can use the CFA model to compare factor means. To do so, we
set the factor means to zero in the male group while freely estimating the factor
means in the female group. None of the factor means are significantly different
across groups, although two differences come close: the means of PEU (t D 1.664,
p D 0.096) and REL (t D 1.709, p D 0.088) are somewhat lower for women than
for men.
To illustrate the use of full SEM, we test the model shown in Fig. 11.1, although
we include gender as a grouping variable and test the invariance of structural paths
across men and women. In order for comparisons of structural coefficients to be
meaningful, we imposed equality of factor loadings across groups. It was already
established that the belief items satisfy metric invariance, and additional analyses
showed that metric invariance also held for the indicators of attitude. Table 11.7
reports the model fit indices for a partial mediation model in which the five belief
factors influence USE (more specifically, the probit of the probability of use of SST)
both directly and indirectly via attitude (model A) and a model with full mediation
in which there are no direct effects of the five belief factors on USE (model B).
Model B shows significantly worse fit than model A. Closer inspection of the results
reveals a significant modification index for the direct effect of PEU on USE in the
female group. In model C, we therefore release the direct effect of PEU on USE,
and the resulting model does not show a deterioration in fit relative to model A.
We can conclude that there are no direct effects of four of the belief factors (PU,
REL, FUN, and NEW) on USE, but PEU has a direct effect for women. Figure 11.3
presents the unstandardized path coefficients estimated for model C. Note that the
regressions of USE on ATT and on PEU are probit regressions, which means that
jxb14@psu.edu
358 H. Baumgartner and B. Weijters
the path coefficients are interpreted as the increase in the probit index (z-score) of
the probability of USE of SST for a unit increase in attitude or PEU (as measured
by a five point scale). Although the path coefficients are not identical for men and
women, none of the coefficients were significantly different across groups (the chi-
square difference test comparing a model with freely estimated coefficients and a
model with invariant coefficients was 2 (7) D 6.77, p D 0.45). PU, PEU, and
FUN have significant effects on ATT for both males and females and the effect
of REL is marginal for women; the effect of NEW is non-significant for both
men and women. PEU also has a direct effect on USE for women. In a bootstrap
analysis based on 1000 bootstrap samples the effect of FUN on ATT for men is only
marginal. The indirect effects of PU, PEU, and FUN are significant for both men
and women, and the indirect effect of REL is marginal for women, based on a Sobel
test. However, the indirect effects of PEU and FUN are fragile for men based on
a bootstrap analysis with 1000 bootstrap samples (i.e., PEU is not significant and
FUN is marginal), and the indirect effect of REL for women is nonsignificant. Note
that the indirect effects are “naïve” indirect effects, not causally defined indirect
effects (see Muthén and Asparouhov 2015). Figure 11.3 also reports the R2 ’s for the
various endogenous constructs, which range from 0.55 to 0.71.
In summary, the findings show that perceptions of PU, PEU, and FUN determine
consumers’ attitude toward self-scanning technology, and that attitude influences
actual use of self-scanning. PEU also has a direct effect on USE for women, but
overall the structural model is largely invariant across genders.
jxb14@psu.edu
11 Structural Equation Modeling 359
References
Anderson, J.C., Gerbing, D.W.: Structural equation modeling in practice: a review and recom-
mended two-step approach. Psychol. Bull. 103, 411–423 (1988)
Baumgartner, H., Homburg, C.: Applications of structural equation modeling in marketing and
consumer research: a review. Int. J. Res. Mark. 13, 139–161 (1996)
Baumgartner, H., Weijters, B.: Measurement models for marketing constructus. In: Wierenga,
B., van der Lans, R. (eds.) Handbook of Marketing Decision Models, Springer, New York,
forthcoming (2017)
Bollen, K.A.: Structural Equations with Latent Variables. Wiley, New York (1989)
Browne, M.W., Cudeck, R.: Alternative ways of assessing model fit. Sociol. Methods Res. 21,
230–258 (1992)
Diamantopoulos, A., Riefler, P., Roth, K.P.: Advancing formative measurement models. J. Bus.
Res. 61, 1203–1218 (2008)
Fornell, C., Larcker, D.F.: Evaluating structural equation models with unobservable variables and
measurement error. J. Mark. Res. 18, 39–50 (1981)
jxb14@psu.edu
360 H. Baumgartner and B. Weijters
Homburg, C., Stierl, M., Borneman, T.: Corporate social responsibility in business-to-business
markets: how organizational customers account for supplier corporate social responsibility
engagement. J. Mark. 77(6), 54–72 (2013)
Hu, L.t., Bentler, P.M.: Cutoff criteria for fit indexes in covariance structure analysis: conventional
criteria versus new alternatives. Struct. Equ. Model. 6(1), 1–55 (1999)
Hulland, J., Chow, Y.H., Lam, S.: Use of causal models in marketing research: A review. Int. J.
Res. Mark. 13, 181–197 (1996)
Hult, G.T., Ketchen Jr., D.J., Griffith, D.A., Finnegan, C.A., Gonzalez-Padron, T., Harmancioglu,
N., Huang, Y., Talay, M.B., Cavusgil, S.T.: Data equivalence in cross-cultural international
business research: assessment and guidelines. J. Int. Bus. Stud. 39, 1027–1044 (2008)
Jöreskog, K.G., Sörbom, D.: LISREL 8.8 for Windows [Computer Software]. Scientific Software
International, Inc., Skokie, IL (2006)
Kamata, A., Bauer, D.J.: A note on the relation between factor analytic and item response theory
models. Struct. Equ. Model. 15(1), 136–153 (2008)
Klein, A., Moosbrugger, H.: Maximum likelihood estimation of latent interaction effects with the
LMS method. Psychometrika. 65, 457–474 (2000)
Marsh, H.W.: Confirmatory factor analyses of multitrait-multimethod data: many problems and a
few solutions. Appl. Psychol. Meas. 13, 335–361 (1989)
Marsh, H.W., Morin, A.J., Parker, P.D., Kaur, G.: Exploratory structural equation modeling: an
integration of the best features of exploratory and confirmatory factor analysis. Annu. Rev.
Clin. Psychol. 10, 85–110 (2014)
Marsh, H.W., Wen, Z., Hau, K.-T., Nagengast, B.: Structural equation models of latent interaction
and quadratic effects. In: Hancock, G.R., Mueller, R.O. (eds.) Structural Equation Modeling:
A Second Course, 2nd edn, pp. 267–308. Information Age Publishing, Charlotte, NC (2013)
Martínez-López, F.J., Gázquez-Abad, J.C., Sousa, C.M.P.: Structural equation modelling in
marketing and business research. Eur. J. Mark. 47(1/2), 115–152 (2013)
Muthén, B., Asparouhov, T.: Bayesian structural equation modeling: a more flexible representation
of substantive theory. Psychol. Methods. 17(3), 313–335 (2012)
Muthén, B., Asparouhov, T.: Causal effects in mediation modeling: an introduction with applica-
tions to latent variables. Struct. Equ. Model. 22, 12–23 (2015)
Steenkamp, J-B.E.M., Baumgartner, H.: Assessing measurement invariance in cross-national
consumer research. J. Consum. Res. 25, 78–90 (1998)
Weijters, B., Baumgartner, H., Schillewaert, N.: Reversed item bias: an integrative model. Psychol.
Methods. 18(3), 320–334 (2013)
Weijters, B., Rangarajan, D., Falk, T., Schillewaert, N.: Determinants and outcomes of customers’
use of self-service technology in a retail setting. J. Serv. Res. 10(August), 3–21 (2007)
Wu, A.D., Zumbo, B.D.: Thinking about item response theory from a logistic regression perspec-
tive. In: Sawilowsky, S.S. (ed.) Real Data Analysis, pp. 241–269. Information Age Publishing,
Charlotte, NC (2007)
jxb14@psu.edu