You are on page 1of 18

Model Fit

Model fit refers to how well our proposed model (in this case, the model of the factor structure)
accounts for the correlations between variables in the dataset. If we are accounting for all the
major correlations inherent in the dataset (with regards to the variables in our model), then we
will have good fit; if not, then there is a significant discrepancy between the correlations
proposed and the correlations observed, and thus we have poor model fit. Our proposed model
does not fit the observed or estimated model (i.e., the correlations in the dataset).


There are specific measures that can be calculated to determine goodness of fit. The metrics that
ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is
inversely related to sample size and the number of variables in the model. Thus, the thresholds
below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al.
2010 on page 654.

Modification indices

Modification indices offer suggested remedies to discrepancies between the proposed and
estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix

Page 1 of 18
model fit, as all regression lines between latent and observed variables are already in place.
Therefore, in a CFA, we look to the modification indices for the covariances. We cannot covary
error terms with observed or latent variables, or with other error terms that are not part of the
same factor. Thus, the only modification available to us is to covary error terms that are part of
the same factor. The figure below illustrates this rule. In general, you want to address the largest
modification indices before addressing more minor ones.


Residuals are much like modification indices; they point out where the discrepancies are between
the proposed and estimated models. However, they also indicate whether or not those
discrepancies are significant. A significant standardized residual is one with an absolute value
greater than 0.4. Significant residuals significantly decrease your model fit. Fixing model fit per

Page 2 of 18
the residuals matrix is similar to fixing model fit per the modification indices. The same rules
apply. For a more specific run-down of how to calculate and locate residuals, refer to the HOW
TO CFA video tutorial.

Validity and Reliability

It is absolutely necessary to establish convergent and discriminant validity, as well as reliability,

when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving
on to test a causal model will be useless - garbage in, garbage out! There are a few measures that
are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance
Extracted (AVE), Maximum Shared Squared Variance (MSV), and Average Shared Squared
Variance (ASV). The thresholds for these values are as follows:


CR > 0.7

Convergent Validity

CR > (AVE)
AVE > 0.5

Discriminant Validity



If you have convergent validity issues, then your variables do not correlate well with each other
within their parent factor; i.e, the latent factor is not well explained by its observed variables. If
you have discriminant validity issues, then your variables correlate more highly with variables
outside their parent factor than with the variables within their parent factor; i.e., the latent factor
is better explained by some other variables (from a different factor), than by its own observed

Page 3 of 18
If you need to cite these suggested thresholds, please use the following:

Hair, J., Black, W., Babin, B., and Anderson, R. (2010). Multivariate data analysis (7th ed.):
Prentice-Hall, Inc. Upper Saddle River, NJ, USA.

Measurement Model Invariance

Before creating composite variables for a path analysis, configural and metric invariance should
be tested during the CFA to validate that the factor structure and loadings are sufficiently
equivalent across groups, otherwise your composite variables will not be very useful (because
they are not actually measuring the same underlying latent construct for both groups).


Configural invariance tests whether the factor structure represented in your CFA achieves
adequate fit when both groups are tested together and freely (i.e., without any cross-group path
constraints). To do this, simply build your measurement model as usual, create two groups in
AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as
usual. If the resultant model achieves good fit, then you have configural invariance. If you dont
pass the configural invariance test, then you may need to look at the modification indices to
improve your model fit or to see how to restructure your CFA.


If we pass the test of configural invariance, then we need to test for metric invariance. To test for
metric invariance, simply perform a chi-square difference test on the two groups just as you
would for a structural model. The evaluation is the same as in the structural model invariance
test: if you have a significant p-value for the chi-square difference test, then you have evidence
of differences between groups, otherwise, they are invariant and you may proceed to make your
composites from this measurement model (but make sure you use the whole dataset when you
create composites, instead of using the split dataset).

Page 4 of 18
Contingency Plans

If you do not achieve invariant models, here are some appropriate approaches in the order I
would attempt them.

1. Modification indices: Fit the model for each group using the unconstrained
measurement model. You can toggle between groups when looking at modification
indices. So, for example, for males, there might be a high MI for the covariance between
e1 and e2, but for females this might not be the case. Go ahead and add those covariances
appropriately for both groups. When adding them to the model, it does it for both groups,
even if you only needed to do it for one of them. If fitting the model this way does not
solve your invariance issues, then you will need to look at differences in regression
2. Regression weights: You need to figure out which item or items are causing the
trouble (i.e., which ones do not measure the same across groups). The cause of the lack of
invariance is most likely due to one of two things: the strength of the loading for one or
more items differs significantly across groups, or, an item or two load better on a factor
other than their own for one or more groups. To address the first issue, just look at the
standardized regression weights for each group to see if there are any major differences
(just eyeball it). If you find a regression weight that is exceptionally different (for
example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then
you may need to remove that item if possible. Retest and see if invariance issues are
solved. If not, try addressing the second issue (explained next).
3. Standardized Residual Covariances: To address the second issue, you need to
analyze the standardized residual covariances.

2nd Order Factors

Handling 2nd order factors in AMOS are not difficult, but it is tricky. And, if you don't get it
right, it won't run. The pictures below offer a simple example of how you would model a 2nd
order factor in a measurement model and in a structural model.

Page 5 of 18
Page 6 of 18
Page 7 of 18

Structural equation modeling (SEM) grows out of and serves purposes similar to multiple
regression, but in a more powerful way which takes into account the modeling of interactions,
nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent
independents each measured by multiple indicators, and one or more latent dependents also each
with multiple indicators. SEM may be used as a more powerful alternative to multiple regression,
path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these
procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension
of the general linear model (GLM) of which multiple regression is a part.

SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page
provides general instruction and guidance regarding how to write hypotheses for different types
of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and
model fit for structural models. Videos and slides presentations are provided in the subsections.


Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle
for many researchers (just select at random any article from a good academic journal, and count
the wording issues!). In this section we offer examples of how you might word different types of
hypotheses. These examples are not exhaustive, but they are safe.

Direct effects

Diet has a positive effect on weight loss.

An increase in hours spent watching television will negatively affect weight loss.

Page 8 of 18
Mediated effects

For mediated effects, be sure to indicate the direction of the mediation (positive or negative), the
degree of the mediation (partial, full, or simply indirect), and the direction of the mediated
relationship (positive or negative).

Exercise positively and partially mediates the positive relationship between diet and
weight loss.
Television time positively and fully mediates the positive relationship between diet and
weight loss.
Diet affects weight loss positively and indirectly through exercise.

Interaction effects

Exercise positively moderates the positive relationship between diet and weight loss.
Exercise amplifies the positive relationship between diet and weight loss.
TV time negatively moderates (dampens) the positive relationship between diet and
weight loss.

Multi-group effects

Body Mass Index (BMI) moderates the relationship between exercise and weight loss,
such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle
mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to
weight loss).
Age moderates the relationship between exercise and weight loss, such that for age < 40,
the positive effect is stronger than for age > 40.
Diet moderates the relationship between exercise and weight loss, such that for western
diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and

Page 9 of 18
Handling controls

When including controls in hypotheses (yes, you should include them), simply add at the end of
any hypothesis, when controlling for.[list control variables here] For example:

Exercise positively moderates the positive relationship between diet and weight loss
when controlling for TV time and diet.
Diet has a positive effect on weight loss when controlling for TV time and diet.

Supporting Hypotheses

Getting the wording right is only part of the battle, and is mostly useless if you cannot support
your reasoning for WHY you think the relationships proposed in the hypotheses should exist.
Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You
must then go an explain the various reasons behind your hypothesized relationship. Take Diet
and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss".

Weight is gained as we consume calories. Diet reduces the number of calories consumed.
Therefore, the more we diet, the more weight we should lose.


Controls are potentially confounding variables that we need to account for, but that dont drive
our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a
negative effect on school performance. But there are many things that could effect school
performance, possibly even more than the amount of time spent in front of the TV. So, in order
to account for these other potentially confounding variables, the authors control for them. They
are basically saying, that regardless of IQ, time spent reading for pleasure, Hours spent doing
homework, or the amount of time parents spend reading to their child, an increase in TV time
still significantly decreases school performance regardless of all these other factors. These
relationships are shown in the figure below.

Page 10 of 18
As a cautionary note, you should nearly always include some controls; however, these control
variables still count against your sample size calculations. So, the more controls you have, the
higher your sample size needs to be. Also you get a higher R square but with increasingly
smaller gains for each added control. Sometimes you may even find that adding a control
drowns out all the effects of the IVs, in such a case you may need to run your tests without
that control variable (but then you can only say that your IVs, though significant, only account
for a small amount of the variance in the DV). With that in mind, you cant and shouldnt control
for everything, and as always, your decision to include or exclude controls should be based on

Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them
like the other exogenous variables (the ones that dont have arrows going into them), and have
them regress on whichever endogenous variables they may logically affect. In this case, I have
valShort, a potentially confounding variable, as a control, with regards to valLong. And I have
LoyRepeat as a control on LoyLong. Ive also covaried the Controls with each other and with the
other exogenous variables. When using controls in a moderated mediation analysis, go ahead and
put the controls in at the very beginning.

Page 11 of 18
When reporting the model, you do need to include the controls in all your tests and output, but
you should consolidate them at the bottom where they can be out of the way. Also, just so you
dont get any crazy ideas, you would not test for any mediation between a control and a
dependent variable. However, you may report how the control effects a dependent variable
differently based on a moderating variable. For example, valshort may have a stronger effect on
valLong for males than for females. This is something that should be reported, but not
necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from
controls are not significant, you do not need trim them from your model (although, there are also
other schools of thought on this issue).



Mediation models are used to describe chains of causation. Mediation is often used to provide a
more accurate explanation fo the causal effect the antecedent has on the dependent variable. The
mediator is usually that variable that is the missing link in a chain of causation. For example,

Page 12 of 18
Intelligence leads to increased performance - but not in all cases, as not all intelligent people are
high performers. Thus, some other variable is needed to explain the reason for the inconsistent
relationship between IV and DV. This other variable is called a mediator. In this example, work
effectiveness, may be a good mediator. We would say that work effectiveness fully and positively
mediates the relationship between intelligence and performance. Thus, the direct relationship
between intelligence and performance is better explained through the mediator of work
effectiveness. The logic is, even if you are intelligent, if you don't work smarter, then you won't
perform well. However, intelligent people tend to work smarter (but not always).


There are three main types of simple mediation: 1) partial, 2) full, and 3) indirect. Partial
mediation means that both the direct and indirect effects from the IV to DV are significant. Full
means that the direct effect drops out of significance when the mediator is added, and that the
indirect effect is significant. Indirect means that the direct effect never was significant, but that
the indirect effect is. The figure below illustrates these types of mediation. Please refer to the
step by step guide listed above for determining significance of the mediation.

Page 13 of 18


In factorial designs, interaction effects are the joint effects of two predictor variables in addition
to the individual main effects. This is another form of moderation (along with multi-grouping)
i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on
the value of another explanatory variable (the moderator). So, for example

you lose 1 pound of weight for every hour you exercise

you lose 1 pound of weight for every 500 calories you cut back from your regular diet
but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut back
from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in
total, you lose three pounds

So, the multiplicative effect of exercising while dieting is greater than the additive effects of
doing one or the other. Here is another simple example:

Chocolate is yummy
Cheese is yummy
but combining chocolate and cheese is yucky!

Page 14 of 18
The following figure is an example of a simple interaction model.


Interactions enable more precise explanation of causal effects by providing a method for
explaining not only how X effects Y, but also under what circumstances the effect of X changes
depending on the moderating variable of Z. Interpreting interactions is somewhat tricky.
Interactions should be plotted (as demonstrated in the HOW TO video). Once plotted, the
interpretation can be made using the following four examples (in the figures below) as a guide.

Page 15 of 18
Model fit again

You already did model fit in your CFA, but you need to do it again in your structural model in
order to demonstrate sufficient exploration of alternative models. The method is the same: look
at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing
that should be noted here in particular, however, is logic that should determine how you apply
the modification indices to error terms.

If the correlated variables are not logically causally correlated, but merely statistically correlated,
then you may covary the error terms in order to account for the systematic statistical correlations
without implying a causal relationship.
o e.g., burnout from customers is highly correlated with burnout from management
o We expect these to have similar values (residuals) because they are logically similar and
have similar structures, but they do not necessarily have any causal ties.
If the correlated variables are logically causally correlated, then simply add a regression line.
o e.g., burnout from customers is highly correlated with satisfaction with customers

Page 16 of 18
o We expect burnC to predict satC, so not accounting for it is negligent

Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e.,
one in which all modification indices are addressed) isn't logical, or does not fit with your theory,
you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain
why you did not choose the better fitting model.


Multi-group moderation is a special form of moderation in which a dataset is split along values
of a categorical variable (such as gender), and then a given model is tested with each set of data.
Using the gender example, the model is tested for males and females separately. The use of
multi-group moderation is to determine if relationships hypothesized in a model will differ based
on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for
example. A multi-group moderation model would answer the question: does dieting effect
weight loss differently for males than for females? In the videos above, you will learn how to set
up a multigroup model in AMOS, and test it using chi-square differences, and using critical
ratios. Really, using critical ratios takes about a one minute after the model is set up, and it
involves no room for human error, whereas using the chi-square method can take upwards of 30
minutes and it involves a lot of room for human error. So, I recommend the easy method!

From Measurement Model to Structural Model

Many of the examples in the videos so far have taught concepts using a set of composite
variables (instead of latent factors with observed items). Many will want to utilize the full power
of SEM by building true structural models (with latent factors). This is not a difficult thing.
Simply remove the covariance arrows from your measurement model (after CFA), then draw
single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it.
It's that easy. Refer to the video for a demonstration.

Page 17 of 18
Creating Composites from Latent Factors

If you would like to create composite variables (as used in many of the videos) from latent
factors, it is an easy thing to do. However, you must remember two very important caveats:

You are not allowed to have any missing values in the data used. These will need to be
imputed beforehand in SPSS or excel (I have two tools for this in my Stats Tools Package
- one for imputing, and one for simply removing the entire row that has missing data).
Latent factor names must not have any spaces or hard returns in them. They must be
single continuous strings (FactorOne or Factor_One instead of Factor One).

After those two caveats are addressed, then you can simply go to the Analyze menu, and select
Data Imputation. Select Regression Imputation, and then click on the Impute button. This will
create a new SPSS dataset with the same name as the current dataset except it will be followed
by an "_C". This can be found in the same folder as your current dataset.

Page 18 of 18