You are on page 1of 8

An Introduction to Structural Equation Modelling Page 1

Structural Equation Modeling (SEM) techniques


Building and Testing Models

1. SEM basics
2. The confirmatory approach of SEM
3. Assumptions underlying SEM
4. Superiority of SEM over Multiple Linear Regression Analysis
5. The structural equation modeling process
6. Goodness-of-fit measures

Appendix: Checklist for Building and Testing SEM Models using AMOS Graphics


Source and further reading:
This outline follows in parts the University of Texas Guide Structural Equation Model-
ing using AMOS which can freely be downloaded:
http://www.utexas.edu/cc/stat/tutorials/amos/index.html


1. SEM basics

SEM is a general statistical modelling technique to establish relationships be-
tween variables. A key characteristic of SEM is that observed variables are concep-
tualised as representing a number of latent variables (latent factors or constructs)
these latent variables cannot be directly measured but only inferred from (the relation-
ships between) measured variables. The vast majority of psychological constructs are
by nature latent variables not amenable to direct measurements (e.g. aggressive-
ness, negative affect or flow).

SEM is an extension of Multiple Linear Regression Analysis (MLRA) and consists of a
series of MLRA equations with all equations being fitted simultaneously. Hence, SEM
is an extension of the General Linear Model (GLM) and SEM approaches encompass
diverse statistical techniques such as:
- ANOVA/MANOVA
- MLRA
- Path analysis (PA) as a subset of SEM with only observed variables
- Confirmatory Factor Analysis (CFA)
- Causal modelling with latent variables.

The fundamental principle of SEM techniques lies in testing whether or not a specified
model, which is based on theory, fits the data. Models that do not fit the data are re-
jected, whereas models that fit the data sufficiently are provisionally accepted.
Using ML (maximum likelihood) chi-square tests, SEM programmes (such as LISREL,
AMOS or SEPATH) calculate a battery of different goodness-of-fit indices (resem-
bling R Square in MLRA).
In addition, in order to test the strengths of paths in the model, structural path coeffi-
cients are calculated (being equivalent to regression coefficients in MLRA).

An Introduction to Structural Equation Modelling Page 2

Since it is rare that the model fits well at first, frequently model modifications are re-
quired to obtain a better fitting model. LISREL and AMOS allow for the use of modifi-
cation indices to generate the expected reduction in the overall fit chi-square for
each possible path that can be added to the model. As it will be shown a reduction in
the chi-square fit index reflects increased overall fit of the model being tested.


Nomenclature and symbols of variables:

Indicators: observed (or manifest) variables (directly measured)

Factors: latent variables, i.e. unobserved variables which cannot
be directly measured but inferred by the relationship (correlation)
of observed variables in the equation.

Error terms [E]: measurement error of observed variables

Residuals or disturbance terms [D]: error terms of factors (refers to prediction
error): Model = Data + Residual Residual = Data Model.

In SEM (as opposed to PA) each variable in the model is conceptualised as a latent
one, measured by multiple indicators (with a recommended minimum of three indica-
tors per latent variable).

Latent variables (factors) encompass both:
- Independent variables = exogenous (upstream) variables
- Dependent variables = endogenous (downstream) variables.
Contrary to endogenous variables, exogenous variables have no prior causal vari-
ables.


The five steps in SEM

Model Specification
In causal modelling this comprises both measurement and structural models

Model Identification
The gold-standard are over-identified models

Model Estimation
Estimation Method [Maxim Likelihood] and assumptions

Model Testing
Assessment of the overall fit of a given model

Model Modification
Aimed at increasing the overall goodness-of fit based on modification indices

Observed
Variables
Latent
Factors
E/D
An Introduction to Structural Equation Modelling Page 3


2. The confirmatory approach of SEM

SEM is generally viewed as a confirmatory rather than an exploratory procedure, using
one of the three following approaches:
1. Strictly confirmatory approach: A model is tested using SEM goodness-of-fit
tests to determine if the pattern of variances and covariances is consistent with a
structural (path) model specified by the researcher. However, an accepted
model is only a not-disconfirmed model (as alternative, not yet examined, mod-
els may fit the data better).
2. Alternative models approach: One may test two or more models to determine
which one has the best fit. There are numerous goodness-of-fit measures.
3. Model development approach: In practice much SEM research combines con-
firmatory and exploratory purposes: A model is tested using SEM procedures,
found to be deficient and an alternative model is then tested based on the
changes suggested by the SEM modification indices.

Regardless of the approach, SEM cannot itself draw causal arrows in the models or re-
solve causal ambiguities. The specified model has to be theory-based, i.e. theoreti-
cal insight and judgement by the researcher is still of utmost importance!


Theory Model Construction Instrument Construction Data Collection Model Testing Results


Interpretation


3. Assumptions underlying SEM

Although using PA (Path Analysis), SEM relaxes many (but not all) of its assumptions
pertaining to data level, interactions and uncorrelated error.

- Multivariate normal distribution of the indicators: Each indicator (observed
variable) should be normally distributed for each value of the other indicator.
Multivariate normality is required by the Maximum Likelihood Estimation (MLE),
which is the predominant method in SEM for estimating structural (path) coeffi-
cients. However, Kline (1998) suggests that under conditions of severe non-
normality of data, SEM parameter estimates (e.g. path estimates) are still fairly
accurate but corresponding significance indices are too high.
- Multivariate normal distribution of the latent dependent variables: Each de-
pendent latent variable in the model should be normally distributed for each
value of each other latent variable. Dichotomous latent variables violate this as-
sumption.
- Linearity: SEM assumes linear relationships between indicator and latent vari-
ables. However, as with regression analysis, it is possible to add logarithmic or
other nonlinear transformations of the original variable to the model.
- Indirect measurement: Typically, all variables in the model are latent variables.
An Introduction to Structural Equation Modelling Page 4

- Multiple indicators: A minimum of thee indicators should be used to measure
each latent variable in the model. Multiple indicators are part of a strategy to
lower measurement error and increase data reliability.
- Not theoretically underidentified of just identified models: A model is just
identified or saturated if there are as many parameters to be estimated as there
are elements in to covariance matrix ( df = 0). Researchers seek an overiden-
tified model, i.e. one where the number of observed variances and covariances
(knowns) is greater than the number of parameters to be estimated (un-
knowns) in this case: df > 0.
- Not empirically identified due to high multicollinearity: A model can be theo-
retically identified but still not solvable due to such empirical problems as high
multicollinearity or path estimates close to zero in non-recursive models.
- Data of interval level of measurement assumed. However, unlike traditional
PA, SEM explicitly models error, including error arising from use of ordinal data.
Exogenous variables may be dichotomous or dummy variables, but unless spe-
cial approaches are taken categorical dummy variables may not be used as en-
dogenous variables.
- Multicollinearity: Complete multicollinearity is assumed to be absent, but corre-
lations among the independents may be modelled explicitly in SEM.
- Sample size should not be small since SEM relies on tests which are sensitive
to sample size. In the literature sample sizes commonly run 200-400 PS for
models with 10-15 indicator variables. Stevens (1996) suggests having at least
15 cases per indicator or measured variable. Another commonly applied rule of
the thumb is: N > 50 +8k (with k = number of indicators).



4. Superiority of SEM over Multiple Linear Regression Analysis

Despite the numerous assumptions discussed above LISEL and AMOS can fit models
with the following characteristics (which would be subject to limitations in MLRA):
- Multiple dependent outcome variables are permitted
- Mediating variables can be included in the same single model as predictors
- No assumption that each predictor is measured without error
- Multicollinearity among the predictors does not severely hinder result interpreta-
tion.



5. The structural equation modeling process

The SEM process centres around two steps:

1. Validating the measurement model through Confirmatory Factor Analysis
(CFA)

2. Fitting the structural model through Path Analysis (PA)

An Introduction to Structural Equation Modelling Page 5

A pure measurement model is a CFA model in which there is unmeasured covariance
between each possible pair of latent variables. Unmeasured covariance implies that
one always draws two-headed covariance arrows connecting all pairs of exogenous
variables (associations, no causations). The structural (path) model may be con-
trasted with the measurement model. It is the set of exogenous and endogenous vari-
ables in the model together with direct effects connecting them (straight arrows) and
error terms (disturbance terms) of these variables.

After validation of the measurement model, two or more models (one of which
may the null model) are then compared in terms of model-fit, which measures the
extent to which the covariances predicted by the model correspond to the covariances
in the data. Modification indices and other coefficients may be used to alter one
or more models to improve fit.



6. Goodness-of-fit tests

Goodness-of-fit tests determine if the model can be provisionally accepted or has to be
rejected. Since the null-hypothesis under test is that the model fits the data, the re-
searcher hopes to find a small non-significant chi-square model fit statistic, not al-
lowing rejecting the null-hypothesis.

If the p-value of the
2
.05 the null-hypothesis that the model fits the data cannot
be rejected the model is provisionally accepted.
A non-statistically significant chi-square value indicates that the sample covariance ma-
trix and the reproduced model-implied covariance matrix are similar.

The fit indices outputs in AMOS and LISREL contain a plethora of model fit statistics
which are all designed to test or describe overall model fit. Commonly reported fit
statistics are the chi-square (labelled discrepancy in the AMOS output) reported with
its df and p and furthermore the Tucker-Lewis Index (TLI) and the Root Mean Square
Error of Approximation (RMSEA) and its lower and upper confidence interval
boundaries.
AMOS and LISREL provide around 20 different goodness-of-fit measures; the choice of
it is a matter of dispute among methodologists. Kline (1998) recommends at least four
tests, such as chi-square, GFI (Goodness-of-fit index), NFI (Normed fit index) or CFI
(Comparative fit index), NNFI and SRMS. There is a wide disagreement on just which
fit indices to report.

Model fit criterion (Acceptable) level Interpretation
Chi-square [Discrepancy/CMIN] Tabled
2
-value If insignificant model can be provi-
sionally accepted
Goodness-of-fit [GFI] 0 [no fit] 1 [perfect fit] Values close to .95 reflect good fit
Root-mean-square error
of approximation [RMSEA]

< .05 (.10)
Values less than .05 (.10) repre-
sent a good model fit.
Tucker-Lewis Index [TLI] 0 [no fit] 1 [perfect fit] Values close to .95 reflect good fit
Parsimonious Fit Index [PFI] 0 [no fit] 1 [perfect fit] Compares values in alternative
models
Root mean square error of ap-
proximation [RMSEA]
P < .06 Suggested cut-off for good model
fit.
An Introduction to Structural Equation Modelling Page 6



An Introduction to Structural Equation Modelling Page 7


Appendix: Checklist for Building and Testing SEM Models using AMOS Graphics

Both Models 1 and 2 shall be built from scratch on, tested and interpreted (in terms of
goodness-of-fit indices and path coefficients).
Model 2 provides an example how a models fit can be significantly increased on the
basis of modification indices.


I. Building the Model

1. Specification of the measurement part of the model using the drawing tools of
AMOS-Graphics

2. Reading Data into AMOS
In our case we read in SPSS Raw Data; other options: Covariance and
Correlation Matrices

3. Label observed variables

4. Name the latent variables

5. Specify the relationships among the latent variables (the structural model)

6. Create error and residual terms for any latent variables predicted by other
variables in the model


II. Running the Model

1. Selecting AMOS Analysis Options <<Safe file before running the analysis>>

2. Calculating the Estimates


III. Interpreting the AMOS Output

1. Tests of Absolute Fit

2. Tests of Relative Fit <<Save before modifying the model>>


IV. Modifying the Model to Obtain Superior Goodness of Fit







An Introduction to Structural Equation Modelling Page 8


References:

Bentler, P.M. & Chou, C.PO. (1987). Practical issues in structural modeling. Sociologi-
cal Methods and Research. 165 (1): 78-117.

Byrne, B. M. (2001). Structural Equation Modeling with AMOS: Basic Concepts, Appli-
cation, and Programming. USA: Lawrence Earlbaum Publishers.

Stevens, J. (1996). Applied multivariate statistics for the social sciences. Mahwah, NJ:
Lawrence Earlbaum Publishers.

You might also like