You are on page 1of 8

Confirmatory Factor Analysis

(CFA)

 Confirmatory factor analysis (CFA) is a multivariate statistical


procedure that is used to test how well the measured variables represent
the number of constructs. 
 Confirmatory factor analysis (CFA) and exploratory factor analysis
(EFA) are similar techniques, but in exploratory factor analysis (EFA),
data is simply explored and provides information about the numbers of
factors required to represent the data.
 In exploratory factor analysis, all measured variables are related to every
latent variable. 
 But in confirmatory factor analysis (CFA), researchers can specify the
number of factors required in the data and which measured variable is
related to which latent variable. 
 Confirmatory factor analysis (CFA) is a tool that is used to confirm or
reject the measurement theory.

General Purpose – Procedure


 Defining individual construct:  The first step involves the procedure
that defines constructs theoretically. This involves a pretest to evaluate
the construct items, and a confirmatory test of the measurement model
that is conducted using confirmatory factor analysis (CFA), etc.

 Developing the overall measurement model theory: In confirmatory


factor analysis (CFA), we should consider the concept of
unidimensionality between construct error variance and within construct
error variance. 

 Designing a study to produce the empirical results: The measurement


model must be specified.  Most commonly, the value of one loading
estimate should be one per construct. 

 Assessing the measurement model validity: Assessing the measurement


model validity occurs when the theoretical measurement model is
compared with the reality model to see how well the data fits.
Assumptions
The assumptions of a CFA include
 Multivariate normality,
 A sufficient sample size (n >200),
 The correct a priori model specification
 Data must come from a random sample.

Key Terms:
 Theory: A systematic set of causal relationships that provide the
comprehensive explanation of a phenomenon.
 Model: A specified set of dependent relationships that can be used to test
the theory.
 Path analysis: Used to test structural equations.
 Path diagram: Shows the graphical representation of cause and effect
relationships of the theory.
 Endogenous variable: The resulting variables that are a causal
relationship.
 Exogenous variable: The predictor variables.
 Confirmatory analysis: Used to test the pre-specified relationship.
 Goodness of fit: The degree to which the observed input matrix is
predicted by the estimated model.
 Latent variables: Variables that are inferred, not directly observed, from
other variables that are observed.

Evaluating model fit

In CFA, several statistical tests are used to determine how well the model fits to
the data.
Absolute fit indices
Absolute fit indices determine how well the a priori model fits, or reproduces
the data.
Absolute fit indices include, but are not limited to, the Chi-Squared test,
RMSEA, GFI, AGFI, RMR, and SRMR.
Chi-squared test
The chi-squared test indicates the difference between observed and
expected covariance matrices. Values closer to zero indicate a better fit; smaller
difference between expected and observed covariance matrices.
Root mean square error of approximation
The root mean square error of approximation (RMSEA) avoids issues of sample
size by analyzing the discrepancy between the hypothesized model, with
optimally chosen parameter estimates, and the population covariance matrix.
The RMSEA ranges from 0 to 1, with smaller values indicating better model fit.
A value of .06 or less is indicative of acceptable model fit.
Root mean square residual and standardized root mean square residual
The root mean square residual (RMR) and standardized root mean square
residual (SRMR) are the square root of the discrepancy between the sample
covariance matrix and the model covariance matrix. The standardized root
mean square residual value ranges from 0 to 1, with a value of .08 or less being
indicative of an acceptable model.
Goodness of fit index and adjusted goodness of fit index
The goodness of fit index (GFI) is a measure of fit between the hypothesized
model and the observed covariance matrix. The adjusted goodness of fit index
(AGFI) corrects the GFI, which is affected by the number of indicators of each
latent variable. The GFI and AGFI range between 0 and 1, with a value of over .
9 generally indicating acceptable model fit.
Relative fit indices
Normed fit index and non-normed fit index
The normed fit index (NFI) analyzes the discrepancy between the chi-squared
value of the hypothesized model and the chi-squared value of the null model. 
 Values for both the NFI and NNFI should range between 0 and 1, with a cutoff
of .95 or greater indicating a good model fit.
Comparative fit index
The comparative fit index (CFI) analyzes the model fit by examining the
discrepancy between the data and the hypothesized model, while adjusting for
the issues of sample size inherent in the chi-squared test of model fit, and the
normed fit index.
CFI values range from 0 to 1, with larger values indicating better fit. Previously,
a CFI value of .90 or larger was considered to indicate acceptable model fit.
However, recent studies have indicated that a value greater than .90 is needed to
ensure that misspecified models are not deemed acceptable (Hu & Bentler,
1999). Thus, a CFI value of .95 or higher is presently accepted as an indicator of
good fit (Hu & Bentler, 1999).
Metrics
The thresholds listed in the table below are from Hu and Bentler (1999).

Modification indices
Modification indices offer suggested remedies to discrepancies between the
proposed and estimated model.

Validity and Reliability


 There are a few measures that are useful for establishing validity and
reliability: Composite Reliability (CR), Average Variance Extracted (AVE),
Maximum Shared Variance (MSV), and Average Shared Variance (ASV).

Convergent Validity
The extent that different measures of the same construct converge or strongly
correlate with one another.
 AVE > 0.5
Reliability

 CR > 0.7
Discriminant Validity
The extent that measures of different constructs diverge or minimally correlate
with one another
 MSV < AVE
 Square root of AVE greater than inter-construct correlations (Fornell &
Larcker, 1981)

If you have convergent validity issues, then your variables do not correlate well
with each other within their parent factor; i.e, the latent factor is not well
explained by its observed variables.
If you have discriminant validity issues, then your variables correlate more
highly with variables outside their parent factor than with the variables within
their parent factor; i.e., the latent factor is better explained by some other
variables (from a different factor), than by its own observed variables.

• James Gaskin: Validity during CFA


• http://statwiki.kolobkreations.com/index.php?title=Main_Page

Common Method Bias (CMB)

Common method bias refers to a bias in your dataset due to something external
to the measures. Something external to the question may have influenced the
response given. For example, collecting data using a single (common) method,
such as an online survey, may introduce systematic response bias that will either
inflate or deflate responses. A study that has significant common method bias is
one in which a majority of the variance can be explained by a single factor. To
test for a common method bias you can do a few different tests.
Harman’s single factor test

 It should be noted that the Harman's single factor test is no longer widely
accepted and is considered an outdated and inferior approach.
A Harman's single factor test tests to see if the majority of the variance can be
explained by a single factor. To do this, constrain the number of factors
extracted in your EFA to be just one (rather than extracting via eigenvalues).
Then examine the unrotated solution. If CMB is an issue, a single factor will
account for the majority of the variance in the model.
Common Latent Factor
This method uses a common latent factor (CLF) to capture the common
variance among all observed variables in the model. To do this, simply add a
latent factor to your AMOS CFA model (as in the figure below), and then
connect it to all observed items in the model. Then compare the standardised
regression weights from this model to the standardized regression weights of a
model without the CLF. If there are large differences (like greater than 0.200)
then you will want to retain the CLF as you either impute composites from
factor scores, or as you move in to the structural model.

You might also like