You are on page 1of 26

Principal axis factor analysis using SPSS

Mike Crowson, Ph.D.


October 29, 2019

Data can be downloaded from here:


https://drive.google.com/open?id=1QpsTAmMJ4d8MUj-vZuGN7QQ_3f-BMmsK
For our example, we will factor analyze (using principal axis factoring) 24 survey items where
participants rated their level of agreement using a scale ranging from 1=strongly disagree to
7=strongly agree.

The survey (items) see next slide appear in SPSS as…

(subset of items under data view)

Items as they appear


under variable view
Survey items
Steps to consider using when carrying out exploratory factor analysis (EFA)

1. Address the question of whether it makes sense to conduct factor analysis on the
correlation matrix. Tabachnick & Fidell (2013) suggests examining the correlation
matrix of the measured variables itself. If no correlations are found in excess of .30 (in
absolute value), then there is little point in factor analyzing the matrix. Next, we can
address this question by referring to results from Bartlett’s test, as well as the Kaiser-
Meier-Olkin Measure of Sampling Adequacy (i.e., KMO MSA). See discussion by
Dziuban & Shirkey (1974).
a. If Bartlett’s test is significant, then this is considered an indication that it is
appropriate to factor analyze the matrix (as significance indicates that the sample
correlation matrix is significantly different from an identity matrix).
b. KMO MSA < .50 indicates that a matrix is “unacceptable” for factoring. However,
using Kaiser & Rice’s (1974) terminology, the factorability of a matrix can be
considered with the following ranges in mind: .50’s (miserable), .60’s
(mediocre), .70’s (middling), .80’s (meritorious), .90’s (marvelous).
2. Consider the individual items and their contributions to the relationships in the matrix
and potential factor solution. Field (2018) suggested that if a variable exhibits no
correlations > .30 with the other variables in a matrix and consider deleting those
variables that do not meet this threshold. Additionally, consider the MSA’s of the
individual items, which can be done by examining the principal diagonal of the anti-
image correlation matrix [use Kaiser & Rice’s (1974) system to identify potentially
problematic items].
Steps to consider using when carrying out exploratory factor analysis (EFA)

3. Address the question of whether multicollinearity may be a problem. One way of doing
this is examining the determinant of the correlation matrix. Values close to 0 may signal
a problem (Tabachnick & Fidell, 2013). Field (2018) suggests a threshold of .00001 as a
means of identifying collinearity. Additional approaches include (a) examining the
correlation matrix for correlations in the .80’s or .90’s (Field, 2018) and/or examining the
R-squares after regressing each variable onto the remainder [in the latter, high R-
squares signal that the variation in one variable is a linear function of the other
variables included in the analysis; see Tabachnick & Fidell, 2013].
4. Once you have determined that it is appropriate to factor analyze the correlation
matrix, then identify the number of factors that account for the correlations among the
variables. Potential considerations/strategies: (a) Eigenvalue cutoff rule (not generally
recommended with PAF; (b) scree test; (c) parallel analysis; (d) retain as many factors as
account for a certain percentage of the variation; (e) factor meaningfulness (this last is
considered by examining factor loadings; see next). See discussion by Pituch & Stevens
(2016).
5. Name and describe factors. Most often, this involves first performing some type of
rotation and then interpreting the factor loadings. Factors are named by considering
those measured variables loading at some minimum threshold (e.g., .30, .32, or .40) on
them. Pituch and Stevens (2016) suggest a threshold of .40.
Subset of table of descriptive
statistics

Total effective N=253


Here is a subset of the correlations among our variables. There are many correlations
among .30 in this matrix which can be considered an indication that it is appropriate to
conduct factor analysis on it.

Additionally, inspection of the full matrix (again, not shown here) reveals no evidence of
multicollinearity as none of the bivariate r’s > .80. [Again, an additional strategy may be to
regress each variable onto the remainder and examine the R-square values. But we will bypass
that approach in this presentation.]
Once again, the determinant can be used
as one means for assessing
multicollinearity. Field (2018) suggests that
the determinant be > .00001. In this
example, that threshold is met.

As we see here, Bartlett’s test is statistically significant (p<.001) and the KMO MSA is .870
(“meritorious”). Both of these results indicate that it is appropriate to conduct factor analysis
on the correlation matrix.
This is a subset of the anti-image correlation matrix. Values in the principal diagonal (with ‘a’
superscripts) are MSA’s for each item. Most of the items had MSA’s above .80 (a couple in
the .70’s). Using Kaiser & Rice’s (1974) criteria, these results provide good indication that the
items are appropriate for inclusion in the factor analysis.
Here, we see the scree plot with the eigenvalues computed from our data. These are actually
the eigenvalues from an initial principal components analysis (PCA). Visually, it appears that
there may be 3-4 factors in the data.

We can also conduct a parallel analysis by comparing these eigenvalues against randomly
generated eigenvalues. An easy way to obtain the randomly generated eigenvalues is the
parallel analysis engine at: https://analytics.gonzaga.edu/parallelengine/
The table on the left contains random eigenvalues from a PCA, whereas the table on the right
contains the eigenvalues from the data. We see that the first three eigenvalues based on the
original data are greater than the random eigenvalues. This suggests a three-factor solution.

Now, let’s force a three factor solution and request Varimax rotation…
We will use a loading criterion of .40 (in absolute
value), as recommended by Pituch & Stevens
(2016). To make the table of factor loadings
easier to read, we will click on “Suppress small
coefficients” and type in “.40”.
Prior to rotation, Factors 1-3 accounted for 23.6%, 16.73%, and 5.298% of the variation,
respectively. Following rotation, the factors accounted for 23.30%, 12.52%, and 9.82%
of the variance, respectively.
The items pq1 – pq12 all loaded unambiguously onto
factor 1. These items all reflected “Political interest
and involvement”. The positively-worded items all
have positive loadings (representing affirmation of
interest and involvement), whereas the negatively-
worded items have negative loadings (representing
sentiments opposing interest and involvement).
Items 13, 15, 17, 20, 22, 23, and 24 loaded onto factor
2 and appear to represent “Dogmatic certainty”.

Items 16, 18, 19, and 21 might represent something


along the lines of “Belief flexibility”.
Subset of communality estimates:

These are interpreted as the proportion of


variation in each item accounted for by the
retained factors.
Now, let’s re-analyze the data where we force our 3-factor solution but use Promax
rotation…
Here, we have the pattern matrix provided. These are analogous to beta coefficients in the
context of regression. In this case, the coefficients reflect the relationship between each
variable and a given factor, controlling for the correlations among factors.

Again, items pq1 – pq12 all loaded unambiguously


onto factor 1 and reflected “Political interest and
involvement”. The positively-worded items all have
positive loadings (representing affirmation of interest
and involvement), whereas the negatively-worded
items have negative loadings (representing
sentiments opposing interest and involvement).
Again, items pq1 – pq12 all loaded unambiguously
onto factor 1 and reflected “Political interest and
involvement”. The positively-worded items all have
positive loadings (representing affirmation of interest
and involvement), whereas the negatively-worded
items have negative loadings (representing
sentiments opposing interest and involvement).

Items 13, 15, 17, 20, 22, and 23 loaded onto factor 2
and reflected “Dogmatic certainty”.

Items 16, 18, 19, and 21 again appear to reflect “Belief


flexibility”.
Here, we have a correlation matrix reflecting the factor correlations based on the model.
Given our naming of the factors we can say that “Political interest and involvement” (f1)
was somewhat positively related to “Dogmatic certainty” (f2) and “Belief flexibility” (f3).
On the other hand, “Dogmatic certainty” was moderately and negatively correlated with
“Belief flexibility” with r = -.352.
One last piece of output: the structure matrix

This is just a subset of the structure matrix.

This matrix contains correlations between


each measured variable and factor. Pituch &
Stevens (2016) do not recommend focusing on
this matrix when interpreting factors. Rather,
they recommend using the Pattern matrix.
References

Dziuban, C.D., & Shirkey, E.C. (1974). When is a correlation matrix appropriate for factor
analysis? Some decision rules. Psychological Bulletin, 31, 358-361.

Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed). Los Angeles: Sage.

Kaiser, H.F., & Rice, J. (1974). Little jiffy mark IV. Educational and Psychological
Measurement, 34, 111-117.

Matsunaga, M. (2010). How to factor –analyze your data right: Do’s, don’ts, and how-to’s.
International Journal of Psychological Research, 3, 97-110.

Pituch, K.A., & Stevens, J.P. (2016). Applied multivariate statistics for the social sciences (6th
ed). Thousand Oaks, CA: Sage.

Tabachnick, B.G., & Fidell, L.S. (2013). Using multivariate statistics (6th ed). Upper Saddle
River, NJ: Pearson.

You might also like