You are on page 1of 41

Factor Analysis with SPSS

Karl L. Wuensch
Dept. of Psychology
East Carolina University
What is a Common Factor?
It is an abstraction, a hypothetical
construct that relates to at least two of our
measurement variables.
We want to estimate the common factors
that contribute to the variance in our
variables.
Is this an act of discovery or an act of
invention?
What is a Unique Factor?
It is a factor that contributes to the
variance in only one variable.
There is one unique factor for each
variable.
The unique factors are unrelated to one
another and unrelated to the common
factors.
We want to exclude these unique factors
from our solution.
Iterated Principal Factors Analysis
The most common type of FA.
Also known as principal axis FA.
We eliminate the unique variance by
replacing, on the main diagonal of the
correlation matrix, 1s with estimates of
communalities.
Initial estimate of communality = R2
between one variable and all others.
Lets Do It
Using the beer data, change the extraction
method to principal axis.
Look at the Initial Communalities
They were all 1s for our PCA.
They sum to 5.675.
We have eliminated 7 5.675 = 1.325
units of unique variance.
Communalities

Initial Extraction
COST .738 .745
SIZE .912 .914
ALCOHOL .866 .866
REPUTAT .499 .385
COLOR .922 .892
AROMA .857 .896
TASTE .881 .902
Extraction Method: Principal Axis Factoring.
Iterate!
Using the estimated communalities, obtain
a solution.
Take the communalities from the first
solution and insert them into the main
diagonal of the correlation matrix.
Solve again.
Take communalities from this second
solution and insert into correlation matrix.
Solve again.
Repeat this, over and over, until the
changes in communalities from one
iteration to the next are trivial.
Our final communalities sum to 5.6.
After excluding 1.4 units of unique
variance, we have extracted 5.6 units of
common variance.
That is 5.6 / 7 = 80% of the total variance
in our seven variables.
We have packaged those 5.6 units of
common variance into two factors:

Total Variance Explained

Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings


Factor Total % of Variance Cumulative % Total % of Variance Cumulative %
1 3.123 44.620 44.620 2.879 41.131 41.131
2 2.478 35.396 80.016 2.722 38.885 80.016
Extraction Method: Principal Axis Factoring.
Our Rotated Factor Loadings
Not much different from those for the PCA.
Rotated Factor Matrixa

Factor
1 2
TASTE .950 -2.17E-02
AROMA .946 2.106E-02
COLOR .942 6.771E-02
SIZE 7.337E-02 .953
ALCOHOL 2.974E-02 .930
COST -4.64E-02 .862
REPUTAT -.431 -.447
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 3 iterations.
Reproduced and Residual
Correlation Matrices
Correlations between variables result from
their sharing common underlying factors.
Try to reproduce the original correlation
matrix from the correlations between
factors and variables (the loadings).
The difference between the reproduced
correlation matrix and the original
correlation matrix is the residual matrix.
We want these residuals to be small.
Check Reproduced under Descriptive
in the Factor Analysis dialogue box, to get
both of these matrices:
COST
Re produced Correlations

SIZE ALCOHOL RE PUTAT COLOR AROMA TA STE


Reproduced Correlation COST .745b .818 .800 -.365 1.467E -02 -2. 57E -02 -6. 28E -02
SIZE b
.818 .914 .889 -.458 .134 8.950E -02 4.899E -02
ALCOHOL .800 .889 .866b -.428 9.100E -02 4.773E -02 8.064E -03
RE PUTAT -.365 -.458 -.428 .385b -.436 -.417 -.399
COLOR 1.467E -02 .134 9.100E -02 -.436 .892b .893 .893
AROMA -2. 57E -02 8.950E -02 4.773E -02 -.417 .893 .896b .898
TA STE -6. 28E -02 4.899E -02 8.064E -03 -.399 .893 .898 .902b
Residuala COST 1.350E -02 -3. 295E-02 -4. 02E -02 3.328E -03 -2. 05E -02 -1. 16E -03
SIZE 1.350E -02 1.495E -02 6.527E -02 4.528E -02 8.097E -03 -2. 32E -02
ALCOHOL -3. 29E -02 1.495E -02 -3. 47E -02 -1. 88E -02 -3. 54E -03 3.726E -03
RE PUTAT -4. 02E -02 6.527E -02 -3. 471E-02 6.415E -02 -2. 59E -02 -4. 38E -02
COLOR 3.328E -03 4.528E -02 -1. 884E-02 6.415E -02 1.557E -02 1.003E -02
AROMA -2. 05E -02 8.097E -03 -3. 545E-03 -2. 59E -02 1.557E -02 -2. 81E -02
TA STE -1. 16E -03 -2. 32E -02 3.726E -03 -4. 38E -02 1.003E -02 -2. 81E -02
Ex trac tion Met hod: Principal A xis Fact oring.
a. Residuals are computed between observed and reproduced correlations. There are 2 (9.0%) nonredundant residuals with
absolute values greater than 0. 05.
b. Reproduced communalities
Nonorthogonal (Oblique) Rotation
The axes will not be perpendicular, the
factors will be correlated with one another.
the factor loadings (in the pattern matrix)
will no longer be equal to the correlation
between each factor and each variable.
They will still equal the beta weights, the
As in
X j A1 j F1 A2 j F2 Amj Fm U j
Promax rotation is available in SAS.
First a Varimax rotation is performed.
Then the axes are rotated obliquely.
Here are the beta weights, in the Pattern
Matrix, the correlations in the Structure
Matrix, and the correlations between
factors:
Beta Weights Correlations
Structure Matrix
Pattern Matrixa
Factor
Factor
1 2
1 2
TASTE .955 -7.14E-02
TASTE .947 .030
AROMA .949 -2.83E-02 AROMA .946 .072
COLOR .943 1.877E-02 COLOR .945 .118
SIZE 2.200E-02 .953 SIZE .123 .956
ALCOHOL -2.05E-02 .932 ALCOHOL .078 .930
COST -9.33E-02 .868 COST -.002 .858
REPUTAT -.408 -.426 REPUTAT -.453 -.469
Extraction Method: Principal Axis Factoring. Extraction Method: Principal Axis Factoring.
Rotation Method: Promax with Kaiser Normalization.
Rotation Method: Promax with Kaiser Normalization.
a. Rotation converged in 3 iterations.

Factor Correlation Matrix

Factor 1 2
1 1.000 .106
2 .106 1.000
Extraction Method: Principal Axis Factoring.
Rotation Method: Promax with Kaiser Normalization.
Exact Factor Scores
You can compute, for each subject,
estimated factor scores.
Multiply each standardized variable score
by the corresponding standardized scoring
coefficient.
For our first subject,
Factor 1 = (-.294)(.41) + (.955)(.40) + (-.036)(.22)
+ (1.057)(-.07) + (.712)(.04) + (1.219)(.03)
+ (-1.14)(.01) = 0.23.
SPSS will not only give you the scoring
coefficients, but also compute the
estimated factor scores for you.
In the Factor Analysis window, click
Scores and select Save As Variables,
Regression, Display Factor Score
Coefficient Matrix.
Here are the scoring coefficients:
Factor Score Coefficient Matrix

Factor
1 2
COST .026 .157
SIZE -.066 .610
ALCOHOL .036 .251
REPUTAT .011 -.042
COLOR .225 -.201
AROMA .398 .026
TASTE .409 .110
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
Factor Scores Method: Regression.

Look back at the data sheet and you will


see the estimated factor scores.
R2 of the Variables With Each Factor
These are treated as indicators of the internal
consistency of the solution.
.70 and above is good.
They are in the main diagonal of this matrix

Factor Score Covariance Matrix


Factor 1 2
1 .966 .003
2 .003 .953
R2 of the Variables With Each Factor 2

These squared multiple correlation


coefficients are equal to the variance of
the factor scores.
Use the Factor Scores
Let us see how the factor scores are
related to the SES and Group variables.
Use multiple regression to predict SES
from the factor scores.
Model Summary

Adjusted Std. Error of


Model R R Square R Square the Estimate
1 .988a .976 .976 .385
a. Predictors: (Constant), FAC2_1, FAC1_1
ANOVAb

Sum of
Model Squares df Mean Square F Sig.
1 Regres sion 1320.821 2 660.410 4453.479 .000a
Residual 32.179 217 .148
Total 1353.000 219
a. Predic tors : (Const ant), FAC2_1, FAC1_1
b. Dependent Variable: SES

Coeffi cientsa

St andardiz ed
Coeffic ient s Correlations
Model Beta t Sig. Zero-order Part
1 (Const ant) 134.810 .000
FAC1_1 .681 65.027 .000 .679 .681
FAC2_1 -.718 -68.581 .000 -.716 -.718
a. Dependent Variable: SES
Also, use independent t to compare
groups on mean factor scores.
Group Sta tistics

St d. Error
GROUP N Mean St d. Deviation Mean
FAC1_1 1 121 -.4198775 .97383364 .08853033
2 99 .5131836 .71714232 .07207552
FAC2_1 1 121 .5620465 .88340921 .08030993
2 99 -.6869457 .55529938 .05580969

Independent Samples Test

Levene's Test for


Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Difference
F Sig. t df Sig. (2-tailed) Lower Upper
FAC1_1 Equal variances
19.264 .000 -7.933 218 .000 -1.16487 -.701253
as sumed
Equal variances
-8.173 215.738 .000 -1.15807 -.708049
not ass umed
FAC2_1 Equal variances
25.883 .000 12.227 218 .000 1.047657 1.450327
as sumed
Equal variances
12.771 205.269 .000 1.056175 1.441809
not ass umed
Unit-Weighted Factor Scores
Define subscale 1 as simple sum or mean
of scores on all items loading well (> .4) on
Factor 1.
Likewise for Factor 2, etc.
Suzie Cues answers are
Color, Taste, Aroma, Size, Alcohol, Cost, Reputation
80, 100, 40, 30, 75, 60, 10
Aesthetic Quality = 80+100+40-10 = 210
Cheap Drunk = 30+75+60-10 = 155
It may be better to use factor scoring
coefficients (rather than loadings) to
determine unit weights.
Grice (2001) evaluated several techniques
and found the best to be assigning a unit
weight of 1 to each variable that has a
scoring coefficient at least 1/3 as large as
the largest for that factor.
Using this rule, we would not include
Reputation on either subscale and would
drop Cost from the second subscale.
Item Analysis
and Cronbachs Alpha
Are our subscales reliable?
Test-Retest reliability
Cronbachs Alpha internal consistency
Mean split-half reliability
With correction for attenuation
Is a conservative estimate of reliability
AQ = Color + Taste + Aroma Reputation
Must negatively weight Reputation prior to
item analysis.
Transform, Compute,
NegRep = -1Reputat.
Analyze, Scale, Reliability Analysis
Statistics
Scale if item deleted.

Continue, OK
Shoot for an alpha of at least .70 for
research instruments.
Note that deletion of the Reputation item
would increase alpha to .96.
Comparing Two Groups Factor
Structure
Eyeball Test
Same number of well defined factors in both
groups?
Same variables load well on same factors in
both groups?
Pearson r
Just correlate the loadings for one factor in
one group with those for the corresponding
factor in the other group.
If there are many small loadings, r may be
large due to the factors being similar on small
loadings despite lack of similarity on the larger
loadings.
CC, Tuckers coefficient of congruence
Follow the instructions in the document
Comparing Two Groups Factor Structures:
Pearson r and the Coefficient of Congruence
CC of .85 to .94 corresponds to similar
factors, and .95 to 1 as essentially identical
factors.
Cross-Scoring
Obtain scoring coefficients for each group.
For each group, compute factor scores using
coefficients obtained from the analysis for that
same group (SG) and using coefficients
obtained from the analysis for the other group
(OG).
Correlate SG factor scores with OG factor
scores.
Catells Salient Similarity Index
Factors (one from one group, one from the
other group) are compared in terms of
similarity of loadings.
Catells Salient Similarity Index, s, can be
transformed to a p value testing the null that
the factors are not related to one another.
See my document Cattells s for details.
Required Number of Subjects and
Variables
Rules of Thumb (not very useful)
100 or more subjects.
at least 10 times as many subjects as you
have variables.
as many subjects as you can, the more the
better.
It depends see the references in the
handout.
Start out with at least 6 variables per
expected factor.
Each factor should have at least 3
variables that load well.
If loadings are low, need at least 10
variables per factor.
Need at least as many subjects as
variables. The more of each, the better.
When there are overlapping factors
(variables loading well on more than one
factor), need more subjects than when
structure is simple.
If communalities are low, need more
subjects.
If communalities are high (> .6), you can
get by with fewer than 100 subjects.
With moderate communalities (.5), need
100-200 subjects.
With low communalities and only 3-4 high
loadings per factor, need over 300
subjects.
With low communalities and poorly defined
factors, need over 500 subjects.
What I Have Not Covered Today
LOTS.
For a brief introduction to reliability,
validity, and scaling, see Document or
Slideshow .
For an SAS version of this workshop, see
Document or Slideshow .
Practice Exercises
Animal Rights, Ethical Ideology, and
Misanthropy
Rating Characteristics of Criminal
Defendants

You might also like