Professional Documents
Culture Documents
Karl L. Wuensch
Dept. of Psychology
East Carolina University
What is a Common Factor?
• It is an abstraction, a hypothetical
construct that relates to at least two of our
measurement variables.
• We want to estimate the common factors
that contribute to the variance in our
variables.
• Is this an act of discovery or an act of
invention?
What is a Unique Factor?
• It is a factor that contributes to the
variance in only one variable.
• There is one unique factor for each
variable.
• The unique factors are unrelated to one
another and unrelated to the common
factors.
• We want to exclude these unique factors
from our solution.
Iterated Principal Factors Analysis
• The most common type of FA.
• Also known as principal axis FA.
• We eliminate the unique variance by
replacing, on the main diagonal of the
correlation matrix, 1’s with estimates of
communalities.
• Initial estimate of communality = R2
between one variable and all others.
Lets Do It
• Using the beer data, change the extraction
method to principal axis.
Look at the Initial Communalities
• They were all 1’s for our PCA.
• They sum to 5.675.
• We have eliminated 7 – 5.675 = 1.325
units of unique variance.
Communalities
Initial Extraction
COST .738 .745
SIZE .912 .914
ALCOHOL .866 .866
REPUTAT .499 .385
COLOR .922 .892
AROMA .857 .896
TASTE .881 .902
Extraction Method: Principal Axis Factoring.
Iterate!
• Using the estimated communalities, obtain
a solution.
• Take the communalities from the first
solution and insert them into the main
diagonal of the correlation matrix.
• Solve again.
• Take communalities from this second
solution and insert into correlation matrix.
• Solve again.
• Repeat this, over and over, until the
changes in communalities from one
iteration to the next are trivial.
• Our final communalities sum to 5.6.
• After excluding 1.4 units of unique
variance, we have extracted 5.6 units of
common variance.
• That is 5.6 / 7 = 80% of the total variance
in our seven variables.
• We have packaged those 5.6 units of
common variance into two factors:
Factor
1 2
TASTE .950 -2.17E-02
AROMA .946 2.106E-02
COLOR .942 6.771E-02
SIZE 7.337E-02 .953
ALCOHOL 2.974E-02 .930
COST -4.64E-02 .862
REPUTAT -.431 -.447
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 3 iterations.
Reproduced and Residual
Correlation Matrices
• Correlations between variables result from
their sharing common underlying factors.
• Try to reproduce the original correlation
matrix from the correlations between
factors and variables (the loadings).
• The difference between the reproduced
correlation matrix and the original
correlation matrix is the residual matrix.
• We want these residuals to be small.
• Check “Reproduced” under “Descriptive”
in the Factor Analysis dialogue box, to get
both of these matrices:
• Reproduced Correlations
Factor 1 2
1 1.000 .106
2 .106 1.000
Extraction Method: Principal Axis Factoring.
Rotation Method: Promax with Kaiser Normalization.
Exact Factor Scores
• You can compute, for each subject,
estimated factor scores.
• Multiply each standardized variable score
by the corresponding standardized scoring
coefficient.
• For our first subject,
Factor 1 = (-.294)(.41) + (.955)(.40) + (-.036)(.22)
+ (1.057)(-.07) + (.712)(.04) + (1.219)(.03)
+ (-1.14)(.01) = 0.23.
• SPSS will not only give you the scoring
coefficients, but also compute the
estimated factor scores for you.
• In the Factor Analysis window, click
Scores and select Save As Variables,
Regression, Display Factor Score
Coefficient Matrix.
• Here are the scoring coefficients:
Factor Score Coefficient Matrix
Factor
1 2
COST .026 .157
SIZE -.066 .610
ALCOHOL .036 .251
REPUTAT .011 -.042
COLOR .225 -.201
AROMA .398 .026
TASTE .409 .110
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
Factor Scores Method: Regression.
Sum of
Model Squares df Mean Square F Sig.
1 Regression 1320.821 2 660.410 4453.479 .000a
Residual 32.179 217 .148
Total 1353.000 219
a. Predictors: (Constant), FAC2_1, FAC1_1
b. Dependent Variable: SES
Coefficientsa
Standardized
Coefficients Correlations
Model Beta t Sig. Zero-order Part
1 (Constant) 134.810 .000
FAC1_1 .681 65.027 .000 .679 .681
FAC2_1 -.718 -68.581 .000 -.716 -.718
a. Dependent Variable: SES
• Also, use independent t to compare
groups on mean factor scores.
Group Statistics
Std. Error
GROUP N Mean Std. Deviation Mean
FAC1_1 1 121 -.4198775 .97383364 .08853033
2 99 .5131836 .71714232 .07207552
FAC2_1 1 121 .5620465 .88340921 .08030993
2 99 -.6869457 .55529938 .05580969
• Continue, OK
• Shoot for an alpha of at least .70 for
research instruments.
• Note that deletion of the Reputation item
would increase alpha to .96.
Comparing Two Groups’ Factor
Structure
• Eyeball Test
– Same number of well defined factors in both
groups?
– Same variables load well on same factors in
both groups?
• Pearson r
– Just correlate the loadings for one factor in
one group with those for the corresponding
factor in the other group.
– If there are many small loadings, r may be
large due to the factors being similar on small
loadings despite lack of similarity on the larger
loadings.
• CC, Tucker’s coefficient of congruence
– Follow the instructions in the document
Comparing Two Groups’ Factor Structures: P
earson
r and the Coefficient of Congruence
– CC of .85 to .94 corresponds to similar
factors, and .95 to 1 as essentially identical
factors.
• Cross-Scoring
– Obtain scoring coefficients for each group.
– For each group, compute factor scores using
coefficients obtained from the analysis for that
same group (SG) and using coefficients
obtained from the analysis for the other group
(OG).
– Correlate SG factor scores with OG factor
scores.
• Catell’s Salient Similarity Index
– Factors (one from one group, one from the
other group) are compared in terms of
similarity of loadings.
– Catell’s Salient Similarity Index, s, can be
transformed to a p value testing the null that
the factors are not related to one another.
– See my document Cattell’s s for details.
Required Number of Subjects and
Variables
• Rules of Thumb (not very useful)
– 100 or more subjects.
– at least 10 times as many subjects as you
have variables.
– as many subjects as you can, the more the
better.
• It depends – see the references in the
handout.
• Start out with at least 6 variables per
expected factor.
• Each factor should have at least 3
variables that load well.
• If loadings are low, need at least 10
variables per factor.
• Need at least as many subjects as
variables. The more of each, the better.
• When there are overlapping factors
(variables loading well on more than one
factor), need more subjects than when
structure is simple.
• If communalities are low, need more
subjects.
• If communalities are high (> .6), you can
get by with fewer than 100 subjects.
• With moderate communalities (.5), need
100-200 subjects.
• With low communalities and only 3-4 high
loadings per factor, need over 300
subjects.
• With low communalities and poorly defined
factors, need over 500 subjects.
What I Have Not Covered Today
• LOTS.
• For a brief introduction to reliability,
validity, and scaling, see Document or
Slideshow .
• For an SAS version of this workshop, see
Document or Slideshow .
Practice Exercises
• Animal Rights, Ethical Ideology, and Misan
thropy
• Rating Characteristics of Criminal Defenda
nts