You are on page 1of 42

Principle Components

Analysis with SPSS

Download The Instructional
• Point your browser to
• Click on Principle Components Analysis .
• Save, Desktop, Save.
• Do same for Factor Analysis .
When to Use PCA

• You have a set of p continuous variables.
• You want to repackage their variance into
m components.
• You will usually want m to be < p, but not
Components and Variables
• Each component is a weighted linear
combination of the variables
Ci = Wi 1 X 1 + Wi 2 X 2 +  + Wip X p
• Each variable is a weighted linear
combination of the components.

X j = A1 j C1 + A2 j C2 +  + Amj Cm
Factors and Variables
• In Factor Analysis, we exclude from the
solution any variance that is unique, not
shared by the variables.
X j = A1 j F1 + A2 j F2 +  + Amj Fm + U j

• Uj is the unique variance for Xj
Goals of PCA and FA
• Data reduction.
• Discover and summarize pattern of
intercorrelations among variables.
• Test theory about the latent variables
underlying a set a measurement variables.
• Construct a test instrument.
• There are many others uses of PCA and FA.
Data Reduction
• Ossenkopp and Mazmanian (Physiology and
Behavior, 34: 935-941).
• 19 behavioral and physiological variables.
• A single criterion variable, physiological
response to four hours of cold-restraint
• Extracted five factors.
• Used multiple regression to develop a multiple
regression model for predicting the criterion
from the five factors.
Exploratory Factor Analysis
• Want to discover the pattern of
intercorrleations among variables.
• Wilt et al., 2005 (thesis).
• Variables are items on the SOIS at ECU.
• Found two factors, one evaluative, one on
difficulty of course.
• Compared FTF students to DE students,
on structure and means.
Confirmatory Factor Analysis
• Have a theory regarding the factor
structure for a set of variables.
• Want to confirm that the theory describes
the observed intercorrelations well.
• Thurstone: Intelligence consists of seven
independent factors rather than one global
Construct Test Instrument
• Write a large set of items designed to test the
constructs of interest.
• Administer the survey to a sample of persons
from the target population.
• Use FA to help select those items that will be
used to measure each of the constructs of
• Use Cronbach alpha to check reliability of
resulting scales.
An Unusual Use of PCA
• Poulson, Braithwaite, Brondino, and Wuensch
(1997, Journal of Social Behavior and
Personality, 12, 743-758).
• Simulated jury trial, seemingly insane
defendant killed a man.
• Criterion variable = recommended verdict
– Guilty
– Guilty But Mentally Ill
– Not Guilty By Reason of Insanity.
• Predictor variables = jurors’ scores on 8
• Discriminant function analysis.
• Problem with multicollinearity.
• Used PCA to extract eight orthogonal
• Predicted recommended verdict from
these 8 components.
• Transformed results back to the original
A Simple, Contrived Example
• Consumers rate importance of seven
characteristics of beer.
– low Cost
– high Size of bottle
– high Alcohol content
– Reputation of brand
– Color
– Aroma
– Taste
• Download FACTBEER.SAV from
• Analyze, Data Reduction, Factor.
• Scoot beer variables into box.
• Click Descriptives and then check Initial
Solution, Coefficients, and KMO and
Bartlett’s Test of Sphericity. Click
• Click Extraction and then select Principal
Components, Correlation Matrix,
Unrotated Factor Solution, Scree Plot, and
Eigenvalues Over 1. Click Continue.
• Click Rotation. Select Varimax and
Rotated Solution. Click Continue.
• Click Options. Select Exclude Cases Listwise
and Sorted By Size. Click Continue.

• Click OK, and SPSS completes the Principle
Components Analysis.
Checking for Unique Variables
• Check the correlation matrix.
• If there are any variables not well
correlated with some others, might as well
delete them.
• Bartlett’s test of sphericity tests null that
the matrix is an identity matrix, but does
not help identify individual variables that
are not well correlated with others.
• For each variable, check R2 between it and the
remaining variables.
• Look at partial correlations – variables with
large partial correlations share variance with
one another but not with the remaining
variables – this is problematic.
• Kaiser’s MSA will tell you, for each variable,
how much of this problem exists.
• The smaller the MSA, the greater the problem.
• An MSA of .9 is marvelous, .5 miserable.
• Use SAS to get the partial correlations and
individual MSAs.
• SPSS only gives an overall MSA, which is
of no use in identifying problematic
KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling

Bartlett's Test of Approx. Chi-Square 1637.9
Sphericity df 21
Sig. .000
Extracting Principal Components
• From p variables we can extract p components.
• Each of p eigenvalues represents the amount of
standardized variance that has been captured by
one component.
• The first component accounts for the largest
possible amount of variance.
• The second captures as much as possible of what
is left over, and so on.
• Each is orthogonal to the others.
• Each variable has standardized variance =
• The total standardized variance in the p
variables = p.
• The sum of the m = p eigenvalues = p.
• All of the variance is extracted.
• For each component, the proportion of
variance extracted = eigenvalue / p.
• For our beer data, here are the
eigenvalues and proportions of variance
for the seven components:

Initial Eigenvalues
% of Cumulative
Component Total Variance %
1 3.313 47.327 47.327
2 2.616 37.369 84.696
3 .575 8.209 92.905
4 .240 3.427 96.332
5 .134 1.921 98.252
6 9.E-02 1.221 99.473
7 4.E-02 .527 100.000
Extraction Method: Principal Component Analysis.
How Many Components to Retain
• From p variables we can extract p
• We probably want fewer than p.
• Simple rule: Keep as many as have
eigenvalues ≥ 1.
• A component with eigenvalue < 1
captured less than one variable’s worth of
• Visual Aid: Use a Scree Plot
• Scree is rubble at base of cliff.
• For our beer data,
Scree Plot







1 2 3 4 5 6 7

Component Number
• Only the first two components have
eigenvalues greater than 1.
• Big drop in eigenvalue between
component 2 and component 3.
• Components 3-7 are scree.
• Try a 2 component solution.
• Should also look at solution with one fewer
and with one more component.
Loadings, Unrotated and Rotated
• loading matrix = factor pattern matrix =
component matrix.
• Each loading is the Pearson r between one
variable and one component.
• Since the components are orthogonal, each
loading is also a β weight from predicting X from
the components.
• Here are the unrotated loadings for our 2
component solution:
Component Matrixa

1 2
COLOR .760 -.576
AROMA .736 -.614
REPUTAT -.735 -.071
TASTE .710 -.646
COST .550 .734
ALCOHOL .632 .699
SIZE .667 .675
Extraction Method: Principal Component Analysis.
a. 2 components extracted.

• All variables load well on first component,
economy and quality vs. reputation.
• Second component is more interesting,
economy versus quality.
• Rotate these axes so that the two
dimensions pass more nearly through the
two major clusters (COST, SIZE, ALCH
• The number of degrees by which I rotate
the axes is the angle PSI. For these data,
rotating the axes -40.63 degrees has the
desired effect.
• Component 1 = Quality versus reputation.
• Component 2 = Economy (or cheap
drunk) versus reputation.
Rotated Component Matrix

1 2
TASTE .960 -.028
AROMA .958 1.E-02
COLOR .952 6.E-02
SIZE 7.E-02 .947
ALCOHOL 2.E-02 .942
COST -.061 .916
REPUTAT -.512 -.533
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 3 iterations.
Number of Components in the
Rotated Solution
• Try extracting one fewer component, try one more
• Which produces the more sensible solution?
• Error = difference in obtained structure and true
• Overextraction (too many components) produces
less error than underextraction.
• If there is only one true factor and no unique
variables, can get “factor splitting.”
• In this case, first unrotated factor ≅ true
• But rotation splits the factor, producing an
imaginary second factor and corrupting
the first.
• Can avoid this problem by including a
garbage variable that will be removed prior
to the final solution.
Explained Variance
• Square the loadings and then sum them across
• Get, for each component, the amount of
variance explained.
• Prior to rotation, these are eigenvalues.
• Here are the SSL for our data, after rotation:
Total Variance Explained

Rotation Sums of Squared
% of Cumulative
Component Total Variance %
1 3.017 43.101 43.101
2 2.912 41.595 84.696
Extraction Method: Principal Component Analysis.

• After rotation the two components together
account for (3.02 + 2.91) / 7 = 85% of the
total variance.
• If the last component has a small SSL,
one should consider dropping it.
• If SSL = 1, the component has extracted
one variable’s worth of variance.
• If only one variable loads well on a
component, the component is not well
• If only two load well, it may be reliable, if
the two variables are highly correlated with
one another but not with other variables.
Naming Components
• For each component, look at how it is
correlated with the variables.
• Try to name the construct represented by
that factor.
• If you cannot, perhaps you should try a
different solution.
• I have named our components “aesthetic
quality” and “cheap drunk.”
• For each variable, sum the squared
loadings across components.
• This gives you the R2 for predicting the
variable from the components,
• which is the proportion of the variable’s
variance which has been extracted by the
• Here are the communalities for our beer
data. “Initial” is with all 7 components,
“Extraction” is for our 2 component

Initial Extraction
COST 1.000 .842
SIZE 1.000 .901
ALCOHOL 1.000 .889
REPUTAT 1.000 .546
COLOR 1.000 .910
AROMA 1.000 .918
TASTE 1.000 .922
Extraction Method: Principal Component Analysis.
Orthogonal Rotations
• Varimax -- minimize the complexity of the
components by making the large loadings
larger and the small loadings smaller
within each component.
• Quartimax -- makes large loadings larger
and small loadings smaller within each
• Equamax – a compromize between these
Oblique Rotations
• Axes drawn between the two clusters in
the upper right quadrant would not be
• May better fit the data with axes that are
not perpendicular, but at the cost of having
components that are correlated with one
• More on this later.