You are on page 1of 46

Factor Analysis: Confirmatory 599

Dim. 2 [5] Guttman, L. (1965). A faceted definition of intelligence,


n t Scripta Hierosolymitana 14, 166–181.
D Joi
[6] Guttman, L. (1991a). Two structural laws for intelligence
tests, Intelligence 15, 79–103.
8 11 12 [7] Guttman, L. (1991b). Louis Guttman: In Memoriam-
chapters from an Unfinished Textbook on Facet Theory,
3 Israel Academy of Sciences and Humanities, Jerusalem.
B [8] Levy, S. & Guttman, L. (1985). The partial-order of
4 7
10 A severity of thyroid cancer with the prognosis of sur-
2 vival, in Ins and Outs of Solving Problems, J.F. Mar-
chotorchino, J.-M. Proth & J. Jansen, eds, Elsevier,
3 6 9 Amsterdam, pp. 111–119.
C [9] Lingoes, J.C. (1968). The multivariate analysis of quali-
1 tative data, Multivariate Behavioral Research 1, 61–94.
[10] Shye, S. (1985). Multiple Scaling, North-Holland, Ams-
1 2 5 terdam.

Dim.1 (See also Multidimensional Unfolding)


1 2 3
INGWER BORG
Figure 4 POSAC solution for the data in Table 2

Y coordinates. Both secondary facets thus generate


additional cutting points and thereby more intervals
on the base dimensions.
HUDAP also contains programs for doing MSA. Factor Analysis:
MSA is seldom used in practice, because its solu-
tions are rather indeterminate [2]. They can be trans- Confirmatory
formed in many ways that can radically change
their appearance, which makes it difficult to inter-
pret and to replicate them. One can, however, make Of primary import to factor analysis, in general,
MSA solutions more robust by enforcing addi- is the notion that some variables of theoretical
tional constraints such as linearity onto the boundary interest cannot be observed directly; these unob-
lines or planes [9]. Such constraints are not related served variables are termed latent variables or fac-
to content and for that reason rejected by many tors. Although latent variables cannot be measured
facet theorists who prefer data analysis that is as directly, information related to them can be obtained
‘intrinsic’ [7] as possible. On the other hand, if a indirectly by noting their effects on observed vari-
good MSA solution is found under ‘extrinsic’ side ables that are believed to represent them. The oldest
constraints, it certainly also exists for the softer intrin- and best-known statistical procedure for investigating
sic model. relations between sets of observed and latent vari-
ables is that of factor analysis. In using this approach
References to data analyses, researchers examine the covariation
among a set of observed variables in order to gather
[1] Amar, R. & Toledano, S. (2001). HUDAP Manual, information on the latent constructs (i.e., factors)
Hebrew University of Jerusalem, Jerusalem. that underlie them. In factor analysis models, each
[2] Borg, I. & Shye, S. (1995). Facet Theory: form and observed variable is hypothesized to be determined
Content, Sage, Newbury Park. by two types of influences: (a) the latent variables
[3] Elizur, D. (1970). Adapting to Innovation: A Facet Anal-
ysis of the Case of the Computer, Jerusalem Academic
(factors) and (b) unique variables (called either resid-
Press, Jerusalem. ual or error variables). The strength of the relation
[4] Guttman, L. (1944). A basis for scaling qualitative data, between a factor and an observed variable is usually
American Sociological Review 9, 139–150. termed the loading of the variable on the factor.
600 Factor Analysis: Confirmatory

Exploratory versus Confirmatory Factor Hypothesizing a CFA Model


Analysis
Given the a priori knowledge of a factor structure
and the testing of this factor structure based on the
There are two basic types of factor analyses: explo-
analysis of covariance structures, CFA belongs to a
ratory factor analysis (EFA) and confirmatory factor
class of methodology known as structural equation
analysis (CFA). EFA is most appropriately used when
modeling (SEM). The term structural equation mod-
the links between the observed variables and their eling conveys two important notions: (a) that struc-
underlying factors are unknown or uncertain. It is tural relations can be modeled pictorially to enable a
considered to be exploratory in the sense that the clearer conceptualization of the theory under study,
researcher has no prior knowledge that the observed and (b) that the causal processes under study are rep-
variables do, indeed, measure the intended factors. resented by a series of structural (i.e., regression)
Essentially, the researcher uses EFA to determine equations. To assist the reader in conceptualizing a
factor structure. In contrast, CFA is appropriately CFA model, I now describe the specification of a
used when the researcher has some knowledge of hypothesized CFA model in two ways; first, as a
the underlying latent variable structure. On the basis graphical representation of the hypothesized structure
of theory and/or empirical research, he or she postu- and, second, in terms of its structural equations.
lates relations between the observed measures and the
underlying factors a priori, and then tests this hypoth-
esized structure statistically. More specifically, the Graphical Specification of the Model
CFA approach examines the extent to which a highly
CFA models are schematically portrayed as path
constrained a priori factor structure is consistent with
diagrams (see Path Analysis and Path Diagrams)
the sample data. In summarizing the primary dis-
through the incorporation of four geometric sym-
tinction between the two methodologies, we can say
bols: a circle (or ellipse) representing unobserved
that whereas EFA operates inductively in allowing latent factors, a square (or rectangle) representing
the observed data to determine the underlying factor observed variables, a single-headed arrow (−>) rep-
structure a posteriori, CFA operates deductively in resenting the impact of one variable on another, and a
postulating the factor structure a priori [5]. double-headed arrow (<−>) representing covariance
Of the two factor analytic approaches, CFA is by between pairs of variables. In building a CFA model,
far the more rigorous procedure. Indeed, it enables researchers use these symbols within the framework
the researcher to overcome many limitations asso- of three basic configurations, each of which repre-
ciated with the EFA model; these are as follows: sents an important component in the analytic process.
First, whereas the EFA model assumes that all com- We turn now to the CFA model presented in Figure 1,
mon factors are either correlated or uncorrelated, the which represents the postulated four-factor struc-
CFA model makes no such assumptions. Rather, the ture of nonacademic self-concept (SC) as tapped by
researcher specifies, a priori, only those factor corre- items comprising the Self Description Questionnaire-
lations that are considered to be substantively mean- I (SDQ-I; [15]). As defined by the SDQ-I, nonaca-
ingful. Second, with the EFA model, all observed demic SC embraces the constructs of physical and
variables are directly influenced by all common fac- social SCs.
tors. With CFA, each factor influences only those On the basis of the geometric configurations noted
observed variables with which it is purported to be above, decomposition of this CFA model conveys the
linked. Third, whereas in EFA, the unique factors are following information: (a) there are four factors, as
assumed to be uncorrelated, in CFA, specified covari- indicated by the four ellipses labeled Physical SC
ation among particular uniquenesses can be tapped. (Appearance; F1), Physical SC (Ability; F2), Social
Finally, provided with a malfitting model in EFA, SC (Peers; F3), and Social SC (Parents; F4); (b) the
there is no mechanism for identifying which areas four factors are intercorrelated, as indicated by the
of the model are contributing most to the misfit. In six two-headed arrows; (c) there are 32 observed
CFA, on the other hand, the researcher is guided to variables, as indicated by the 32 rectangles (SDQ1-
a more appropriately specified model via indices of SDQ66); each represents one item from the SDQ-I;
misfit provided by the statistical program. (d) the observed variables measure the factors in
Factor Analysis: Confirmatory 601

SDQ1 E1

1.0 SDQ8 E8

SDQ15 E15
Physical SC SDQ22 E22
(Appearance)
F1 SDQ38 E38

SDQ46 E46

SDQ54 E54

SDQ62 E62

SDQ3 E3

1.0 SDQ10 E10

SDQ24 E24
Physical SC SDQ32 E32
(Ability)
F2 SDQ40 E40

SDQ48 E48
SDQ56 E56

SDQ64 E64

SDQ7 E7

1.0 SDQ14 E14

SDQ28 E28
Social SC SDQ36 E36
(Peers)
F3 SDQ44 E44

SDQ52 E52

SDQ60 E60

SDQ69 E69

SDQ5 E5

SDQ19 E19
1.0
SDQ26 E26
Social SC SDQ34 E34
(Parents)
F4 SDQ42 E42

SDQ50 E50

SDQ58 E58

SDQ66 E66

Figure 1 Hypothesized CFA model


602 Factor Analysis: Confirmatory

the following pattern: Items 1, 8, 15, 22, 38, 46, always explained (i.e., accounted for) by other vari-
54, and 62 measure Factor 1, Items 3, 10, 24, 32, ables in the model. One relatively simple approach to
40, 48, 56, and 64 measure Factor 2, Items 7, 14, formulating these structural equations, then, is first
28, 36, 44, 52, 60, and 69 measure Factor 3, and to note each dependent variable in the model and
Items 5, 19, 26, 34, 42, 50, 58, and 66 measure then to summarize all influences on these variables.
Factor 4; (e) each observed variable measures one Turning again to Figure 1, we see that there are 32
and only one factor; and (f) errors of measurement variables with arrows pointing toward them; all rep-
associated with each observed variable (E1-E66) resent observed variables (SDQ1-SDQ66). Accord-
are uncorrelated (i.e., there are no double-headed ingly, these regression paths can be summarized in
arrows connecting any two error terms. Although the terms of 32 separate equations as follows:
error variables, technically speaking, are unobserved SDQ1 = F1 + E1
variables, and should have ellipses around them, SDQ8 = F1 + E8
common convention in such diagrams omits them in SDQ15.= F1 + E15
the interest of clarity. ..
In summary, a more formal description of the SDQ62 = F1 + E62
CFA model in Figure 1 argues that: (a) responses to
the SDQ-I are explained by four factors; (b) each SDQ3 = F2 + E3
SDQ10.= F2 + E10
item has a nonzero loading on the nonacademic SC ..
factor it was designed to measure (termed target
SDQ64 = F2 + E64
loadings), and zero loadings on all other factors
(termed nontarget loadings); (c) the four factors SDQ7 = F3 + E7
are correlated; and (d) measurement error terms are SDQ14.= F3 + E14
..
uncorrelated.
SDQ69 = F3 + E69
SDQ5 = F4 + E5
Structural Equation Specification of the Model SDQ19.= F4 + E19
..
From a review of Figure 1, you will note that each SDQ66 = F4 + E66
observed variable is linked to its related factor by a
(1)
single-headed arrow pointing from the factor to the
observed variable. These arrows represent regression Although, in principle, there is a one-to-one cor-
paths and, as such, imply the influence of each factor respondence between the schematic presentation of
in predicting its set of observed variables. Take, for a model and its translation into a set of structural
example, the arrow pointing from Physical SC (Abil- equations, it is important to note that neither one
ity) to SDQ1. This symbol conveys the notion that of these representations tells the whole story. Some
responses to Item 1 of the SDQ-I assessment measure parameters, critical to the estimation of the model,
are ‘caused’ by the underlying construct of physi- are not explicitly shown and thus may not be obvi-
cal SC, as it reflects one’s perception of his or her ous to the novice CFA analyst. For example, in both
physical ability. In CFA, these symbolized regression the schematic model (see Figure 1) and the linear
paths represent factor loadings and, as with all factor structural equations cited above, there is no indication
analyses, their strength is of primary interest. Thus, that either the factor variances or the error variances
specification of a hypothesized model focuses on the are parameters in the model. However, such param-
formulation of equations that represent these struc- eters are essential to all structural equation models
tural regression paths. Of secondary importance are and therefore must be included in the model spec-
any covariances between the factors and/or between ification. Typically, this specification is made via a
the measurement errors. separate program command statement, although some
The building of these equations, in SEM, embraces programs may incorporate default values. Likewise,
two important notions: (a) that any variable in the it is equally important to draw your attention to
model having an arrow pointing at it represents a the specified nonexistence of certain parameters in a
dependent variable, and (b) dependent variables are model. For example, in Figure 1, we detect no curved
Factor Analysis: Confirmatory 603

arrow between E1 and E8, which would suggest the values such as the 1’s specified in Figure 1 are typ-
lack of covariance between the error terms associated ically assigned to certain parameters for purposes of
with the observed variables SDQ1 and SDQ8. (Error model identification and latent factor scaling.) For
covariances can reflect overlapping item content and, example, as shown in Figure 1, and in the struc-
as such, represent the same question being asked, but tural equation above, the factor loading of SDQ8
with a slightly different wording.) on Factor 1 is freely estimated, as indicated by the
single-headed arrow leading from Factor 1 to SDQ8.
By contrast, the factor loading of SDQ10 on Factor 1
Testing a Hypothesized CFA Model is not estimated (i.e., there is no single-headed arrow
leading from Factor 1 to SDQ10); this factor load-
Testing for the validity of a hypothesized CFA model ing is automatically fixed to zero by the program.
requires the satisfaction of certain statistical assump-
Although there are four main methods for estimat-
tions and entails a series of analytic steps. Although
ing parameters in CFA models, maximum likelihood
a detailed review of this testing process is beyond
estimation remains the one most commonly used and
the scope of the present chapter, a brief outline is
is the default method for all SEM programs.
now presented in an attempt to provide readers with
at least a flavor of the steps involved. (For a non-
mathematical and paradigmatic introduction to SEM
based on three different programmatic approaches to Evaluating Model Fit
the specification and testing of a variety of basic CFA
models, readers are referred to [6–9]; for a more Once the CFA model has been estimated, the next
detailed and analytic approach to SEM, readers are task is to determine the extent to which its specifi-
referred to [3], [14], [16] and [17].) cations are consistent with the data. This evaluative
process focuses on two aspects: (a) goodness-of-fit of
Statistical Assumptions the model as a whole, and (b) goodness-of-fit of indi-
vidual parameter estimates. Global assessment of fit
As with other multivariate methodologies, SEM assu- is determined through the examination of various fit
mes that certain statistical conditions have been met. indices and other important criteria. In the event that
Of primary importance is the assumption that the data goodness-of-fit is adequate, the model argues for the
are multivariate normal (see Catalogue of Proba- plausibility of postulated relations among variables;
bility Density Functions). In essence, the concept if it is inadequate, the tenability of such relations
of multivariate normality embraces three require- is rejected. Although there is now a wide array of
ments: (a) that the univariate distributions are normal; fit indices from which to choose, typically only one
(b) that the joint distributions of all variable combi- or two need be reported, along with other fit-related
nations are normal; and (c) that all bivariate scatter- indicators. A typical combination of these evalua-
plots are linear and homoscedastic [14]. Violations tive criteria might include the Comparative Fit Index
of multivariate normality can lead to the distortion (CFI; Bentler, [1]), the standardized root mean square
of goodness-of-fit indices related to the model as a residual (SRMR), and the Root Mean Square Error of
whole (see e.g., [12]; [10]; and (see Goodness of Fit) Approximation (RMSEA; [18]), along with its 90%
to positively biased tests of significance related to the confidence interval. Indicators of a well-fitting model
individual parameter estimates [14]). would be evidenced from a CFI value equal to or
greater than .93 [11], an SRMR value of less than
Estimating the Model .08 [11], and an RMSEA value of less than .05 [4].
Goodness-of-fit related to individual parameters
Once the researcher determines that the statistical of the model focuses on both the appropriateness
assumptions have been met, the hypothesized model (i.e., no negative variances, no correlations >1.00)
can then be tested statistically in a simultaneous anal- and statistical significance (i.e., estimate divided
ysis of the entire system of variables. As such, some by standard error >1.96) of their estimates. For
parameters are freely estimated while others remain parameters to remain specified in a model, their
fixed to zero or some other nonzero value. (Nonzero estimates must be statistically significant.
604 Factor Analysis: Confirmatory

SDQ1 0.58 E1*


0.82 SDQ8 0.71 E8*
0.70*
SDQ15 0.58 E15*
0.82*
Physical SC 0.85* SDQ22 0.53 E22*
(Appearance)
F1 0.58* SDQ38 0.82 E38*
0.72*
0.69* SDQ46 0.69 E46*
0.71*
SDQ54 0.73 E54*

SDQ62 0.71 E62*


0.41*

SDQ3 0.66 E3*

0.75 SDQ10 0.85 E10*


0.53*
SDQ24 0.84 E24*
0.54*
Physical SC 0.52* SDQ32 0.85 E32*
0.55* (Ability)
F2 0.80* SDQ40 0.61 E40*
0.73*
0.88* SDQ48 0.69 E48*
0.57*
SDQ56 0.48 E56*

SDQ64 0.82 E64*


0.52*

0.29* SDQ7 0.79 E7*


0.61 SDQ14 0.74 E14*
0.67*
SDQ28 0.68 E28*
0.74*
Social SC 0.61* 0.79
SDQ36 E36*
(Peers)
F3 0.64*
SDQ44 0.77 E44*
0.62*
0.71* SDQ52 0.78 E52*
0.21*
0.79*
SDQ60 0.71 E60*

SDQ69 0.61 E69*


0.42*

SDQ5 0.78 E5*


0.63 SDQ19 0.83 E19*
0.56*
SDQ26 0.80 E26*
0.60*
Social SC 0.55* SDQ34 0.83 E34*
(Parents)
F4 0.66*
SDQ42 0.76 E42*
0.73*
0.83* SDQ50 0.68 E50*
0.69*
SDQ58 0.55 E58*

SDQ66 0.73 E66*

Figure 2 Standardized estimates for hypothesized CFA model


Factor Analysis: Confirmatory 605

Post Hoc Model-fitting Finally, values associated with the double-headed


arrows represent latent factor correlations. Thus, for
Presented with evidence of a poorly fitting model, example, the value of .41 represents the correlation
the hypothesized CFA model would be rejected. between Factor 1 (Physical SC; Appearance) and Fac-
Analyses then proceed in an exploratory fashion as tor 2 (Physical SC; Ability). These factor correlations
the researcher seeks to determine which parameters should be consistent with the theory within which the
in the model are misspecified. Such information CFA model is grounded.
is gleaned from program output that focuses on In conclusion, it is important to emphasize that
modification indices (MIs), estimates that derive from only issues related to the specification of first-order
testing for the meaningfulness of all constrained (or CFA models, and only a cursory overview of the steps
fixed) parameters in the model. For example, the involved in testing these models has been included
constraint that the loading of SDQ10 on Factor 1 here. Indeed, sound application of SEM procedures
is zero, as per Figure 1 would be tested. If the MI in testing CFA models requires that researchers have a
related to this fixed parameter is large, compared comprehensive understanding of the analytic process.
to all other MIs, then this finding would argue for Of particular importance are issues related to the
its specification as a freely estimated parameter. assessment of multivariate normality, appropriateness
In this case, the new parameter would represent of sample size, use of incomplete data, correction
a loading of SDQ10 on both Factor 1 and Factor for nonnormality, model specification, identification,
2. Of critical importance in post hoc model-fitting, and estimation, evaluation of model fit, and post hoc
however, is the requirement that only substantively model-fitting. Some of these topics are covered in
meaningful parameters be added to the original model other entries, as well as the books and journal articles
specification. cited herein.

Interpreting Estimates References


Shown in Figure 2 are standardized parameter esti-
[1] Bentler, P.M. (1990). Comparative fit indexes in struc-
mates resulting from the testing of the hypothesized tural models, Psychological Bulletin 107, 238–246.
CFA model portrayed in Figure 1. Standardization [2] Bentler, P.M. (2004). EQS 6.1: Structural Equations
transforms the solution so that all variables have a Program Manual, Multivariate Software Inc, Encino.
variance of 1; factor loadings will still be related in [3] Bollen, K. (1989). Structural Equations with Latent
the same proportions as in the original solution, but Variables, Wiley, New York.
parameters that were originally fixed will no longer [4] Browne, M.W. & Cudeck, R. (1993). Alternative ways
of assessing model fit, in Testing Structural Equation
have the same values. In a standardized solution, fac-
Models, K.A. Bollen & J.S. Long eds, Sage, Newbury
tor loadings should generally be less than 1.0 [14]. Park, pp. 136–162.
Turning first to the factor loadings and their [5] Bryant, F.B. & Yarnold, P.R. (1995). Principal-com-
associated errors of measurement, we see that, for ponents analysis and exploratory and confirmatory factor
example, the regression of Item SDQ15 on Factor 1 analysis, in Reading and Understanding Multivariate
(Physical SC; Appearance) is .82. Because SDQ15 Statistics, L.G. Grimm & P.R. Yarnold eds, American
loads only on Factor 1, we can interpret this estimate Psychological Association, Washington.
[6] Byrne, B.M. (1994). Structural Equation Modeling with
as indicating that Factor 1 accounts for approximately EQS and EQS/Windows: Basic Concepts, Applications,
67% (100 × .822 ) of the variance in this item. The and Programming, Sage, Thousand Oaks.
measurement error coefficient associated with SDQ15 [7] Byrne, B.M. (1998). Structural Equation Modeling with
is .58, thereby indicating that some 34% (as a LISREL, PRELIS, and SIMPLIS: Basic Concepts, Appli-
result of decimal rounding) of the variance associated cations, and Programming, Erlbaum, Mahwah.
with this item remains unexplained by Factor 1. [8] Byrne, B.M. (2001a). Structural Equation Modeling with
(It is important to note that, unlike the LISREL AMOS: Basic Concepts, Applications, and Program-
ming, Erlbaum, Mahwah.
program [13], which does not standardize errors [9] Byrne, B.M. (2001b). Structural equation modeling with
in variables, the EQS program [2] used here does AMOS, EQS, and LISREL: comparative approaches
provide these standardized estimated values; see to testing for the factorial validity of a measuring
Structural Equation Modeling: Software.) instrument, International Journal of Testing 1, 55–86.
606 Factor Analysis: Exploratory

[10] Curran, P.J., West, S.G. & Finch, J.F. (1996). The robust- as many who followed that EFA continues to be cen-
ness of test statistics to nonnormality and specifica- tral to multivariate analysis so many years after its
tion error in confirmatory factor analysis, Psychological introduction. In a recent search of electronic sources,
Methods 1, 16–29.
[11] Hu, L.-T. & Bentler, P.M. (1999). Cutoff criteria for fit
where I restricted attention to the psychological and
indexes in covariance structure analysis: conventional social sciences (using PsychINFO), more than 20 000
criteria versus new alternatives, Structural Equation articles and books were identified in which the term
Modeling 6, 1–55. ‘factor analysis’ had been used in the summary, well
[12] Hu, L.-T., Bentler, P.M. & Kano, Y. (1992). Can test over a thousand citations from the last decade alone.
statistics in covariance structure analysis be trusted? EFA, as it is known today, was for many years
Psychological Bulletin 112, 351–362.
called common factor analysis. The method is in
[13] Jöreskog, K.G. & Sörbom, D. (1996). LISREL 8:
User’s Reference Guide, Scientific Software Interna-
some respects similar to another well-known method
tional, Chicago. called principal component analysis (PCA) and
[14] Kline, R.B. (1998). Principles and Practice of Structural because of various similarities, these methods are
Equation Modeling, Guildwood Press, New York. frequently confused. One of the purposes of this
[15] Marsh, H.W. (1992). Self Description Questionnaire article will be to try to dispel at least some of
(SDQ) I: A Theoretical and Empirical Basis for the the confusion.
Measurement of Multiple Dimensions of Preadolescent
The general methodology currently seen as an
Self-concept: A Test Manual and Research Monograph,
Faculty of Education, University of Western Sydney, umbrella for both exploratory factor analysis and
Macarthur, New South Wales. confirmatory factor analysis (see Factor Analysis:
[16] Maruyama, G.M. (1998). Basics of Structural Equation Confirmatory) is called structural equation mod-
Modeling, Sage, Thousand Oaks. eling (SEM) Although EFA can be described as an
[17] Raykov, T. & Marcoulides, G.A. (2000). A First Course exploratory or unrestricted structural equation model,
Oin Structural Equation Modeling, Erlbaum, Mahwah. it would be a shame to categorize EFA as nothing
[18] Steiger, J.H. (1990). Structural model evaluation and
more than a SEM, as doing so does an injustice to
modification: an interval estimation approach, Multivari-
ate Behavioral Research 25, 173–180. its long history as the most used and most studied
latent variable method in the social and behavioral
sciences. This is somewhat like saying that analy-
(See also History of Path Analysis; Linear Sta- sis of variance (ANOVA) which has been on the
tistical Models for Causation: A Critical Review; scene for more than seventy-five years and which is
Residuals in Structural Equation, Factor Analysis, prominently related to experimental design, is just
and Path Analysis Models; Structural Equation a multiple linear regression model. There is some
Modeling: Checking Substantive Plausibility) truth to each statement, but it is unfair to the rich his-
tories of EFA and ANOVA to portray their boundaries
BARBARA M. BYRNE so narrowly.
A deeper point about the relationships between
EFA and SEM is that these methods appeal to
very different operational philosophies of science.
While SEMs are standardly seen as founded on
rather strict hypothetico-deductive logic, EFAs are
Factor Analysis: not. Rather, EFA generally invokes an exploratory
search for structure that is open to new structures
Exploratory not imagined prior to analysis. Rozeboom [20] has
carefully examined the logic of EFA, using the
label explanatory induction to describe it; this term
Introduction neatly summarizes EFA’s reliance on data to induce
hypotheses about structure, and its general concern
This year marks the one hundredth anniversary for for explanation.
exploratory factor analysis (EFA), a method intro- Several recent books, excellent reviews, and con-
duced by Charles Spearman in 1904 [21]. It is structive critiques of EFA have become available to
testimony to the deep insights of Spearman as well help understand its long history and its potential for
Factor Analysis: Exploratory 607

Table 1 Correlations among pairs of variables, painter data of [8]


Composition Drawing Color Expression School D
Composition 1.00
Drawing 0.42 1.00
Color −0.10 −0.52 1.00
Expression 0.66 0.57 −0.20 1.00
School D −0.29 −0.36 0.53 −0.45 1.00

effective use in modern times [6, 8, 15, 16, 23, 25]. variable. For more details, see the file ‘painters’ in
A key aim of this article is to provide guidance with the Modern Applied Statistics with S (MASS) library
respect to literature about factor analysis, as well as in R or Splus software (see Software for Statisti-
to software to aid applications. cal Analyses), and note that the original data and
several further analyses can be found in the MASS
library [24].
Basic Ideas of EFA Illustrated Table 1 exhibits correlations among the painter
variables, where upper triangle entries are ignored
Given a matrix of correlations or covariances (see since the matrix is symmetric. Table 2 exhibits a
Correlation and Covariance Matrices) among a common factor coefficients matrix (of order 5 × 2)
set of manifest or observed variables, EFA entails that corresponds to the initial correlations, where
a model whose aim is to explain or account for entries of highest magnitude are in bold print. The
correlations using a smaller number of ‘underlying final column of Table 2 is labeled h2 , the standard
variables’ called common factors. EFA postulates notation for variable communalities. Because these
common factors as latent variables so they are factor coefficients correspond to an orthogonal fac-
unobservable in principle. Spearman’s initial model, tor solution, that is, uncorrelated common factors,
developed in the context of studying relations among each communality can be reproduced as a (row)
psychological measurements, used a single common sum of squares of the two factor coefficients to
factor to account for all correlations among a battery its left; for example (0.76)2 + (−0.09)2 = 0.59. The
of tests of intellectual ability. Starting in the 1930s, columns labeled 1 and 2 are factor loadings, each of
Thurstone generalized the ‘two-factor’ method of which is properly interpreted as a (product-moment)
Spearman so that EFA became a multiple (common) correlation between one of the original manifest vari-
factor method [22]. In so doing, Thurstone effectively ables (rows) and a derived common factor (columns).
broadened the range of prospective applications in Post-multiplying the factor coefficient matrix by its
science. The basic model for EFA today remains transpose yields numbers that approximate the corre-
largely that of Thurstone. EFA entails an assumption sponding entries in the correlation matrix. For exam-
that there exist uniqueness factors as well as ple, the inner product of the rows for Composition
common factors, and that these two kinds of factors
complement one another in mutually orthogonal
spaces. An example will help clarify the central ideas. Table 2 Factor loadings for 2-factor EFA solution, painter
data
Table 1 below contains a correlation matrix for
all pairs of five variables, the first four of which Factor
correspond to ratings by the seventeenth century art
Variable name 1 2 h2
critic de Piles (using a 20 point scale) of 54 painters
for whom data were complete [7]. Works of these Composition 0.76 −0.09 0.59
painters were rated on four characteristics: composi- Drawing 0.50 −0.56 0.56
tion, drawing, color, and expression. Moreover, each Color −0.03 0.80 0.64
painter was associated with a particular ‘School.’ For Expression 0.81 −0.26 0.72
current purposes, all information about Schools is School D −0.30 0.62 0.47
ignored except for distinguishing the most distinc- Avg. Col. SS 0.31 0.28 0.60
tive School D (Venetian) from the rest using a binary
608 Factor Analysis: Exploratory

and Drawing is 0.76 × 0.50 + (−0.09) × (−0.56) =

1.0
0.43, which is close to 0.42, the observed correlation;
so the corresponding residual equals −0.01. Pairwise Color[-1]

0.8
products for all rows reproduce the observed corre-
lations in Table 1 quite well as only one residual fit
School.D[-1]
exceeds 0.05 in magnitude, and the mean residual

0.6
Factor 2
Drawing
is 0.01.
The final row of Table 2 contains the average

0.4
sum of squares for the first two columns; the third
Expression
entry is the average of the communalities in the final

0.2
column, as well as the sum of the two average sums of
squares to its left: 0.31 + 0.28 ≈ 0.60. These results
demonstrate an additive decomposition of common Composition

0.0
variance in the solution matrix where 60 percent
0.0 0.2 0.4 0.6 0.8 1.0
of the total variance is common among these five
Factor 1
variables, and 40 percent is uniqueness variance.
Users of EFA have often confused communality Figure 1 Plot of variables-as-points in 2-factor space,
with reliability, but these two concepts are quite dis- painter data
tinct. Classical common factor and psychometric test
theory entail the notion that the uniqueness is the sum
of two (orthogonal) parts, specificity and error. Con- factor, and is also related to, that is, not orthogo-
sequently, uniqueness variance is properly seen as an nal to, the point for Expression, shows that mean
upper bound for error variance; alternatively, commu- ratings, especially for the Drawing, Expression, and
nality is in principle a lower bound for reliability. It Color variates (the latter in an opposite direction), are
might help to understand this by noting that each EFA notably different between Venetian School artists and
entails analysis of just a sample of observed variables painters from the collection of other schools. This can
or measurements in some domain, and that the addi- be verified by examination of the correlations (some-
tion of more variables within the general domain will times called point biserials) between the School.D
generally increase shared variance as well as indi- variable and all the ratings variables in Table 1; the
vidual communalities. As battery size is increased, skeptical reader can easily acquire these data and
individual communalities increase toward upper lim- study details. In fact, one of the reasons for choosing
its that are in principle close to variable reliabilities. this example was to show that EFA as an exploratory
See [15] for a more elaborate discussion. data analytic method can help in studies of relations
To visualize results for my example, I plot the among quantitative and categorical variables. Some
common factor coefficients in a plane, after making connections of EFA with other methods will be dis-
some modifications in signs for selected rows and the cussed briefly in the final section.
second column. Specifically, I reverse the signs of the In modern applications of factor analysis, inves-
3rd and 5th rows, as well as in the second column, so tigators ordinarily try to name factors in terms of
that all values in the factor coefficients matrix become dimensions of individual difference variation, to iden-
positive. Changes of this sort are always permissible, tify latent variables that in some sense appear to
but we need to keep track of the changes, in this case ‘underlie’ observed variables. In this case, my igno-
by renaming the third variable to ‘Color[−1]’ and the rance of the works of these classical painters, not to
final binary variable to ‘School.D[−1]’. Plotting the mention of the thinking of de Piles as related to his
revised coefficients by rows yields the five labeled ratings, led to my literal, noninventive factor names.
points of Figure 1. Before going on, it should be made explicit that
In addition to plotting points, I have inserted vec- insertion of the factor-vectors into this plot, and the
tors to correspond to ‘transformed’ factors; the arrows attempt to name factors, are best regarded as discre-
show an ‘Expression–Composition’ factor and a sec- tionary parts of the EFA enterprise. The key output
ond, correlated, ‘Drawing–Color[−1]’ factor. That of such an analysis is the identification of the sub-
the School.D variable also loads highly on this second space defined by the common factors, within which
Factor Analysis: Exploratory 609

variables can be seen to have certain distinctive struc- in a general way. More detailed or more technical
tural relationships with one another. In other words, discussions concerning such differences is available
it is the configuration of points in the derived space in [15].
that provides the key information for interpreting As noted, the key aim of EFA is usually to derive a
factor results; a relatively low-dimensional subspace relatively small number of common factors to explain
provides insights into structure, as well as quan- or account for (off-diagonal) covariances or correla-
tification of how much variance variables have in tions among a set of observed variables. However,
common. Positioning or naming of factors is gener- despite being an exploratory method, EFA entails
ally optional, however common. When the common use of a falsifiable model at the level of manifest
number of derived factors exceeds two or three, fac- observations or correlations (covariances). For such
tor transformation is an almost indispensable part of a model to make sense, relationships among mani-
an EFA, regardless of whether attempts are made to fest variables should be approximately linear. When
name factors. approximate linearity does not characterize relation-
Communalities generally provide information as ships among variables, attempts can be made to trans-
to how much variance variables have in common form (at least some of) the initial variables to ‘remove
or share, and can sometimes be indicative of how bends’ in their relationships with other variables,
highly predictable variables are from one another. In or perhaps to remove outliers. Use of square root,
fact, the squared multiple correlation of each variable logarithmic, reciprocal, and other nonlinear transfor-
with all others in the battery is often recommended mations are often effective for such purposes. Some
as an initial estimate of communality for each vari- investigators question such steps, but rather than
able. Communalities can also signal (un)reliability, asking why nonlinear transformations should be con-
depending on the composition of the battery of vari- sidered, a better question usually is, ‘Why should
ables, and the number of factors; recall the foregoing the analyst believe the metric used at the outset
discussion on this matter. for particular variables should be expected to render
Note that there are no assumptions that point relationships linear, without reexpressions or transfor-
configurations for variables must have any particular mations?’ Given at least approximate linearity among
form. In this sense, EFA is more general than many of all pairs of variables – the inquiry about which is
its counterparts. Its exploratory nature also means that greatly facilitated by examining pairwise scatterplots
prior structural information is usually not part of an among all pairs of variables – common factor anal-
EFA, although this idea will eventually be qualified ysis can often facilitate explorations of relationships
in the context of reviewing factor transformations. among variables. The prospects for effective or pro-
Even so, clusters or hierarchies of either variables ductive applications of EFA are also dependent on
or entities may sometimes be identified in EFA thoughtful efforts at the stage of study design, a mat-
solutions. In our example, application of the common ter to be briefly examined below. With reference to
factor method yields a relatively parsimonious model our example, the pairwise relationships between the
in the sense that two common factors account for all various pairs of de Pile’s ratings of painters were
relationships among variables. However, EFA was, found to be approximately linear.
and is usually, antiparsimonious in another sense as In contrast to EFA, principal components analysis
there is one uniqueness factor for each variable as does not engage a model. PCA generally entails
well as common factors to account for all entries in an algebraic decomposition of an initial data matrix
the correlation table. into mutually orthogonal derived variables called
components. Alternatively, PCA can be viewed as
a linear transformation of the initial data vectors
Some Relationships Between EFA and into uncorrelated variates with certain optimality
PCA properties. Data are usually centered at the outset
by subtracting means for each variable and then
As noted earlier, EFA is often confused with PCA. scaled so that all variances are equal, after which
In fact, misunderstanding occurs so often in reports, the (rectangular) data matrix is resolved using a
published articles, and textbooks that it will be useful method called singular value decomposition (SVD).
to describe how these methods compare, at least Components from a SVD are usually ordered so that
610 Factor Analysis: Exploratory

the first component accounts for the largest amount components (‘factors’) to generate, or try to interpret;
of variance, the second the next largest amount, nor is there assistance for choosing samples or extrap-
subject to the constraint that it be uncorrelated with olating beyond extant data for purposes of statistical
the first, and so forth. The first few components or psychometric generalization. The latter concerns
will often summarize the majority of variation in are generally better dealt with using models, and EFA
the data, as these are principal components. When provides what in certain respects is one of the most
used in this way, PCA is justifiably called a data general classes of models available.
reduction method and it has often been successful in To make certain other central points about PCA
showing that a rather large number of variables can more concrete, I return to the correlation matrix for
be summarized quite well using a relatively small the painter data. I also conducted a PCA with two
number of derived components. components (but to save space I do not present the
Conventional PCA can be completed by simply table of ‘loadings’).
computing a table of correlations of each of the That is, I constructed the first two principal
original variables with the chosen principal compo- component variables, and found their correlations
nents; indeed doing so yields a PCA counterpart of with the initial variables. A plot (not shown) of
the EFA coefficients matrix in Table 2 if two com- the principal component loadings analogous to that
ponents are selected. Furthermore, sums of squares of Figure 1 shows the variables to be configured
of correlations in this table, across variables, show similarly, but all points are further from the origin.
the total variance each component explains. These The row sums of squares of the component loadings
component-level variances are the eigenvalues pro- matrix were 0.81, 0.64, 0.86, 0.83, and 0.63, values
duced when the correlation matrix associated with the that correspond to communality estimates in the third
data matrix is resolved into eigenvalues and eigenvec- column of the common factor matrix in Table 2.
tors. Alternatively, given the original (centered and Across all five variables, PCA row sums of squares
scaled) data matrix, and the eigenvalues and vectors (which should not be called communalities) range
of the associated correlation matrix, it is straightfor- from 14 to 37 percent larger than the h2 entries
ward to compute principal components. As in EFA, in Table 2, an average of 27 percent; this means
derived PCA coefficient matrices can be rotated or that component loadings are substantially larger in
transformed, and for purposes of interpretation this magnitude than their EFA counterparts, as will be
has become routine. true quite generally. For any data system, given the
Given its algebraic nature, there is no particular same number of components as common factors,
reason for transforming variables at the outset so that component solutions yield row sums of squares that
their pairwise relationships are even approximately tend to be at least somewhat, and often markedly,
linear. This can be done, of course, but absent a larger than corresponding communalities.
model, or any particular justification for concentrat- In fact, these differences between characteristics of
ing on pairwise linear relationships among variables, the PCA loadings and common factor loadings sig-
principal components analysis of correlation matri- nify a broad point worthy of discussion. Given that
ces is somewhat arbitrary. Because PCA is just an principal components are themselves linear combina-
algebraic decomposition of data, it can be used for tions of the original data vectors, each of the data
any kind of data; no constraints are made about the variables tends to be part of the linear combination
dimensionality of the data matrix, no constraints on with which it is correlated. The largest weights for
data values, and no constraints on how many compo- each linear combination correspond to variables that
nents to use in analyses. These points imply that for most strongly define the corresponding linear combi-
PCA, assumptions are also optional regarding statis- nation, and so the corresponding correlations in the
tical distributions, either individually or collectively. Principal Component (PC) loading matrix tend to be
Accordingly, PCA is a highly general method, with highest, and indeed to have spuriously high mag-
potential for use for a wide range of data types or nitudes. In other words, each PC coefficient in the
forms. Given their basic form, PCA methods provide matrix that constitutes the focal point for interpre-
little guidance for answering model-based questions, tation of results, tends to have a magnitude that is
such as those central to EFA. For example, PCA gen- ‘too large’ because the corresponding variable is cor-
erally offers little support for assessing how many related partly with itself, the more so for variables
Factor Analysis: Exploratory 611

that are largest parts of corresponding components. such results are conditioned on the number, m, of
Also, this effect tends to be exacerbated when princi- common factors selected for analysis. I shall assume
pal components are rotated. Contrastingly, common that in deciding to use EFA, there is at least some
factors are latent variables, outside of the space of doubt, a priori, as to how many factors to retain, so
the data vectors, and common factor loadings are extant data will be the key basis for deciding on the
not similarly spurious. For example, EFA loadings number of factors. (I shall also presume that the data
in Table 2, being correlations of observed variables have been properly prepared for analysis, appropriate
with latent variables, do not reflect self-correlations, nonlinear transformations made, and so on, with the
as do their PCA counterparts. understanding that even outwardly small changes in
the data can affect criteria bearing on the number of
factors, and more.)
The Central EFA Questions: How Many
The reader who is even casually familiar with EFA
Factors? What Communalities? is likely to have learned that one way to select the
Each application of EFA requires a decision about number of factors is to see how many eigenvalues (of
how many common factors to select. Since the com- the correlation matrix; recall PCA) exceed a certain
mon factor model is at best an approximation to the criterion. Indeed, the ‘roots-greater-than-one’ rule has
real situation, questions such as how many factors, become a default in many programs. Alas, rules of
or what communalities, are inevitably answered with this sort are generally too rigid to serve reliably
some degree of uncertainty. Furthermore, particular for their intended purpose; they can lead either to
features of given data can make formal fitting of an overestimates or underestimates of the number of
EFA model tenuous. My purpose here is to present common factors. Far better than using any fixed
EFA as a true exploratory method based on com- cutoff is to understand certain key principles and
mon factor principles with the understanding that then learn some elementary methods and strategies
formal ‘fitting’ of the EFA model is secondary to for choosing m. In some cases, however, two or more
‘useful’ results in applications; moreover, I accept values of m may be warranted, in different solutions,
that certain decisions made in contexts of real data to serve distinctive purposes for different EFAs of the
analysis are inevitably somewhat arbitrary and that same data.
any given analysis will be incomplete. A wider per- A second thing even a nonspecialist may have
spective on relevant literature will be provided in the learned is to employ a ‘scree’ plot (SP) to choose
final section. the number of factors in EFA. An SP entails plotting
The history of EFA is replete with studies of eigenvalues, ordered from largest to smallest, against
how to select the number of factors; hundreds of their ordinal position, 1, 2, . . ., and so on. Ordinarily,
both theoretical and empirical approaches have been the SP is based on eigenvalues of a correlation
suggested for the number of factors question, as this matrix [5]. While the usual SP sometimes works
issue has been seen as basic for much of the past reasonably well for choosing m, there is a mismatch
century. I shall summarize some of what I regard as between such a standard SP, and another relevant
the most enduring principles or methods, while trying fact: a tacit assumption of this method is that all p
to shed light on when particular methods are likely communalities are the same. But to assume equal
to work effectively, and how the better methods can communalities is usually to make a rather strong
be attuned to reveal relevant features of extant data. assumption, one quite possibly not supported by data
Suppose scores have been obtained on some num- in hand.
ber of correlated variables, say p, for n entities, A better idea for SP entails computing the original
perhaps persons. To entertain a factor analysis (EFA) correlation matrix, R, as well as its inverse R−1 . Then,
for these variables generally means to undertake denoting the diagonal of the inverse as D2 (entries
an exploratory structural analysis of linear relations of which exceed unity), rescale the initial correlation
among the p variables by analyzing a p × p covari- matrix to DRD, and then compute eigenvalues of
ance or correlation matrix. Standard outputs of such this rescaled correlation matrix. Since the largest
an analysis are a factor loading matrix for orthogonal entries in D2 correspond to variables that are most
or correlated common factors as well as communal- predictable from all others, and vice versa, the
ity estimates, and perhaps factor score estimates. All effect is to weigh variables more if they are more
612 Factor Analysis: Exploratory

predictable, less if they are less predictable from of residual correlations associated with fitting off-
other variables in the battery. (The complement of diagonals of the observed correlation matrix in suc-
the reciprocal of any D2 entry is in fact the squared cessive choices for m, the number of common factors.
multiple correlation (SMC) of that variable with all When a break occurs in the eigenvalue plot, it signi-
others in the set.) An SP based on eigenvalues of fies a notable drop in the sum of squares of residual
DRD allows for variability of communalities, and is correlations after fitting the common factor model
usually realistic in assuming that communalities are to the observed correlation matrix for a particular
at least roughly proportional to SMC values. value of m. I have constructed a horizontal line in
Figure 2 provides illustrations of two scree plots Figure 2 to correspond to the mean of the 20 smallest
based on DRD, as applied to two simulated random eigenvalues (24–4) of DRD, to help see the variation
samples. Although real data were used as the starting these so-called ‘rejected ’ eigenvalues have around
point for each simulation, both samples are just their mean. In general, it is the variation around such
simulation sets of (the same size as) the original a mean of rejected eigenvalues that one seeks to
data set, where four factors had consistently been reduce to a ‘reasonable’ level when choosing m in the
identified as the ‘best’ number to interpret. EFA solution, since a ‘good’ EFA solution accounts
Each of these two samples yields a scree plot, and well for the off-diagonals of the correlation matrix.
both are given in Figure 2 to provide some sense Methods such as bootstrapping – wherein multiple
of sampling variation inherent in such data; in this versions of DRD are generated over a series of boot-
case, each plot leads to breaks after four common strap samples of the original data matrix – can be
factors – where the break is found by reading the plot used to get a clearer sense of sampling variation,
from right to left. But the slope between four and five and probably should become part of standard prac-
factors is somewhat greater for one sample than the tice in EFA both at the level of selecting the number
other, so one sample identifies m as four with slightly
of factors, and assessing variation in various derived
more clarity than the other. In fact, for some other
EFA results.
samples examined in preparing these scree plots,
When covariances or correlations are well fit by
breaks came after three or five factors, not just four.
some relatively small number of common factors,
Note that for smaller samples greater variation can
then scree plots often provide flexible, informative,
be expected in the eigenvalues, and hence the scree
and quite possibly persuasive evidence about the
breaks will generally be less reliable indicators of the
number of common factors. However, SPs alone can
number of common factors for smaller samples.
be misleading, and further examination of data may
So what is the principle behind the scree method?
The answer is that the variance of the p – m small- be helpful. The issue in selecting m vis-à-vis the
est eigenvalues is closely related to the variance SP concerns the nature or reliability of the informa-
tion in eigenvectors associated with corresponding
eigenvalues. Suppose some number m∗ is seen as a
possible underestimate for m; then deciding to add
one more factor to have m∗ + 1 factors, is to decide
25
Eigenvalues for matrix DRD

that the additional eigenvector adds useful or mean-


20

ingful structural information to the EFA solution. It is


possible that m∗ is an ‘underestimate’ solely because
15

Scree break at four factors


a single correlation coefficient is poorly fit, and that
10

adding a common factor merely reduces a single


‘large’ residual correlation. But especially if the use
of m∗ + 1 factors yields a factor loading matrix that
5

upon rotation (see below) improves interpretability


0

in general, there may be ex post facto evidence that


5 10 15 20
m∗ was indeed an underestimate. Similar reasoning
1:24
may be applied when moving to m∗ + 2 factors, etc.
Figure 2 Two scree plots, for two simulated data sets, Note that sampling variation can also result in sample
each n = 145 reordering of so-called population eigenvectors too.
Factor Analysis: Exploratory 613

An adjunct to an SP that is too rarely used A more commonly used EFA method is called
is simply to plot the distribution of the residual maximum likelihood factor analysis (MLFA) for
correlations, either as a histogram, or in relation to the which algorithms and software are readily available,
original correlations, for, say, m, m + 1, and m + 2 and generally well understood. The theory for this
factors in the vicinity of the scree break; outliers or method has been studied perhaps more than any
other anomalies in such plots can provide evidence other and it tends to work effectively when the EFA
that goes usefully beyond the SP when selecting m. problem has been well-defined and the data are ‘well-
Factor transformation(s) (see below) may be essential behaved.’ Specialists regularly advocate use of the
to one’s final decision. Recall that it may be a folly MLFA method [1, 2, 16, 23], and it is often seen as
even to think there is a single ‘correct’ value for m the common factor method of choice when the sample
for some data sets. is relatively large. Still, MLFA is an iterative method
Were one to use a different selection of variables that can lead to poor solutions, so one must be alert in
to compose the data matrix for analysis, or per- case it fails in some way. Maximum likelihood EFA
haps make changes in the sample (deleting or adding methods generally call for large n’s, using an assump-
cases), or try various different factoring algorithms, tion that the sample has been drawn randomly from
further modifications may be expected about the num- a parent population for which multivariate normality
ber of common factors. Finally, there is always the (see Catalogue of Probability Density Functions)
possibility that there are simply too many distinctive holds, at least approximately; when this assumption is
dimensions of individual difference variation, that is, violated seriously, or when sample size is not ‘large,’
common factors, for the EFA method to work effec- MLFA may not serve its exploratory purpose well.
tively in some situations. It is not unusual that more Statistical tests may sometimes be helpful, but the
variables, larger samples, or generally more investiga- sample size issue is vital if EFA is used for test-
tive effort, are required to resolve some basic ques- ing statistical hypotheses. There can be a mismatch
tions such as how many factors to use in analysis. between exploratory use of EFA and statistical test-
Given some choice for m, the next decision is ing because small samples may not be sufficiently
usually that of deciding what factoring method to informative to reject any factor model, while large
use. The foregoing idea of computing DRD, finding samples may lead to rejection of every model in some
its eigenvalues, and producing an SP based on those, domains of application. Generally scree methods for
can be linked directly to an EFA method called image choosing the number of factors are superior to statis-
factor analysis (IFA) [13], which has probably been tical testing procedures.
underused, in that several studies have found it to Given a choice of factoring methods – and of
be a generally sound and effective method. IFA is course there are many algorithms in addition to
a noniterative method that produces common factor IFA and MLFA – the generation of communality
coefficients and communalities directly. IFA is based estimates follows directly from the choice of m, the
on the m largest eigenvalues, say, the diagonal entries number of common factors. However, some EFA
of m , and corresponding eigenvectors, say Qm , of methods or algorithms can yield numerically unstable
the matrix denoted DRD, above. Given a particular results, particularly if m is a substantial fraction of p,
factor method, communality estimates follow directly the number of variables, or when n is not large in
from selection of the number of common factors. relation to p. Choice of factor methods, like many
The analysis usually commences from a correlation other methodological decisions, is often best made in
matrix, so communality estimates are simply row consultation with an expert.
sums of squares of the (orthogonal) factor coefficients
matrix that for m common factors is computed as
m = D−1 Qm (m − φ I)1/2 , where φ is the average Factor Transformations to Support EFA
of the p – m smallest eigenvalues. IFA may be Interpretation
especially defensible for EFA when sample size is
limited; more details are provided in [17], including Given at least a tentative choice for m, EFA methods
a sensible way to modify the diagonal D2 when such as IFA or MLFA can be used straightforwardly
the number of variables is a ‘substantial fraction’ of to produce matrices of factor coefficients to account
sample size. for structural relations among variables. However,
614 Factor Analysis: Exploratory

attempts to interpret factor coefficient matrices with- ‘varimax’, or ‘equamax.’ Dispensing with quotations,
out further efforts to transform factors usually fall we merely note that in general, equamax solutions
short unless m = 1 or 2, as in our illustration. For tend to produce simple structure solutions for which
larger values of m, factor transformation can bring different factors account for nearly equal amounts of
order out of apparent chaos, with the understanding common variance; quartimax, contrastingly, typically
that order can take many forms. Factor transformation generates one broad or general factor followed by
algorithms normally take one of three forms: Pro- m − 1 ‘smaller’ ones; varimax produces results inter-
crustes fitting to a prespecified target (see Procrustes mediate between these extremes. The last, varimax,
Analysis), orthogonal simple structure, or oblique is the most used of the orthogonal simple structure
simple structure. All modern methods entail use of rotations, but choice of a solution should not be based
specialized algorithms. I shall begin with Procrustean too strongly on generic popularity, as particular fea-
methods and review each class of methods briefly. tures of a data set can make other methods more
Procrustean methods owe their name to a figure effective. Orthogonal solutions offer the appealing
of ancient Greek mythology, Procrustes, who made feature that squared common factor coefficients show
a practice of robbing highway travelers, tying them directly how much of each variable’s common vari-
up, and stretching them, or cutting off their feet ance is associated with each factor. This property is
to make them fit a rigid iron bed. In the context lost when factors are transformed obliquely. Also, the
of EFA, Procrustes methods are more benign; they factor coefficients matrix alone is sufficient to inter-
merely invite the investigator to prespecify his or her pret orthogonal factors; not so when derived factors
beliefs about structural relations among variables in are mutually correlated. Still, forcing factors to be
the form of a target matrix, and then transform an uncorrelated can be a weakness when the constraint
initial factor coefficients matrix to put it in relatively of orthogonality limits factor coefficient configura-
close conformance with the target. Prespecification tions unrealistically, and this is a common occurrence
of configurations of points in m-space, preferably when several factors are under study.
on the basis of hypothesized structural relations that Oblique transformation methods allow factors to
are meaningful to the investigator, is a wise step be mutually correlated. For this reason, they are
for most EFAs even if Procrustes methods are not more complex and have a more complicated his-
to be used explicitly for transformations. This is tory. A problem for many years was that by allowing
because explication of beliefs about structures can factors to be correlated, oblique transformation meth-
afford (one or more) reference system(s) for inter- ods often allowed the m-factor space to collapse;
pretation of empirical data structures however they successful methods avoided this unsatisfactory situ-
were initially derived. It is a long-respected princi- ation while tending to work well for wide varieties
ple that prior information, specified independently of of data. While no methods are entirely acceptable
extant empirical data, generally helps to support sci- by these standards, several, notably those of Jen-
entific interpretations of many kinds, and EFA should nrich and Sampson (direct quartimin) [12], Harris
be no exception. In recent times, however, meth- and Kaiser (obliquimax), Rozeboom (Hyball) [18],
ods such as confirmatory factor analysis (CFA), are Yates (geomin) [25], and Hendrikson and White (pro-
usually seen as making Procrustean EFA methods max) [9] are especially worthy of consideration for
obsolete because CFA methods offer generally more applications. Browne [2], in a recent overview of ana-
sophisticated numerical and statistical machinery to lytic rotation methods for EFA, stated that Jennrich
aid analyses. Still, as a matter of principle, it is use- and Sampson [12] ‘solved’ the problems of oblique
ful to recognize that general methodology of EFA has rotation; however, he went on to note that ‘. . . we
for over sixty years permitted, and in some respects are not at a point where we can rely on mechan-
encouraged, incorporation of sharp prior questions in ical exploratory rotation by a computer program if
structural analyses. the complexity of most variables is not close to one
Orthogonal rotation algorithms provide relatively [2, p. 145].’ Methods such as Hyball [19] facilitate
simple ways for transforming factors and these random starting positions in m-space of transforma-
have been available for nearly forty years. Most tion algorithms to produce multiple solutions that
commonly, an ‘orthomax’ criterion is optimized, can then be compared for interpretability. The pro-
using methods that have been dubbed ‘quartimax’, max method is notable not only because it often
Factor Analysis: Exploratory 615

works well, but also because it combines elements simplified using the singular value decomposition of
of Procrustean logic with analytical orthogonal trans- matrix Z D; indeed, these score estimates are just
formations. Yates’ geomin [25] is also a particularly rescaled versions of the first m principal components
attractive method in that the author went back to of Z D. Regression estimates, in turn, are further col-
Thurstone’s basic ideas for achieving simple struc- umn rescalings of the same m columns in Xm−Bartlett .
ture and developed ways for them to be played out in MLFA factor score estimates are easily computed,
modern EFA applications. A special reason to favor but to discuss them goes beyond our scope; see [15].
simple structure transformations is provided in [10, Rotated or transformed versions of factor score esti-
11] where the author noted that standard errors of fac- mates are also not complicated; the reader can go to
tor loadings will often be substantially smaller when factor score estimation (FSE) for details.
population structures are simple than when they are
not; of course this calls attention to the design of the
battery of variables. EFA in Practice: Some Guidelines and
Resources

Estimation of Factor Scores Software packages such as CEFA [3], which imple-
ments MLFA as well as geomin among other meth-
It was noted earlier that latent variables, that is, ods, and Hyball [18], can be downloaded from the
common factors, are basic to any EFA model. A web without cost, and they facilitate use of most of
strong distinction is made between observable vari- the methods for factor extraction as well as factor
ates and the underlying latent variables seen in EFA transformation. These packages are based on mod-
as accounting for manifest correlations or covariances ern methods, they are comprehensive, and they tend
between all pairs of manifest variables. The latent to offer advantages that most commercial software
variables are by definition never observed or observ- for EFA do not. What these methods lack, to some
able in a real data analysis, and this is not related to extent, is mechanisms to facilitate modern graphical
the fact that we ordinarily see our data as a sample (of displays. Splus and R software, the latter of which
cases, or rows); latent variables are in principle not is also freely available from the web [r-project.org],
observable, either for statistically defined samples, or provide excellent modern graphical methods as well
for their population counterparts. Nevertheless, it is as a number of functions to implement many of the
not difficult to estimate the postulated latent vari- methods available in CEFA, and several in Hyball.
ables, using linear combinations of the observed data. A small function for IFA is provided at the end of
Indeed, many different kinds of factor score estimates this article; it works in both R and Splus. In gen-
have been devised over the years (see Factor Score eral, however, no one source provides all methods,
Estimation). mechanisms, and management capabilities for a fully
Most methods for estimating factor scores are not operational EFA system – nor should this be expected
worth mentioning because of one or another kind since what one specialist means by ‘fully operational’
of technical weakness. But there are two methods necessarily differs from that of others.
that are worthy of consideration for practical appli- Nearly all real-life applications of EFA require
cations in EFA where factor score estimates seem decisions bearing on how and how many cases are
needed. These are called regression estimates and selected, how variables are to be selected and trans-
Bartlett (also, maximum likelihood ) estimates of fac- formed to help ensure approximate linearity between
tor scores, and both are easily computed in the context variates; next, choices about factoring algorithms or
of IFA. Recalling that D2 was defined as the diagonal methods, the number(s) of common factors and fac-
of the inverse of the correlation matrix, now suppose tor transformation methods must be made. That there
the initial data matrix has been centered and scaled be no notably weak links in this chain is important if
as Z where Z’Z = R; then, using the notation given an EFA project is to be most informative. Virtually
earlier in the discussion of IFA, Bartlett estimates of all questions are contextually bound, but the literature
factor scores can be computed as Xm−Bartlett = Z D of EFA can provide guidance at every step.
Qm (m − φ I)−1/2 . The discerning reader may recog- Major references on EFA application, such as that
nize that these factor scores estimates can be further of Carroll [4], point up many of the possibilities and
616 Factor Analysis: Exploratory

a perspective on related issues. Carroll suggests that References


special value can come from side-by-side analyses of
the same data using EFA methods and those based on [1] Browne, M.W. (1968). A comparison of factor analytic
structural equation modeling (SEM). McDonald [15] techniques, Psychometrika 33, 267–334.
discusses EFA methods in relation to SEM. Several [2] Browne, M.W. (2001). An overview of analytic rotation
in exploratory factor analysis, Multivariate Behavioral
authors have made connections between EFA and
Research 36, 111–150.
other multivariate methods such as basic regression; [3] Browne, M.W., Cudeck, R., Tateneni, K. & Mels, G.
see [14, 17] for examples. (1998). CEFA: Comprehensive Exploratory Factor Anal-
ysis (computer software and manual). [http://
– – an S function for Image Factor Analysis – – quantrm2.psy.ohio-state.edu/browne/]
[4] Carroll, J.B. (1993). Human Cognitive Abilities: A Sur-
‘ifa’<-function(rr,mm) { vey of Factor Analytic Studies, Cambridge University
# routine is based on image factor Press, New York.
# analysis; [5] Cattell, R.B. (1966). The scree test for the number of
factors, Multivariate Behavioral Research 1, 245–276.
# it generates an unrotated common
[6] Darlington, R. (2000). Factor Analysis (Instructional
# factor coefficients matrix & a scree Essay on Factor Analysis). [http://comp9.psych.
# plot; in R, follow w/ promax or cornell.edu/Darlington/factor.htm]
# varimax; in Splus follow w/ rotate. [7] Davenport, M. & Studdert-Kennedy, G. (1972). The sta-
# rr is taken to be symmetric matrix tistical analysis of aesthetic judgement: an exploration,
# of correlations or covariances; Applied Statistics 21, 324–333.
# mm is no. of factors. For additional [8] Fabrigar, L.R., Wegener, D.T., MacCallum, R.C. &
# functions or assistance, contact: Strahan, E.J. (1999). Evaluating the use of exploratory
factor analysis in psychological research, Psychological
# rpruzek@uamail.albany.edu
Methods 3, 272–299.
rinv <- solve(rr) #takes inverse [9] Hendrickson, A.E. & White, P.O. (1964). PROMAX:
# of R; so R must be nonsingular a quick method for transformation to simple structure,
sm2i <- diag(rinv) Brit. Jour. of Statistical Psychology 17, 65–70.
smrt <- sqrt(sm2i) [10] Jennrich, R.I. (1973). Standard errors for obliquely
dsmrt <- diag(smrt) rotated factor loadings, Psychometrika 38, 593–604.
rsr <- dsmrt %*% rr %*% dsmrt [11] Jennrich, R.I. (1974). On the stability of rotated factor
reig <- eigen(rsr, sym = T) loadings: the Wexler phenomenon, Brit. J. Math. Stat.
Psychology 26, 167–176.
vlamd <- reig$va
[12] Jennrich, R.I. & Sampson, P.F. (1966). Rotation for
vlamdm <- vlamd[1:mm] simple loadings, Psychometrika 31, 313–323.
qqm <- as.matrix(reig$ve[, 1:mm]) [13] Jöreskog, K.G. (1969). Efficient estimation in image
theta <- mean(vlamd[(mm + 1) factor analysis, Psychometrika 34, 51–75.
:nrow(qqm)]) [14] Lawley, D.N. & Maxwell, A.E. (1973). Regression and
dg <- sqrt(vlamdm - theta) factor analysis, Biometrika 60, 331–338.
if(mm == 1) [15] McDonald, R.P. (1984). Factor Analysis and Related
fac <- dg[1] * diag(1/smrt) Methods, Lawrence Erlbaum Associates, Hillsdale.
[16] Preacher, K.J. & MacCallum, R.C. (2003). Repair-
%*% qqm
ing Tom Swift’s electric factor analysis machine,
else fac <- diag(1/smrt) %*% qqm Understanding Statistics 2, 13–43. [http://www.
%*% diag(dg) geocities.com/Athens/Acropolis/8950/
plot(1:nrow(rr), vlamd, type tomswift.pdf]
= "o") [17] Pruzek, R.M. & Lepak, G.M. (1992). Weighted struc-
abline(h = theta, lty = 3) tural regression: a broad class of adaptive methods
title("Scree plot for IFA") for improving linear prediction, Multivariate Behavioral
print("Common factor coefficients Research 27, 95–130.
[18] Rozeboom, W.W. (1991). HYBALL: a method for
matrix is: fac")
subspace-constrained oblique factor rotation, Multi-
print(fac) variate Behavioral Research 26, 163–177. [http://
list(vlamd = vlamd, theta = theta, web.psych.ualberta.ca/∼rozeboom/]
fac = fac) [19] Rozeboom, W.W. (1992). The glory of suboptimal fac-
} tor rotation: why local minima in analytic optimization
Factor Analysis: Multiple Groups 617

of simple structure are more blessing than curse, Multi- making cross-group comparisons because it allows
variate Behavioral Research 27, 585–599. for (a) simultaneous estimation of all parameters
[20] Rozeboom, W.W. (1997). Good science is abductive, not (including mean-level information) for all groups
hypothetico-deductive, in What if there were no signifi-
cance tests?, Chapter 13, L.L. Harlow, S.A. Mulaik &
and (b) direct statistical comparisons of the estimated
J.H. Steiger, eds, Lawrence Erlbaum Associates, Hills- parameters across the groups. The theoretical basis
dale, NJ. for selecting groups can vary from nominal vari-
[21] Spearman, C. (1904). General intelligence objectively ables such as gender, race, clinical treatment group, or
determined and measured, American Jour. of Psychology nationality to continuous variables that can be easily
15, 201–293. categorized such as age-group or grade level. When
[22] Thurstone, L.L. (1947). Multiple-factor Analysis: A making comparisons across distinct groups, however,
Development and Expansion of the Vectors of Mind, Uni-
it is critical to determine that the constructs of interest
versity of Chicago Press, Chicago.
[23] Tucker, L. & MacCallum, R.C. (1997). Exploratory have the same meaning in each group (i.e., they are
factor analysis. [unpublished, but available: http:// said to be measurement equivalent, or have strong
www.unc.edu/∼rcm/book/factornew.htm] factorial invariance; see below). This condition is
[24] Venables, W.N. & Ripley, B.D. (2002). Modern Applied necessary in order to make meaningful comparisons
Statistics with S, Springer, New York. across groups [1].
[25] Yates, A. (1987). Multivariate Exploratory Data Analy- In order to determine measurement equivalence,
sis: A Perspective on Exploratory Factor Analysis, State
the analyses should go beyond the standard covari-
University of New York Press, Albany.
ance structures information of the traditional CFA
ROBERT PRUZEK model to also include the mean structure infor-
mation [9, 14, 16, 21]. We refer to such inte-
grated modeling as mean and covariance structure
(MACS) modeling. MACS analyses are well suited
to establish construct comparability (i.e., factorial
invariance) and, at the same time, detect possi-
Factor Analysis: Multiple ble between-group differences because they allow:
(a) simultaneous model fitting of an hypothesized
Groups factorial structure in two or more groups (i.e., the
expected pattern of indicator-to-construct relations
for both the intercepts and factor loadings, (b) tests
Factor Analysis: Multiple Groups with of the cross-group equivalence of both intercepts
Means and loadings, (c) corrections for measurement error
whereby estimates of the latent constructs’ means
The confirmatory factor analysis (CFA) (see Factor and covariances are disattenuated (i.e., estimated as
Analysis: Confirmatory) model is a very effec- true and reliable values), and (d) strong tests of sub-
tive approach to modeling multivariate relationships stantive hypotheses about possible cross-group dif-
across multiple groups. The CFA approach to fac- ferences on the constructs [11, 14].
torial invariance has its antecedents in exploratory
factor analysis. Cattell [4] developed a set of princi- The General Factor Model. To understand the logic
ples by which to judge the rotated solutions from and steps involved in multiple-group MACS model-
two populations with the goal being simultaneous ing, we begin with the matrix algebra notations for the
simple structure. Further advancements were made general factor model, which, for multiple populations
by Horst & Schaie [7] and culminated with work by g = 1, 2, . . . , G, is represented by:
Meredith [13] in which he gave methods for rotat- Xg = τg + g ξg + δg (1)
ing solutions from different populations to achieve
one best fitting factor pattern. Confirmatory factor E(Xg ) = µxg = τg + g κg (2)
analytic techniques have made exploratory methods g = g 
+
g g g (3)
of testing for invariance obsolete by allowing an
exact structure to be hypothesized. The multiple- where x is a vector of observed or manifest indica-
group CFA approach is particularly useful when tors, ξ is a vector of latent constructs, τ is a vector of
618 Factor Analysis: Multiple Groups

intercepts of the manifest indicators,  is the factor Although we will often discuss the modeling proce-
pattern or loading matrix of the indicators, κ rep- dures in terms of two groups, the extension to three
resents the means of the latent constructs, is the or more groups is straightforward (see e.g., [9]).
variance-covariance matrix of the latent constructs,
and is a symmetric matrix with the variances of Configural Invariance. The most basic form of
the error term, δ, along the diagonal and possible factorial invariance is ensuring that the groups have
covariances among the residuals in the off diago- the same basic factor structure. The groups should
nal. All of the parameter matrices are subscripted have the same number of latent constructs, the same
with a g to indicate that the parameters may take number of manifest indicators, and the same pattern
different values in each population. For the com- of fixed and freed (i.e., estimated) parameters. If
mon factor model, we assume that the indicators (i.e., these conditions are met, the groups are said to
items, parcels, scales, responses, etc.) are continuous have configural invariance. As the weakest form of
variables that are multivariate normal (see Catalogue invariance, configural invariance only requires the
of Probability Density Functions) in the population same pattern of fixed and freed estimates among the
and the elements of have a mean of zero and are manifest and latent variables, but does not require the
independent of the estimated elements in the other coefficients be equal across groups.
parameter matrices.
In a MACS framework, there are six types of
parameter estimate that can be evaluated for equiva- Weak Factorial Invariance. Although termed ‘weak
lence across groups. The first three components refer factorial invariance’, this level of invariance is more
to the measurement level: (a) , the unstandardized restricted than configural invariance. Specifically, in
regression coefficients of the indicators on the latent addition to the requirement of having the same pattern
constructs (the loadings of the indicators), (b) τ , the of fixed and freed parameters across groups, the
intercepts or means of the indicators, and (c) , loadings are equated across groups. The manifest
the residual variances of each indicator, which is means and residual variances are free to vary. This
the aggregate of the unique factor variance and the condition is also referred to as pattern invariance [15]
unreliable variance of an indicator. The other three or metric invariance [6]. Because the factor variances
types of parameter refer to the latent construct level: are free to vary across groups, the factor loadings are,
(d) κ, the mean of the latent constructs, (e) φii latent technically speaking, proportionally equivalent (i.e.,
variances, and (f ) φij latent covariances or correla- weighted by the differences in latent variances). If
tions [9, 12, 14]. weak factorial invariance is found to be untenable
(see ‘testing’ below) then only configural invariance
holds across groups. Under this condition, one has
Taxonomy of Invariance little basis to suppose that the constructs are the same
A key aspect of multiple-group MACS modeling is in each group and systematic comparisons of the
the ability to assess the degree of factorial invariance constructs would be difficult to justify. If invariance
of the constructs across groups. Factorial invariance of the loadings holds then one has a weak empirical
addresses whether the constructs’ measurement prop- basis to consider the constructs to be equivalent and
erties (i.e., the intercepts and loadings, which reflect would allow cross-group comparisons of the latent
the reliable components of the measurement space) variances and covariances, but not the latent means.
are the same in two or more populations. This ques-
tion is distinct from whether the latent aspects of Strong Factorial Invariance. As Meredith [14]
the constructs are the same (e.g., the constructs’ compellingly argued, any test of factorial invariance
mean levels or covariances). This latter question deals should include the manifest means – weak factorial
with particular substantive hypotheses about possible invariance is not a complete test of invariance.
group differences on the constructs (i.e., the reliable With strong factorial invariance, the loadings and the
and true properties of the constructs). The concept of intercepts are equated (and like the variances of the
invariance is typically thought of and described as a constructs, the latent means are allowed to vary in
hierarchical sequence of invariance starting with the the second and all subsequent groups). This strong
weakest form and working up to the strictest form. form of factorial invariance, also referred to as scalar
Factor Analysis: Multiple Groups 619

invariance [22], is required in order for individuals expected to operate in an equivalent manner across
with the same ability in separate groups to have the the subgroups of interest. In addition, the residuals
same score on the instrument. With any less stringent reflect the unique factors of the measured indicators
condition, two individuals with the same true level of (i.e., variance that is reliable but unique to the par-
ability would not have the same expected value on ticular indicator). If the unique factors differ trivially
the measure. This circumstance would be problematic with regard to subgroup influences, this violation of
because, for example, when comparing groups based selection theorem [14] can be effectively tolerated, if
on gender on a measure of mathematical ability one sufficiently small, by allowing the residuals to vary
would want to ensure that a male and a female with across the subgroups. In other words, strong factorial
the same level of ability would receive the same invariance is less biasing than strict factorial invari-
score. ance because, even though the degree of random error
An important advantage of strong factorial invari- may be quite similar across groups, if it is not exactly
ance is that it establishes the measurement equiva- equal, the nonequal portions of the random error
lence (or construct comparability) of the measures. In are forced into other parameters of a given model,
this case, constructs are defined in precisely the same thereby introducing potential sources of bias. More-
operational manner in each group; as a result, they over, in practical applications of cross-group research
can be compared meaningfully and with quantitative such as cross-cultural studies, some systematic bias
precision. Measurement equivalence indicates that (e.g., translation bias) may influence the reliable com-
(a) the constructs are generalizable entities in each ponent of a given residual. Assuming these sources
subpopulation, (b) sources of bias and error (e.g., cul- of bias and error are negligible (see ‘testing’ below),
tural bias, translation errors, varying conditions of they could be represented as unconstrained residual
administration) are minimal, (c) subgroup differences variance terms across groups in order to examine
have not differentially affected the constructs under- the theoretically meaningful common-variance com-
lying measurement characteristics (i.e., constructs ponents as unbiasedly as possible.
are comparable because the indicators’ specific vari-
ances are independent of cultural influences after Partial Invariance. Widaman and Reise [23] and
conditioning on the construct-defining common vari- others have also introduced the concept of partial
ance; [14]), and (d) between-group differences in the invariance, which is the condition when a constraint
constructs’ mean, variance, and covariance relations of invariance is not warranted for one or a few of the
are quantitative in nature (i.e., the nature of group loading parameters. When invariance is untenable,
differences can be assessed as mean-level, variance, one may then attempt to determine which indicators
and covariance or correlational effects) at the con- contribute significantly to the misfit ([3] [5]). It is
struct level. In other words, with strong factorial likely that only a few of the indicators deviate sig-
invariance, the broadest spectrum of hypotheses about nificantly across groups, giving rise to the condition
the primary construct moments (means, variances, known as partial invariance. When partial invariance
covariances, correlations) can be tested while simul- is discovered there are a variety of ways to pro-
taneously establishing measurement equivalence (i.e., ceed. (a) One can leave the estimate in the model,
two constructs can demonstrate different latent rela- but not constrain it to be invariant across groups and
tions across subgroups, yet still be defined equiva- argue that the invariant indicators are sufficient to
lently at the measurement level). establish comparability of the constructs [23]; (b) one
can argue that the differences between indicators are
Strict Factorial Invariance. With strict factorial small enough that they would not make a substantive
invariance, all conditions are the same as for strong difference and proceed with invariance constraints in
invariance but, in addition, the residual variances are place [9]; (c) one could decide to reduce the number
equated across groups. This level of invariance is of indicators by only using indicators that are invari-
not required for making veridical cross-group com- ant across groups [16]; (d) one could conclude that
parisons because the residuals are where the aggre- because invariance cannot be attained that the instru-
gate of the true measurement error variance and the ment must be measuring different constructs across
indicator-specific variance is represented. Here, the the multiple groups and, therefore, not use the instru-
factors that influence unreliability are not typically ment at all [16]. Milsap and Kwok [16] also describe
620 Factor Analysis: Multiple Groups

a method to assess the severity of the violations of mean structure is used, the location must be iden-
invariance by evaluating the sensitivity and speci- tified in addition to the scale of the other esti-
ficity at various selection points. mated parameters.
The first method to identification and scale setting
Selection Theorem Basis for Expecting Invariance. is to fix a parameter in the latent model. For example,
The loadings and intercepts of a constructs indica- to set the scale for the location parameters, one can
tors can be expected to be invariant across groups fix the latent factor mean, κ, to zero (or a nonzero
under a basic tenet of selection theorem – namely, value). Similarly, to set the scale for the variance-
conditional independence ([8, 14]; see also [18]). In covariance and loading parameters one can fix the
particular, if subpopulation influences (i.e., the basis variances, φii to 1.0 (or any other nonzero value). The
for selecting the groups) and the specific components advantages of this approach are that the estimated
(unique factors) of the construct’s manifest indicators latent means in each subsequent group are relative
are independent when conditioned on the common mean differences from the first group. Because this
construct components, then an invariant measurement first group is fixed at zero, the significance of the
space can be specified even under extreme selection latent mean estimates in the subsequent groups is the
conditions. When conditional independence between significance of the difference from the first group.
the indicators’ unique factors and the selection basis Fixing the latent variances to 1.0 has the advantage
hold, the construct information (i.e., common vari- of providing estimates of the associations among the
ance) contains, or carries, information about subpopu- latent constructs in correlational metric as opposed to
lation influences. This expectation is quite reasonable an arbitrary covariance metric.
if one assumes that the subpopulations derive from a The second common method is known as the
common population from which the subpopulations marker-variable method. To set the location param-
can be described as ‘selected’ on the basis of one or eters, one element of τ is set to zero (or a nonzero
more criteria (e.g., experimental treatment, economic value) for each construct. To set the scale, one ele-
affluence, degree of industrialization, degree of indi- ment of λ is fixed to 1.0 (or any other nonzero value)
vidualism etc.). This expectation is also reasonable if for each construct. This method of identification is
one assumes on the basis of a specific theoretical view less desirable than the 1st and 3rd methods because
that the constructs should exist in each assessed sub- the location and scale of the latent construct is deter-
population and that the constructs’ indicators reflect mined arbitrarily on the basis of which indicator is
generally equivalent domain representations. chosen. Reise, Widaman, and Pugh [19] recommend
Because manifest indicators reflect both common that if one chooses this approach the marker variables
and specific sources of variance, cross-group effects should be supported by previous research or selected
may influence not only the common construct-related on the basis of strong theory.
variance of a set of indicators but also the specific A third possible identification method is to con-
variance of one or more of them [17]. Measurement strain the sum of τ for each factor to zero [20]. For
equivalence will hold if these effects have influ- the scale identification, the λs for a factor should sum
enced only the common-variance components of a set to p, the number of manifest variables. This method
of construct indicators and not their unique-specific forces the mean and variance of the latent construct to
components [8, 14, 18]. If cross-group influences be the weighted average of all of its indicators’ means
differentially and strongly affect the specific compo- and loadings. The method has the advantage of pro-
nents of indicators, nonequivalence would emerge. viding a nonarbitrary scale that can legitimately vary
Although measurement nonequivalence can be a across constructs and groups. It would be feasible, in
meaningful analytic outcome, it disallows, when suf- fact, to compare the differences in means of two dif-
ficiently strong, quantitative construct comparisons. ferent constructs if one was theoretically motivated
to do so (see [20], for more details of this method).

Identification Constraints Testing for Measurement Invariance and Latent


Construct Differences
There are three methods of placing constraints on
the model parameters in order to identify the con- In conducting cross-group tests of equality, either a
structs and model (see Identification). When a statistical or a modeling rationale can be used for
Factor Analysis: Multiple Groups 621

evaluating the tenability of the cross-group restric- In contrast to the measurement level, the latent
tions [9]. With a statistical rationale, an equivalence level reflects interpretable, error-free effects among
test is conducted as a nested-model comparison constructs. Here, testing them for evidence of sys-
between a model in which specific parameters are tematic differences (i.e., the hypothesis-testing phase
constrained to equality across groups and one in of an analysis) is probably best done using a statisti-
which these parameters (and all others) are freely cal rationale because it is a precise criteria for testing
estimated in all groups. The difference in χ2 between the specific theoretically driven questions about the
the two models is a test of the equality restrictions constructs and because such substantive tests are typ-
(with degrees of freedom equal to the difference in ically narrower in scope (i.e., fewer parameters are
their degrees of freedom). If the test is nonsignificant involved). However, such tests should carefully con-
then the statistical evidence indicates no cross-group sider issues such as error rate and effect size.
differences between the equated parameters. If it Numerous examples of the application of MACS
is significant, then evidence of cross-group inequal- modeling can be found in the literature, however,
ity exists. Little [9] offers a detailed didactic discussion of the
The other rationale is termed a modeling ratio- issues and steps involved when making cross-group
nale [9]. Here, model constraints are evaluated using comparisons (including the LISREL source code used
practical fit indices to determine the overall adequacy to estimate the models and a detailed Figural repre-
of a fitted model. This rationale is used for large mod- sentation). His data came from a cross-cultural study
els with numerous constrained parameters because of personal agency beliefs about school performance
the χ2 statistic is an overly sensitive index of model that included 2493 boys and girls from Los Angeles,
fit, particularly for large numbers of constraints and Moscow, Berlin, and Prague. Little conducted an 8-
when estimated on large sample sizes (e.g., [10]). group MACS comparison of boys and girls across the
From this viewpoint, if a model with numerous con- four sociocultural settings. His analyses demonstrated
straints evinces adequate levels of practical fit, then that the constructs were measurement equivalent (i.e.,
the set of constraints are reasonable approximations had strong factorial invariance) across all groups indi-
of the data. cating that the translation process did not unduly
Both rationales could be used in testing the influence the measurement properties of the instru-
measurement level and the latent level parameters. ment. However, the constructs themselves revealed
Because these two levels represent distinctly and a number of theoretically meaningful differences,
qualitatively different empirical and theoretical goals, including striking differences in the mean levels and
however, their corresponding rationale could also be the variances across the groups, but no differences in
different. Specifically, testing measurement equiva- the strength of association between the two primary
lence involves evaluating the general tenability of constructs examined.
an imposed indicator-to-construct structure via over-
all model fit indices. Here, various sources of model Extensions to Longitudinal MACS Modeling
misfit (random or systematic) may be deemed sub-
stantively trivial if model fit is acceptable (i.e., if The issues related to cross-group comparisons with
the model provides a reasonable approximation of MACS models are directly applicable to longitudinal
the data; [2, 9]). The conglomerate effects of these MACS modeling. That is, establishing the measure-
sources of misfit, when sufficiently small, can be ment equivalence (strong metric invariance) of a
depicted parsimoniously as residual variances and construct’s indicators over time is just as important as
general lack of fit, with little or no loss to theoretical establishing their equivalence across subgroups. One
meaningfulness (i.e., the trade-off between empiri- additional component of longitudinal MACS mod-
cal accuracy and theoretical parsimony; [11]). When eling that needs to be addressed is the fact that the
compared to a non-invariance model, an invariance specific variances of the indicators of a construct will
model differs substantially in interpretability and par- have some degree of association across time. Here,
simony (i.e., fewer parameter estimates than a non- independence of the residuals is not assumed, but
invariance model), and it provides the theoretical and rather dependence of the unique factors is expected.
mathematical basis for quantitative between-group In this regard, the a priori factor model, when fit
comparisons. across time, would specify and estimate all possible
622 Factor Analysis: Multiple Groups

residual correlations of an indicator with itself across [4] Cattell, R.B. (1944). Parallel proportional profiles and
each measurement occasion. other principles for determining the choice of factors by
rotation, Psychometrika 9, 267–283.
[5] Cheung, G.W. & Rensvold, R.B. (1999). Testing fac-
Summary torial invariance across groups: a reconceptualization
and proposed new method, Journal of Management 25,
MACS models are a powerful tool for cross-group 1–27.
[6] Horn, J.L. & McArdle, J.J. (1992). A practical and
and longitudinal comparisons. Because the means theoretical guide to measurement invariance in aging
or intercepts of measured indicators are included research, Experimental Aging Research 18, 117–144.
explicitly in MACS models, they provide a very [7] Horst, P. & Schaie, K.W. (1956). The multiple group
strong test of the validity of construct compar- method of factor analysis and rotation to a simple
isons (i.e., measurement equivalence). Moreover, structure hypothesis, Journal of Experimental Education
the form of the group- or time-related differences 24, 231–237.
can be tested on many aspects of the constructs [8] Lawley, D.N. & Maxwell, A.E. (1971). Factor Anal-
ysis as a Statistical Method, 2nd Edition, Butterworth,
(i.e., means, variances, and covariances or corre-
London.
lations). As outlined here, the tenability of mea- [9] Little, T.D. (1997). Mean and covariance structures
surement equivalence (i.e., construct comparabil- (MACS) analyses of cross-cultural data: practical and
ity) can be tested using model fit indices (i.e., theoretical issues, Multivariate Behavioral Research 32,
the modeling rationale), whereas specific hypothe- 53–76.
ses about the nature of possible group differences [10] Marsh, H.W., Balla, J.R. & McDonald, R.P. (1988).
on the constructs can be tested using precise sta- Goodness-of-fit indexes in confirmatory factor analysis:
the effect of sample size, Psychological Bulletin 103,
tistical criteria. A measurement equivalent model is
391–410.
advantageous for three reasons: (a) it is theoreti- [11] McArdle, J.J. (1996). Current directions in structural
cally very parsimonious and, thus, a reasonable a factor analysis, Current Directions 5, 11–18.
priori hypothesis to entertain, (b) it is empirically [12] McArdle, J.J. & Cattell, R.B. (1994). Structural equation
very parsimonious, requiring fewer estimates than a models of factorial invariance in parallel proportional
non-invariance model, and (c) it provides the math- profiles and oblique confactor problems, Multivariate
ematical and theoretical basis by which quantitative Behavioral Research 29, 63–113.
[13] Meredith, W. (1964). Rotation to achieve factorial
cross-group or cross-time comparisons can be con-
invariance, Psychometrika 29, 187–206.
ducted. In other words, strong factorial invariance [14] Meredith, W. (1993). Measurement invariance, factor
indicates that constructs are fundamentally similar analysis and factorial invariance, Psychometrika 58,
in each group or across time (i.e., comparable) and 525–543.
hypotheses about the nature of possible group- or [15] Millsap, R.E. (1997). Invariance in measurement and
time-related influences can be meaningfully tested prediction: their relationship in the single-factor case,
on any of the constructs’ basic moments across time Psychological Methods 2, 248–260.
[16] Millsap, R.E. & Kwok, O. (2004). Evaluating the
or across each group whether the groups are defined
impact of partial factorial invariance on selection in two
on the basis of culture, gender, or any other group- populations, Psychological Methods 9, 93–115.
ing criteria. [17] Mulaik, S.A. (1972). The Foundations of Factor Analy-
sis, McGraw-Hill, New York.
[18] Muthén, B.O. (1989). Factor structure in groups selected
References on observed scores, British Journal of Mathematical and
Statistical Psychology 42, 81–90.
[1] Bollen, K.A. (1989). Structural Equations with Latent [19] Reise, S.P., Widaman, K.F. & Pugh, R.H. (1995).
Variables, Wiley, New York. Confirmatory factor analysis and item response theory:
[2] Browne, M.W. & Cudeck, R. (1993). Alternative ways two approaches for exploring measurement invariance,
of assessing model fit, in Testing Structural Equation Psychological Bulletin 114, 552–566.
Models, K.A. Bollen & J.S. Long, eds, Sage Publica- [20] Slegers, D.W. & Little, T.D. (in press). Evaluating
tions, Newbury Park, pp. 136–162. contextual influences using multiple-group, longitudi-
[3] Byrne, B.M., Shavelson, R.J. & Muthén, B. (1989). Test- nal mean and covariance structures (MACS) methods,
ing for the equivalence of factor covariance and mean in Modeling contextual influences in longitudinal data,
structures: the issue of partial measurement invariance, T.D. Little, J.A. Bovaird & J. Marquis, eds, Lawrence
Psychological Bulletin 105, 456–466. Erlbaum, Mahwah.
Factor Analysis: Multitrait–Multimethod 623

[21] Sörbom, D. (1982). Structural equation models with small. Discriminant validity postulates that measures
structured means, in Systems Under Direct Observation, of one trait are not too highly correlated with
K.G. Jöreskog & H. Wold, eds, Praeger, New York, measures of different traits and particularly not too
pp. 183–195.
[22] Steenkamp, J.B. & Baumgartner, H. (1998). Assess-
highly correlated just because they share the same
ing measurement invariance in cross-national consumer assessment method.
research, Journal of Consumer Research 25, 78–90. The variables of an MTMM matrix follow a
[23] Widaman, K.F. & Reise, S.P. (1997). Exploring the crossed-factorial measurement design whereby each
measurement invariance of psychological instruments: of t traits is assessed with each of m measurement
applications in the substance use domain, in The science methods. Table 1 gives an example of how the
of prevention: Methodological advances from alcohol
observed variables and their correlation coefficients
and substance abuse research, K.J. Bryant & M. Windle
et al., eds, American Psychological Association, Wash- are arranged in the correlation matrix, conventionally
ington, pp. 281–324. ordering traits within methods. Because the matrix is
symmetric, only entries in its lower half have been
Further Reading marked. Particular types of correlations are marked
symbolically:
MacCallum, R.C., Browne, M.W. & Sugawara, H.M. (1996).
Power analysis and determination of sample size for V – validity diagonals, correlations of measures of
covariance structure modeling, Psychological Methods 1, the same traits assessed with different methods.
130–149. M – monomethod triangles, correlations of measures
McGaw, B. & Jöreskog, K.G. (1971). Factorial invariance of of different traits that share the same methods.
ability measures in groups differing in intelligence and
H – heterotrait–heteromethod triangles, correlations
socioeconomic status, British Journal of Mathematical
and Statistical Psychology 24, 154–168. of measures of different traits obtained with
different methods.
1 – main diagonal, usually containing unit entries. It
(See also Structural Equation Modeling: Latent is not uncommon to see the unit values replaced
Growth Curve Analysis) by reliability estimates.
TODD D. LITTLE AND DAVID W. SLEGERS Campbell and Fiske [6] proposed four qualitative
criteria for evaluating convergent and discriminant
validity by the MTMM matrix.

CF1 (convergent validity): ‘. . .the entries in the


Factor Analysis: validity diagonal [V] should be significantly
Multitrait–Multimethod different from zero and sufficiently large . . .’
CF2 (discriminant validity): ‘. . .a validity diagonal
value [V] should be higher than the values
The Multitrait–Multimethod Matrix lying in its column and row in the heterotrait-
heteromethod triangles [H].’
The well-known paper by Campbell and Fiske [6] CF3 (discriminant validity): ‘. . .a variable correlate
proposed the multitrait–multimethod (MTMM) ma- higher with an independent effort to measure the
trix as a measurement design to study trait validity same trait [V] than with measure designed to get
across assessment methods. Their central idea was at different traits which happen to employ the
that traits should be independent of and detectable by same method [M].’
a variety of measurement methods. In particular, the CF4 (discriminant validity): ‘. . .the same pattern of
magnitude of a trait should not change just because trait interrelationship be shown in all of the
a different assessment method is used. Campbell and heterotrait triangles of both the monomethod [M]
Fiske’s main distinction was between two forms of and heteromethod [H] blocks.’
validity, identified as convergent and discriminant.
Convergent validity assures that measures of the Depending on which of the criteria were satisfied,
same trait are statistically related to each other and convergent or discriminant validity of assessment
that their error and unique components are relatively instruments would then be ascertained or rejected.
624 Factor Analysis: Multitrait–Multimethod

Table 1 Components of a 3-trait–3-method correlation matrix


Method 1 Method 2 Method 3

Trait 1 Trait 2 Trait 3 Trait 1 Trait 2 Trait 3 Trait 1 Trait 2 Trait 3


Method 1 Trait 1 1
Trait 2 M 1
Trait 3 M M 1
Method 2 Trait 1 V H H 1
Trait 2 H V H M 1
Trait 3 H H V M M 1
Method 3 Trait 1 V H H V H H 1
Trait 2 H V H H V H M 1
Trait 3 H H V H H V M M 1

 
Confirmatory Factor Analysis Approach λ1,1 0 0
to MTMM  0 λ2,2 0 
 
 0 0 λ3,3 
 
Confirmatory factor analysis (CFA) (see Factor  λ4,1 0 0 
 
Analysis: Confirmatory) was proposed as a model- τ =  0 λ5,2 0 . (2)
 
oriented approach to MTMM matrix analysis by [1],  0 0 λ6,3 
 
[11], [12], and [13]. Among the several compet-  λ7,1 0 0 
ing multivariate models for MTMM matrix analysis  0 λ8,2 0 
reviewed by [17] and [18], CFA is the only approach 0 0 λ9,3
with an appreciable following in the literature.
and the matrix of factor correlations is
Under the factor model (see Factor Analysis:  
Exploratory), the n × p observed data matrix X 1 φ21 φ31
of n observations on p variables arises as a linear τ = φ21 1 φ32 . (3)
combination of n × k, k < p factor scores, with φ31 φ32 1
factor loading matrix , and uncorrelated residuals All zero entries in τ and the diagonal entries
E. The covariance structure of the observed data is in τ are fixed (predetermined) parameters; the
p factor loading parameters λi,j , t (t − 1)/2 factor
x =  + , (1)
correlations, and p uniqueness coefficients in the
where  is the covariance matrix of the k latent diagonal of  are estimated from the data. The
factors and  the diagonal covariance matrix of model is identified when three or more methods
the residuals. There are two prominent models for are included in the measurement design. For the
MTMM factor analysis: the trait-only model [11, special case that all intertrait correlations are nonzero,
12] expressing the observed variables in terms of model identification requires only two methods (two-
t correlated trait factors and the trait-method factor indicator rule [2]).
model [1, 12, 13] with t trait and m method factors. The worked example uses the MTMM matrix
of Table 2 on the basis of data by Flamer [8],
also published in [9] and [22]. The traits are Atti-
Confirmatory Factor Analysis – Trait-only tude toward Discipline in Children (ADC), Attitude
Model toward Mathematics (AM), and Attitude toward the
Law (AL). The methods are all paper-and-pencil,
The trait-only model allows one factor per trait. differing by response format: dichotomous Likert
Trait factors are usually permitted to correlate. For (L) scales, Thurstone (Th) scales, and the semantic
the nine-variable MTMM matrix shown in Table 1, differential (SD) technique. Distinctly larger entries
assuming the same variable order, the loading matrix in the validity diagonals (in bold face) and simi-
τ has the following simple structure: lar patterns of small off-diagonal correlations in the
Factor Analysis: Multitrait–Multimethod 625

Table 2 Flamer (1978) attitude data, sample A (N = 105)a


ADC− L AM− L AL− L ADC− Th AM− Th AL− Th ADC− SD AM− SD AL− SD
ADC− L 1.00
AM− L −0.15 1.00
AL− L 0.19 −0.12 1.00
ADC− Th 0.72 −0.11 0.19 1.00
AM− Th −0.01 0.61 −0.03 −0.02 1.00
AL− Th 0.26 −0.04 0.34 0.27 0.01 1.00
ADC− SD 0.42 −0.15 0.21 0.40 0.01 0.34 1.00
AM− SD −0.06 0.72 −0.05 −0.03 0.75 −0.03 0.00 1.00
AL− SD 0.13 −0.12 0.46 0.17 −0.01 0.44 0.33 0.00 1.00
a
Reproduced with permission from materials held in the University of Minnesota Libraries.

Table 3 Trait-only factor analysis of the Flamer attitude data


ˆτ
Factor loading matrix 

Trait factors
Uniqueness
Method Trait ADC AM AL estimates θ̂
ADC 0.85 0.0 0.0 0.28
Likert AM 0.0 0.77 0.0 0.41
AL 0.0 0.0 0.61 0.63
ADC 0.84 0.0 0.0 0.29
Thurstone AM 0.0 0.80 0.0 0.36
AL 0.0 0.0 0.62 0.62
ADC 0.50 0.0 0.0 0.75
Semantic diff AM 0.0 0.95 0.0 0.12
AL 0.0 0.0 0.71 0.50
ˆτ
Factor correlations 

ADC AM AL

ADC 1.0
AM −0.07 1.0
AL 0.39 −0.05 1.0
χ 2 = 23.28 P = 0.503
df = 24 N = 105

monomethod triangles and heterotrait–heteromethod Performance of the trait-only factor model with
blocks suggest some stability of the traits across the other empirical MTMM data is mixed. In Wothke’s
three methods. [21] reanalyses of 23 published MTMM matrices,
The parameter estimates for the trait-only factor the model estimates were inadmissible or failed to
model are shown in Table 3. The solution is admissi- converge in 10 cases. Statistically acceptable model
ble and its low maximum-likelihood χ 2 -value signals fit was found with only 2 of the 23 data sets.
acceptable statistical model fit. No additional model
terms are called for. This factor model postulates con- Confirmatory Factor Analysis – Traits
siderable generality of traits across methods, although Plus Methods Model
the large uniqueness estimates of some of the attitude
measures indicate low factorial validity, limiting their Measures may not only be correlated because they
practical use. reflect the same trait but also because they share
626 Factor Analysis: Multitrait–Multimethod

the same assessment method. Several authors [1, 12] where the δi are mt − 1 nonzero scale parameters for
(p)
have therefore proposed the less restrictive trait- the rows of τ µ , with δ1 = 1 fixed and all other δi
method factor model, permitting systematic variation estimated, and the λk are a set of m + t nonzero scale
(p)
due to shared methods as well as shared traits. parameters for the columns of τ µ , with all λk esti-
The factor loading matrix of the expanded model mated. Grayson and Marsh [10] proved algebraically
simply has several columns of method factor loadings that factor models with loading matrix (6) and fac-
appended to the right, one column for each method: tor correlation structure (5) are unidentified no matter
  how many traits and methods are analyzed. Even if
λ1,1 0 0 λ1,4 0 0 (p)
 0 λ2,2 0 λ2,4 0 0  τ µ is further constrained by setting all (row) scale
  parameters to unity (δi = 1), the factor model will
 0 0 λ3,3 λ3,4 0 0 
  remain unidentified [20].
 λ4,1 0 0 0 λ4,5 0 
  Currently, identification conditions for the gen-
τ µ =  0 λ5,2 0 0 λ5,5 0 .
  eral form of the trait-method model are not com-
 0 0 λ6,3 0 λ6,5 0 
  pletely known. Identification and admissibility prob-
 λ7,1 0 0 0 0 λ7,6 
 0 λ8,2 0 0 0 λ8,6  lems appear to be the rule with empirical MTMM
0 0 λ9,3 0 0 λ9,6 data, although an identified, admissible, and fit-
(4) ting solution has been reported for one particular
dataset [2]. However, in order to be identified, the
A particularly interesting form of factor correlation estimated factor loadings must necessarily be dif-
matrix is the block-diagonal model, which implies ferent from the proportional structure in (6) – a
independence between trait and method factors: difference that would complicate the evaluation of
trait validity. Estimation itself can also be diffi-
τ 0 cult: The usually iterative estimation process often
τ µ = . (5)
0 µ approaches an intermediate solution of the form
(6) and cannot continue because the matrix of sec-
In the structured correlation matrix (5), the submatrix
ond derivatives of the fit function becomes rank
τ contains the correlations among traits and the sub-
deficient at that point. This is a serious practical
matrix µ contains the correlations among methods.
problem because condition (6) is so general that it
While the block-diagonal trait-method model
‘slices’ the identified solution space into many dis-
appeared attractive when first proposed, there has
joint subregions so that the model estimates can
been growing evidence that its parameterization is
become extremely sensitive to the choice of start
inherently flawed. Inadmissible or unidentified model
values. Kenny and Kashy [14] noted that ‘. . . estima-
solutions are nearly universal with both simulated
tion problems increase as the factor loadings become
and empirical MTMM data [3, 15, 21]. In addition,
increasingly similar.’
identification problems of several aspects of the
There are several alternative modeling approaches
trait-method factor model have been demonstrated
that the interested reader may want to con-
formally [10, 14, 16, 20]. For instance, consider
sult: (a) CFA with alternative factor correlation
factor loading structures whose nonzero entries are
structures [19]; (b) CFA with correlated uniqueness
proportional by rows and columns:

 
λ1 0 0 λ4 0 0
 0 δ2 · λ2 0 δ2 · λ4 0 0 
 
 0 0 δ3 · λ3 δ3 · λ4 0 0 
 
 δ4 · λ1 0 0 0 δ4 · λ5 0 
(p)  
τ µ = 0 δ5 · λ2 0 0 δ5 · λ5 0 , (6)
 
 0 0 δ6 · λ3 0 δ6 · λ5 0 
 
 δ7 · λ1 0 0 0 0 δ7 · λ6 
 0 δ8 · λ2 0 0 0 δ8 · λ6 
0 0 δ9 · λ3 0 0 δ9 · λ6
Factor Analysis: Multitrait–Multimethod 627

coefficients [4, 14, 15]; (c) covariance components [9] Flamer, S. (1983). Assessment of the multitrait-
analysis [22]; and (d) the direct product model [5]. multimethod matrix validity of Likert scales via
Practical implementation issues for several of these confirmatory factor analysis, Multivariate Behavioral
Research 18, 275–308.
models are reviewed in [14] and [22]. [10] Grayson, D. & Marsh, H.W. (1994). Identification with
deficient rank loading matrices in confirmatory factor
analysis multitrait-multimethod models, Psychometrika
Conclusion 59, 121–134.
[11] Jöreskog, K.G. (1966). Testing a simple structure
About thirty years of experience with confirmatory hypothesis in factor analysis, Psychometrika 31,
165–178.
factor analysis of MTMM data have proven less
[12] Jöreskog, K.G. (1971). Statistical analysis of sets of
than satisfactory. Trait-only factor analysis suffers congeneric tests, Psychometrika 36(2), 109–133.
from poor fit to most MTMM data, while the block- [13] Jöreskog, K.G. (1978). Structural analysis of covari-
diagonal trait-method model is usually troubled by ance and correlation matrices, Psychometrika 43(4),
identification, convergence, or admissibility prob- 443–477.
lems, or by combinations thereof. In the presence [14] Kenny, D.A. & Kashy, D.A. (1992). Analysis
of method effects, there is no generally accepted of the multitrait-multimethod matrix by confirma-
tory factor analysis, Psychological Bulletin 112(1),
multivariate model to yield summative measures of
165–172.
convergent and discriminant validity. In the absence [15] Marsh, H.W. & Bailey, M. (1991). Confirmatory factor
of such a model, ‘(t)here remains the basic eyeball analysis of multitrait-multimethod data: A comparison of
analysis as in the original article [6]. It is not always alternative models, Applied Psychological Measurement
dependable; but it is cheap’ [7]. 15(1), 47–70.
[16] Millsap, R.E. (1992). Sufficient conditions for rota-
tional uniqueness in the additive MTMM model, British
References Journal of Mathematical and Statistical Psychology 45,
125–138.
[1] Althauser, R.P. & Heberlein, T.A. (1970). Validity [17] Millsap, R.E. (1995). The statistical analysis of method
and the multitrait-multimethod matrix, in Sociological effects in multitrait-multimethod data: a review, in
Methodology 1970, E.F. Borgatta, ed., Jossey-Bass, San Personality, Research Methods and Theory. A Festschrift
Francisco. Honoring Donald W. Fiske, P.E. Shrout & S.T. Fiske,
[2] Bollen, K.A. (1989). Structural Equations with Latent eds, Lawrence Erlbaum Associates, Hillsdale.
Variables, Wiley, New York. [18] Schmitt, N. & Stults, D.M. (1986). Methodology review:
[3] Brannick, M.T. & Spector, P.E. (1990). Estimation prob- analysis of multitrait-multimethod matrices, Applied
lems in the block-diagonal model of the multitrait- Psychological Measurement 10, 1–22.
multimethod matrix, Applied Psychological Measure- [19] Widaman, K.F. (1985). Hierarchically nested covari-
ment 14(4), 325–339. ance structure models for multitrait-multimethod data,
[4] Browne, M.W. (1980). Factor analysis for multi- Applied Psychological Measurement 9, 1–26.
ple batteries by maximum likelihood, British Jour- [20] Wothke, W. (1984). The estimation of trait and
nal of Mathematical and Statistical Psychology 33, method components in multitrait-multimethod measure-
184–199. ment, Unpublished doctoral dissertation, University of
[5] Browne, M.W. (1984). The decomposition of multitrait- Chicago.
multimethod matrices, British Journal of Mathematical [21] Wothke, W. (1987). Multivariate linear models of
and Statistical Psychology 37, 1–21. the multitrait-multimethod matrix, in Paper Presented
[6] Campbell, D.T. & Fiske, D.W. (1959). Convergent and at the Annual Meeting of the American Educational
discriminant validation by the multitrait-multimethod Research Association, Washington, (paper available
matrix, Psychological Bulletin 56, 81–105. through ERIC).
[7] Fiske, D.W. (1995). Reprise, new themes and steps [22] Wothke, W. (1996). Models for multitrait-multimethod
forward, in Personality, Research Methods and Theory. matrix analysis, in Advanced Structural Equation Mod-
A Festschrift Honoring Donald W. Fiske, P.E. Shrout eling. Issues and Techniques, G.A. Marcoulides &
& S.T. Fiske, eds, Lawrence Erlbaum Associates, R.E. Schumacker, eds, Lawrence Erlbaum Associates,
Hillsdale. Mahwah.
[8] Flamer, S. (1978). The effects of number of scale alterna-
tives and number of items on the multitrait-multimethod
matrix validity of Likert scales, Unpublished Disserta- (See also History of Path Analysis; Residuals in
tion, University of Minnesota. Structural Equation, Factor Analysis, and Path
628 Factor Analysis of Personality Measures

Analysis Models; Structural Equation Modeling: factor analysis to personality data, and this repre-
Overview) sents the first presentation of a major factor analysis
of personality measures. Thurstone, however, later
WERNER WOTHKE dropped this line of investigation to focus on men-
tal abilities.
Numerous other personality scientists soon fol-
lowed Thurstone’s initial lead and began using factor
analytic techniques to identify, evaluate, and refine
the major dimensions of personality. The personal-
ity theories and measures of Raymond Cattell and
Factor Analysis of Hans Eysenck represent two major early applications
and more recently the factor analyses of Jack Dig-
Personality Measures man, Lewis Goldberg, Paul Costa, and Jeff McCrae,
and a host of others have laid the foundation for a
widely used, though not by any means universally
The technique of factor analysis was developed accepted, five-factor structure of personality often
about 100 years ago by Charles Spearman [12] called the Big Five. Today there are a variety of
who applied the technique to the observed correla- structural models of personality that are based on fac-
tions among measures of mental abilities. Briefly, tor analyses. A number of these are summarized in
factor analysis is a statistical technique that derives Table 1.
aggregates of variables (typically called ‘factors’) This table is intended to be illustrative rather
from the observed relations (typically indexed by than comprehensive or definitive. There are other
correlations) among those variables. The result of systems, other variants on the systems shown here,
Spearman’s analysis was the identification of a sin- and other scientists who might be listed. More
gle factor that seemed to underlie observed scores comprehensive tables can be found in [4, 9, and
on a large number of measures of human mental 10]. Although Table 1 represents only a portion
ability. Subsequently, further applications of factor of the factor analytic models of personality, it is
analysis to the mental ability domain indicated that sufficient to raise the fundamental issue that will be
the one factor model was too simple. In particular, the focus of this contribution: Why does the same
Louis Thurstone suggested seven primary mental general analytic strategy (factor analysis) result in
ability factors rather than the single factor claimed structural models of personality that are so diverse?
by Spearman. Interestingly, Thurstone’s 1933 Amer- In addressing this issue, I will consider the variety of
ican Psychological Association presidential address, factor analytic procedures that result from different
Vectors of the Mind, [13] in which he presented this subjective decisions about the conduct of a factor
alternate view of the structure of mental abilities analysis. A more thorough discussion of these issues
focused as much or more on the application of can be found in [6] and [7]. These decisions include

Table 1 Illustration of the major structural models of personality based on factor analysis
Number of factors Representative labels and structure Associated theorists
2 Love-Hate; Dominance-Submission (interpersonal circle) Leary, Wiggins
2 Alpha (A, C, N) Beta (E, O) (Higher order factors of the Big Five) Digman
3 Extroversion, Neuroticism, Psychoticism Eysenck
5 E, A, C, N, O (Big Five; Five-Factor Model) Digman, Goldberg, Costa &
McCrae
7 E, A, C, N, O + Positive and Negative Evaluation Tellegen, Waller, Benet
16 16- PF; 16 Primary factors further grouped into five more global Cattell
factorsa
Note: E = Extroversion, A = Agreeableness, C = Conscientiousness, N = Neuroticism, O = Openness.
a
A complete list of the labels for the 16 PF can be found in [3].
Factor Analysis of Personality Measures 629

(a) the sample of observed items to be factored, of personality measures must begin with a careful
(b) the method of factor extraction, (c) the criteria evaluation of the measures that are included (and
for deciding the number of factors to be extracted, excluded) and the rationale behind such inclusion or
(d) the type of factor rotation if any, and (e) the exclusion. Probably the most prominent rationale for
naming of the factors. Readers who believe that selecting variables for a factor analysis in personal-
science is objective and who believe that the diversity ity has been the ‘lexical hypothesis’. This hypothesis
of results obtained from factor analyses is prima facia roughly states that all of the most important ways
evidence that the technique is unscientific will find that people differ from each other in personality
the tone of this contribution decidedly unsympathetic will become encoded in the natural language as sin-
to that view. gle word person descriptive terms such as ‘friendly’
or ‘dependable’. On the basis of this hypothesis,
What Personality Variables are to be Included one selects words from a list of all possible terms
in a Factor Analysis? The first decision in any that describe people culled from a dictionary and
scientific study is what to study. This is an inherently then uses those words as stimuli for which peo-
subjective decision and, at its broadest level, is ple are asked to describe themselves or others on
the reason that some of us become, say, chemists those terms. Cattell used such a list that was com-
and others of us become, say, psychologists. In plied by Allport and Odbert [1] in his analyses, and
the more specific case of studying the structure more recently, the Big Five was based on a simi-
of human personality, we must also begin with a lar and more recent list compiled by Warren Nor-
decision of which types of variables are relevant man [11].
to personality. Factor analysis, just as any other
statistical technique, can only operate on the data How (and Why) Should Personality Factors be
that are presented. In the case of personality structure Extracted? The basic data used in a factor analysis
for example, a factor representing Extraversion will of personality items are responses (typically ratings
only be found if items that indicate Extraversion of descriptiveness of the item about one’s self or
are present in the data: No Extroversion items; no possibly another person) from N subjects to k per-
Extroversion factor. An historical example of the sonality items; for example, ‘talks to strangers’, ‘is
influence of this decision on the study of personality punctual’, or ‘relaxed’. These N × k responses are
structure was Cattell’s elimination of a measure then converted into a k × k correlation (or less often
of intelligence from early versions of the domains a covariance) matrix, and the k × k matrix is then
he factored. This marked the point at which a factor analyzed to yield a factor matrix showing
powerful individual difference variable, intelligence, the ‘loadings’ of the k variables on the m factors.
disappeared from the study of personality. More Specifically, factor analysis operates on the com-
recently, the decision on the part of Big Five theorists mon (shared) variance of the variables as measured
to exclude terms that are purely evaluative such as by their intercorrelations. The amount of variance
‘nice’, or ‘evil’, from the personality domain meant a variable shares with the other variables is called
that no factors representing general evaluation were the variables communality. Factor analysis proceeds
included in the structure of personality. Adding such by extracting factors iteratively such that the first
items to the domain to be factored resulted, not factor accounts for as much of the total common
surprisingly, in a model called the Big Seven as variance across the items (called the factor’s eigen-
shown in Table 1. value) as possible, the second factor accounts for
Cattell’s decision to exclude intelligence items as much of the remaining common variance as pos-
or Big Five theorists’ decisions to exclude purely sible and so on. Figure 1 shows a heuristic factor
evaluative items represent different views of what is matrix. The elements of the matrix are the esti-
meant by personality. It would be difficult to identify mated correlations between each variable and each
those views as correct or incorrect in any objective factor. These correlations are called ‘loadings’. To
sense, but recognizing these different views can help the right of the matrix is a column containing the
clarify the differences in Table 1 and in other fac- final communality estimates (usually symbolized as
tor analyses of personality domains. The point is h2 ). These are simply the sum of the squared load-
that understanding the results of a factor analysis ings for each variable across the m factors and thus
630 Factor Analysis of Personality Measures

Factors goal of factor analysis has been attributed to Cyril


Items 1 2 3.... j... m Burt, among others.
Communalities
A second reason for extracting factors is perhaps
h 21 more profound. This reason is to discover the under-
1 r11 ....... lying factors that ‘cause’ individuals to respond to
h 22
2 r21 .......... the items in certain ways. This view of factors as
h 23
3 . (loadings) ‘causes’, is, of course, more controversial because of
. the ‘correlations do not imply causation’ rule. How-
. .
. ever, this rule should not blind us to the fact that
. . relations among variables are caused by something;
h 2i
i . rij it is just that we do not necessarily know what that
.
. cause is on the basis of correlations alone.
. The difference between this descriptive and
.
h 2k explanatory view of factor analysis is the foundation
k of the two major approaches to factor extraction;
Eigenvalues L1 L 2 L 3 . . . . L j . . . Lm principal component analysis (PC) and principle
axis factor analysis (PF) (see Factor Analysis:
Figure 1 Heuristic representation of a factor matrix Exploratory). Figure 2 illustrates the difference
between these two approaches using structural model
diagrams. As can be seen in Figures 2(a) and 2(b) the
represent the total common variance in each vari- difference between PC and PF is the direction of the
able that is accounted for by the factors. At the arrows in the diagram. Conceptually, the direction of
bottom of the matrix are the eigenvalues of the fac- the arrows indicates the descriptive emphasis of PC
tors. These are the sum of the squared loadings for analysis and the causal emphasis of PF analysis. In
each factor across the k variables and thus represent PC analysis the items together serve to ‘define’ the
the total amount of variance accounted for by each component and it serves to summarize the items that
factor. define it. In PF analysis the underlying factor serves
The point at which the correlation matrix is as a cause of why people respond consistently to a set
converted to a factor matrix represents the next of items. The similarity of the items, the responses
crucial subjective decision point in the factor analysis. to which are caused by the factor, is used to label
Although the communalities of the variables can the cause, which could be biological, conditioned,
be calculated from the final factor matrix, these or cognitive.
communalities must be initially estimated for the Because correlations are bidirectional, the direc-
factor analysis to proceed and the investigator must tion of the arrows in a path diagram is statisti-
decide how those initial communality values are to cally arbitrary and both diagrams will be equally
be estimated. The vast majority of factor analyses supported by the same correlation matrix. How-
are based on one of two possible decisions about ever, there is a crucial additional difference between
these estimates. In principle, these decisions reflect Figures 2(a) and 2(b) that does lead to different
the investigators belief about the nature of factors results between PC and PF. This difference is shown
and the goal of the factor analysis. One reason for in Figure 1(c), which adds error terms to the item
extracting factors from a matrix of correlations is responses when they are viewed as ‘caused’ by the
simply as an aid to interpreting the complex patterns factor in PF analysis. The addition of error in PF
implied by those correlations. The importance of analysis recognizes that the response to items is not
this aid can be readily appreciated by anyone who perfectly predicted by the underlying factor. That is,
has tried to discern how groups of variables are there is some ‘uniqueness’ or ‘error’ in individuals’
similar and different on the basis of the 4950 unique responses in addition to the common influence of the
correlations available from a set of 100 items, or, factor. In PC analysis, no error is assigned to the item
less ambitiously, among the 435 unique correlations responses as they are not viewed as caused by the fac-
available from a set of 30 items or measures. This tor. It is at this point that the statistical consequence
‘orderly simplification’ of a correlation matrix as a of these views becomes apparent. In PC analysis, the
Factor Analysis of Personality Measures 631

Item 1 Item 1

Item 2 Item 2 Factor


Component

Item 3 Item 3

(a) (b)

Error1 Item 1

Factor
Error2 Item 2

Item 3
Error3
(c)

Figure 2 Path diagram Illustrating the difference between PC and PF

initial communality estimates for the item are all fixed among other things are designed for time consuming,
at 1.0 as all of the variance is assumed to be common. tedious, and error-prone tasks, so the computational
In PF analysis, the initial communality estimates are advantage of PC is no longer of much relevance.
generally less than 1.0 (see next paragraph) to reflect However, the conservative nature of science, which
that some of the item variance is unique. The conse- tends to foster continuity of methods and measures,
quence of recognizing that some of the variability in has resulted in the vast majority of factor analy-
people’s responses to items is unique to that item is to ses of personality items to continue to be based on
reduce the amount of variance that can be ‘explained’ PC, regardless of the (often unstated) view of inves-
or attributed to the factors. Thus, PF analysis typically tigator about the nature of factors or the goal of
results in factors that account for less variance than the analysis.
PC analysis. Within the domain of personality it is often
There is also a computational consequence of the case that similar factor structures emerge from
choosing PF over PC analysis. PF analysis is much the same data regardless of whether PC or PF is
more difficult from a computational standpoint than employed, probably because the initial regression-
PC because one needs to estimate the error or unique- based communality estimates for personality vari-
ness of the items before the analysis can proceed. ables in PF tend to approach the 1.0 estimates used
This is typically done by regressing each item on by PC analysis. Thus, the decision to use PC or PF
all the others in the set to be factored, and using on personality data may be of little practical con-
the resulting R2 as the estimate of the items com- sequence. However, the implied view of factors as
mon variance (communality) and 1 – R2 as the items descriptive or causal by PC or PF respectively still
unique variance. Multiple linear regression requires has important implications for the study of personal-
inverting a correlation matrix, a time consuming, ity. The causal view of factors must be a disciplined
tedious, and error-prone task. If one were to fac- view to avoid circularity. For example, it is easy to
tor, say, 100 items one would have to invert 100 ‘explain’ that a person has responded in an agreeable
matrices. This task would simply be beyond the manner because they are high on the agreeableness
skills and temperament of most investigators and as factor. Without further specifying, and independently
a consequence the vast majority of historical factor testing, the source of that factor (e.g., genetic, cog-
analyses used the PC approach, which requires no nitive, environmental), the causal assertion is circu-
matrix inversion. Today we have computers, which, lar (‘He is agreeable because he is agreeable’) and
632 Factor Analysis of Personality Measures

untestable. The PC view avoids this problem by sim- ‘eigenvalues greater than 1.0’ rule. The logic of
ply using the factor descriptively without implying this rule is that, at a minimum, a factor should
a cause. account for more common variance than any single
However, the ‘merely’ descriptive view of factors item. On the basis of this logic, it is clear that this
is scientifically less powerful and two of the earliest rule only applies to PC analysis where the common
and most influential factor analytic models of per- variance of an item is set at 1.0 and indeed Kaiser
sonality of Cattell [3] and Eysenck [5] both viewed proposed this rule for PC analysis. Nonetheless,
factors as casual. Eysenck based his three factors on one often sees this rule misapplied in PF analyses.
a strong biological theory that included the role of Although there is a statistical objectivity about this
individual differences in brain structure and systems rule, in practice its application often results in factors
of biological activation and inhibition as the basis that are specific to only one or two items and/or
of personality, and Eysenck used factor analysis to factors that are substantively difficult to interpret
evaluate his theory by seeing if factors consistent or name.
with his theory could be derived from personality One recent development that addresses the num-
ratings. Cattell, on the other hand, did not base his ber of factors problem is the use of factor analyses
16 factors on an explicit theory but instead viewed based on maximum-likelihood criteria. In principle,
factor analysis as a tool for empirically discovering this provides a statistical test of the ‘significance’
the important and replicable factors that caused per- of the amount of additional variance accounted for
sonality. The widely accepted contemporary model by each additional factor. One then keeps extracting
of five factors also has both descriptive and causal factors until the additional variance accounted for by
interpretations. The term ‘Five-Factor Model’ used by each factor does not significantly increase over the
Costa and McCrae among others emphasizes a causal variance accounted for by the previous factor. How-
interpretation, whereas the term ‘Big Five’ used ever, it is still often the case that factors that account
by Goldberg among others emphasizes the descrip- for ‘significantly’ more variance do not include large
tive view. numbers of items and/or are not particularly meaning-
How Many Factors are There? Probably the most ful. Thus, the tension between statistical significance
difficult issue in factor analysis is deciding on the and substantive significance remains, and ultimately
number of factors. Within the domain of personality, the number of factors reported reflects a subjective
we have seen that the number of factors extracted balance between these two criteria.
is influenced crucially by the decision of how many
and what type of items to factor. However, another How Should the Factors be Arranged (Rotated)?
reason that different investigators may report different Yet another source of subjectivity in factor analy-
numbers of factors is that there is no single criterion sis results because the initial extraction of factors
for deciding how many factors are needed or useful does not provide a statistically unique set of fac-
to account for the common variance among a set tors. Statistically, factors are extracted to account for
of items. The problem is that as one extracts more as much variance as possible. However, once a set
factors one necessarily accounts for more common of factors is extracted, it turns out that there are
variance. Indeed in PC analysis one can extract as many different combinations of factors and item load-
many factors are there are items in the data set and ings that will account for exactly the same amount
in doing so one can account for all the variance. of total variance of each item. From a statistical
Thus, the decision about the number of factors to standpoint, as long as a group of factors accounts
extract is ultimately based on the balance between for the same amount of total variance, there is no
the statistical goal of accounting for variance and basis for choosing one group over another. Thus,
the substantive goal of simplifying a set of data investigators are free to select whatever arrangements
into a smaller number of meaningful descriptive of factors and item loadings they wish. The term
components or underlying causal factors. The term that is used to describe the rearrangement of fac-
‘meaningful’ is the source of the inherent subjectivity tors among a set of personality items is ‘rotation’,
in this decision which comes from the geometric view of factors as
The most common objective criteria that has been vectors moving (rotating) through a space defined by
used to decide on the number of factors is Kaiser’s items.
Factor Analysis of Personality Measures 633

There is a generally accepted criterion, called not simple (load on more than one factor) to the
simple structure, that is used to decide how to factors by making the relations among the factors
rotate factors. An ideal simple structure is one more complex. Perhaps the best way to appreciate
where each item correlates 1.0 with one factor the advantage of oblique rotations over orthogonal
and 0.00 with the other factors. In actual data ones is to note that if the simple structure factors are
this ideal will not be realized, but the goal is orthogonal or nearly so, oblique rotations will leave
to come as close to this ideal as possible for as the factors essentially uncorrelated and oblique rota-
many items as possible. The rationale for simple tions will become identical (or nearly so) to orthog-
structure is simplicity and this rationale holds for onal ones. A second advantage of oblique rotations
both PC and PF analyses. For PC analysis, simple of personality factors is that it allows the investigator
structure results in a description of the relations to explore higher order factor models–that is fac-
among the variables that is easy to interpret because tors of factors. Two of the systems shown in Table 1,
there is little item overlap between factors. For PF Digman’s Alpha and Beta factors and Cattell’s five
analysis the rationale is that scientific explanations Global Factors for the 16 PF represent such higher
should be as simple as possible. However, there order factor solutions.
are several different statistical strategies that can Simple structure is generally accepted as a goal
be used to approximate simple structure and the of factor rotation and is the basis for all the spe-
decision about which strategy to use is again a cific rotational strategies available in standard factor
subjective one. analytic software. However, within the field of per-
The major distinction between strategies to sonality there has been some theoretical recognition
achieve simple structure is oblique versus orthogonal that simple structure may not be the most appropriate
rotation of factors. As the labels imply, oblique way to conceptualize personality. The best historical
rotation allows the factors to be correlated with example of this view is the interpersonal circle of
each other whereas orthogonal rotation constrains the Leary, Wiggins, and others [14]. A circular arrange-
factors to be uncorrelated. Most factor analyses use an ment of items around two orthogonal axes means that
orthogonal rotation based on a specific strategy called some items must load equally highly on both factors,
‘Varimax’. Although other orthogonal strategies exist which is not simple structure. In the interpersonal cir-
(e.g., Equimax, Quartimax) the differences among cle, for example, an item such as ‘trusting’ has both
these in terms of rotational results are usually loving and submissive aspects, and so would load
slight and one seldom encounters these alternative complexly on both the Love-Hate and Dominance-
orthogonal approaches. Orthogonal approaches to the Submission factors. Likewise, ‘cruel’ has both Dom-
rotation of personality factors probably dominate inant and hateful aspects. More recently, a complex
in the literature because of their computational version of the Big Five called the AB5C structure
simplicity relative to oblique rotations. However, the that explicitly recognizes that many personality items
issue of computational simplicity is no longer of are blends of more than one factor was introduced
much concern with the computer power available by [8]. In using factor analysis to identify or evalu-
today so the continued preference for orthogonal ate circumplex models or any personality models that
rotations may, as with the preference for PC over explicitly view personality items as blends of factors,
PF, be historically rather than scientifically based. simple structure will not be an appropriate criterion
Oblique rotations of personality factors have some for arranging the factors.
distinct advantages over orthogonal rotations. In gen-
eral these advantages result because oblique rotations What Should the Factors be Called? In previous
are less constrained than orthogonal ones. That is, sections the importance of the meaningfulness and
oblique rotations allow the factors to be correlated interpretation of personality factors as a basis for
with each other, whereas orthogonal rotations force evaluating the acceptability of a factor solution has
the factors to be uncorrelated. Thus, in the pursuit been emphasized. But, of course, the interpretation
of simple structure, oblique rotations will be more and naming of factors is another source of the
successful than orthogonal ones because oblique rota- inherent subjectivity in the process. This subjectivity
tions have more flexibility. Oblique rotations can, in is no different than the subjectivity of all of science
some sense, transfer the complexity of items that are when it comes to interpreting the results – but the
634 Factor Analysis of Personality Measures

fact that different, but reasonable, scientists will often clear interpretive differences as well. The implication
disagree about the meaning or implications of the of this point is that one should not simply look at
same data certainly applies to the results of a factor the name or interpretation an investigator applies to
analysis. a factor, but also at the factor-loading matrix so
The interpretation problem in factor analysis is that the basis for the interpretation can be evaluated.
perhaps particularly pronounced because factors, per- It is not uncommon to see the same label applied
sonality or otherwise, have no objective reality. to somewhat different patterns of loadings, or for
Indeed, factors do not result from a factor analysis, different labels to be applied to the same pattern
rather the result is a matrix of factor loadings such as of loadings.
the one shown in Figure 1. On the basis of the con- Some investigators, perhaps out of recognition of
tent of the items and their loadings in the matrix, the the difficulty and subjectivity of factor naming, have
‘nature’ of the factor is inferred. That is, we know eschewed applying labels at all and instead refer to
a factor through the variables with which it is corre- factors by number. Thus, in the literature on the Big
lated. It is because factors do not exist and are not, Five, one may see reference to Factors I, II, III, IV,
therefore, directly observed that we often call them V. Of course, those investigators know the ‘names’
‘latent’ factors. Latent factors have the same prop- that are typically applied to the numbered factors
erties as other latent variables such as ‘depression,’ and these are shown in Table 2. Another approach
‘intelligence’, or ‘time’. None of these variables is has been to name the factors with uncommon labels
observed or measured directly, but rather they are to try to separate the abstract scientific meaning of
measured via observations that are correlated with a factor from its everyday interpretation. In particu-
them such as loss of appetite, vocabulary knowl- lar, Cattell used this approach with the 16PF, where
edge, or the movement of the hand on a watch. he applied labels to his factors such as ‘Parmia’,
A second complication is that in the factor analy- ‘Premsia’, ‘Autia’, and so on. Of course, transla-
ses described here there are no statistical tests of tions of these labels into their everyday equivalents
whether a particular loading is significant; instead soon appeared (Parmia is ‘Social Boldness’, Prem-
different crude standards such as loadings over 0.50 sia is ‘Sensitivity’, and Autia is ‘Abstractedness’),
or over 0.30 have been used to decide if an item is but the point can be appreciated, even if not gener-
‘on’ a factor. Different investigators can, of course, ally followed.
decide on different standards, with the result that
factors are identified by different items, even in the
same analysis. A Note on Confirmatory Factor Analysis. This
Thus, it should come as no surprise that different presentation of factor analysis of personality mea-
investigators will call the ‘same’ factor by a different sures has focused almost exclusively on approaches
name. Within the domain of the interpersonal circle, to factor analysis that are often referred to as
for example, the factors have been called Love and ‘exploratory’ (see Factor Analysis: Exploratory).
Hate, or Affiliation and Dominance. Within the Big This label is somewhat misleading as it implies that
Five, various labels have been applied to each factor, investigators use factor analysis just to ‘see what hap-
as shown in Table 2. Although there is a degree of pens’. Most investigators are not quite so clueless
similarity among the labels in each column, there are and the factor analysis of personality items usually

Table 2 Some of the different names applied to the Big Five personality factors in different systems
Factor I Factor II Factor III Factor IV Factor V
Extroversion Agreeableness Conscientiousness Emotional Stability Openness to
Experience
Surgency Femininity High Ego Control Neuroticism (r) Intellect
Power Love Prudence Adjustment Imagination
Low Ego Control Likeability Work Orientation Anxiety (r) Rebelliousness
Sociability Impulsivity (r)
Note: r = label is reversed relative to the other labels.
Factor Analysis of Personality Measures 635

takes place under circumstances where the investiga- work well with personality data. Specifically, even
tor has some specific ideas about what items should when item sets that seem to have a well-established
be included in the set to be factored, and hypotheses structure such as those contained in the Big Five
about how many factors there are, what items will be Inventory (BFI-44 [2]) or the Eysenck Personality
located on the same factor, and even what the factors Questionnaire (EPQ [5]) are subjected to CFA based
will be called. In this sense, the analysis has some on that structure, the fit of the established structure
‘confirmatory’ components. to the observed correlations is generally below the
In fact the term ‘exploratory’ refers to the fact that minimum standards of acceptable fit.
in these analyses a correlation matrix is submitted The obvious interpretation of this finding is that
for analyses and the analyses generates the optimal factor analyses of personality measures do not lead
factors and loadings empirically for that sample of to structures that adequately summarize the complex
data and without regard to the investigators ideas and relations among those measures. This interpretation is
expectations. Thus the investigator’s beliefs do not undoubtedly correct. What is not correct is the further
guide the analyses and so they are not directly tested. conclusion that structures such as those represented
Indeed, there is no hypothesis testing framework by five or three or seven factors, or circumplexes,
within exploratory factor analysis and this is why or whatever are therefore useless or misleading
most decisions associated with this approach to factor characterizations of personality.
analysis are subjective. Factor analyses of personality measures are
The term confirmatory factor analysis (CFA) intended to simplify the complex observed relations
(see Factor Analysis: Confirmatory) is generally among personality measures. Thus, it is not surprising
reserved for a particular approach that is based that factor analytic solutions do not summarize
on structural equation modeling as represented in
all the variation and covariation among personality
programs such as LISREL, EQS, or AMOS (see
measures. The results of CFA are indicating that
Structural Equation Modeling: Software). CFA is
factor analytic models of personality simply do not
explicitly guided by the investigators beliefs and
capture all the complexity in human personality, but
hypotheses. Specifically, the investigator indicates the
this is not their purpose. To adequately represent this
number of factors, designates the variables that load
complexity items would need to load on a number
on each factor, and indicates if the factors are cor-
of factors (no simple structure); factors would need
related (oblique) or uncorrelated (orthogonal). The
analyses then proceed to generate a hypothetical cor- to correlate with each other (oblique rotations), and
relation matrix based on the investigator’s specifica- many small factors representing only one or two
tions and this matrix is compared to the empirical items might be required. Moreover, such structures
correlation matrix based on the items. Chi-square might well be specific to a given sample and would
goodness-of- fit tests and various modifications of not generalize. The cost of ‘correctly’ modeling
these as ‘fit indices’ are available for evaluating how personality would be the loss of the simplicity that
close the hypothesized matrix is to the observed the factor analysis was initially designed to provide.
matrix. In addition, the individual components of the Certainly the factor analysis of personality measures
model such as the loadings of individual variables on is an undertaking where Whitehead’s dictum, ‘Seek
specific factors and proposed correlations among the simplicity but distrust it’, applies.
factors can be statistically tested. Finally, the incre- CFA can still be a powerful tool for evaluat-
ment in the goodness-of-fit of more complex models ing the relations among personality measures. The
relative to simpler ones can be tested to see if the point of this discussion is simply that CFA should
greater complexity is warranted. not be used to decide if a particular factor ana-
Clearly, when investigators have some idea about lytic model is ‘correct’; as the model almost cer-
what type of factor structure should emerge from their tainly is not correct because it is too simple. Rather,
analysis, and investigators nearly always have such an CFA should be used to compare models of person-
idea, CFA would seem to be the method of choice. ality by asking if adding more factors or correla-
However, the application of CFA to personality data tions among factors significantly improves the fit
has been slow to develop and is not widely used. The of a model. That is, when the question is changed
primary reason for this is that CFA does not often from, ‘Is the model correct’?, to ‘Which model is
636 Factor Score Estimation

significantly better’? CFA can be a most appropri- and Abnormal Personality, 2nd Edition, S. Strack, ed.,
ate tool. Finally, it is important to note that CFA Springer, New York.
also does not address the decision in factor analysis [8] Hofstee, W.K.B., de Raad, B. & Goldberg, L.R. (1992).
Integration of the Big Five and circumplex approaches
of personality measures that probably has the most to trait structure, Journal of Personality and Social
crucial impact on the results. This is the initial deci- Psychology 63, 146–163.
sion about what variables are to be included in the [9] John, O.P. (1990). The “Big Five” factor taxonomy:
analysis. dimensions of personality in the natural language and in
questionnaires, in Handbook of Personality: Theory and
Research, L.A. Pervin, ed., Guilford Press, New York,
Summary pp. 66–100.
[10] McCrae, R.R. & Costa Jr, P.T. (1996). Toward a new
Factor analysis of personality measures has resulted generation of personality theories: theoretical contexts
for the five-factor model, in J.S. Wiggins, ed., The Five-
in a wide variety of possible structures of human Factor Model of Personality: Theoretical Perspectives,
personality. This variety results because personal- Guilford Press, New York, pp. 51–87.
ity psychologists have different theories about what [11] Norman, W. (1963). Toward an adequate taxonomy of
constitutes the domain of personality and they have personality attributes, Journal of Abnormal and Social
different views about the goals of factor analysis. Psychology 66(6), 574–583.
In addition, different reasonable criteria exist for [12] Spearman, C. (1904). “General intelligence” objectively
determined and measured, American Journal of Psychol-
determining the number of factors and for rotat-
ogy 15, 201–293.
ing and naming those factors. Thus, the evalua- [13] Thurstone, L.L. (1934). The vectors of mind, Psycho-
tion of any factor analysis must include not sim- logical Review 41, 1–32.
ply the end result, but all the decisions that were [14] Wiggins, J.S. (1982). Circumplex models of interper-
made on the way to achieving that result. The exis- sonal behavior in clinical psychology, in Handbook of
tence of many reasonable factor models of human Research Methods in Clinical Psychology, P.C. Kendall
personality suggests that people are diverse not & J.N. Butcher, eds, Wiley, New York, pp. 183–221.
only in their personality, but in how they perceive
WILLIAM F. CHAPLIN
personality.

References

[1] Allport, G.W. & Odbert, H.S. (1936). Trait names:


a psycho-lexical study, Psychological Monographs 47, Factor Loadings see History of
211.
[2] Benet-Martinez, V. & John, O.P. (1998). Los Cincos Factor Analysis: A Statistical
Grandes across cultures and ethnic groups: multitrait
multimethod analyses of the Big Five in Spanish and
Perspective
English, Journal of Personality and Social Psychology
75, 729–750.
[3] Cattell, R.B. (1995). Personality structure and the new
fifth edition of the 16PF, Educational & Psychological
Measurement 55(6), 926–937.
[4] Digman, J.M. (1997). Higher-order factors of the Big
Five, Journal of Personality and Social Psychology 73, Factor Score Estimation
1246–1256.
[5] Eysenck, H.J. & Eysenck, S.B.G. (1991). Manual of the
Eysenck Personality Scales (EPQ Adults), Hodder and Introduction
Stoughton, London.
[6] Fabrigar, L.R., Wegener, D.T., MacCallum, R.C. & Factor analysis is concerned with two problems.
Strahan, E.J. (1999). Evaluating the use of exploratory
factor analysis in psychological research, Psychological
The first problem is concerned with determining a
Methods 3, 272–299. factor pattern matrix based on either the principal
[7] Goldberg, L.R. & Velicer, W.F. (in press). Principles components analysis or the common factor model.
of exploratory factor analysis, in Differentiating Normal Factor loadings in the pattern matrix indicate how
Factor Score Estimation 637

highly the observed variables are related to the of the loadings of n observed variables on the m
principal components or the common factors, both principal components, and F is an m × 1 column
of which can be thought of as latent variables. vector of m principal component scores. The principal
The second problem is concerned with estimating component scores are given by
latent variable scores for each case. Latent variable
scores, commonly referred to as factor scores, are Fm×1 = A−1
n×m Zn×1
useful and often necessary. Consider that the number A Z = A AF
of observed variables may be large; obtaining the
(typically fewer) factor scores facilitates subsequent = (A A)−1 A Z
analyses. To cite another example, factor scores – at = Dn×n A−1
m×n Zn×1 , (2)
least when derived under the common factor model –
are likely to be more reliable than observed scores. where λD is an m × m diagonal matrix of m eigenval-
Related to the idea of higher reliability is the belief ues. Equation (2) implies that a principal component
that a factor score is a pure, univocal (homogenous) score is constructed in the following way. First, each
measure of a latent variable, while an observed score of the n loadings (symbolized by a) from the principal
may be ambiguous because we do not know what component’s column in pattern matrix A is divided
combination of latent variables may be represented by the eigenvalue (λ) of the principal component
by that observed score. (i.e., a/λ). Second, a/λ is multiplied by the score of
A number of methods have been proposed for the observed variable z associated with the loading
obtaining factor scores. When these methods are (i.e., a/λ × z). And then third, the n a/λ × z terms
applied to factors derived under the principal com- are summed, constructing the principal component f
ponents model, the scores are ‘exact’, exact in the from their linear combination:
sense that a unique set of factor scores can be found n
aj k
for the principal components that are supposed to fk = × zj , (3)
denote their true population values. It does not matter j =1
λk
whether scores are derived for all n components, or
only for some m(m ≤ n) of them. In contrast, factor where aj k is the loading for the j th observed variable
scores are not uniquely determinable for the factors (j = 1, 2, . . . n) on the kth principal component (k =
of the common factor model: An infinite number of 1, 2, . . . m, m ≤ n), and λk is the eigenvalue of the
sets of factor scores are possible for any one set of kth principal component.
common factors and thus, their true values must be For example, assume we have retained three
estimated. Factor score indeterminacy arises from the principal components from eight observed variables:
indeterminacy of the common factor model itself.  
.71 .11 .16
 .82 .15 .20 
 
 .93 .19 .24 
Principal Component Scores  
 .10 .77 .28 
A=  (4)
 .22 .88 .32 
Factor scores computed for a set of principal com-  
ponents – henceforth to be referred to as principal  .24 .21 .36 
 .28 .23 .71 
component scores – are straightforwardly calculated.
As noted above, this is true no matter how many of .39 .32 .77
the n possible principal components are retained. and the eight observed scores for a person are
In order to describe principal component scores,  
we begin with a matrix equation for a single case .10
 .22 
in which only m of the principal components have  
been retained:  −.19 
 
 −.25 
z= . (5)
Zn×1 = An×m Fm×1 , (1)  .09 
 
 .23 
where Z is an n × 1 column vector of n standardized  .15 
observed variables, A is an n × m pattern matrix −.19
638 Factor Score Estimation

The three eigenvalues are, respectively, 2.39, 1.64, a unique set of them – an infinite number of such
and 1.53. The first, second, and third principle sets exist. This results from the underidentification
component scores are calculated as of the common factor model (see Factor Analy-
sis: Exploratory; Identification). An underidentified
.71 .82 model is a model for which not enough informa-
f1 = .04 = × 10 + × .22
2.39 2.39 tion in the data is present to estimate all of the
.93 .10 model’s unknown parameters. In the principal com-
+ × −.19 + × −.25 ponents model, identification is achieved by imposing
2.39 2.39
two restrictions: (a) the first component accounts for
.22 .24 the maximum amount of variance possible, the sec-
+ × .09 + × .23
2.39 2.39 ond the next, and so on and so forth, and (b) the
components are uncorrelated with each other. Impos-
.28 .39
+ × .15 + × −.19 ing these two restrictions, the unknown parameters
2.39 2.39 in the principal components model – the n × m fac-
.11 .15 tor loadings – can be uniquely estimated. Thus, the
f2 = −.05 = × .10 + × .22 principal components model is identified: The n × m
1.64 1.64
factor laodings to be estimated are ≤ in number
.19 .77 to the n(n + 1)/2 correlations available to estimate
+ × −.19 + × −.25
1.64 1.64 them.
.88 .21 In contrast, even with the imposition of the two
+ × .09 + × .23 restrictions, the common factor model remains under-
1.64 1.64
identified for the following reason. The model pos-
.23 .32 tulates not only the existence of m common factors
+ × .15 + × −.19
1.64 1.64 underlying n variables, requiring the specification of
.16 .20 n × m factor loadings (as in the principal components
f3 = .01 = × .10 + × .22 model), it also postulates the existence of n specific
1.53 1.53 factors, resulting in a model with (n × m) + n param-
.24 .28 eters to be estimated, greater in number than the
+ × −.19 + × −.25 n(n + 1)/2 available to estimate them. As a result,
1.53 1.53
the n × m factor loadings have an infinite number
.32 .36
+ × .09 + × .23 of possible values. Logically then, the factor scores
1.53 1.53 would be expected to have an infinite number of pos-
.71 .77 sible values.
+ × .15 + × −.19 .
1.53 1.53
(6)
Methods for Estimating Common Factor
Component scores can be computed using either Scores
the unrotated pattern matrix or the rotated pattern
matrix; both are of equivalent statistical validity. The
Estimation by Regression
scores obtained using the rotated matrix are simply
rescaled transformations of scores obtained using the
unrotated matrix. Thomson [9] was the first to suggest that ordinary
least-squares regression methods (see Least Squares
Estimation) can be used to obtain estimates of factor
Common Factor Scores scores. The information required to find the regres-
sion weights for the factors on the observed vari-
Why are Common Factor Scores Indeterminate? ables – the correlations among the observed vari-
ables and the correlation between the factors and
Scores from the common factor model are estimated observed variables – is available from the factor anal-
because it is mathematically impossible to determine ysis. The least-squares criterion is to minimize the
Factor Score Estimation 639
 
sum of the squared differences between predicted 1.00
and true factor scores, which is analogous to the = . (10)
0.45 1.00
generic least-squares criterion of minimizing the sum
of the squared differences between predicted and On the basis of (9), the regression weights are
observed scores.
We express the linear regression of any factor f  −1
on the observed variables z in matrix form for one 1.00
0.31 1.00 
case as  
0.48 0.54 1.00 
B= 
F̂1×m = Z1×n Bn×m , (7) 0.69 0.31 0.45 1.00 
0.34 0.30 0.26 0.41 1.00 
0.37 0.41 0.57 0.39 0.38 1.00
where B is a matrix of weights for the regression of  
0.65 0.19
the m factors on the n observed variables, and F̂ is a  0.59 0.08  
row vector of m estimated factor scores.   
 0.59 0.17  1.00
When the common factors are orthogonal, we use × 
 0.12 0.72  0.45 1.00
the following matrix equation to obtain B:  0.14 0.70 
0.24 0.34
 
Bn×m = R−1
n×n An×m , (8) 2.05
 −0.01 0.48 
 
 −0.41 −0.64 1.99 
where R is the matrix of correlations between the = 
 −1.18 −0.01 −0.20 2.11 
n observed variables. If the common factors are  −0.09 −0.20 0.16 −0.35 1.32 
nonorthogonal, we also require , the correlation −0.02 −0.14 −0.70 −0.12 −0.33 1.64
matrix among the m factors:  
0.65 0.19
 0.59 0.08   
 
Bn×m = R−1
n×n An×m m×m . (9)  0.59 0.17  1.00
× 
 0.12 0.72  0.45 1.00
 0.14 0.70 
To illustrate the regression method, we perform 0.24 0.34
a common factor analysis on a set of six observed  
0.65 −0.19
variables, retaining nonorthogonal two factors. We
 0.33 −0.01 
use data from three cases and define  
 0.33 0.10 
=  (11)
   −0.29 0.42 
0.00 −0.57 −1.15 0.00 −1.12 −0.87  0.22 0.54 
Z = −1.00 −0.57 0.57 0.00 0.80 −0.21 , −0.14 0.01
1.00 1.15 0.57 0.00 0.32 1.09
 
1.00 Then, based on (7), the two-factor scores for the
 0.31 1.00  three cases are
 
 0.48 0.54 1.00 
R= ,  
 0.69 0.31 0.45 1.00  0.00 −0.57 −1.15 0.00 −1.12 −0.87
 0.34 0.30 0.26 0.41 1.00  F̂ = −1.00 −0.57 0.57 0.00 0.80 −0.21
0.37 0.41 0.57 0.39 0.38 1.00 1.00 1.15 0.57 0.00 0.32 1.09
   
0.65 0.19 0.65 −0.19
 0.59 0.08   0.33 −0.01 
   
 0.59 0.17   0.33 0.10 
A= , × 
 0.12 0.72   −0.29 0.42 
 0.14 0.70   0.22 0.54 
0.24 0.34 −0.14 0.01
640 Factor Score Estimation

 
−0.69 −0.72 distributed (see Catalogue of Probability Density
= −0.44 0.68 (12) Functions).
1.14 0.04 Bartlett’s method specifies that for one case

The regression estimates have the following Z1×n = F̂1×m Am×n + V̂1×n Un×n , (13)
properties:
where Z is a row vector of n observed variables
1. The multiple correlation between each factor scores, F̂ is a column vector of m estimated factor
score and the common factors is maximized. scores, A is the factor pattern matrix of loadings for
2. Each factor score estimate fˆ is uncorrelated with the n observed variables on the m factors, V̂ is a
its own residual f − fˆ and the residual of every row vector of n estimated unique scores, and U is a
other estimate. diagonal matrix of the standard deviations of the n
3. Even when the common factors are orthogonal, unique factors. The common factor analysis provides
the estimates fˆ are mutually correlated. both A and U.
4. Even when the common factors are orthogonal, Recalling that F̂1×m = Z1×n Bn×m (1), we obtain
the estimate fˆ of one factor can be correlated the factor score weight matrix B to estimate the factor
with any of the other m − 1 common factors. scores in F̂:
5. Factor scores obtained through regression are
biased estimates of their population values. Bn×m = U−2  −2 −1
n×n An×m (Am×n Un×n An×m ) ∴

Depending on one’s point of view as to what F̂1×m = Z1×n Bn×m


properties factor scores should have, properties 3 and = Z1×n U−2  −2 −1
n×n An×m (Am×n Un×n An×m ) .
4 may or may not be problematic. If one believes that (14)
the univocality of a factor is diminished when it is
correlated with another factor, then estimating factor U−2 is the inverse of a diagonal matrix of the
scores by regression is considered a significantly variances of the n unique factor scores. Using the
flawed procedure. According to this view, univocality results from (14), we can obtain the unique factor
is compromised when variance in the factor is in scores with
part due to the influence of other factors. According
to an alternative view, if in the population factors V̂1×n = Z1×n U−1 −1
n×n − F̂1×m Am×n Un×n . (15)
are correlated, then their estimated scores should be
as well. For our example of Bartlett’s method, we define U2
as a diagonal matrix of the n unique factor variances,
 
0.41 0.00 0.00 0.00 0.00 0.00
Minimizing Unique Factors
 0.00 0.61 0.00 0.00 0.00 0.00 
 
 0.00 0.00 0.43 0.00 0.00 0.00 
Bartlett [2] proposed a method of factor score esti- U2 =  ,
mation in which the least-squares criterion is to  0.00 0.00 0.00 0.39 0.00 0.00 
 0.00 0.00 0.00 0.00 0.75 0.00 
minimize the difference between the predicted and
unique factor scores instead of minimizing the differ- 0.00 0.00 0.00 0.00 0.00 0.54
ence between the predicted and ‘true’ factor scores (16)
that is used in regression estimation. Unlike the
and U as a diagonal matrix of the n unique factor
regression method, Bartlett’s method produces a fac-
standard deviations,
tor score estimate that only correlates with its own
 
factor and not with any other factor. However, cor- 0.64 0.00 0.00 0.00 0.00 0.00
relations among the estimated scores of different  0.00 0.78 0.00 0.00 0.00 0.00 
 
factors still remain. In addition, Bartlett estimates,  0.00 0.00 0.66 0.00 0.00 0.00 
U= .
again unlike regression estimates, are unbiased. This  0.00 0.00 0.00 0.62 0.00 0.00 
 0.00 0.00 0.00 0.00 0.87 0.00 
is because they are maximum likelihood estimates
of the population factor scores: It is assumed that 0.00 0.00 0.00 0.00 0.00 0.73
the unique factor scores are multivariate normally (17)
Factor Score Estimation 641
 
Following (14), the factor score weight matrix is 2.44 0.00 0.00 0.00 0.00 0.00
obtained by  0.00 1.64 0.00 0.00 0.00 0.00 
 
 0.00 0.00 2.38 0.00 0.00 0.00 
× 
 0.00 0.00 0.00 2.63 0.00 0.00 
   0.00 0.00 
0.41 0.00 0.00 0.00 0.00 0.00 −2 0.00 0.00 0.00 1.33
 0.00 0.61 0.00 0.00 0.00 0.00  0.00 0.00 0.00 0.00 0.00 1.85
 
 0.00 0.00 0.43 0.00 0.00 0.00   −1
B=  0.65 0.19
 0.00 0.00 0.00 0.39 0.00 0.00 
 0.00 0.00 0.00 0.00 0.75 0.00   0.59 0.08 
 
0.00 0.00 0.00 0.00 0.00 0.54  0.59 0.17 
× 
 0.12 0.72 
     0.14 0.70 
0.65 0.19 0.65 0.19 
 0.59 0.08   0.59 0.08  0.24 0.34
   
 0.59 0.17   0.59 0.17   
×    1.59 0.46
 0.12 0.72   0.12 0.72 
 0.14 0.70   0.14 0.70   0.97 0.13 
 
 1.40 0.40 
0.24 0.34 0.24 0.34 = 
 0.31 1.89 
   0.19 0.93 
0.41 0.00 0.00 0.00 0.00 0.00 −2
 0.00 0.61 0.00 0.00 0.00 0.00  0.44 0.63
 
 0.00 0.00 0.43 0.00 0.00 0.00   
×  0.48 −0.23
 0.00 0.00 0.00 0.39 0.00 0.00  ×
 0.00 0.00 0.00 0.00 0.75 0.00  −0.23 0.52
0.00 0.00 0.00 0.00 0.00 0.54  
0.66 −0.17
 −1  0.44 −0.15 
0.65 0.19  
 0.59 0.08   0.59 −0.11 
  = . (18)
 0.59 0.17   −0.28 0.92 
×   −0.12 0.45 
 0.12 0.72 
 0.14 0.70  0.07 0.23
0.24 0.34
  With B defined as above, the estimated common
2.44 0.00 0.00 0.00 0.00 0.00 factor scores for the three cases are
 0.00 1.64 0.00 0.00 0.00 0.00 
 
 0.00 0.00 2.38 0.00 0.00 0.00   
=  0.00 −0.57 −1.15 0.00 −1.12 −0.87
 0.00 0.00 0.00 2.63 0.00 0.00 
F̂ = −1.00 −0.57 0.80 −0.21
 0.00 0.00 0.00 0.00 1.33 0.00 
0.57 0.00
1.00 1.15 0.57 0.00 0.32 1.09
0.00 0.00 0.00 0.00 0.00 1.85
 
  0.66 −0.17
0.65 0.19
 0.44 −0.15 
 0.59 0.08   
   0.59 −0.11 
 0.59 0.17  × 
×   −0.28 0.92 
 0.12 0.72   −0.12 0.45 
 0.14 0.70 
0.07 0.23
0.24 0.34
 
  −0.85 −0.49
= −0.69 0.45 , (19)
 
  1.54 0.04
 0.65 0.59 0.59 0.12 0.14 0.24 
 
 0.19 0.08 0.17 0.72 0.70 0.34 
  and from (15), the unique factor scores for the three
cases are
642 Factor Score Estimation

 
0.64 0.00 0.00 0.00 0.00 0.00 −1
   0.00 0.78 0.00 0.00 0.00 0.00 
0.00 −0.57 −1.15 0.00 −1.12 −0.87  
 0.00 0.00 0.66 0.00 0.00 0.00 
V̂ = −1.00 −0.57 0.57 0.00 0.80 −0.21  
 0.00 0.00 0.00 0.62 0.00 0.00 
1.00 1.15 0.57 0.00 0.32 1.09  0.00 0.00 0.00 0.00 0.87 0.00 
0.00 0.00 0.00 0.00 0.00 0.73
  
0.65 0.19 0.64 0.00
0.00 0.00 0.00 0.00 −1
   0.59 0.08   0.00 0.78
0.00 0.00 0.00 0.00 
−0.85 −0.49   
 0.59 0.17   0.00 0.00
0.66 0.00 0.00 0.00 
− −0.69 0.45   
 0.12 0.72   0.00 0.00
0.00 0.62 0.00 0.00 
1.54 0.04 
0.14 0.70   0.00 0.00
0.00 0.00 0.87 0.00 
0.24 0.34 0.00 0.00
0.00 0.00 0.00 0.73
 
1.56 0.00 0.00 0.00 0.00 0.00
   0.00 1.28 0.00 0.00 0.00 0.00 
0.00 −0.57 −1.15 0.00 −1.12 −0.87  
 0.00 0.00 1.51 0.00 0.00 0.00 
= −1.00 −0.57 0.57 0.00 0.80 −0.21  
 0.00 0.00 0.00 1.61 0.00 0.00 
0.32 1.09 
0.00 0.00 0.00 0.00 1.50 0.00 
1.00 1.15 0.57 0.00
0.00 0.00 0.00 0.00 0.00 1.37
  
0.65 0.19 1.56 0.00 0.00 0.00 0.00 0.00
   0.59 0.08   0.00 1.28 0.00 0.00 0.00 0.00 
−0.85 −0.49   
 0.59 0.17   0.00 0.00 1.51 0.00 0.00 0.00 
− −0.69 0.45   
 0.12 0.72   0.00 0.00 0.00 1.61 0.00 0.00 
1.54 0.04   
0.14 0.70 0.00 0.00 0.00 0.00 1.50 0.00 
0.24 0.34 0.00 0.00 0.00 0.00 0.00 1.37
   
0.00 −0.73 −1.74 0.00 −1.29 −1.19 −0.41 −0.42 −0.39 −0.28 −0.40 −0.27
= −1.56 −0.73 0.86 0.00 0.92 −0.29 − −0.23 −0.28 −0.22 0.15 0.19 −0.01
1.56 1.47 0.86 0.00 0.36 1.49 0.64 0.71 0.60 0.13 0.21 0.28
 
0.41 −0.31 −1.39 0.28 −0.89 −0.92
= −1.33 −0.44 1.09 −0.15 0.73 −0.27 (20)
0.92 0.76 0.28 −0.13 0.16 1.21

Bartlett factor score estimates can always be of the factor score f and a residual ε:
distinguished from regression factor score estimates
by examining the variance of the factor scores. While fˆ = f + ε, (23)
regression estimates have variances ≤1, Bartlett esti- where
mates have variances ≥1 [7]. This can be explained ε = fˆ − f. (24)
as follows: The regression estimation procedure
divides the factor score f into two uncorrelated parts, The result is that the variance of fˆ is the sum of
the regression part fˆ and the residual part f − fˆ. the unit variance of f and the variance of ε, the error
Thus, about the true value.
f = fˆ + e, (21)
Uncorrelated Scores Minimizing Unique Factors
where
e = f − fˆ. (22) Anderson and Rubin [1] revised Bartlett’s method
so that factor score estimates are both uncorrelated
Since the e are assumed multivariate normally the m − 1 with the other factors and are not cor-
distributed, the fˆ can further be written as the sum related with each other. These two properties result
Factor Score Estimation 643

from the following matrix equation for the factor and therefore
score estimates:
  1/2
F̂ = Z1×n U−2 0.74 −0.67 23.73 0.00
n×n An×m G1/2 =
0.67 0.74 0.00 1.64
(Am×n U−2
n×n n×n Un×n An×m )
−1/2
, (25)  
0.74 −0.67
where is a matrix of factor correlations. ×
0.67 0.74
While resembling (14), (25) is substantially more    
complex to solve: The term, Am×n U−2 0.74 −0.67 4.87 0.00 0.74 −0.67
n×n n×n Un×n =
An×m , is raised to a power of −1/2. This power 0.67 0.74 0.00 1.28 0.67 0.74
indicates that the inversion of the symmetric square  
3.26 1.79
root of the matrix product is required. The symmetric = . (33)
1.79 2.89
square root of a matrix can be found for any positive
definite symmetric matrix. To illustrate, we define G
as an n × n positive semidefinite symmetric matrix. Then, from Equation
The symmetric square root of G, G1/2 , must meet the  
following condition: 0.00 −0.57 −1.15 0.00 −1.12 −0.87
F̂ = −1.00 −0.57 0.57 0.00 0.80 −0.21
1/2 1/2
Gn×n = Gn×n Gn×n . (26) 1.00 1.15 0.57 0.00 0.32 1.09
 
Perhaps the most straightforward method of 0.41 0.00 0.00 0.00 0.00 0.00 −2
obtaining G1/2 is to obtain the spectral decomposition  0.00 0.61 0.00 0.00 0.00 0.00 
 
of G, such that G can be reproduced by a function  0.00 0.00 0.43 0.00 0.00 0.00 
× 
of its eigenvalues (λ) and eigenvectors (x):  0.00 0.00 0.00 0.39 0.00 0.00 
 0.00 0.00 0.00 0.00 0.75 0.00 
Gn×n = Xn×n Dn×n Xn×n , (27) 0.00 0.00 0.00 0.00 0.00 0.54
 
where X is an n × n matrix of eigenvectors and D 0.65 0.19
is an n × n diagonal matrix of eigenvalues. It follows  0.59 0.08   −1
 
then that  0.59 0.17  3.26 1.79
× 
Gn×n = Xn×n Dn×n Xn×n .
1/2 1/2
(28)  0.12 0.72  1.79 2.89
 0.14 0.70 
If we set G1/2 = Am×n U−2
n×n n×n Un×n An×m , (25) 0.24 0.34
can now be rewritten as  
0.00 −0.57 −1.15 0.00 −1.12 −0.87
F̂ = Z1×n U−2 −1/2 = −1.00 −0.57 0.57 0.00 0.80 −0.21
n×n An×m G . (29)
1.00 1.15 0.57 0.00 0.32 1.09
To illustrate the Anderson and Rubin method,  
2.44 0.00 0.00 0.00 0.00 0.00
we specify
   0.00 1.64 0.00 0.00 0.00 0.00 
 
0.74 −0.67  0.00 0.00 2.38 0.00 0.00 0.00 
X= (30) × 
0.67 0.74  0.00 0.00 0.00 2.63 0.00 0.00 
 0.00 0.00 0.00 0.00 1.33 0.00 
and   0.00 0.00 0.00 0.00 0.00 1.85
23.73 0.00  
= . (31) 0.65 0.19
0.00 1.64
 0.59 0.08   
Then, for Am×n U−2  
n×n n×n Un×n An×m , the spectral  0.59 0.17  0.46 −0.29
× 
decomposition is  0.12 0.72  −0.29 0.52
    0.14 0.70 
0.74 −0.67 23.73 0.00
G= 0.24 0.34
0.67 0.74 0.00 1.64  
  0.00 −0.57 −1.15 0.00 −1.12 −0.87
0.74 −0.67 = −1.00 −0.57 0.57 0.00 0.80 −0.21
× (32)
0.67 0.74 1.00 1.15 0.57 0.00 0.32 1.09
644 Factorial Designs

 
0.60 −0.21 Symposium on Mathematical Statistics and Probability
 0.41 −0.21    5, 111–150.
  −0.67 −0.32 [2] Bartlett, M.S. (1937). The statistical conception of men-
 0.53 −0.19 
×  = −0.68 0.53 . (34) tal factors, British Journal of Psychology 28, 97–104.
 −0.40 0.90 
 −0.18 0.43  1.35 −0.20 [3] Cattell, R.B. (1978). The Scientific Use of Factor Anal-
ysis in the Behavioral and Life Sciences, Plenum Press,
0.03 0.20 New York.
[4] Gorsuch, R.L. (1983). Factor Analysis, 2nd Edition,
The unique factor scores are computed as in the Lawrence Erlbaum Associates, Hillsdale.
Bartlett method (15). Substituting the results of the [5] Harman, H.H. (1976). Modern Factor Analysis, 3rd Edi-
Anderson and Rubin method into (15) yields tion revised, The University of Chicago Press, Chicago.
  [6] Lawley, D.N. & Maxwell, A.E. (1971). Factor Analysis
0.32 −0.40 −1.44 0.19 −1.01 −0.99 as a Statistical Method, American Elsevier Publishing,
V̂ = −1.34 −0.45 1.07 −0.18 0.68 −0.30 New York.
1.03 0.86 0.36 −0.01 0.33 1.31 [7] McDonald, R.P. (1985). Factor Analysis and Related
(35) Methods, Lawrence Erlbaum Associates, Hillsdale.
[8] Mulaik, S.A. (1972). The Foundations of Factor Analy-
sis, McGraw-Hill, New York.
Conclusion [9] Thomson, G.H. (1939). The Factorial Analysis of Human
Ability, Houghton Mifflin, Boston.
For convenience, we reproduce the factor scores [10] Yates, A. (1987). Multivariate Exploratory Data Analy-
sis: A Perspective on Exploratory Factor Analysis, State
estimated by the regression, Bartlett, and Anderson
University of New York Press, Albany.
and Rubin methods:
Regression Bartlett SCOTT L. HERSHBERGER
   
−0.69 −0.72 −0.85 −0.49
−0.44 0.68 −0.69 0.45
1.14 0.04 1.54 0.04
Anderson-Rubin

−0.67 −0.32 .
 Factorial Designs
−0.68 0.53
1.35 −0.20
The similarity of the factor score estimates com- A factorial design is one in which two or more
puted by the three methods is striking. treatments (or classifications for variables such as
This is in part surprising. Empirical studies sex) are investigated simultaneously and, in the ideal
have found that although the factor score estimates case, all possible combinations of each treatment
obtained from different methods correlate substan- (or classification) occur together in the design. In a
tially, they often have very different values [8]. So one-way design, we might ask whether two differ-
it would seem that the important issue is not which ent drugs lead to a significant difference in aver-
of the three estimation methods should be used, but age adjustment scores. In a factorial design, we
whether any of them should be used at all due to might ask whether the two drugs differ in effec-
factor score indeterminacy, implying that only prin- tiveness and whether the effectiveness of the drugs
cipal component scores should be obtained. Read- changes when each drug is applied under the admin-
ers seeking additional information on this area of istration of four different dosage levels. The first
controversy specifically, and factor scores generally, independent variable is the type of drug (with two
should consult, in addition to those references already levels) and the second independent variable is the
cited [3–6, 10]. dosage (with four levels). This design would be
a 2 × 4 factorial ANOVA (see Analysis of Vari-
References ance) design.
A major advantage of a factorial design is that
[1] Anderson, T.W. & Rubin, H. (1956). Statistical inference we can evaluate two independent variables in a
in factor analysis, Proceedings of the Third Berkeley single experiment, as illustrated in Table 1. Another

You might also like