You are on page 1of 35

# 1

Exploratory Factor
Analysis

LEARNING OBJECTIVES
2

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

## Differentiate factor analysis techniques from other multivariate

techniques.
Distinguish between exploratory and confirmatory uses of factor
analytic techniques.
Understand the seven stages of applying factor analysis.
Distinguish between R and Q factor analysis.
Identify the differences between component analysis and common
factor analysis models.
Describe how to determine the number of factors to extract.
Explain the concept of rotation of factors.
Describe how to name a factor.
Explain the additional uses of factor analysis.
State the major limitations of factor analytic techniques.

3

## particularly suitable for analyzing the patterns of

complex, multidimensional relationships
encountered by researchers.
Factor analysis can be utilized to examine the

## underlying patterns or relationships for a large

number of variables and to determine whether the
information can be condensed or summarized in a
smaller set of factors or components.

4

## whose primary purpose is to define the underlying

structure among the variables in the analysis.

## structure of the interrelationships (correlations)

among a large number of variables by defining sets of
variables that are highly interrelated, known as
factors. These groups of variables (factors), which
are by definition highly intercorrelated, are assumed
to represent dimensions within the data.

## reduces a large number of overlapping variables to a

smaller set of factors.
factor analytic techniques can achieve their purposes from

6

## The starting point in factor analysis is the research

problem The general purpose of factor analytic
techniques is to find a way to condense (summarize) the
information contained in a number of original variables
into a smaller set of new, composite dimensions or
variates (factors) with a minimum loss of information
In meeting its objectives, factor analysis is keyed to four
issues: specifying the unit of analysis, achieving data
summarization and/or data reduction, variable selection,
and using factor analysis results with other multivariate
techniques.

8

## Factor analysis is actually a more general model in that it can

identify the structure of relationships among either variables or
respondents by examining either the correlations between the
variables or the correlations between the respondents.
If the objective of the research were to summarize the
characteristics, factor analysis would be applied to a correlation
matrix of the variables
Factor analysis also may be applied to a correlation matrix of the
individual respondents based on their characteristics. Referred to
as Q factor analysis, this method combines or condenses large
numbers of people into distinctly different groups within a larger
population. Most researchers utilize some type of cluster analysis
to group individual respondents.

Reducion
9

## summarization is the definition of structure.

DATA REDUCTION Factor analysis can also be used to achieve data

## reduction by (1) identifying representative variables from a much

larger set of variables for use in subsequent multivariate analyses, or
(2) creating an entirely new set of variables, much smaller in
number, to partially or completely replace the original set of
variables.

10

## The design of a factor analysis involves three basic decisions:

(1) calculation of the input data (a correlation matrix) to meet
the specified objectives of grouping variables or respondents;
(2) design of the study in terms of number of variables,
measurement properties of variables, and the types of
allowable variables
(3)and the sample size necessary, both in absolute terms and
as a function of the number of variables in the analysis.

11

## on calculating the input data for the analysis. There are

two forms of factor analysis: R-type versus Q-type factor
analysis. Both types of factor analysis utilize a correlation
matrix as the basic data input With R -type factor analysis,
the researcher would use a traditional correlation matrix.
In this Q-type factor analysis, the results would be a factor
matrix that would identify similar individuals.
From the results of a Q factor analysis, we could identify
groups or clusters of individuals that demonstrate a
similar pattern on the variables included in the analysis.

12

## How does Q-type factor analysis differ from cluster analysis?

t Q-type factor analysis is based on the intercorrelations between the

## respondents, whereas cluster analysis forms groupings based on a

distance-based similarity measure between the respondents' scores on
the variables being analyzed.
To illustrate this difference, consider Figure 3, which contains the
scores of four respondents over three different variables. A Q-type
factor analysis of these four respondents would yield two groups with
similar covariance structures, consisting of respondents A and C
versus B and D. In contrast, the clustering approach would be sensitive
to the actual distances among the respondents' scores and would lead
to a grouping of the closest pairs. Thus, with a cluster analysis
approach, respondents A and B would be placed in one group and C
and D in the other group.

## Variable Selection and Measurement Issues

13
Two specific questions must be answered at this point: ( 1) What type of

variables can be used in factor analysis? and (2) How many variables
should be included? In terms of the types of variables included.
the primary requirement is that a correlation value can be calculated
among all variables. Metric variables are easily measured by several types
of correlations. Nonmetric variablesare more problematic because they
cannot use the same types of correlation measures used by metric
variables. The most prudent approach is to avoid nonmetric variables. If a
nonmetric variable must be included, one approach is to define dummy
variables (coded 0--1) to represent categories of nonmetric variables. Hall
the variables are dummy variables, then specialized forms of factor
analysis, such as Boolean factor analysis, are more appropriate [5]. The
researcher should also attempt to minimize the number of variables
included but still maintain a reasonable number of variables per factor. H a
study is being designed to assess a proposed structure.

14

## patterns among groups of variables, and it is of little

use in identifying factors composed of only a single
variable. Finally, when designing a study to be factor
analyzed, the researcher should, if possible, identify
several key variables that closely reflect the
hypothesized underlying factors.

Sample Size
15

## of fewer than 50 observations, and preferably the sample

size should be 100 or larger. As a general rule, the minimum
is to have at least five times as many observations as the
number of variables to be analyzed, and the more
acceptable sample size would have a 10:1 ratio.
The researcher should always try to obtain the highest
cases-per-variable ratio to minimize the chances of
overfitting the data .In order to do so, the researcher may
employ the most parsimonious set of variables, guided by
conceptual and practical considerations, and then obtain an
adequate sample size for the number of variables examined.

## Stage 3: Assumptions in Factor

Analysis
16

Conceptual Issues
Statistical Issues

17

Factor analysis
Statistical multivariate technique
Data reduction technique
Factors must cover most of data variability
1 type of variables considered
Critical assumptions
1.
2.

Statistical

18

## Selected variables & sample

Sample must be
Homogeneous & representative poor factor analysis
results
Major issue: some underlying structure does NOT

## exist in the selected variables

Presence of correlated variables does NOT
guarantee relevance even if meet statistical
requirements
Researcher MUST ensure that observed patterns
conceptually applied & appropriate for study

19

## ensure that original variables sufficiently inter-correlated

Measures to diagnose factorability of correlation matrix
1.

Data

## matrix must has sufficient correlation & partial correlation

Bartlett test of sphericity
Measure of sampling adequacy (MSA > 0.50)
2.

## An extended version of MSA to individual variable-level

Calculate MSA for each variable
Exclude variable with least MSA (<0.05) & redo calculations
until all included variables have MSA > 0.50

## STAGE 4: DERIVING FACTORS AND

ASSESSING OVERALL FIT
20

## prepared the researcher is ready to apply factor analysis to

identify the underlying structure of relationships. In doing
so, decisions must be made concerning (I) the method of
extracting the factors (common factor analysis versus
components analysis) and (2) the number of factors selected
to present the underlying structure in the data.
Selecting the Factor Extraction Method The researcher can

## choose from two similar yet unique, methods for defining

(extracting) the factors to represent the structure of the
variables in the analysis.

VARIABLE
21

## into three types of variance:

1. Common variance is defined as that variance in a variable that
is shared with all other variables in the analysis.
2. Specific variance (also known as unique variance) is that
variance associated with only a specific variable. This variance
cannot be explained by the correlations to the other variables
but is still associated uniquely with a single variable.
3. Error variance is also variance that cannot be explained by
correlations with other variables, but it is due to unreliability
in the data -gathering process, measurement error, or a
random component in the measured phenomenon.

## COMMON FACTOR ANALYSIS

VERSUS COMPONENT ANALYSIS
22

The selection of one method over the other is based on two criteria:

(1) the objectives of the factor analysis and (2) the amount of prior
knowledge about the variance in the variables.
The most direct comparison between the two methods is
by their use of the explained versus unexplained variance:
Component analysis, considers the total variance and derives
factors that contain small proportions of unique variance and, in
some instances, error variance.
Common factor analysis, considers only the common or shared
variance,
assuming that both the unique and error variance are not of interest
in defiling the structure of the variables.

## Component factor analysis is most

appropriate when:
23

## Data reduction is a primary concern, focusing on the

minimum number of factors needed to account for the
maximum portion of the total variance represented in the
original set of variables, and
Prior knowledge suggests that specific and error variance

## represent a relatively small proportion of the total variance.

The primary objective is to identify the latent dimensions
or constructs represented in the original variables, and
The researcher has little knowledge about the amount of
specific and error variance and therefore wishes to
eliminate this variance.

24

## This technique is simple to apply to either components analysis

or common factor analysis. The rationale for the latent root
criterion is that any individual factor should account for the
variance of at least a single variable if it is to be retained for
interpretation.
A PRIORI CRITERION

## The a priori criterion is a simple yet reasonable criterion under

certain circumstances. When applying it, the researcher knows
how many factors to extract before und raking the factor
analysis.

25

## The percentage of variance criterion is an approach based on achieving

a specified cumulative percentage of total variance extracted by
successive factors. The purpose is to ensure practical significance for the
derived factors by ensuring that they explain at least a specified amount
of variance..
SCREE TEST CRITERION
Recall that with the component analysis factor model the later factors
extracted contain both common and unique variance. Although all
factors contain at least one unique variance, the proportion of unique
variance is substantially higher in later factors. The scree test is used to
identify the optimum number of factors that can be extracted before the
amount of unique variance begins to dominate the common variance
structure .

HETEROGENEITY OF THE
RESPONDENTS
26

shared variance among variables is the basis for both common and

## component factor models. An underlying assumption is that shared

variance extends across the entire sample. If the sample is
heterogeneous with regard to at least one subset of the variables, then
the first factors will represent those variables that are more
homogeneous across the entire sample. Variables that are better
discriminators between the subgroups of the sample will load on later
factors.
When the objective is to identify factors that discriminate among the
subgroups of a sample, the researcher should extract additional factors
beyond those indicated by the methods just discussed and examine the
additional factors' ability to discriminate among the groups. If they
prove less beneficial in discrimination, the solution can be run again
and these later factors eliminated.

27

28

29

30
Click:

Analyze and
select

Dimension
Reduction
Factor
A factor
Analysis Box
will appear

31
Move

variables/scale
items to
Variable box

32

Factor

extraction
When
variables
are in
variable
box,
select:

Extraction

## Obtaining a Factor Analysis

33
When the factor

extraction Box
appears, select:
Scree Plot
keep all default

selections
including:

Principle component
Analysis
Based on Eigen Value of
1, and

Un-rotated factor
solution

34
During

factor
extraction
keep
factor
rotation
default of:

None
Press
continue

## Obtaining a Factor Analysis

35
During Factor

Rotation:
Decide on the
number of factors
based on actor
extraction phase
and enter the
desired number of
factors by
choosing:
Fixed number of
factors and
entering the
desired number of
factors to extract.
Under Rotation
Choose Varimax
Press continue
Then OK