You are on page 1of 35

1

Exploratory Factor
Analysis

LEARNING OBJECTIVES
2

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

Differentiate factor analysis techniques from other multivariate


techniques.
Distinguish between exploratory and confirmatory uses of factor
analytic techniques.
Understand the seven stages of applying factor analysis.
Distinguish between R and Q factor analysis.
Identify the differences between component analysis and common
factor analysis models.
Describe how to determine the number of factors to extract.
Explain the concept of rotation of factors.
Describe how to name a factor.
Explain the additional uses of factor analysis.
State the major limitations of factor analytic techniques.

Understanding Factor Analysis


3

This chapter describes factor analysis, a technique

particularly suitable for analyzing the patterns of


complex, multidimensional relationships
encountered by researchers.
Factor analysis can be utilized to examine the

underlying patterns or relationships for a large


number of variables and to determine whether the
information can be condensed or summarized in a
smaller set of factors or components.

WHAT IS FACTOR ANALYSIS?


4

Factor analysis is an interdependence technique

whose primary purpose is to define the underlying


structure among the variables in the analysis.

factor analysis provides the tools for analyzing the

structure of the interrelationships (correlations)


among a large number of variables by defining sets of
variables that are highly interrelated, known as
factors. These groups of variables (factors), which
are by definition highly intercorrelated, are assumed
to represent dimensions within the data.

factor analysis is viewed as a data-reduction technique as it

reduces a large number of overlapping variables to a


smaller set of factors.
factor analytic techniques can achieve their purposes from

either an exploratory or confirmatory perspective.

FACTOR ANALYSIS DECISION PROCESS


6

STAGE 1: OBJECTIVES OF FACTOR ANALYSIS :

The starting point in factor analysis is the research


problem The general purpose of factor analytic
techniques is to find a way to condense (summarize) the
information contained in a number of original variables
into a smaller set of new, composite dimensions or
variates (factors) with a minimum loss of information
In meeting its objectives, factor analysis is keyed to four
issues: specifying the unit of analysis, achieving data
summarization and/or data reduction, variable selection,
and using factor analysis results with other multivariate
techniques.

Specifying the Unit of Analysis


8

Factor analysis is actually a more general model in that it can


identify the structure of relationships among either variables or
respondents by examining either the correlations between the
variables or the correlations between the respondents.
If the objective of the research were to summarize the
characteristics, factor analysis would be applied to a correlation
matrix of the variables
Factor analysis also may be applied to a correlation matrix of the
individual respondents based on their characteristics. Referred to
as Q factor analysis, this method combines or condenses large
numbers of people into distinctly different groups within a larger
population. Most researchers utilize some type of cluster analysis
to group individual respondents.

Achieving Data Summarization Versus Data


Reducion
9

DATA SUMMARIZATION The fundamental concept involved in data

summarization is the definition of structure.


DATA REDUCTION Factor analysis can also be used to achieve data

reduction by (1) identifying representative variables from a much


larger set of variables for use in subsequent multivariate analyses, or
(2) creating an entirely new set of variables, much smaller in
number, to partially or completely replace the original set of
variables.

STAGE 2: DESIGNING A FACTOR ANALYSIS


10

The design of a factor analysis involves three basic decisions:


(1) calculation of the input data (a correlation matrix) to meet
the specified objectives of grouping variables or respondents;
(2) design of the study in terms of number of variables,
measurement properties of variables, and the types of
allowable variables
(3)and the sample size necessary, both in absolute terms and
as a function of the number of variables in the analysis.

Correlations Among Variables or Respondents


11

The first decision in the design of a factor analysis focuses

on calculating the input data for the analysis. There are


two forms of factor analysis: R-type versus Q-type factor
analysis. Both types of factor analysis utilize a correlation
matrix as the basic data input With R -type factor analysis,
the researcher would use a traditional correlation matrix.
In this Q-type factor analysis, the results would be a factor
matrix that would identify similar individuals.
From the results of a Q factor analysis, we could identify
groups or clusters of individuals that demonstrate a
similar pattern on the variables included in the analysis.

12

How does Q-type factor analysis differ from cluster analysis?


t Q-type factor analysis is based on the intercorrelations between the

respondents, whereas cluster analysis forms groupings based on a


distance-based similarity measure between the respondents' scores on
the variables being analyzed.
To illustrate this difference, consider Figure 3, which contains the
scores of four respondents over three different variables. A Q-type
factor analysis of these four respondents would yield two groups with
similar covariance structures, consisting of respondents A and C
versus B and D. In contrast, the clustering approach would be sensitive
to the actual distances among the respondents' scores and would lead
to a grouping of the closest pairs. Thus, with a cluster analysis
approach, respondents A and B would be placed in one group and C
and D in the other group.

Variable Selection and Measurement Issues


13
Two specific questions must be answered at this point: ( 1) What type of

variables can be used in factor analysis? and (2) How many variables
should be included? In terms of the types of variables included.
the primary requirement is that a correlation value can be calculated
among all variables. Metric variables are easily measured by several types
of correlations. Nonmetric variablesare more problematic because they
cannot use the same types of correlation measures used by metric
variables. The most prudent approach is to avoid nonmetric variables. If a
nonmetric variable must be included, one approach is to define dummy
variables (coded 0--1) to represent categories of nonmetric variables. Hall
the variables are dummy variables, then specialized forms of factor
analysis, such as Boolean factor analysis, are more appropriate [5]. The
researcher should also attempt to minimize the number of variables
included but still maintain a reasonable number of variables per factor. H a
study is being designed to assess a proposed structure.

14

The strength of factor analysis lies in finding

patterns among groups of variables, and it is of little


use in identifying factors composed of only a single
variable. Finally, when designing a study to be factor
analyzed, the researcher should, if possible, identify
several key variables that closely reflect the
hypothesized underlying factors.

Sample Size
15

The researcher generally would not factor analyze a sample

of fewer than 50 observations, and preferably the sample


size should be 100 or larger. As a general rule, the minimum
is to have at least five times as many observations as the
number of variables to be analyzed, and the more
acceptable sample size would have a 10:1 ratio.
The researcher should always try to obtain the highest
cases-per-variable ratio to minimize the chances of
overfitting the data .In order to do so, the researcher may
employ the most parsimonious set of variables, guided by
conceptual and practical considerations, and then obtain an
adequate sample size for the number of variables examined.

Stage 3: Assumptions in Factor


Analysis
16

Conceptual Issues
Statistical Issues

17

Factor analysis
Statistical multivariate technique
Data reduction technique
Factors must cover most of data variability
1 type of variables considered
Critical assumptions
1.
2.

Conceptual (higher impact)


Statistical

18

Selected variables & sample


Sample must be
Homogeneous & representative poor factor analysis
results
Major issue: some underlying structure does NOT

exist in the selected variables


Presence of correlated variables does NOT
guarantee relevance even if meet statistical
requirements
Researcher MUST ensure that observed patterns
conceptually applied & appropriate for study

19

To produce representative factors, researchers MUST

ensure that original variables sufficiently inter-correlated


Measures to diagnose factorability of correlation matrix
1.

Overall measures of inter-correlation (visual examination)


Data

matrix must has sufficient correlation & partial correlation


Bartlett test of sphericity
Measure of sampling adequacy (MSA > 0.50)
2.

Variable-specific measures of inter-correlation

An extended version of MSA to individual variable-level


Calculate MSA for each variable
Exclude variable with least MSA (<0.05) & redo calculations
until all included variables have MSA > 0.50

STAGE 4: DERIVING FACTORS AND


ASSESSING OVERALL FIT
20

Once the variables are specified and the correlation matrix is

prepared the researcher is ready to apply factor analysis to


identify the underlying structure of relationships. In doing
so, decisions must be made concerning (I) the method of
extracting the factors (common factor analysis versus
components analysis) and (2) the number of factors selected
to present the underlying structure in the data.
Selecting the Factor Extraction Method The researcher can

choose from two similar yet unique, methods for defining


(extracting) the factors to represent the structure of the
variables in the analysis.

PARTITIONING THE VARIANCE OF A


VARIABLE
21

The total variance of any variable can be divided (partitioned)

into three types of variance:


1. Common variance is defined as that variance in a variable that
is shared with all other variables in the analysis.
2. Specific variance (also known as unique variance) is that
variance associated with only a specific variable. This variance
cannot be explained by the correlations to the other variables
but is still associated uniquely with a single variable.
3. Error variance is also variance that cannot be explained by
correlations with other variables, but it is due to unreliability
in the data -gathering process, measurement error, or a
random component in the measured phenomenon.

COMMON FACTOR ANALYSIS


VERSUS COMPONENT ANALYSIS
22

The selection of one method over the other is based on two criteria:

(1) the objectives of the factor analysis and (2) the amount of prior
knowledge about the variance in the variables.
The most direct comparison between the two methods is
by their use of the explained versus unexplained variance:
Component analysis, considers the total variance and derives
factors that contain small proportions of unique variance and, in
some instances, error variance.
Common factor analysis, considers only the common or shared
variance,
assuming that both the unique and error variance are not of interest
in defiling the structure of the variables.

Component factor analysis is most


appropriate when:
23

Data reduction is a primary concern, focusing on the


minimum number of factors needed to account for the
maximum portion of the total variance represented in the
original set of variables, and
Prior knowledge suggests that specific and error variance

represent a relatively small proportion of the total variance.


The primary objective is to identify the latent dimensions
or constructs represented in the original variables, and
The researcher has little knowledge about the amount of
specific and error variance and therefore wishes to
eliminate this variance.

24

LATENT ROOT CRITERION

This technique is simple to apply to either components analysis


or common factor analysis. The rationale for the latent root
criterion is that any individual factor should account for the
variance of at least a single variable if it is to be retained for
interpretation.
A PRIORI CRITERION

The a priori criterion is a simple yet reasonable criterion under


certain circumstances. When applying it, the researcher knows
how many factors to extract before und raking the factor
analysis.

25

PERCENTAGE OF VARIANCE CRITERION

The percentage of variance criterion is an approach based on achieving


a specified cumulative percentage of total variance extracted by
successive factors. The purpose is to ensure practical significance for the
derived factors by ensuring that they explain at least a specified amount
of variance..
SCREE TEST CRITERION
Recall that with the component analysis factor model the later factors
extracted contain both common and unique variance. Although all
factors contain at least one unique variance, the proportion of unique
variance is substantially higher in later factors. The scree test is used to
identify the optimum number of factors that can be extracted before the
amount of unique variance begins to dominate the common variance
structure .

HETEROGENEITY OF THE
RESPONDENTS
26

shared variance among variables is the basis for both common and

component factor models. An underlying assumption is that shared


variance extends across the entire sample. If the sample is
heterogeneous with regard to at least one subset of the variables, then
the first factors will represent those variables that are more
homogeneous across the entire sample. Variables that are better
discriminators between the subgroups of the sample will load on later
factors.
When the objective is to identify factors that discriminate among the
subgroups of a sample, the researcher should extract additional factors
beyond those indicated by the methods just discussed and examine the
additional factors' ability to discriminate among the groups. If they
prove less beneficial in discrimination, the solution can be run again
and these later factors eliminated.

27

28

29

Obtaining a Factor Analysis


30
Click:

Analyze and
select

Dimension
Reduction
Factor
A factor
Analysis Box
will appear

Obtaining a Factor Analysis


31
Move

variables/scale
items to
Variable box

Obtaining a Factor Analysis


32

Factor

extraction
When
variables
are in
variable
box,
select:

Extraction

Obtaining a Factor Analysis


33
When the factor

extraction Box
appears, select:
Scree Plot
keep all default

selections
including:

Principle component
Analysis
Based on Eigen Value of
1, and

Un-rotated factor
solution

Obtaining a Factor Analysis


34
During

factor
extraction
keep
factor
rotation
default of:

None
Press
continue

Obtaining a Factor Analysis


35
During Factor

Rotation:
Decide on the
number of factors
based on actor
extraction phase
and enter the
desired number of
factors by
choosing:
Fixed number of
factors and
entering the
desired number of
factors to extract.
Under Rotation
Choose Varimax
Press continue
Then OK