Exploratory Factor

Analysis

LEARNING OBJECTIVES

2

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

techniques.

Distinguish between exploratory and confirmatory uses of factor

analytic techniques.

Understand the seven stages of applying factor analysis.

Distinguish between R and Q factor analysis.

Identify the differences between component analysis and common

factor analysis models.

Describe how to determine the number of factors to extract.

Explain the concept of rotation of factors.

Describe how to name a factor.

Explain the additional uses of factor analysis.

State the major limitations of factor analytic techniques.

3

complex, multidimensional relationships

encountered by researchers.

Factor analysis can be utilized to examine the

number of variables and to determine whether the

information can be condensed or summarized in a

smaller set of factors or components.

4

structure among the variables in the analysis.

among a large number of variables by defining sets of

variables that are highly interrelated, known as

factors. These groups of variables (factors), which

are by definition highly intercorrelated, are assumed

to represent dimensions within the data.

smaller set of factors.

factor analytic techniques can achieve their purposes from

6

problem The general purpose of factor analytic

techniques is to find a way to condense (summarize) the

information contained in a number of original variables

into a smaller set of new, composite dimensions or

variates (factors) with a minimum loss of information

In meeting its objectives, factor analysis is keyed to four

issues: specifying the unit of analysis, achieving data

summarization and/or data reduction, variable selection,

and using factor analysis results with other multivariate

techniques.

8

identify the structure of relationships among either variables or

respondents by examining either the correlations between the

variables or the correlations between the respondents.

If the objective of the research were to summarize the

characteristics, factor analysis would be applied to a correlation

matrix of the variables

Factor analysis also may be applied to a correlation matrix of the

individual respondents based on their characteristics. Referred to

as Q factor analysis, this method combines or condenses large

numbers of people into distinctly different groups within a larger

population. Most researchers utilize some type of cluster analysis

to group individual respondents.

Reducion

9

DATA REDUCTION Factor analysis can also be used to achieve data

larger set of variables for use in subsequent multivariate analyses, or

(2) creating an entirely new set of variables, much smaller in

number, to partially or completely replace the original set of

variables.

10

(1) calculation of the input data (a correlation matrix) to meet

the specified objectives of grouping variables or respondents;

(2) design of the study in terms of number of variables,

measurement properties of variables, and the types of

allowable variables

(3)and the sample size necessary, both in absolute terms and

as a function of the number of variables in the analysis.

11

two forms of factor analysis: R-type versus Q-type factor

analysis. Both types of factor analysis utilize a correlation

matrix as the basic data input With R -type factor analysis,

the researcher would use a traditional correlation matrix.

In this Q-type factor analysis, the results would be a factor

matrix that would identify similar individuals.

From the results of a Q factor analysis, we could identify

groups or clusters of individuals that demonstrate a

similar pattern on the variables included in the analysis.

12

t Q-type factor analysis is based on the intercorrelations between the

distance-based similarity measure between the respondents' scores on

the variables being analyzed.

To illustrate this difference, consider Figure 3, which contains the

scores of four respondents over three different variables. A Q-type

factor analysis of these four respondents would yield two groups with

similar covariance structures, consisting of respondents A and C

versus B and D. In contrast, the clustering approach would be sensitive

to the actual distances among the respondents' scores and would lead

to a grouping of the closest pairs. Thus, with a cluster analysis

approach, respondents A and B would be placed in one group and C

and D in the other group.

13

Two specific questions must be answered at this point: ( 1) What type of

variables can be used in factor analysis? and (2) How many variables

should be included? In terms of the types of variables included.

the primary requirement is that a correlation value can be calculated

among all variables. Metric variables are easily measured by several types

of correlations. Nonmetric variablesare more problematic because they

cannot use the same types of correlation measures used by metric

variables. The most prudent approach is to avoid nonmetric variables. If a

nonmetric variable must be included, one approach is to define dummy

variables (coded 0--1) to represent categories of nonmetric variables. Hall

the variables are dummy variables, then specialized forms of factor

analysis, such as Boolean factor analysis, are more appropriate [5]. The

researcher should also attempt to minimize the number of variables

included but still maintain a reasonable number of variables per factor. H a

study is being designed to assess a proposed structure.

14

use in identifying factors composed of only a single

variable. Finally, when designing a study to be factor

analyzed, the researcher should, if possible, identify

several key variables that closely reflect the

hypothesized underlying factors.

Sample Size

15

size should be 100 or larger. As a general rule, the minimum

is to have at least five times as many observations as the

number of variables to be analyzed, and the more

acceptable sample size would have a 10:1 ratio.

The researcher should always try to obtain the highest

cases-per-variable ratio to minimize the chances of

overfitting the data .In order to do so, the researcher may

employ the most parsimonious set of variables, guided by

conceptual and practical considerations, and then obtain an

adequate sample size for the number of variables examined.

Analysis

16

Conceptual Issues

Statistical Issues

17

Factor analysis

Statistical multivariate technique

Data reduction technique

Factors must cover most of data variability

1 type of variables considered

Critical assumptions

1.

2.

Statistical

18

Sample must be

Homogeneous & representative poor factor analysis

results

Major issue: some underlying structure does NOT

Presence of correlated variables does NOT

guarantee relevance even if meet statistical

requirements

Researcher MUST ensure that observed patterns

conceptually applied & appropriate for study

19

Measures to diagnose factorability of correlation matrix

1.

Data

Bartlett test of sphericity

Measure of sampling adequacy (MSA > 0.50)

2.

Calculate MSA for each variable

Exclude variable with least MSA (<0.05) & redo calculations

until all included variables have MSA > 0.50

ASSESSING OVERALL FIT

20

identify the underlying structure of relationships. In doing

so, decisions must be made concerning (I) the method of

extracting the factors (common factor analysis versus

components analysis) and (2) the number of factors selected

to present the underlying structure in the data.

Selecting the Factor Extraction Method The researcher can

(extracting) the factors to represent the structure of the

variables in the analysis.

VARIABLE

21

1. Common variance is defined as that variance in a variable that

is shared with all other variables in the analysis.

2. Specific variance (also known as unique variance) is that

variance associated with only a specific variable. This variance

cannot be explained by the correlations to the other variables

but is still associated uniquely with a single variable.

3. Error variance is also variance that cannot be explained by

correlations with other variables, but it is due to unreliability

in the data -gathering process, measurement error, or a

random component in the measured phenomenon.

VERSUS COMPONENT ANALYSIS

22

The selection of one method over the other is based on two criteria:

(1) the objectives of the factor analysis and (2) the amount of prior

knowledge about the variance in the variables.

The most direct comparison between the two methods is

by their use of the explained versus unexplained variance:

Component analysis, considers the total variance and derives

factors that contain small proportions of unique variance and, in

some instances, error variance.

Common factor analysis, considers only the common or shared

variance,

assuming that both the unique and error variance are not of interest

in defiling the structure of the variables.

appropriate when:

23

minimum number of factors needed to account for the

maximum portion of the total variance represented in the

original set of variables, and

Prior knowledge suggests that specific and error variance

The primary objective is to identify the latent dimensions

or constructs represented in the original variables, and

The researcher has little knowledge about the amount of

specific and error variance and therefore wishes to

eliminate this variance.

24

or common factor analysis. The rationale for the latent root

criterion is that any individual factor should account for the

variance of at least a single variable if it is to be retained for

interpretation.

A PRIORI CRITERION

certain circumstances. When applying it, the researcher knows

how many factors to extract before und raking the factor

analysis.

25

a specified cumulative percentage of total variance extracted by

successive factors. The purpose is to ensure practical significance for the

derived factors by ensuring that they explain at least a specified amount

of variance..

SCREE TEST CRITERION

Recall that with the component analysis factor model the later factors

extracted contain both common and unique variance. Although all

factors contain at least one unique variance, the proportion of unique

variance is substantially higher in later factors. The scree test is used to

identify the optimum number of factors that can be extracted before the

amount of unique variance begins to dominate the common variance

structure .

HETEROGENEITY OF THE

RESPONDENTS

26

shared variance among variables is the basis for both common and

variance extends across the entire sample. If the sample is

heterogeneous with regard to at least one subset of the variables, then

the first factors will represent those variables that are more

homogeneous across the entire sample. Variables that are better

discriminators between the subgroups of the sample will load on later

factors.

When the objective is to identify factors that discriminate among the

subgroups of a sample, the researcher should extract additional factors

beyond those indicated by the methods just discussed and examine the

additional factors' ability to discriminate among the groups. If they

prove less beneficial in discrimination, the solution can be run again

and these later factors eliminated.

27

28

29

30

Click:

Analyze and

select

Dimension

Reduction

Factor

A factor

Analysis Box

will appear

31

Move

variables/scale

items to

Variable box

32

Factor

extraction

When

variables

are in

variable

box,

select:

Extraction

33

When the factor

extraction Box

appears, select:

Scree Plot

keep all default

selections

including:

Principle component

Analysis

Based on Eigen Value of

1, and

Un-rotated factor

solution

34

During

factor

extraction

keep

factor

rotation

default of:

None

Press

continue

35

During Factor

Rotation:

Decide on the

number of factors

based on actor

extraction phase

and enter the

desired number of

factors by

choosing:

Fixed number of

factors and

entering the

desired number of

factors to extract.

Under Rotation

Choose Varimax

Press continue

Then OK

