You are on page 1of 14

Factor Analysis

and
Principal Components

By
A. Subrahmanyam
Factor Analysis
Factor analyses are performed by examining
the pattern of correlations (or covariances)
between the observed measures.

To get a small set of variables (preferably


uncorrelated) from a large set of variables
(most of which are correlated to each other)

To create indexes with variables that measure


similar things (conceptually).
Uses of Factor Analysis
Factor Analysis is primarily used for data
reduction or structure detection.
The purpose of data reduction is to remove
redundant (highly correlated) variables from the
data file, perhaps replacing the entire data file
with a smaller number of uncorrelated variables.
The purpose of structure detection is to examine
the underlying (or latent) relationships between
the variables.
To construct a questionnaire to measure an
underlying variable and
Two types of factor analysis:

Exploratory
It is exploratory when you do not have a pre -
defined idea of the structure or how many
dimensions are in a set of variables.

Confirmatory
It is confirmatory when you want to test specific
hypothesis about the structure or the number of
dimensions underlying a set of variables
Important terminology
Factor loading: interpreted as the Pearson
correlation between the variable and the factor
Extraction: the process by which the factors are
determined from a large set of variables
Communality:
The sum of the squared factor loadings for all
factors for a given variable (row) is the variance
in that variable accounted for by all the factors,
and this is called the communality.
The communality measures the percent of variance
in a given variable explained by all the factors
jointly.
Important terminology
Cont..
Eigenvalues:

The eigenvalue for a given factor measures the


variance in all the variables which is accounted for
by that factor.
Eigenvalues are often used to determine how
many factors to take
Take as many factors there are eigenvalues
greater than 1
The amount of standardized variance in a variable
is 1
The sum of eigenvalues is the percentage of
variance accounted for
Principle component
Analysis
Principle component: one of the extraction
methods
A principle component is a linear combination of
observed variables that is independent
(orthogonal) of other components

The first component accounts for the largest


amount of variance in the input data; the second
component accounts for the largest amount or
the remaining variance
Principle component
Analysis Cont..
Components are orthogonal means they are
uncorrelated

Principle components and principle axis are


the most common used methods

When there are multicollinearity, use


principle components

Rotations are often done. Try to use Varimax


Principle component
Analysis Cont..
Possible application of principle
components:
E.g. in a survey research, it is common to
have many questions to address one issue
(e.g. customer service).
It is likely that these questions are highly
correlated. It is problematic to use these
variables in some statistical procedures (e.g.
regression).
One can use factor scores, computed from
factor loadings on each orthogonal component
Principle component
What does principal components analysis do?
Analysis Cont..
Takes a set of correlated
variables and creates a smaller
set of uncorrelated variables.
These newly created variables are called principal
components.
There are two main objectives for using PCA
1. Reduce the dimensionality of the data.
In simple English: turn p variables into less than p variables.
While reducing the number of variables we attempt to keep as
much information of the original variables as possible.
Thus we try to reduce the number of variables without loss of
information.
2. Identify new meaningful underlying variables.
This is often not possible.
The principal components created are linear combinations of the
original variables and often dont lend to any meaning beyond
that.
Rotation
Objective: to facilitate interpretation
Orthogonal rotation: Done when data reduction is
the objective and factors need to be orthogonal
maintains independence of factors
more commonly seen
Ex:varimax, quartimax, equamax, parsimax, etc.

Oblique rotation: use when there are reason to


allow factors to be correlated
allows dependence of factors
can be harder to interpret once you lose
independence of factors
Oblimin and Promax
Varimax Rotation
It seeks the rotated loadings that maximize the variance
of the squared loadings for each factor; the goal is to
make some of these loadings as large as possible, and
the rest as small as possible in absolute value.
The varimax method encourages the detection of factors
each of which is related to few variables.
By default the rotation is varimax which produces
orthogonal factors.
This means that factors are not correlated to each other.
This setting is recommended when you want to identify
variables to create indexes or new variables without inter
- correlated components
Number of methods:

there are 8 factoring methods, including principle


component
Principle axis: account for correlations between the
variables
Unweighted least-squares: minimize the residual
between the observed and the reproduced correlation
matrix
Generalize least-squares: similar to Unweighted
least-squares but give more weight to the variables with
stronger correlation
Maximum Likelihood: generate the solution that is the
most likely to produce the correlation matrix
Alpha Factoring: Consider variables as a sample; not
using factor loadings
Image factoring: decompose the variables into a
common part and a unique part, then work with the
common part