You are on page 1of 27

Industrial Statistics

MS3001 Advanced Marketing Research

Faculty of Science University of Colombo

Application of Multivariate Statistical Methods in Marketing Research

Session 3 Factor Analysis

Factor Analysis
Analysis of Interdependence: for data reduction and the discovery of underlying themes in the data

Factor Analysis (FA)

The term factor analysis was first introduced by Thurstone, 1931 Factor analysis tries to simplify attitudinal data by providing an alternative way of looking at it
What are the main underlying themes in the data? Which perceptions are related?

FA is based on analysing correlation matrix of attributes and aims to identify questions that measure, what respondents see as, similar or related concepts Essentially factor analysis is applied as a data reduction or structure detection method

Illustration
Can a set of 30 imagery statements for the shampoo category be simplified without any loss of information? There seem to be as many as 42 purchase decision criteria for my category. Can you help summarize these criteria? What all would Customer Service Orientation constitute? What variables? Can the variables be grouped into themes / dimensions? I want to deploy an objectively tested, valid scale for my Customer Satisfaction studies. My team has developed a huge battery of statements? Can you help?

Factor Analysis

Factor Analysis
Investigates interrelationships among variables. Variable reduction exercise: Reduces the variables into a sub-set of factors without loss of information Used to define or discover themes or underlying (latent) dimensions of a large set of attributes / variables. Often an intermediate step to some other procedure Factors are used as independent variables in Multiple Regression. Interdependence technique: no variable designated dependent or independent. All variables to be metric (interval) Large samples preferred The worth of the solution often depends on the intuitive interpretability of the factors rather than statistical rules.

Principle behind Factor Analysis

To understand the principle behind factor analyses let us just consider two variables from a study to measure the peoples satisfaction with their life:
1. How satisfied they are with their hobbies? 2. How intensely they are pursuing their hobbies?

One can summarize the correlation between the two variables in a scatter plot. A regression line can be fitted that represents the best summary of the linear relationship between two variables

Regression between two variables

Simple Regression
9 8 7 6 5 4 3 2 1 0 0 2
Intensity of pusrsuing Hobby

4 6 8 10 Satisfaction with Hobbies
7 point scales

Combining two variables in Single Factor

If we could define a variable representing the regression line then that variable would capture the essence of two variables
In a sense we have reduced two variables to one factor

New Factor is a linear combination of two given variables

Combining two correlated variables in to one factor, illustrates the basic idea of Factor Analysis or Principal Component Analysis (PCA) If we extend the two variables example to multiple variables then computations become more involved but basic principle of representing two or more variables by single factor remains the same

Orthogonal Factors
After we have found the line on which the Variance is maximal, there remains variability around this line We continue and define another line that maximizes the remaining variability In this manner consecutive factors are extracted Because each factor is defined to maximize the variability that is not captured by preceding factor, consecutive factors are independent of each other

Put another way consecutive factors are uncorrelated or ORTHOGONAL to each other

Factor Analysis: Example 1

Customers asked to rate bus travel on a number of attributes on a 10 point scale: 1 = Doesnt describe bus travel at all 10 = Totally describes bus travel
Relaxed Friendly Nervous Tolerate it Easy Interesting Uncertain Waste of time

Which statements did they rate similarly? ie which statements are correlated? common themes in the data

Factor Analysis: Example 1 Statements

Correlations Grouped Q1 Relaxed

Q2 Friendly

Q3 Nervous

Q4 Tolerate it

Q5 Easy

Q6 Interesting

Q7 Uncertain

Q8 - Waste of
time

Factor Analysis: Example 1 Component Matrix

Component

1
Q2 Friendly Q1 Relaxed Q6 Interesting Q5 Easy Q4 - Tolerate it 0.823 0.803 0.732 0.725 0.456

2
-0.186

Correlation between statements and factor

First four statement load mainly on first factor Positive bus travel Other 4 load on second factor Negative about bus travel Tolerate it loads on both

-0.265 0.253

Q7 Uncertain
Q3 Nervous Q8 - Waste of time -0.144

0.767
0.697 0.691

Factor Analysis: Example 1 Component Matrix

Component

1
Q2 Friendly Q1 Relaxed Q6 Interesting Q5 Easy Q4 - Tolerate it 0.82 0.80 0.73 0.72 0.45

Correlation between statements and factor

First four statement load mainly on first factor Positive bus travel

Q7 Uncertain
Q3 Nervous Q8 - Waste of time

0.77
0.70 0.70

Other 4 load on second factor Negative about bus travel Tolerate it loads on both

Principal Components Analysis (PCA):

Principal components analysis (PCA):
The most common form of factor analysis, PCA seeks a linear combination of variables such that the maximum variance is extracted from the variables. It then removes this variance and seeks a second linear combination which explains the maximum proportion of the remaining variance, and so on. This is called the principal axis method and results in orthogonal (uncorrelated) factors. PCA analyzes total (common and unique) variance.

Correlations
Grouped

Q1 Relaxed

Q2 Friendly

Q5 - Easy

Q6 Interesting

Q4 Tolerate it

Q3 Nervous

Q7 Uncertain

Q8 - Waste of
time

FA: Example 2, How much variance do the factors explain?

Total Variance Explained Initial Eigenvalues Component Total 1 2 3 4 5 6 7 8 9 5.459 1.249 .900 .830 .631 .478 .431 .353 .295 .204 Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

% of Cumulative % of Cumulative % of Cumulative Total Total Variance % Variance % Variance % 49.628 11.359 8.179 7.546 5.736 4.348 3.917 3.208 2.682 1.850 49.628 5.459 60.986 1.249 69.165 76.711 82.448 86.795 90.713 93.921 96.603 98.453 .900 49.628 11.359 8.179 49.628 2.894 60.986 2.634 69.165 2.080 26.312 23.948 18.905 26.312 50.260 69.165

How much of the total variation in the data is explained by the factors 11 .170 1.547 100.000 The factors should explain at least 2/3 of the Extraction Method: Principal Component Analysis. variance. In this data, the first three factors explain 69% of the variable.
10

FA: Example 2: Identifying factors from the Factor loadings

Rotated Component Matrix(a) Component 1 Offers value-for-money products and services Offers wide range of products and services to suit different needs Progressive and provides innovative insurance solutions Has expertise in providing insurance solutions A reputable insurance provider/company An insurance company I can trust Global insurance company An insurance company with financial strength 2 3

.865 .257 -.006 .836 .101 .192 .741 .197 .432 .657 .326 .267 .251 .849 .086 .187 .809 .208 .425 .593 .283 .074 .575 .458

One of the insurance companies that I would first recommend to my customers .172 .086 .821 Has strong working relationships with its distributors/intermediaries Established local insurance company Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. .200 .342 .689 .334 .481 .543

Review factor loadings to decipher the factors. The factor loadings are the correlations between the factor and the attribute.

Example (Naming exercise - Identifying factors)

Rotated Component Matrix(a) Component 1 Offers value-for-money products and services Offers wide range of products and services to suit different needs Progressive and provides innovative insurance solutions Has expertise in providing insurance solutions A reputable insurance provider/company An insurance company I can trust Global insurance company An insurance company with financial strength 2 3

.865 .257 -.006 .836 .101 .192 .741 .197 .432 .657 .326 .267 .251 .849 .086 .187 .809 .208 .425 .593 .283 .074 .575 .458

Factor 1: Practical solutions

Factor 2: Reputation

One of the insurance companies that I would first recommend to my customers .172 .086 .821 Has strong working relationships with its distributors/intermediaries Established local insurance company Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. .200 .342 .689 .334 .481 .543

1. 2. 3.

A three factor solution is selected for these data: Practical solutions Reputation Distribution/how well established

Factor 3: Distribution/ established

Key Concepts
Eigen Value
Also called characteristic roots. The eigen value for a given factor measures the variance in all the variables which is accounted for by that factor. The ratio of eigen values is the ratio of explanatory importance of the factors with respect to the variables. If a factor has a low eigen value, then it is contributing little to the explanation of variances in the variables and may be ignored as redundant with more important factors.

Variance Explained
To get the percent of variance in all the variables accounted for by each factor, add the sum of the squared factor loadings for that factor (column) and divide by the number of variables. (Note the number of variables equals the sum of their variances as the variance of a standardized variable is 1.) This is the same as dividing the factor's eigenvalue by the number of variables.

Types of Rotation
Varimax rotation is an orthogonal rotation of the factor axes to maximize the variance of the squared loadings of a factor (column) on all the variables (rows) in a factor matrix, which has the effect of differentiating the original variables by extracted factor. Each factor will tend to have either large or small loadings of any particular variable. A varimax solution yields results which make it as easy as possible to identify each variable with a single factor. This is the most common rotation option.

Promax rotation
is an alternative non-orthogonal rotation method which is computationally faster than the direct oblimin method and therefore is sometimes used for very large datasets.

How many Factors?

There are two criteria of choosing the number of factors
The Kaiser Criterion
We can retain only factors with eigenvalue greater than 1 In essence this is like saying that, unless a factor extracts at least as much as the equivalent of one original variable, we drop it A graphical method is the Scree Test. We can plot the eigenvalues in a simple line plot The place where the smooth decrease of eigenvalues appears to level off to the right of plot. To the right of this point, presumably, one finds only factorial scree scree is the geological term referring to debris which collects on the lower part of a rocky slope.

Scree Plot Criteria

The scree plot helps you to determine the optimal number of components. The eigenvalue of each component in the initial solution is plotted. Generally the components on the steep slope are extracted

Scree Plot

Eigenvalue

0 1 2 3 4 5 6 7 8 9

Component Number

Statistics to look at
KMO : Should be more than 0.5
Tells whether the partial correlation between variables is small or large
(0-1) should be close to 1 Below 0.5 implies factor analysis wont be useful.

Bartlett's Test of Sphericity : Should be significant

Tells whether the variables are correlated or not
Bartlettsignificance level should be very small; say below 0.05. and NOT above 0.10

Factor analysis considerations

Choosing the number of factors is an art, as much as a science
Usual practice is to run several alternative analyses Researcher and analysts collaborative judgment are important, to generate a solution that provides a plausible explanation and interpretation of the factors

Must achieve a balance between, one the one hand, having enough factors to explain the variation in the original data satisfactorily and, on the other, not having so many factors that little or no data reduction had been achieved. Look for at least 65-70%+ with scale data, but 50+% with binary How big a sample is needed?
The larger the sample size, the more accurately we can estimate the correlations between questions and the more repeatable the analysis will be A sample of 400 or more should provide a stable factor analysis Minimum sample size of c200

What types of scales work best?

Preferably interval data (5 or 7 point Likert, Agree/disagree scale is actually ordinal data but is treated as interval) as the correlations better estimated Binary (yes/no) variables often have a lower correlation

Question
Factor - respondent vs response ?
Factor vs Cluster ?

Factor Analysis - Summary

Summarises large amounts of data Identifies patterns easily that can be hard to find By basing factors on data patterns, analysis based on actual results, not preconceptions or questionnaire issues Used in conjunction with MLR But....

All variability in data not usually accounted for in factor analysis Factors can be hard to interpret - represent many measures Factors depend on data, and can differ for different sets of data

What it does
Identifies underlying families of parameters that are highly correlated, and each family represents a different factor. Helps reduce data from a large number of parameters to a small number of factors. Produces a set of independent variables to be used for further analysis.

Examples of application
What are the main characteristics based on which consumers form brand images in their mindsets? What are the main service aspects retailers consider when evaluating overall satisfaction with the service provided by supplier? Which 10 or 15 attributes should I finalize to be measured from a list of 30 attributes

Factor Analysis Quick Reference Guide

Requirements
Type of scales: Interval Free association data (ordinal type) could be treated as binary interval data. Some binary nominal scales (involving opposites) could be treated as interval Exclude variables with very low variance, if any. If a pair of variables has a very high correlation, keep one, exclude the other.

Outputs and how to interpret

Factor loading: correlation between factor and standardized variable. Higher loading means more weightage of the variable in the factor Use Promax Rotation to get correlated factors; For exploratory understanding Use Varimax rotation when un-correlated factors are required; to be used for regression and cluster Remove variables which load into multiple factors for cleaner solutions; especially for regression

Guidelines
Number of factors, based on: Total variance explained above 60% Eigen value above 1. Number of variables divided by 3 or 4. KMO measure: (0-1) should be close to 1.Below 0.5 implies factor analysis wont be useful. Bartlettsignificance level should be very small; say below 0.05. and NOT above 0.10

How to use SPSS

Analyze> data reduction > factor
Extraction > method >principal components > ( scree plot) Rotation > method > varimax. Eigen values over > 1 or number of factors =? Descriptives > KMO and Bartletts test of sphericity Scores > save as variables > method: regression Options > missing values > exclude cases pair wise.