You are on page 1of 15

Welcome to Powerpoint slides

for
Chapter 13

Factor Analysis
for
Data Reduction
Marketing Research
Text and Cases
Slide 1
Introduction

1. Factor Analysis is a set of techniques used for


understanding variables by grouping them into
“factors” consisting of similar variables

2. It can also be used to confirm whether a


hypothesized set of variables groups into a factor or
not

3. It is most useful when a large number of variables


needs to be reduced to a smaller set of “factors” that
contain most of the variance of the original variables

4. Generally, Factor Analysis is done in two stages,


called
• Extraction of Factors and
• Rotation of the Solution obtained in stage

5. Factor Analysis is best performed with interval or


ratio-scaled variables
Slide 2
Application Areas/Example

1. In marketing research, a common application area of


Factor Analysis is to understand underlying motives of
consumers who buy a product category or a brand

2. The worked out example in the chapter will help clarify


the use of Factor Analysis in Marketing Research

3. In this example, we assume that a two wheeler


manufacturer is interested in determining which variables his
potential customers think about when they consider his
product

4. Let us assume that twenty two-wheeler owners were


surveyed by this manufacturer (or by a marketing research
company on his behalf). They were asked to indicate on a
seven point scale (1=Completely Agree, 7=Completely
Disagree), their agreement or disagreement with a set of ten
statements relating to their perceptions and some attributes of
the two-wheelers.

5. The objective of doing Factor Analysis is to find


underlying "factors" which would be fewer than 10 in
number, but would be linear combinations of some of the
original 10 variables
Slide 3
The research design for data collection can be stated as
follows-

Twenty 2-wheeler users were surveyed about their


perceptions and image attributes of the vehicles they
owned. Ten questions were asked to each of them, all
answered on a scale of 1 to 7 (1= completely agree, 7=
completely disagree).

1. I use a 2-wheeler because it is affordable.


2. It gives me a sense of freedom to own a 2-
wheeler.
3. Low maintenance cost makes a 2-wheeler very
economical in the long run.
4. A 2-wheeler is essentially a man’s vehicle.
5. I feel very powerful when I am on my 2-wheeler.
6. Some of my friends who don’t have their own
vehicle are jealous of me.
7. I feel good whenever I see the ad for 2-wheeler on
T.V., in a magazine or on a hoarding.
8. My vehicle gives me a comfortable ride.
9. I think 2-wheelers are a safe way to travel.
10. Three people should be legally allowed to travel
on a 2-wheeler.
Slide 4

The input data containing responses of twenty respondents


to the 10 statements are in Appendix 1, in the form of a 20
Row by 10 column matrix (reproduced below).

QUESTION NO.
S. 1 2 3 4 5 6 7 8 9 10
No.
1 1 4 1 6 5 6 5 2 3 2
2 2 3 2 4 3 3 3 5 5 2
3 2 2 2 1 2 1 1 7 6 2
4 5 1 4 2 2 2 2 3 2 3
5 1 2 2 5 4 4 4 1 1 2
6 3 2 3 3 3 3 3 6 5 3
7 2 2 5 1 2 1 2 4 4 5
8 4 4 3 4 4 5 3 2 3 3
9 2 3 2 6 5 6 5 1 4 1
10 1 4 2 2 1 2 1 4 4 1

Table contd on next slide...


Slide 4 contd
QUESTION NO.
S. 1 2 3 4 5 6 7 8 9 10
No.
11 1 5 1 3 2 3 2 2 2 1
12 1 6 1 1 1 1 1 1 2 2
13 3 1 4 4 4 3 3 6 5 3
14 2 2 2 2 2 2 2 1 3 2
15 2 5 1 3 2 3 2 2 1 6
16 5 6 3 2 1 3 2 5 5 4
17 1 4 2 2 1 2 1 1 1 3
18 2 3 1 1 2 2 2 3 2 2
19 3 3 2 3 4 3 4 3 3 3
20 4 3 2 7 6 6 6 2 3 6
Slide 5

The data are subjected to Factor Analysis in two stages


(though the stages are 2, both outputs can be requested at the
same time, at least in SPSS, by the process described in the
SPSS Commands Appendix to the chapter).
1. In stage 1, we request the software package used (SPSS,
Statistica, etc.) to EXTRACT factors with an Eigen Value
of 1 or higher. The method requested is the PRINCIPAL
COMPONENTS. This gives us the output in Figs. 2 and 3.
2. In stage 2, Rotation of the Solution obtained in stage

Fig. 2: Factor/Component Matrix (Unrotated)

Factor Factor 2 Factor 3


VAR00001 1.17581 .66967 .49301
VAR00002 - -.60774 .25369
VAR00003 .13577- .81955 .21827
VAR00004 .96647
.10651 -.03627 -.09745
VAR00005 .95098 .16594 -.13593
VAR00006 .95184 -.08442 -.02522
VAR00007 .97128 .09591 -.04636
VAR00008 - .77498 -.03757
VAR00009 .32171- .73502 -.48213
VAR00010 .16143
.06890 .31862 -.81356
Slide 6

Interpretation of the Output

1. The first step in interpreting the output is to look at the


factors extracted, their eigen values and the cumulative
percentage of variance (fig 3, reproduced below).

Fig. 3: Total Variance Explained

Variable Comm * Factor Eigenva Pact Cum


unality lue of Var Pct
VAR00001 .72243 * 1 3.88282 38.8 38.8
VAR00002 .45214 * 2 2.77701 27.8 66.6
VAR00003 .73056 * 3 1.37475 13.7 80.3
VAR00004 .94488 *
VAR00005 .95038 *
VAR00006 .91376 *
VAR00007 .95474 *
VAR00008 .79869 *
VAR00009 .77745 *
VAR00010 .78946 *
Slide 6 contd...

1. We note that three factors have been extracted,


based on our criterion that only Factors with eigen
values of 1 or more should be extracted. We see
from the Cum. Pct. (Cumulative Percentage of
Variance Explained) column in Fig. 3 that the
three factors extracted together account for 80.3
percent of the total variance (information
contained in the original ten variables). This is a
pretty good bargain, because we are able to
economise on the number of variables (from 10
we have reduced them to 3 underlying factors),
while we lost only about 20 percent of the
information content (80 percent is retained by the
3 factors extracted out of the 10 original
variables).
2. This represents a reasonably good solution for our
problem.
Slide 7

1. Now, we try to interpret what these 3 extracted


factors represent. This we can accomplish by looking
at figs 4 and 2, the rotated and unrotated factor
matrices.

Fig. 4:
Rotated Component Matrixa

Factor 1 Factor 2 Factor 3


VAR00001 .13402 .34749 .76402
VAR00002 -.18143 -.64300 -.07596
VAR00003 -.10944 .62985 .56742
VAR00004 .96986 -.06383 -.01338
VAR00005 .96455 .13362 .04660
VAR00006 .94544 -.13868 .02600
VAR00007 .97214 .02862 .09411
VAR00008 -.26169 .85203 .06517
VAR00009 .00891 .87772 -.08347
VAR00010 .07209 -.10990 .87874
Slide 7 contd...

1. Looking at fig. 4, the rotated factor matrix, we


notice that variable nos. 4, 5, 6 and 7 have
loadings of 0.96986, 0.96455, 0.94544 and
0.97214 on factor 1 (we look down the Factor 1
column in fig. 4, and look for high loadings close
to 1.00). This suggests that Factor 1 is a
combination of these four original variables. Fig.
2 also suggests a similar grouping. Therefore,
there is no problem interpreting factor 1 as a
combination of “a man’s vehicle” (statement in
variable 4), “feeling of power” (variable 5),
“others are jealous of me” (variable 6) and “feel
good when I see my 2-wheeler ads”.

2. At this point, the researcher’s task is to find a


suitable phrase which captures the essence of the
original variables which form the underlying
concept or “factor”. In this case, factor 1 could be
named “male ego”, or “machismo”, or “pride of
ownership” or something similar. With the same
mathematical output, interpretations of different
researchers may differ.
Slide 8

1. Now we will attempt to interpret factor 2. We look


in fig 4, down the column for Factor 2, and find that
variables 8 and 9 have high loadings of 0.85203 and
0.87772, respectively. This indicates that factor 2 is a
combination of these two variables.

2. But if we look at fig. 2, the unrotated factor matrix,


a slightly different picture emerges. Here, variable 3
also has a high loading on factor 2, along with
variables 8 and 9. It is left to the researcher which
interpretation he wants to use, as there are no hard and
fast rules. Assuming we decide to use all three
variables, the related statements are “low
maintenance”, “comfort” and “safety” (from
statements 3, 8 and 9). We may combine these
variables into a factor called “utility” or “functional
features” or any other similar word or phrase which
captures the essence of these three statements /
variables.
Slide 8 contd...

3. For interpreting Factor 3, we look at the column labelled


factor 3 in fig. 4 and find that variables 1 and 10 are loaded
high on factor 3. According to the unrotated factor matrix of
fig. 2, only variable 10 loads high on factor 3. Supposing we
stick to fig. 4, then the combination of “affordability’ and
“cost saving by 3 people legally riding on a 2-wheeler” give
the impression that factor 3 could be “economy” or “low
cost”.

4. We have now completed interpretation of the 3 factors


with eigen values of 1 or more. We will now look at some
additional issues which may be of importance in using factor
analysis.
Slide 9

Additional Issues in Interpreting Solutions

1. We must guard against the possibility that a


variable may load highly on more than one
factors. Strictly speaking, a variable should load
close to 1.00 on one and only one factor, and load
close to 0 on the other factors. If this is not the
case, it indicates that either the sample of
respondents have more than one opinion about the
variable, or that the question/ variable may be
unclear in its phrasing.

2. The other issue important in practical use of


factor analysis is the answer to the question ‘what
should be considered a high loading and what is
not a high loading?” Here, unfortunately, there is
no clear-cut guideline, and many a time, we must
look at relative values in the factor matrix.
Sometimes, 0.7 may be treated as a high value,
while sometimes 0.9 could be the cutoff for high
values.
Slide 9…contd…
Additional Issues (Contd.)

1. The proportion of variance in any one of the original


variables which is captured by the extracted factors is
known as Communality. For example, fig. 3 tells us
that after 3 factors were extracted and retained, the
communality is 0.72243 for variable 1, 0.45214 for
variable 2 and so on (from the column labelled
communality in fig. 3). This means that 0.72243 or
72.24 percent of the variance (information content) of
variable 1 is being captured by our 3 extracted factors
together. Variable 2 exhibits a low communality value
of 0.45214. This implies that only 45.214 percent of the
variance in variable 2 is captured by our extracted
factors. This may also partially explain why variable 2
is not appearing in our final interpretation of the factors
(in the earlier section). It is possible that variable 2 is
an independent variable which is not combining well
with any other variable, and therefore should be further
investigated separately. “Freedom” could be a different
concept in the minds of our target audience.

2. As a final comment, it is again the author’s


recommendation that we use the rotated factor matrix
(rather than unrotated factor matrix) for interpreting
factors, particularly when we use the principal
components method for extraction of factors in stage 1.

You might also like