You are on page 1of 93

Factor Analysis

Prepared for : IBS Gurgaon


By Prof. Manisha Kapoor
December 2020
1. Understanding Factor Analysis
2. Purpose of using factor analysis & Business
Applications
• Factor analysis in Bank
• Case study Psychographic profile – Telecom Brand
3. Assumptions of Factor analysis
4. Nike Case study
Agenda 5. 2 types – PCA & CFA
6. Statistics associated with Factor Analysis
Interpreting/Naming the factors
7. How many factors? - SPSS – 1st Graphical method
(scree plot) 2nd Eigen values
8. Assignment – Inverter case study
9. Assignment – Elan Jeans
1. Understanding
factor analysis
What is factor analysis ?
Factor Analysis
• Data Reduction / Data Summarization
technique
• Interdependence technique

Factor analysis is a general name denoting a


class of procedures primarily used for data
reduction and summarization

Variables are not classified as either dependent


or independent. Instead, the whole set of
independent relationships among variables is
examined in order to define a set of common
dimensions called Factors
2. Purpose of
using factor
analysis
Factor Analysis in Banks
Factor Analysis earns interest in Banks
• How do consumers evaluate banks? Respondents in a survey were asked to rate the importance
of 15 bank attributes. A 5-point scale ranging from not important to very important was
employed. These data were analyzed via principal components analysis

On a scale of 1 to 5 how important is following


attribute
Interest rates on loans 1 2 3 4 5
Not Very Important
at all important
A four-factor solution resulted, with the factors being labeled
4 Factor solution as traditional services, convenience, visibility, and competence.

Factor 1 Factor 2 Factor 3 Factor 4


Traditional services Convenience Visibility Competence

• Interest rates on loans • Convenient branch location • Recommendations from • Employee competence
• Reputation in the • Convenient ATM locations friends and relatives • Availability of auxiliary
community • Speed of service, and • Attractiveness of the banking services.
• Low rates for checking • Convenient banking hours. physical structure
• Friendly and Personalized • Community involvement
service • Obtainability of loans.
• Easy-to-read monthly
statements
• Obtainability of loans.
4 Factor solution

Factor 1 Factor 2 Factor 3 Factor 4


Traditional services Convenience Visibility Competence

• Interest rates on loans • Convenient branch location • Recommendations from • Employee competence
• Reputation in the • Convenient ATM locations friends and relatives • Availability of auxiliary
community • Speed of service, and • Attractiveness of the banking services.
• Low rates for checking • Convenient banking hours. physical structure
• Friendly and Personalized • Community involvement
service • Obtainability of loans.
• Easy-to-read monthly
statements It was concluded that consumers evaluated banks using the four basic factors of
• Obtainability of loans. traditional services, convenience, visibility, and competence, and banks must excel on
these factors to project a good image. By emphasizing these factors, JPMorgan Chase
& Co. became one of the largest U.S. banks and bought the banking operations of
bankrupt rival Washington Mutual in September 2008.'
1. To identify latent or underlying factors, from an array of seemingly
important variables by analysing correlations between variables

➢ Six psychographic variables into two


➢ Academic, sports, cultural
Factor analysis ➢ Honesty, Satisfaction
is used in the 2. To identify a new, smaller, set of uncorrelated factors to replace the
following original set of correlated variables in subsequent multivariate
analysis (regression or discriminant analysis).
circumstances:
➢ Identifying smaller set of variables from larger set – drop others
to avoid problem of multicollinearity

3. To identify a smaller set of salient ( Surrogate) variables from a


larger set for use in subsequent multivariate analysis.

➢ Parsimony or data reduction


Factor Analysis : Segmentation
study by Telecom brand
Telecom: Market
Segmentation Case
Study
Factors used as input for
segmentation
WHO AM I?

Main
Quite Users Enthusiast Savvy users Browsers Experiencer Nay Sayers Devotees
Streamer

• High • Major
• High
representation representation • High % of 40+ • High % of 40+
• Spread Across → • Spread Across • Spread Across representation
Age Avg. Age: 28.8 → Avg. Age: 29.8 → Avg. Age: 27.8
from Medium from (14-19 yrs) yrs. Avg. Age: yrs. Avg. Age:
from 20-29 years
age (20 – 29 yrs.) & (30 – 39 yrs) → 31.9 30.7
Avg. Age: 27.3
→ Avg. Age: 26.7 Avg. Age: 27.8

• Very High
• Very high • High
• Straddle except representation
Town Class • Straddle • High: Metros
Tier 3
• Tier 1 representation representation • Tier 1
from Tier 2 &
from Tier 4 from Metros
Tier 3

• Moderately High • High on SEC A & • Very High on SEC


SEC on SEC A SEC C
• Straddle • High on SEC D
A
• Straddle • SEC C & SEC D • Very High SEC B

• Male/ Female: • Male/ Female:


Gender • Male Skewed • Male Skewed • Female Skewed
Balanced Balanced
• Male Skewed • Male Skewed • Male Skewed

• Permanent • Permanent • Permanent • Permanent


Residential Resident Resident • More permanent • More Permanent Résident Resident • Permanent • Permanent
Status • Migratory • Migratory migrants Residents • Migratory • Migratory Resident Migrants
Population Population Population population

• Very High
• High Businessman • Skilled Workers • Skilled worker,
Student • Skilled Workers • Unskilled, Petty
• Industrialist • Shop Owners • Supervisory
• Petty trader • Very High • Balanced • Shop owners traders
Occupation • Junior • Clerical Salesman
• Clerical Student Mix • officers/executiv • Very high
level,
• Officers/ • Junior level • Junior executives
• Salesman e- senior Housewives
Executives Executives • Businessmen
• Supervisory level
Contents
➢ Research Objective ➢ Segment Profiling
➢ Approach to Segmentation ▪ Demographics
➢ Techniques used ▪ Psychographics
▪ Mobile & Internet Usage
➢ Sampling Methodology
▪ Entertainment- TV Vs. Smartphone
➢ Weights and Quotas ▪ Alternate modes
➢ Sample Construct and City Classification ▪ Importance for best MI experience
➢ Target group ▪ Need & expectations
▪ Acceptance to Concept
➢ Segment Size
➢ Segment Description
➢ Segment Snippets
RESEARCH OBJECTIVE

• The core objective of the study is to segment the Mobile internet (2G/3G) smartphone users to create different
target groups by identifying the distinctness on following parameters -

Profile based on Life values, attitude towards Understand behavior, usage trends (Initial, current
technology ,views about smartphone & MI of the and future) to understand evolution of various
customers application categories used

To identify areas of stress and apps used to find To profile market based on usage (MB, minutes and
needs of customers rupees)

To profile market based on demographics, Barriers and Importance of various application


emerging vs mature, app vs browser usage categories
Key Information Areas
› Life Value – A value is a belief, a mission, or a philosophy that is meaningful. Whether we are consciously aware of them or not,
every individual has a core set of personal values. Values can range from the commonplace, such as the belief in hard work
and punctuality, to the more psychological, such as self-reliance, concern for others, and harmony of purpose. Various questions
about life values have been asked from the respondent which helps in capturing his/her attitude towards life in general.

› Views about Smartphone - Opinions of their smartphones such as how well they understand the features of their smartphone,
what can increase their usage on smartphone such as screen size, HD quality, internet connectivity were asked. Also the
importance, indulgence and emotional connection such as smartphone as extension of personality and source of entertainment
were asked.

› Attitude towards Technology – Impact of technology and how they perceive i.e. eased out life or have created confusion, helps
in planning life, first to buy tech products, Tech products as investments etc were asked.

› Opinion about Mobile Internet – Vulnerable or empowered, socially active, convenience, image and productivity enhancer,
preferred over other devices etc. were asked.
Approach to Segmentation
Market segmentation method was undertaken to identify the customers’ attitudes towards technology, life
values, smartphone and Mobile Internet opinion , behavior and unmet needs that could prominently appear in
different segments. In order to derive the insights, Latent Class Analysis was used to identify the various
segments

STEPS IN SEGMENTATION

▪ As a first step A - priori factors were created using the respondents opinion on Smart Phones, Life Values,
Attitudes towards technology, Views on Technology and Attitudes towards Mobile Internet (MI)

▪ These factors were used as the main segmenting variables

▪ Demographic and Usage behavior variables were used as Covariates

▪ The resulting segments were profiled in terms of variables used for segmentation as well as other variables

The resulting segments are homogeneous within and heterogeneous across the segmenting variables. The segments are
profiled on all variables which are found to significantly distinguish between them
Segmenting Variables
Market segmentation

The following segmenting variables were used to profile segments –

1. Geographic segmentation – Metro, Tier 1, Tier 2, Tier3 and Tier 4

2. Demographic segmentation - based on age, gender, social class, family size, occupation, type of residents
etc.,

3. Psychographic segmentation based on lifestyle, and/or personality characteristics, attitudes towards


technology, Views about Mobile Internet and opinion on smartphones

4. Behavioural segmentation based on occasion & frequency, category used, service usage (money spend ,
minutes of usage and MB of usage of mobile data)

5. Needs Maxdiff was processed for BU and BAU and tagged with each segment to know the order of
importance of these statements for each segments
A-priori factors
• Factor analysis is done to reduce the large number of variables in to
smaller set of variables which are highly correlated and generally point
to a bigger dimension
• Researcher should be able to name these factors which indicates the
common thread across the variables in that factor. However many
A NOTE ON times the experienced researcher already has the prior knowledge of
possible factors and wants to test if this is true. These factors are
Techniques known as a-priori factors
• One uses these a-priori factors in factor analysis and generates the
useD component matrix to see the factor loadings of the variables in that
factor. Factor Loadings above 0.5 generally indicate that the variables
are part of that factor. A factor loading of less than 0.5 indicates that
the variable should be dropped from that a-priori factor.
• Once researcher is satisfied that the variables in a factor has high
loadings and variance explained is more than 50% then the factor
scores are generated
• These factors are then used for further analysis (In this case Latent
class segmentation)
Factors Analysis
Factor - 1 : Good Qualities Loading Factor - 2 : Power, Authority and Wealth Loading
HELPFULNESS- Making an effort to assist others .768 POWER- Having control over people and resources .895
EQUALITY- Desiring equal opportunity for all .800 STATUS- Achieving a higher social status .883
EFFICIENCY- Getting things done effectively and on time .792 WEALTH- Having material possessions, a lot of money .878
DUTY- Fulfilling obligations to family, community and country .759 ADVENTURE- Seeking adventure and risk .840
OPEN-MINDEDNESS- Being open-minded .742 Variance Explained = 76.0%
HONESTY- Being sincere, having integrity .789 Chrobach's Alpha = 0.897
JUSTICE- Protecting individual rights .789
Variance Explained = 77.7%
Chrobach's Alpha = 0.952

Factor - 3 : Fun & Enjoyment Loading Factor - 4 : Traditional Loading


CREATIVITY- Being creative, imaginative .876 TRADITIONAL GENDER ROLES- Following traditional roles .920
for men and women
PROTECTING THE ENVIRONMENT- Helping to preserve the TRADITION- Preserving time-honored customs .920
.849
natural environment
PLEASURE- Indulging my desires .864 Variance Explained = 84.6%
ENJOYING LIFE- Doing things because I like them .871 Chrobach's Alpha = 0.818
HAVING FUN- Having a good time .851
Variance Explained = 74.4%
Chrobach's Alpha = 0.911
Factors Analysis
Factor - 5 : Less Ambitious, Life is full of problems Loading Factor - 6 : Negative for MI, Not Image Booster Loading
My interests are somewhat narrow and limited .774 Smartphones with 3G connections are a common thing and .733
no longer a image booster
Life is not easy as some people see it. Its full of problems .743 Accessing mobile internet makes me frustrated .771
I never get any breaks in life and there are limited .753 Mobile internet has made my life more disorganized .738
opportunities one gets
It is not necessary to take risk in life to succeed .780 Watching Videos on my smartphone’s small screen is .749
difficult and uncomfortable and rarely used by people like me

Variance Explained = 58.1% Variance Explained = 55.9%


Chrobach's Alpha = 0.760 Chrobach's Alpha = 0.737

Factor - 7 : Savy user, emotional, indulgent Loading Factor - 8 : Tech products complicate life Loading
I understand how to use most of the features on my .756 More options means more confusion .739
smartphone
Having one smartphone that can do everything is very .774 Technology like Smartphones, Tablets, Mobile Internet has .771
convenient complicated life
I think of my smart phone as a source of entertainment .739 The newer the product the more complicated it is .704
My smart phone is an extension of my personality .771 It makes me vulnerable to outer world .739
I enjoy customizing the look and feel of my smart phone .766 Variance Explained = 54.6%
I expect the quality of video on my smart phone to be HD .767 Chrobach's Alpha = 0.721
I would use the Internet on my smart phone more often, if the .777
websites and applications load more easily
I would use the Internet on my smart phone more often, if the .769
screen were easier to read
Variance Explained = 58.5%
Chrobach's Alpha = 0.899
Factors Analysis
Factor - 10 : Not socially active, not brand conscious,
Factor - 9 : Tech products does not create impressions Loading Smart Phone is just a tool Loading
Impressions are not formed merely by carrying the latest .800 I just want to enjoy hi-tech products and it is not essential for .762
technology products me to know brand and/or technology behind it
Smartphones with 3G connections are a common thing and .800 Smartphone is just a tool for me .824
no longer a image booster
Variance Explained = 64% I don’t feel any relevance in being connected online all time .769
Chrobach's Alpha = 0.438 Variance Explained = 61.7%
Chrobach's Alpha = 0.689

Factor - 11 : MI not productivity enhancer, other tools Factor - 12 : Does not share on social network sights
exist Loading but checks what others do Loading
Mobile Internet does not significantly enhance my productivity .686 I don’t share every bit of my life socially, but update and share .730
photos, status only on few occasions
Purely a perception as other devices also offer equal multi- .776 I do more communication through voice calls than through .748
tasking avenue text or messaging on IM
Mobile internet has increased by access to things but has .806 I just go to social websites to read, check other peoples .687
not increased efficiency posts, videos, photos.
Variance Explained = 57.4% Variance Explained = 52.2%
Chrobach's Alpha = 0.627 Chrobach's Alpha = 0.462

Factor - 14 : Cares about other, believes in equality


Factor - 13 : In the moment activities and Flexibility Loading and justice and looks for comfort and security Loading
I live my daily life largely by being “in the moment” created .783 I primarily perceive myself as a person who really cares .728
activities about others and strongly believes in equality and justice
Flexibility should be offered to people and resources so that .783 In my life I prefer daily routines, which brings me comfort and .728
they behave the way they want security
Variance Explained = 61.3% Variance Explained = 52.9%
Chrobach's Alpha = 0.368 Chrobach's Alpha = 0.110
Factors Analysis
Factor - 15 : In the moment activities and Flexibility Loading Factor - 16 : In the moment activities and Flexibility Loading
I only lookout for new options in technology when my current .715 I was curious or had a strong desire for using mobile internet .736
technology becomes obsolete
I am amongst those who are the first to buy the latest .762 It provided too many options and further confuses me .723
technology product
I seek products based on relevance and not just blindly .746 I can stop using mobile internet at will .735
following trend
Variance Explained = 55% Variance Explained = 53.5%
Chrobach's Alpha = 0.590 Chrobach's Alpha = 0.564

Factor - 18 : Prefer MI for Business, Prefer Internet on


Factor - 17 : Not Proactive Loading PC Loading
I know what I want from the provider and would call him .800 I use Mobile Internet primarily for Business purposes .775
myself rather he calling me often
I am not that demanding in terms of performance .800 I prefer to do online transaction & ecommerce like Payments .775
of bills, booking tickets, buying products etc. through my fixed
regular internet connection on a PC
Variance Explained = 64% Variance Explained = 60.1%
Chrobach's Alpha = 0.431 Chrobach's Alpha = 0.336
Factors Analysis
Factor - 19 : MI Constraints Loading Factor - 20 : Ready to pay but no wastage Loading
There is hardly anything more than I could do with mobile .809 All my choices are motivated by a need for efficiency and no .693
internet besides what I was earlier doing with other devices wastage
Even though I can now do things on the go because of .809 I lookout for newer better performing brands /products and .684
mobile internet, but there are constraints due to which I need keep switching to other brands
to use other devices
I have a preference to own more number of technology .728
Variance Explained = 65.4% products than go for few quality products
Chrobach's Alpha = 0.468 Variance Explained = 49.3%
Chrobach's Alpha = 0.486

Factor - 21 : Technology Dependent Loading Factor - 22 : Tech Savy Loading


I cannot rely on wireless internet alone .802 I don't mind paying something extra to have the more .797
technologically advanced products
Lot of the things I undertake will come to standstill without .802 Tech products like Smartphone, Tablets are like investments .797
technology for me that helps organize my life and make me productive in
my daily life
Variance Explained = 64.4% Variance Explained = 63.4%
Chrobach's Alpha = 0.446 Chrobach's Alpha = 0.423

Factor - 23 : Technology Dependent Loading


Not necessary for me to be reachable wherever I am .771
I don’t like to publish my own photos all the time, but be a .771
member of the bigger group and contribute to that group
whenever I can
Variance Explained = 59.5%
Chrobach's Alpha = 0.319
1
Quite Users
(28%)
Demographics
Cluster 1

Who am I??

Age - Has fair representation from across all age


Bands (Mean age – 28.76)

M:F- High representation of Males

SEC - Moderately High on SEC A

City Classification: Cuts across Tiers

Occupation - High Businessman/Industrialist,


Junior Officers/ Exec, when compared with average
“I have a preference to
Average Household Size : 3.31 own more number of
technology products than
Fair representation of Permanent Resident and
go for few quality
Migratory Population
products”
psychographics
Cluster 1

Attitude towards life/Persona

› Somewhat positive on good qualities such as power, tradition, fun


and enjoyment

› Somewhat (less ambitious & feels life full of problems)

Lifestyle

› Looks for Efficiency and no Wastage, Comfort and Security

Attitude towards Technology

› Somewhat feels Tech products doesn't create impressions,


somewhat feels tech product complicates life

Views about Internet

› Has increased access to things but has not increased efficiency,


“Just want to use MI for business purposes

smartphone to Opinion on Smartphone

make and receive › Just want to use smartphone to make and receive calls

calls”
1. Market Segmentation – Factor analysis helps in identifying the underlying dimensions from the observed
variables which help researcher to group customers

2. Identifying dimension – Factor analysis help in identifying product or brand attributes that consumer consider
important while taking a buying decision. Factors that explain higher varieties contribute highly in factor analysis
procedure. This information helps the researcher to figure out which variables account for higher varieties and
they may subsequently test its contribution in consumer decision making.
Factor Analysis
Business Applications
Factor Analysis
Business Applications

Type of Study Groups


1 Market It can be used for identifying the underlying New Car buyers might be grouped based on
segmentation variables on which to group the customers the relative emphasis they place on
economy, convenience, performance,
comfort and luxury
This might result in five segments: economy
seekers, convenience seekers, performance
seekers, comfort seekers, and luxury seekers.

2 Product Research In product research, factor analysis can be Toothpaste brands might be evaluated in
employed to determine the brand attributes terms against cavities, whiteness of teeth,
that influence consumer choice. taste, fresh breath, and price
Factor Analysis
Business Applications

Type of
Groups
Study
3 Advertising To understand media The users of frozen foods may be
studies consumption habits of target heavy viewers of cable TV, see a lot
market of movies, and listen to country
music.

4 In pricing To identify characteristics of For example, these consumers might


studies price-sensitive customers be methodical, economy minded,
and home centered.
Latent variables
Example : Factor analysis attempts to explain 100 variables based on some
common underlying dimensions
Example 1
Model Representation in
Factor Analysis
• We have set of observed variables (for which we have data and we are trying to explain the variance and covariance between
these variables (typically using a sample come up with the model for population) and our model will explain the
variance/covariance between these variables by unobserved variables
• We have data from samples of some observed variables.. Such as whether individuals experience insomnia, suicidal thoughts,
whether they hyperventilate , whether they nauseate
• There can be covariance between Insomnia and Suicidal thoughts. Cov (I, S) = 0.3. We are trying to come up with a model
which will explain that covariance in the population
• The way that Factor analysis works is that the covariance in these observed variables are due to some unobserved factors
• So here it can be that the individual is depressed or is experiencing some form of Anxiety

• We assume these underlying factors – Anxiety and Depression cause the variances and covariances amongst these observed
factors
• So there’s weighting of Depression on each of these observed characteristics as well as weighting of Anxiety on these
observed characteristics. These two unobserved factors have causal effect on these observed characteristics
• The weightings of these unobserved factors … weights depression cause in insomnia is W11 and anxiety say W21. We are
trying to estimate these weights and these unobserved factors
• These unobserved factors can themselves be corelated… and there can be other underlying factors
• We are trying to explain the variance of insomnia in population and trying to estimate that through my sample data. We suppose that there’s a
proportion of insomnia which is due to these shared unobserved factors and we call this variance communality because this is the proportion
of variance explained by factors which are common to the other observed variables
• But, there is some proportion of insomnia which is not explained by these unobserved factors which we call unique variance of that specific
observed variable… we suppose e1 ..explain unique variance of that particular factor…
Introduction to Factor Analysis
What is purpose?

“To estimate a model which explains variance/covariance between a set of observed variables
(in a population) by a set of (fewer) unobserved factors + weightings”

Observed Hyperventilate Nausea


Insomnia Suicidal
variables

Underlying
variables
Depression Anxiety

• We are trying to estimate these unobserved factors


Introduction to Factor Analysis
What is purpose?

“To estimate a model which explains variance/covariance between a set of observed variables
(in a population) by a set of (fewer) unobserved factors + weightings”

Observed Hyperventilate Nausea


Insomnia Suicidal
variables

W11 W21
W12

Underlying
variables
Depression Anxiety

• We are trying to estimate these unobserved factors


Introduction to Factor Analysis
What is purpose?

“To estimate a model which explains variance/covariance between a set of observed variables
(in a population) by a set of (fewer) unobserved factors + weightings”

E1 E2 E3 E4

Observed Hyperventilate Nausea


Insomnia Suicidal
variables

W11 W21
W12

Underlying
variables Depression Anxiety
SHARED
FACTORS

UNIQUE
Covariation in Insomnia = Shared Factors + Unique
Model representation in Factor Analysis
Unique
E1 E2 E3 E4 factors

Observed Hyperventilate,Y3 Nausea,Y4


Insomnia, Y1 Suicidal, Y2
variables

W11 W21
W12

Underlying
variables Depression, F1 Anxiety, F2 Common
SHARED
factors
FACTORS

• The covariation among the


UNIQUE variables is described in terms of a
Covariation in Insomnia = Shared Factors + Unique small number of common factors
plus a unique factor for each
• Mathematically, factor analysis is similar to multiple variable
regression analysis, in that each variable is expressed as a
• The amount of variance a variable
linear combination of underlying factors
shares with all other variables
Y1 = A11F1 + A12F2 + e1 Unique Variance included in the analysis is referred
Var (Y1) = (A11)2 + (A12)2 + sigma square to as communality or proportion of
Communality variance shared by the common
A12 → loading of variable Y1 (insomnia) on factor F2 (Anxiety) factor
Model representation in Factor Analysis
Unique
E1 E2 E3 E4 factors

Observed Hyperventilate Nausea


Insomnia Suicidal
variables

W11 W21
W12

Underlying
variables Depression Anxiety Common
SHARED
factors
FACTORS

UNIQUE
Covariation in Insomnia = Shared Factors + Unique

• The common factors themselves


Mathematically, factor analysis is similar to multiple regression analysis, in can be expressed as linear
that each variable is expressed as a linear combination of underlying factors combinations of the observed
variables
Xi = Ai1F1 + Ai2F2+ AimFm + ViUi
Fi = Wi1X1 + Wi2X2+ Wi3X3 + ….WiKXk
3. Assumptions
of factor analysis
Assumptions
• Variables must be interrelated
- 20 unrelated variables=20 factors
- Matrix must have sufficient number of correlations
• Sample must be homogeneous
• Metric variables assumed
• Sample size
- Min 50, prefer 100
- Min 5 observations/item, prefer 10 observations/item
Formulate the problem

Construct the correlation matrix

Determine the method of factor


analysis

Determine the number of factors

Steps Rotate the factors

Interpret the factors

Calculate the Select the


factor scores surrogate variables

Determine the
model fit
4. Nike Case
Study
THE NIKE CASE
Application of
➢ Factor Analysis

Prepared for : IBS Gurgaon


By Prof. Manisha Kapoor
December 2020
Formulate the problem

Construct the correlation matrix

Determine the method of factor


analysis

Determine the number of factors

Steps Rotate the factors

Interpret the factors

Calculate the Select the


factor scores surrogate variables

Determine the
model fit
1. Formulate the problem : Underlying
variables on which to group Nike users

Nike Case

• Conduct the following analysis on Nike data. Consider


only the following variables : awareness, attitude,
preference, intention and loyalty toward Nike.
a. Analyze this data using principal component analysis,
using the varimax rotation procedure
b. Interpret the factors extracted
c. Calculate the factor scores for each respondent
d. If surrogate variables were to be selected, which one
would you select?
e. Examine the model fit
f. Analyze the data using common factor analysis and
answer questions b through e again.
For the factor analysis to be appropriate, the variables
must be correlated.

If the correlations between all the variables are small,


factor analysis may not be appropriate

We would also expect that variables that are highly


correlated with each other would also highly correlate
with the same factor or factors
2. Construct the
Correlation
Matrix - 1
There are relatively high
correlations among Awareness,
Attitude and Preference.

We would expect these


variables to correlate with the
same set of factors

Similarly purchase intention


and loyalty
2. Construct the
Correlation
Matrix - 2
• KMO is statistically
significant Correlations between pairs of
• Not individually but are variables can be explained by
these correlations other variables. So factor
• Whether this correlation analysis is appropriate
matrix different from A value greater than 0.5 is
identity matrix desirable

Thus, factor analysis may be We can reject null hypothesis


considered an appropriate that the variables are
technique for analysing the uncorrelated in the population.
correlation matrix (p-value less than 0.05)
So, factor analysis is
appropriate
Session 2
4. PCA & CFA (2
types)
Types of Factor Analysis
• Exploratory Factor Analysis (EFA)
➢ Used to discover underlying structure
➢ Principal components analys (PCA) (Thurstone)
➢ Considers the total variance and derive factor that contain little amount of unique and error variance*
➢ Unity inserted on diagonal of matrix
➢ Often used in physical science Factor analysis
➢ Common factor analysis (Spearman)*
➢ Consider only the common or shared variance and ignores the unique and error variance, it is
complicated thus less used.
➢ In SPSS known as principal axis factoring
➢ Both PCA and FA give similar answers most of the time and especially win the number of variables are 30
or the communalities 0.6 for most variables
• Confirmatory Factor Analysis (CFA)
➢ Used to test whether data fit a priori expectations for data structure
➢ Structural Equation modeling
3. Determine the method of Factor Analysis – PCA or CFA-1

PCA and CFA ?

First, let’s do it through PCA

PCA – Total variance is taken to derive the factor


CFA – Only those data taken which is common
6. Statistics
associated with
Factor Analysis
S.No Key statistics associated with Factor Test
Analysis
Barlett’s test of sphericity Test statistic used to examine the hypothesis that the variables are uncorrelated in the population. In other words the
1 population correlation matrix is an identity matrix. Each variable correlates perfectly with itself but not with others.
Correlation matrix Simple correlations between all variables
2
Communality The amount of variance a variable shares with all other variables included in the analysis is referred to as communality or proportion
3 of variance shared by the common factor…..see it as RsquareP

Eigen values Total variance explained by each factor


4
Factor loadings Simple correlations between variables and the factors
5
Factor loading plot Original variables using factor loadings as coordinates
6
Factor Matrix Factor loadings of all variables on all the factors extracted
7
Factor scores Factor scores are composite scores estimated for each respondent on the derived factors
8
Factor scores coefficient matrix Weights or factor score coefficients used to combine standardized variables to obtain factor scores
9
10 KMO measure of sampling adequacy

11 Percentage of Variance

12 Residuals

13 Scree Plot
Model representation in Factor Analysis
Unique
E1 E2 E3 E4 factors

Observed Hyperventilate,Y3 Nausea,Y4


Insomnia, Y1 Suicidal, Y2
variables

W11 W21
W12

Underlying
variables Depression, F1 Anxiety, F2 Common
SHARED
factors
FACTORS

• The covariation among the


UNIQUE variables is described in terms of a
Covariation in Insomnia = Shared Factors + Unique small number of common factors
plus a unique factor for each
• Mathematically, factor analysis is similar to multiple variable
regression analysis, in that each variable is expressed as a
• The amount of variance a variable
linear combination of underlying factors
shares with all other variables
Y1 = A11F1 + A12F2 + e1 Unique Variance included in the analysis is referred
Var (Y1) = (A11)2 + (A12)2 + sigma square to as communality or proportion of
Communality variance shared by the common
A12 → loading of variable Y1 (insomnia) on factor F2 (Anxiety) factor
3.
Communalities
• This is the measure of
amount of variance Communality for each
account for in the individual variable is 1 as unities
items by the component. were inserted in the
• And in one component diagonal of the
solution its just square of correlation matrix
correlation

So, this extraction values tells The Communality for variables under
us the proportion of variance extraction are different from under
for each variable that can be Initial because all of the variances
explained by the factors associated with the variables are not
explained unless all the factors are
not retained.
5. How many
factors ?
Determine Number
of factors :
Graphical method
(Scree plot) &
eigen values ; SPSS
Output
4. Determine the number of factors

S.No How many factors? Test

A priori determination Prior knowledge of researcher


1
Determination based on Only factors with eigen value greater than 1 are retained
2
eigen values
Determination based on A scree plot is a plot of the eigenvalues against the number of
3 factors in order of extraction (distinct break)
scree plot
Determination based on So that cumulative percentage of variance extracted by factors
4 reaches a satisfactory level
percentage of variance

5 Determination based on
split-half reliability
Determination based on Significance of separate eigen values
6
significant tests
4. Number of factors Eigen value greater than 1
rule i.e. keep the number of
factors and components that
have eigen values greater
than 1. Notice extraction
sum of squares
Reduce to 2 components

How good it is doing?


Very good….explaining
82% of the variance
• Initial eigen values give eigen values.
• As expected it is in decreasing order.
• The eigen value for a factor indicates the total variance attributed to the factor.
• The total variance attributed by all these factors is equal to 5 which is equal to the number of variables
• Sum of eigen values = Number
• Factor 1 accounts for a variance of 2.386 which is (2.386/5) or 47.72 percent of the total variance or of components
47.72% of the variance is explained by the component
• First two factors account for 82.64 percent of total variance
2.386

4. Number of factors 1.746

• This scree plot is just the eigen values plotted


from left to right

• Notice Big drop from component 1 to 2 and 2


to 3 and then

• Retain the number of component above the


scree when the plot doesn’t drop much or
drops gradually

• Scree – stones/rubble falling from mountain


So, both total variance and
scree plot suggests 2 factors
Factor Rotation
Rotation of Factors

• In rotating the factors we would like each factor to have


nonzero or significant loadings or coefficient for only some of
the variables… we try to increase interpretability

Orthogonal or Oblique or Corelated


Uncorrelated rotations rotations

• Forces them to • Forces them to


uncorrelate correlate
5. Rotate factors How each of the items do in getting into the
component?

Factor loadings – these tells us how strong the relationship is between the item and the component in our analysis …Attitude which loads or
correlates 0.87 on component 1. Attitude loads highest though other also loads…
(rule 0.3) or check significance

• Rotation does not affect the communalities and the percentage of


total variance explained.
• However, the percentage of variance accounted for by each factor
does change The variance explained by the individual factors is
• Rotation comes only if there are 2 or more than 2 components redistributed by rotation

Look at loadings and name it ; Consumer perception/Attitude and Purchase/usage • Look at loadings and name it ; Consumer perception/Attitude and Purchase/usage
5. Rotate factors

• Rotation does not affect the communalities and the percentage of total variance explained. However, the percentage of variance
accounted for by each factor does change.
Interpreting/Nam
ing the factors
Factor Analysis : Nike Case
What is purpose?

“To estimate a model which explains variance/covariance between a set of observed variables
(in a population) by a set of (fewer) unobserved factors + weightings”

Observed Preference Purchase Intention


Awareness Attitude Loyalty
variables

Underlying
variables
? ?

• We are trying to estimate these unobserved factors


Factor Analysis : Nike Case
What is purpose?

“To estimate a model which explains variance/covariance between a set of observed variables
(in a population) by a set of (fewer) unobserved factors + weightings”

Observed Preference Purchase Intention


Awareness Attitude Loyalty
variables

Underlying
variables
Perception Usage

➢ Look at loadings and name it ; Consumer perception/Attitude and Purchase/usage

• We are trying to estimate these unobserved factors


6. Interpret factors
• Interpretation is facilitated by identifying the variables that have large loadings on the same factor. That factor can be interpreted in terms of the factor that load
high on it. ANOTHER useful aid in interpretation is to plot the variables using the factor loadings as co-ordinates.

Variables at the end


of the axis are those
that have high
loadings on only
that factor

• Variables that are not near any of the axes are related to both the factors. If a factor cannot be clearly defined in terms of the original variables, it should be labelled as an undefined or a general factor
7. Calculate Factor Scores
• Factor analysis has its own stand-alone value. However, if the goal of factor analysis is to reduce the original set of variables to a smaller set of composite variables
(factors) for use in subsequent multivariate analysis, it is useful to compute factor scores for each respondent.

The factor scores can be used instead of the original variables in subsequent multivariate analysis

These factor scores can be used as independent variables for multiple regression/chi square etc.

SPSS Procedure of calculating factor scores for each


respondent

Below data tab


Factor analysis → factor scores → click (Regression/Barlett/Anderson-Rubin

• Factor scores : composite scores estimated for each respondent on the derived factors

• The weights or factor score coefficients used to combine the standardized variables are obtained from the factor score coefficient matrix

• Only in the case of PCA it is possible to derive exact factor scores. Moreover, in PCA these scores are uncorrelated

• In CFA there’s no guarantee whether the factors will be uncorrelated


Using Factor Scores for regression
• Once you have identified factors composed of interdependent variables you can create new variables to be tested

Compute a mean score for each new factor composed of the interdependent variables

Factors then become IV/DV in analysis

Creating New factor variables by grouping manually


SPSS Procedure
1. Transform → Compute variable
2. Target variable → Input name (no spaces,
MPERCEPQUAL)
3. Numeric expression→ Mean (X1, X2,X3)
4. OK
5. Repeat
6. Target variable → Input name (MPURCHASE)
7. Numeric expression → Mean (X4,X5)
8. OK
8. Select surrogate variables
• To conduct subsequent analysis and interpret the results in term of original variables rather than factor scores

Factor having highest loading in factor matrix

Prior Knowledge
Introduction to Factor Analysis
What is purpose?

“To estimate a model which explains variance/covariance between a set of observed variables
(in a population) by a set of (fewer) unobserved factors + weightings”

Observed Hyperventilate Nausea


Insomnia Suicidal
variables

W11 W21
W12

Underlying
variables/
Depression Anxiety
Common
Factors

• We are trying to estimate these unobserved factors


9. Determine the model fit
• A basic assumption underlying factor analysis is that the observed correlation between variables can be attributed to common factors.

Hence the correlations between the variables can be deduced or reproduced from the estimated correlations between the variables and the
factors

The differences between the observed correlations (as given in the input correlation matrix) and the reproduced correlations (as
estimated from the factor matrix) can be examined to determine the model fit. These differences are called Residuals

If there are many large residuals, the factor model does not provide a good fit to the data and the model should be reconsidered
11. Assignment 1
Consumer’s Perception on Inverters
Individual
in India
12. Assignment 2
Marketing Dilemma for ‘Elan’ Jeans
Individual
Brand
THE ELAN JEANS CASE :
Application of Factor Analysis

• Samad found that product varieties,


brand/name/image, marketing campaigns,
updated trends in style, functional usage and feel
are the factors that influence consumers jeans
brand purchases. Is it true? Explain

• Samad suggested that the company should invest


in innovative product styles and building brand
trust to create differentiation at utmost level ?

• Is there any difference between male and female


consumers regarding these factors? Find out and
Explain by doing extra analysis
Appendix : SPSS
Screen Captures
SPSS Principal Components Analysis
The following are the detailed steps for running principal components analysis on the toothpaste attribute ratings (V1 to V6) using the data of Table 19.1.
Thank You

You might also like