Professional Documents
Culture Documents
PURPOSE
1
6/14/2023
FACTOR ANALYSIS
APPLICATIONS OF PCA
❖ A retail chain may want to understand how consumers select stores for
their purchases based on 80 different characteristics (e.g. store type,
services offered etc).
→ 80 variables may be too many to understand how consumers make
decisions and develop action plans. Instead, a few more general
dimensions would be helpful for creating profiles (e.g. “salesperson”,
“product range”)
2
6/14/2023
❖ upper-lip
❖ eyebrow-position
❖ nose-width
❖ eye-position
❖ eye/eyebrow-length
❖ face-width
3
6/14/2023
4
6/14/2023
BASIC CONCEPT
• Identifying areas of variance in data to best discriminate between key
underlying phenomena observed
– Areas of greatest “signal” in the data
• We want a smaller set of variables that explain most of the variance in the
original data, in more compact and insightful form
9
10
5
6/14/2023
11
PRINCIPAL COMPONENTS
• First principal component is the direction of greatest variability (covariance)
in the data
•Direction of greatest variability is that in which the average square
of the projection is greatest
• And so on …
12
6
6/14/2023
To explain all the variation in the original data, we would need (in
general) all q principal components.
This can usually be done using merely the “first few” principal
components.
13
DIMENSIONALITY REDUCTION
We can ignore the components of lesser significance.
You do lose some information, but if the eigenvalues are small, you don’t lose
much
– q dimensions in original data
– calculate q eigenvectors and eigenvalues 14
– choose only the first m eigenvectors, based on their eigenvalues
– final data set has only m dimensions
14
7
6/14/2023
First criterion:
15
16
16
8
6/14/2023
17
17
19
9
6/14/2023
ASSUMPTIONS
1.GIGO
2.Sample size
3.Levels of measurement
4.Normality
5.Linearity
6.Outliers
7.Factorability
20
20
21
10
6/14/2023
ASSUMPTION TESTING:
SAMPLE SIZE
Some guidelines:
22
ASSUMPTION TESTING:
LEVEL OF MEASUREMENT
23
23
11
6/14/2023
ASSUMPTION TESTING:
NORMALITY
24
24
ASSUMPTION TESTING:
LINEARITY
25
25
12
6/14/2023
ASSUMPTION TESTING:
OUTLIERS
26
26
27
13
6/14/2023
ASSUMPTION TESTING:
FACTORABILITY
Check the factorability of the correlation matrix (i.e., how suitable
is the data for factor analysis?) by one or more of the following
methods:
28
ASSUMPTION TESTING:
FACTORABILITY (CORRELATIONS)
Are there SOME correlations over .3? If so, proceed
with PCA
29
14
6/14/2023
ASSUMPTION TESTING:
FACTORABILITY: MEASURES OF SAMPLING
ADEQUACY
❖ Global diagnostic indicators - correlation matrix is factorable if:
30
30
31
15
6/14/2023
EXPLAINED VARIANCE
32
SCREE PLOT
33
33
16
6/14/2023
Factor 1
Variable 2
Variable 3 Factor 2
Variable 4 Factor 3
34
Variable 5
34
Factor 1
Variable 2
Variable 3 Factor 2
Variable 4 Factor 3
35
Variable 5
35
17
6/14/2023
INITIAL SOLUTION:
UNROTATED FACTOR STRUCTURE
❖ Some variables may not load highly on any factors (check: low
communality)
36
Orthogonal Oblique
(Varimax) (Oblimin)
-produces uncorrelated factors -allows correlations between
factors
37
37
18
6/14/2023
38
38
❖ Consider interpretability
39
39
19
6/14/2023
INTERPRETABILITY
❖ It is dangerous to be driven by factor loadings only – think
carefully - be guided by theory and common sense in
selecting factor structure.
40
41
41
20
6/14/2023
FACTOR MATRIX
42
42
FACTOR MATRIX
Healthful
Artificial
Popularity
Interesting/
exciting
43
43
21
6/14/2023
44
44
45
45
22
6/14/2023
❖ Choose cut-off because factors can be interpreted above but not below
cut-off
46
46
47
47
23
6/14/2023
LIMITATIONS
Best Practice:
Consequently a standard procedure of factor analysis should be to divide
the sample randomly into two or more groups and independently run a
factor analysis with each group. If the same factors emerge in each
analysis, then confidence that the results do not represent a statistical
accident is increased.
48
48
49
24
6/14/2023
CASE OVERVIEW
50
1. Agriculture
2. Mining
3. Construction
4. Manufacturing (durable goods)
5. Manufacturing (nondurable goods)
6. Transportation
7. Communications
8. Electricity, gas, sanitation
9. Wholesale trade The sample contains 50
10. Retail trade observations, one for each of
11. Fiduciary, insurance, real estate the 50 US states
12. Services
51
13. Government
51
25
6/14/2023
52
52
53
53
26