Scaling

bow83755_app H_001-012.
qxd
23/5/08
4:52 PM
Page H1
Appendix H
bow83755_app H_001-012.qxd
H2
23/5/08
4:52 PM
Appendix H
Page H2
Factor Analysis, Cluster Analysis, and Multidimensional Scaling
APPENDIX H: Factor Analysis, Cluster Analysis,

and Multidimensional Scaling1
In the following two exercises, we illustrate factor analysis, cluster analysis, and multidimensional scaling.
H.1 Factor Analysis: An Application of Correlation

A personnel officer interviewed and rated 48 job applicants on the following 15 variables.
1
2
3
4
5
Form of application letter

Appearance
Academic ability
Likeability
Self-confidence
Lucidity
Honesty
8 Salesmanship
9 Experience
10 Drive
6
11
12
13
14
15
Ambition
Grasp
Potential
Keenness to join
Suitability
In order to better understand the relationships between the 15 variables, the personnel officer will
use a technique called factor analysis. The first step in factor analysis is to standardize each variable. A variable is standardized by calculating the mean and the standard deviation of the 48 values
of the variable and then subtracting from each value the mean and dividing the resulting difference
by the standard deviation. The variance of the values of each standardized variable can be shown
to be equal to 1, and the pairwise correlations between the standardized variables can be shown to
be equal to the pairwise correlations between the original variables. Although we will not give the
48 values of each of the 15 variables (see Kendall (1980)), we present in Table H.1 a matrix containing the pairwise correlations of these variables. Considering the matrix, we note that there are
so many fairly large pairwise correlations that it is difficult to understand the relationships between
the 15 variables. When we use factor analysis, we determine whether there are uncorrelated
factors, fewer in number than 15, that (1) explain a large percentage of the total variation in the 15
variables and (2) help us to understand the relationships between the 15 variables.
To find the desired factors, we first find what are called principal components. The first
principal component is the composite of the 15 standardized variables that explains the highest percentage of the total of the variances of these variables. The SPSS output in Figure H.1 tells
us that the first principal component is
y(1) .44676x1 .58285x2 .10900x3 .64584x15
where x1, x2, . . . , x15 denote the 15 standardized variables. Here, the coefficient multiplied by
each xi is called the factor loading of y(1) on xi and can be shown to equal the pairwise correlation between y(1) and xi. For example, the factor loading .58285 says that the pairwise correlation between y(1) and x2 is .58285. The SPSS output also tells us that the variance (or eigenvalue)
of the 48 values of y(1) is 7.50395. Furthermore, since the sum of the variances of the 15 standardized variables is 15, the SPSS output tells us that the variance of y(1) explains
(7.5039515)100% 50% of the total variation in the standardized variables. Similarly, the
SPSS output shows the second principal component, which has a variance of 2.06148 and explains (2.0614815)100% 13.7% of the total variation in the standardized variables. In all,
there are 15 principal components that are uncorrelated with each other and explain a cumulative
percentage of 100 percent of the total variation in the 15 variables. Also, note that the variance
of a particular principal component can be shown to equal the sum of the squared pairwise correlations between the principal component and the 15 standardized variables. For example, examining the first column of pairwise correlations in the upper portion of Figure H.1, it follows
that the variance of the first principal component is
(.44676)2 (.58285)2 (.64584)2 7.50395
Although the SPSS output shows the percentage of the total variation explained by each of
the 15 principal components, it only shows 7 of these principal components. The reason is that,
1
Some of the discussion and three examples in this appendix are based on Chapters 15 and 16 in Intermediate Statistical
Methods, A Computer Package Approach (Prentice Hall, 1983) by Mark L. Berenson, David M. Levine, and Mathew Goldstein.
bow83755_app H_001-012.qxd
23/5/08
4:52 PM
Page H3
H3
Appendix H Factor Analysis, Cluster Analysis, and Multidimensional Scaling

TA B L E
H.1
Variable
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
A Matrix of Pairwise Correlations for the Applicant Data

1
10
11
12
13
14
15
1.00
.24
1.00
.04
.12
1.00
.31
.38
.00
1.00
.09
.43
.00
.30
1.00
.23
.37
.08
.48
.81
1.00
.11
.35
.03
.65
.41
.36
1.00
.27
.48
.05
.35
.82
.83
.23
1.00
.55
.14
.27
.14
.02
.15
.16
.23
1.00
.35
.34
.09
.39
.70
.70
.28
.81
.34
1.00
.28
.55
.04
.35
.84
.76
.21
.86
.20
.78
1.00
.34
.51
.20
.50
.72
.88
.39
.77
.30
.71
.78
1.00
.37
.51
.29
.61
.67
.78
.42
.73
.35
.79
.77
.88
1.00
.47
.28
.32
.69
.48
.53
.45
.55
.21
.61
.55
.55
.54
1.00
.59
.38
.14
.33
.25
.42
.00
.55
.69
.62
.43
.53
.57
.40
1.00
Source: Reproduced by permission of the Publishers, Charles Griffin & Company Ltd., of London and High Wycombe, from Kendall, Multivariate
Analysis, 2nd. (1980).
since we wish to obtain final factors that are fewer in number than the number of original variables, we have instructed SPSS to retain 7 principal components for further study. The choice
of 7 principal components, while somewhat arbitrary, is based on the belief that 7 principal components will explain a high percentage of the total variation in the 15 variables. The SPSS output
FIGURE
H.1
SPSS Output of a Factor Analysis of the Applicant Data (7 Factors Used)
FACTOR MATRIX USING PRINCIPAL FACTOR, NO ITERATIONS

FACTOR 1a
FACTOR 2b
FACTOR 3
FACTOR 4
FACTOR 5
FACTOR 6
FACTOR 7
x1
0.44676
0.61880
0.37635
-0.12148
0.10168
0.42496
0.08504
7.50395a
50.0
x2
0.58285
-0.05019
-0.01995
0.28167
0.75188
-0.03325
0.00345
2.06148b
13.7
63.8
x3
0.10900
0.33907
-0.49450
0.71393
-0.18095
0.16113
0.18206
1.46768
9.8
73.6
x4
0.61698
-0.18150
0.57968
0.35707
-0.09904
0.07837
-0.05714
1.20910
8.1
81.6
x5
0.79807
-0.35611
-0.29930
-0.17939
0.00025
0.00377
0.06620
0.74143
4.9
86.6
x6
0.86688
-0.18544
-0.18414
-0.06923
-0.17813
0.11744
-0.30132
0.48402
3.2
89.8
x7
0.43330
-0.58195
0.36036
0.44570
-0.06052
-0.21591
0.06539
0.34408
2.3
92.1
x8
0.88244
-0.05647
-0.24821
-0.22786
0.02960
-0.06262
0.00981
0.31027
2.1
94.1
x9
0.36549
0.79438
0.09258
0.07431
-0.08999
-0.25962
-0.06758
0.25965
1.7
95.9
x10
0.86261
0.06908
-0.09993
-0.16645
-0.17554
-0.17549
0.29665
10
0.20575
1.4
97.2
x11
0.87185
-0.09840
-0.25565
-0.20948
0.13698
0.07573
0.12514
11
0.15093
1.0
98.3
x12
0.90776
-0.03023
-0.13453
0.09726
-0.06359
0.10194
-0.24685
12
0.09327
0.6
98.9
x13
0.91310
0.03250
-0.07327
0.21842
-0.10489
0.04666
-0.00366
13
0.07628
0.5
99.4
x14
0.71033
-0.11478
0.55801
-0.23496
-0.10071
0.05911
0.14353
14
0.05766
0.4
99.8
x15
0.64584
0.60374
0.10687
-0.02889
0.06431
-0.29308
-0.10537
15
0.03441
0.2
100.0
FACTOR
EIGENVALUE
PCT OF VAR
VARIMAX ROTATED FACTOR MATRIX

FACTOR 1
FACTOR 2
FACTOR 3
x1
0.12359
0.04204
0.42738
-0.00497
FACTOR 4
FACTOR 5
FACTOR 6
FACTOR 7
VARIABLE
COMMUNALITY
0.85336
0.09437
0.01521
x1
0.93708
x2
0.32636
0.21176
0.11729
0.05621
x3
0.05396
-0.02816
0.13368
0.97451 -0.01201
x4
0.22106
0.85846
x5
0.91144
0.15413
x6
0.87938
0.25709
x7
0.20161
0.87606
x8
0.90070
0.07788
x9
0.06497
x10
0.79694
0.20942
x11
0.89427
0.06033
x12
0.79690
x13
0.07715
0.90226
0.01101
x2
0.98841
0.03936
0.00014
x3
0.97293
0.13049
-0.01215
0.26494
0.09997
0.11479
x4
0.89636
-0.08310
-0.04208
-0.05072
0.13904
-0.06943
x5
0.88989
0.96088
0.10119
0.01702
0.05912
-0.00285
0.32778
x6
-0.13423
0.00066
-0.22952
0.16057
-0.06982
x7
0.90948
0.21967
-0.05953
0.05564
0.16142
-0.04510
x8
0.90031
0.88690
0.16270
0.20105
-0.01159
-0.00158
x9
0.85878
0.35909
0.02333
0.10603
-0.02550
-0.34034
x10
0.93618
0.08680
-0.01813
0.16585
0.26018
-0.11304
x11
0.91921
0.30629
0.23598
0.14095
0.12915
0.14954
0.29053
x12
0.92786
0.73031
0.40428
0.29019
0.26489
0.16244
0.14080
0.06079
x13
0.90108
x14
0.45932
0.56662
0.16988
-0.38607
0.42522
-0.03753
-0.16248
x14
0.91856
x15
0.33966
0.07614
0.84300 -0.01002
0.18417
0.17055
0.00713
x15
0.89497
-0.03039
First principal component has variance 7.50395.

Second principal component has variance 2.06148.
CUM PCT
50.0
bow83755_app H_001-012.qxd
H4
23/5/08
4:52 PM
Appendix H
Page H4
tells us that this choice is reasonablethe first 7 principal components explain 92.1 percent of
the total variation in the 15 variables. The reason that we need to further study the 7 principal
components is that, in general, principal components tend to be correlated with many of the factors (see the factor loadings on the SPSS output) and thus tend to be difficult to interpret in a
meaningful way. For this reason, we rotate the 7 principal components by using VARIMAX rotation. This technique attempts to find final uncorrelated factors each of which loads highly on
(that is, is strongly correlated with) a limited number of the 15 original standardized variables
and loads as low as possible on the rest of the standardized variables. The SPSS output shows
the results of the VARIMAX rotation. Examining the check marks that we have placed on the
output, we see that Factor 1 loads heavily on variables 5 (self-confidence), 6 (lucidity), 8 (salesmanship), 10 (drive), 11 (ambition), 12 (grasp) and 13 (potential). Therefore, Factor 1 might be
interpreted as an extroverted personality dimension. Factor 2 loads heavily on variables 4 (likeability) and 7 (honesty). Therefore, Factor 2 might be interpreted as an agreeable personality
dimension. Similarly, Factors 3 through 7 might be interpreted as the following dimensions:
Factor 3: experience; Factor 4: academic ability; Factor 5: form of application letter;
Factor 6: appearance; Factor 7: no discernible dimension. Note that, although variable
14 (keenness to join) does not load heavily on any factor, its correlation of .56662 with Factor 2
(agreeable personality) might mean that it should be interpreted to be part of the agreeable personality dimension.
We next note that the communality to the right of each variable in Figure H.1 is the percentage of the variance of the variable that is explained by the 7 factors. The communality for
each variable can be shown to equal the sum of the squared pairwise correlations between the
variable and the 7 factors. For example, examining the first row of pairwise correlations in the
lower portion of Figure H.1, it follows that the communality for factor 1 is
(.12359)2 (.04204)2 (.01521)2 .93708
All of the communalities in Figure H.1 seem high. However, some statisticians might say that
we have retained too many factors. To understand this, note that the upper portion of Figure H.1
tells us that the sum of the variances of the first seven factors is
7.50395 2.06148 1.46768 1.20910 .74143 .48402 .34408 13.81174
This variance is (13.8117415)100% 92.1% of the sum of the variances of the 15 standardized variables. Some statisticians would suggest that we retain a factor only if its variance exceeds 1, the variance of each standardized variable. If we do this, we would retain 4
factors, since the variance of the fourth factor is 1.20910 and the variance of the fifth factor
is .74143. Figure H.2 gives the SAS output obtained by using 4 factors. Examining the check
marks that we have placed on the output, we see that Factors 1 through 4 might be interpreted
as follows: Factor 1: extroverted personality; Factor 2: experience; Factor 3: agreeable
personality; Factor 4: academic ability. Variable 2 (appearance) does not load heavily on
any factor and thus is its own factor, as Factor 6 on the SPSS output in Figure H.1 indicated is true. Variable 1 (form of application letter) loads heavily on Factor 2 (experience).
In summary, there is not much difference between the 7 factor and 4 factor solutions. We
might therefore conclude that the 15 variables can be reduced to the following 5 uncorrelated
factors: extroverted personality, experience, agreeable personality, academic ability,
and appearance.
a In Applied Multivariate Techniques (John Wiley and Sons, 1996), Subhash Sharma considers a study
in which 143 respondents rated three brands of laundry detergents on 12 product attributes using a
5-point Likert scale. The 12 product attributes are:
V1: Gentle to natural fabrics
V7: Makes colors bright
V2: Wont harm colors
V8: Removes grease stains
V3: Wont harm synthetics

V4: Safe for lingerie
V9: Good for greasy oil

V10: Pleasant fragrance
V5: Strong, powerful
V11: Removes collar soil
V6: Gets dirt out
V12: Removes stubborn stains
bow83755_app H_001-012.qxd
23/5/08
8:29 PM
Page H5

FIGURE
H.2
H5
SAS Output of a Factor Analysis of the Applicant Data (4 Factors Used)
PRINCIPAL AXIS
PRIOR ESTIMATES OF COMMUNALITY
X1
X2
X3
X4
X5
X6
X7
X8
1.000000
1.000000
1.000000
1.000000
1.000000
1.000000
1.000000
1.000000
X9
XA
XB
XC
XD
XE
XF
1.000000
1.000000
1.000000
1.000000
1.000000
1.000000
1.000000
7.503986
2.061498
1.467686
1.209097
0.741423
0.484018
0.344075
0.310272
PORTION
0.500
0.137
0.098
0.081
0.049
0.032
0.023
0.021
CUM PORTION
0.500
0.638
0.736
0.816
0.866
0.898
0.921
0.941
10
11
12
13
14
15
EIGENVALUES
EIGENVALUES
0.259652
0.205746
0.150932
0.093269
0.076283
0.057655
0.034407
PORTION
0.017
0.014
0.010
0.006
0.005
0.004
0.002
CUM PORTION
0.959
0.972
0.983
0.989
0.994
0.998
1.000
4 FACTORS WILL BE RETAINED.

FACTOR PATTERN
FACTOR1
FACTOR2
FACTOR3
FACTOR4
X1
0.44676
0.61880
0.37635
-0.12148
X2
0.58285
-0.05019
-0.01995
0.28166
FORM OF APPLICATION LETTER

APPEARANCE
X3
0.10900
0.33907
-0.49449
0.71391
ACADEMIC ABILITY
X4
0.61699
-0.18149
0.57967
0.35706
X5
0.79807
-0.35610
-0.29930
-0.17939
SELF CONFIDENCE
LIKEABILITY
X6
0.86688
-0.18543
-0.18414
-0.06923
LUCIDITY
X7
0.43330
-0.58195
0.36035
0.44569
X8
0.88244
-0.05647
-0.24821
-0.22786
X9
0.36549
0.79437
0.09258
0.07431
XA
0.86261
0.06908
-0.09993
-0.16645
DRIVE
XB
0.87186
-0.09840
-0.25564
-0.20948
AMBITION
XC
0.90776
-0.03023
-0.13453
0.09726
XD
0.91310
0.03250
-0.07327
0.21842
XE
0.71033
-0.11478
0.55800
-0.23495
KEENNESS TO JOIN
XF
0.64584
0.60373
0.10687
-0.02889
SUITABILITY
HONESTY
SALESMANSHIP
EXPERIENCE
GRASP
POTENTIAL
VARIMAX
ROTATED FACTOR PATTERN
FACTOR1
FACTOR2
FACTOR3
FACTOR4
X1
0.11447
0.83336
0.11063
-0.13808
X2
0.43964
0.14979
0.39417
0.22555
APPEARANCE
X3
0.06115
0.12744
0.00557
0.92792
ACADEMIC ABILITY
X4
0.21559
0.87360
-0.08137
LIKEABILITY
X5
0.91896
-0.10368
0.16241
-0.06219
SELF CONFIDENCE
X6
0.86439
0.10195
0.25878
0.00642
LUCIDITY
X7
0.21715
0.86440
0.00341
HONESTY
0.24667
-0.24607
0.08773
-0.04938
-0.05537
0.21919
FORM OF APPLICATION LETTER
X8
0.91799
0.20635
X9
0.08530
0.84871
SALESMANSHIP
XA
0.79576
0.35407
0.15950
-0.05026
DRIVE
XB
0.91641
0.16268
0.10496
-0.04184
AMBITION
XC
0.80415
0.25872
0.34049
0.15153
GRASP
XD
0.73917
0.32885
0.42493
0.22980
POTENTIAL
XE
0.43597
0.36420
0.54105
-0.51862
XF
0.37950
0.79807
0.07847
0.08221
EXPERIENCE
KEENNESS TO JOIN
SUITABILITY
VARIANCE EXPLAINED BY EACH FACTOR

FACTOR1
FACTOR2
FACTOR3
FACTOR4
5.745474
2.735065
2.413961
1.347767
Table H.2 is a matrix containing the pairwise correlations between the variables, and Figure H.3 is the
SPSS output of a factor analysis of the detergent data. Why did the analyst choose to retain two
Good service
factors? Discuss why Factor 1 can be interpreted to be the ability of the detergent to clean clothes.
friendly; Price level;
Discuss why Factor 2 can be interpreted to be the mildness of the detergent.
Attractiveness;
b Table H.3 shows the output of a factor analysis of the ratings of 82 respondents who were asked to
Spaciousness; Size
evaluate a particular discount store on 29 attributes using a 7-point Likert scale. Interpret and give
names to the five factors.
bow83755_app H_001-012.qxd
23/5/08
H6
4:52 PM
Appendix H
FIGURE
H.3
Page H6
SPSS Output of a Factor Analysis of the Detergent Data2
INITIAL STATISTICS:
VARIABLE
ROTATED FACTOR MATRIX:
COMMUNALITY
V1
.42052
6.30111
52.5
52.5
VI
.12289
V2
.39947
1.81757
15.1
67.7
V2
.13900
.64781
V3
.56533
.66416
5.5
73.2
V3
.24971
.78587
V4
.56605
.57155
4.8
78.0
V4
.29387
.74118
V5
.60467
.55995
4.7
82.6
V5
.73261
.15469
V6
.57927
.44517
3.7
86.3
V6
.73241
.20401
V7
.69711
.41667
3.5
89.8
V7
.77455
.22464
V8
.74574
.32554
2.7
92.5
V8
.85701
.20629
V9
.66607
.27189
2.3
94.8
V9
.80879
.19538
V10
.59287
10
.25690
2.1
96.9
V10
.69326
.23923
V11
.71281
11
.19159
1.6
98.5
V11
.77604
.25024
V12
.64409
12
.17789
1.5
100.0
V12
.79240
.19822
TA B L E
V1
V2
V3
V4
V5
V6
V7
V8
V9
V10
V11
V12
H.2
FACTOR
EIGENVALUE
PCT OF VAR
CUM PCT
FACTOR 1
FACTOR 2
.65101
Correlation Matrix for Detergent Data2
V1
V2
V3
V4
V5
V6
V7
V8
V9
V10
V11
V12
1.00000
0.41901
0.51840
0.56641
0.18122
0.17454
0.23034
0.30647
0.24051
0.21192
0.27443
0.20694
0.41901
1.00000
0.57599
0.49886
0.18666
0.24648
0.22907
0.22526
0.21967
0.25879
0.32132
0.25853
0.51840
0.57599
1.00000
0.64325
0.29080
0.34428
0.41083
0.34028
0.32854
0.38828
0.39433
0.36712
0.56641
0.49886
0.64325
1.00000
0.38360
0.39637
0.37699
0.40391
0.42337
0.36564
0.33691
0.36734
0.18122
0.18666
0.29080
0.38360
1.00000
0.57915
0.59400
0.67623
0.69269
0.43873
0.55485
0.65261
0.17454
0.24648
0.34428
0.39637
0.57915
1.00000
0.57756
0.70103
0.62280
0.62174
0.59855
0.57845
0.23034
0.22907
0.41083
0.37699
0.59400
0.57756
1.00000
0.67682
0.68445
0.54175
0.78361
0.63889
0.30647
0.22526
0.34028
0.40391
0.67623
0.70103
0.67682
1.00000
0.69813
0.68589
0.71115
0.71891
0.24051
0.21967
0.32854
0.42337
0.69269
0.62280
0.68445
0.69813
1.00000
0.58579
0.64637
0.69111
0.21192
0.25879
0.38828
0.36564
0.43873
0.62174
0.54175
0.68589
0.58579
1.00000
0.62250
0.63494
0.27443
0.32132
0.39433
0.33691
0.55485
0.59855
0.78361
0.71115
0.64637
0.62250
1.00000
0.63973
0.20694
0.25853
0.36712
0.36734
0.65261
0.57845
0.63889
0.71891
0.69111
0.63494
0.63973
1.00000
The source of Table H.2 and Figure H.3 is Applied Multivariate Techniques by Subhash Sharma, John Wiley and Sons, Inc., New York, 1996.
H.2 Cluster Analysis and Multidimensional Scaling

Professional baseball and tennis were less popular in 2000 than in the late 1970s and early
1980s. To see why this might be true, we consider a study by Levine (1977) concerning the
perceptions of various sports in 1977. Levine had 45 undergraduate students give each of boxing (BX), basketball (BK), golf (G), swimming (SW), skiing (SK), baseball (BB), ping pong
(PP), hockey (HK), handball (H), track and field (TF), bowling (BW), tennis (T), and football
(F) an integer rating of 1 to 7 on six scales: fast moving (1) versus slow moving (7); complicated
rules (1) versus simple rules (7); team oriented (1) versus individual (7); easy to play (1) versus
hard to play (7); noncontact (1) versus contact (7); competition against opponent (1) versus competition against standard (7). The first two rows of Table H.4 present a particular undergraduates
ratings of boxing and basketball on each of the six scales, and Table H.5 presents the average
rating by all 45 undergraduates of each sport on each of the six scales.
To better understand the perceptions of the 13 sports, we will cluster them into groups. The first
step in doing this is to consider the distance between each pair of sports for each undergraduate. For
example, to calculate the distance between boxing and basketball for the undergraduate whose ratings are given in Table H.4, we calculate the paired difference between the ratings on each of the six
scales, square each paired difference, sum the six squared paired differences, and find the square root
of this sum. The resulting distance is 5.9161. A distance for each undergraduate for each pair of
sports can be found, and then an average distance over the 45 undergraduates for each pair of sports
can be calculated. Statistical software packages do this, but these packages sometimes standardize
the individual ratings before calculating the distances. We will not discuss the various ways in which
bow83755_app H_001-012.qxd
23/5/08
4:52 PM
Page H7

TA B L E
H7
Factor Analysis of the Discount Store Data3
H.3
Factor
Scale
1. Good service
2. Helpful salespersons
3. Friendly personnel
4. Clean
5. Pleasant store to shop in
6. Easy to return purchases
7. Too many clerks
8. Attracts upper-class customers
9. Convenient location
10. High quality products
11. Good buys on products
12. Low prices
13. Good specials
14. Good sales on products
15. Reasonable value for price
16. Good store
17. Low pressure salespersons
18. Bright store
19. Attractive store
20. Good displays
21. Unlimited selections of products
22. Spacious shopping
23. Easy to find items you want
24. Well-organized layout
25. Well-spaced merchandise
26. Neat
27. Big store
28. Ads frequently seen by you
29. Fast checkout
Percentage of variance explained
Cumulative variance explained
.79
.75
.74
.59
.58
.56
.53
.46
.36
.34
.02
.03
.35
.30
.17
.41
.20
.02
.19
.33
.09
.00
.36
.02
.20
.38
.20
.03
.30
16
16
II
III
.15
.03
.07
.31
.15
.23
.00
.06
.30
.27
.88
.74
.67
.67
.52
.47
.30
.10
.03
.15
.00
.20
.16
.05
.15
.12
.15
.20
.16
12
28
.06
.04
.17
.34
.48
.13
.02
.25
.02
.31
.09
.14
.05
.01
.11
.47
.28
.75
.67
.61
.29
.00
.10
.25
.27
.45
.06
.07
.00
9
37
IV
.12
.13
.09
.15
.26
.03
.23
.00
.19
.12
.10
.00
.10
.08
.02
.12
.03
.26
.34
.15
.03
.70
.57
.54
.52
.49
.07
.09
.25
8
45
.07
.31
.14
.25
.10
.03
.37
.17
.03
.25
.03
.13
.14
.16
.03
.11
.05
.05
.24
.20
.00
.10
.01
.17
.16
.34
.65
.42
.33
5
50
Communality
.67
.68
.61
.65
.67
.39
.47
.31
.26
.36
.79
.59
.60
.57
.36
.63
.18
.61
.66
.57
.09
.54
.49
.39
.43
.72
.49
.23
.28
The source of Table H.3 is Marketing Research, Sixth Edition by David A. Aaker, V. Kumar, and George S. Dax, John Wiley and
Sons, Inc., New York, 1998.
TA B L E
H.4
Sport
Boxing
Basketball
Paired
Difference
A Particular Undergraduates Ratings of Boxing and Basketball

(1) Fast
(1) Easy
(1) Comp
Mvg.
to Play
Opp.
(7) Slow (1) Compl. (1) Team (7) Hard
(1) Ncon. (7) Comp
Mvg. (7) Simple (7) Indv.
to Play (7) Con.
Std.
3
2
5
3
7
2
4
4
6
4
1
2
1
Distance
2(1)2 (2)2 (5)2 (0)2 (2)2 (1)2
235 5.9161
such standardization can be done. Rather, we note that Table H.6 presents a matrix containing the
average distance over the 45 undergraduates for each pair of sports, and we note that this matrix has
been obtained by using a software package that uses a standardization procedure. There are many
different approaches to using the average distances to cluster the sports. We will discuss one approachthe hierarchical, complete linkage approach. Hierarchical clustering implies that, once
two sports are clustered together at a particular stage, they are considered to be permanently joined
and cannot be separated into different clusters at a later stage. Complete linkage bases the merger
of two clusters of sports (either cluster of which can be an individual sport) on the maximum distance between sports in the clusters. For example, since Table H.6 shows that the smallest average
distance is the average distance between football and hockey, which is 2.20, football and hockey are
bow83755_app H_001-012.qxd
23/5/08
H8
4:52 PM
Appendix H
TA B L E
TA B L E
H.6
Page H8
H.5
Average Rating of Each Sport on Each of the Six Scales
Sport
(1) Fast
Mvg.
(7) Slow
Mvg.
(1) Compl.
(7) Simple
Boxing
Basketball
Golf
Swimming
Skiing
Baseball
Ping-Pong
Hockey
Handball
Track & field
Bowling
Tennis
Football
3.07
1.84
6.13
2.87
2.13
4.78
3.18
1.71
2.53
2.82
5.07
2.89
2.42
4.62
3.78
4.49
5.02
4.60
4.18
5.13
3.22
4.67
4.38
5.16
3.78
2.76
(1) Team
(7) Indv.
(1) Easy
to Play
(7) Hard
to Play
(1) Ncon.
(7) Con.
(1) Comp
Opp.
(7) Comp
Std.
6.62
1.56
6.58
5.29
5.96
2.16
5.38
1.82
4.78
4.47
5.40
5.47
1.44
4.78
3.82
3.84
3.64
5.22
3.33
2.91
5.04
3.71
3.84
3.11
4.09
5.00
6.02
4.89
1.82
2.22
2.51
3.60
2.04
5.96
2.78
2.89
1.60
2.16
6.47
1.73
2.27
4.11
4.36
4.71
2.67
2.20
2.49
2.31
3.82
3.73
2.42
2.33
A Matrix Containing the Average Distances
Sport
BX
BK
SK
SW
BB
PP
HK
TF
BW
BK
G
SK
SW
BB
PP
HK
H
TF
BW
T
F
3.85
4.33
3.80
3.81
4.12
3.74
3.85
3.41
3.81
4.07
3.49
3.86
4.88
4.05
3.81
3.15
3.56
2.58
3.24
3.36 3
4.23
3.32
2.51
3.73
3.56
3.83
3.61
5.11
3.92
3.88
2.72
3.59
5.15
2.84
4.16
3.67
4.02
3.25
3.20
3.75
3.19
4.38
3.60
2.72
4.17
2.80
2.84
2.89
2.82
4.41
3.41
3.49
3.34
3.37
3.32
3.25
3.43
4.27
2.58
3.06
2.87 4
2.54
4.35
3.52
3.72
4.58
3.58
2.20 1
2.75
3.13
2.33 2
3.68
3.26
2.72
3.84
2.85
4.67
3.69
Source of Tables H.4, H.5, and H.6 and of Figures H.4 and H.5: D. M. Levine, Nonmetric Multidimensional Scaling and Hierarchical Clustering:
Procedures for the Investigation of the Perception of Sports, Research Quarterly, Vol. 48 (1977), pp. 341348.
clustered together in the first stage of clustering (see the tree diagram in Figure H.4). Since the second smallest average distance is the average distance between tennis and handball, which is 2.33,
tennis and handball are clustered together in the second stage of clustering. The third smallest average distance is the average distance between football and basketball, which is 2.51, but football has
already been clustered with hockey. The average distance between hockey and basketball is 2.58,
and so the average distance between basketball and the cluster containing football and hockey is
2.58the maximum of 2.51 and 2.58. This average distance is equal to the average distance between ping pong and the cluster containing tennis and handball, which (as shown in Table H.6) is
the maximum of 2.54 and 2.58that is, 2.58. There is no other average distance as small as 2.58.
Furthermore, note that the distance between basketball and football is 2.51, whereas the distance
between ping pong and tennis is 2.54. Therefore, we will break the tie between the two average
distances of 2.58 by adding basketball to the cluster containing football and hockey in the third stage
of clustering. Then, we add ping pong to the cluster containing tennis and handball in the fourth
stage of clustering. Figure H.4 shows the results of all 12 stages of clustering.
At the end of seven stages of clustering, six clusters have been formed. They are:
Cluster 1:
Cluster 2:
Cluster 3:
Boxing
Skiing
Swimming, Ping Pong, Handball,
Tennis, Track and Field
Cluster 4: Golf, Bowling

Cluster 5: Basketball, Hockey, Football
Cluster 6: Baseball
bow83755_app H_001-012.qxd
23/5/08
4:52 PM
Page H9

FIGURE
H.4
Boxing
Skiing
Swimming
Ping Pong
Handball
Tennis
Track & Field
Golf
Bowling
Basketball
Hockey
Football
Baseball
A Tree Diagram Showing Clustering of the 13 Sports
10
6
4
2
7
11
3
1
12
8
Distance
In Figure H.5 we present a two dimensional graph in which we place ovals around these six clusters. This graph is the result of a procedure called multidimensional scaling. To understand this
procedure, note that, since each sport is represented by six ratings, each sport exists geometrically
as a point in six dimensional space. Multidimensional scaling uses the relative average distances
between the sports in the six dimensional space (that is, the relationships between the average distances in Table H.6) and attempts to find points in a lesser dimensional space that approximately
have the same relative average distances between them. In this example we illustrate mapping the
six dimensional space into a two dimensional space, because a two dimensional space allows us to
most easily interpret the results of multidimensional scalingthat is, to study the location of the
sports relative to each other and thereby determine the overall factors or dimensions that appear to
separate the sports. Figure H.5 gives the output of multidimensional scaling that is given by a standard statistical software system (we will not discuss the numerical procedure used to actually carry
out the multidimensional scaling). By comparing the sports near the top of Axis II with the sports
near the bottom, and by using the average ratings in Table H.5, we see that Axis II probably represents the factor team versus individual. By comparing the sports on the left of Axis I with the
sports on the right, and by using the average ratings in Table H.5, we see that Axis I probably represents the factor degree of action, which combines contact/noncontact aspects with fast moving/slow moving aspects. Also, note that the two clusters that have been formed at the end of 11
stages of clustering in Figure H.4 support the existence of the team versus individual factor.
FIGURE
H.5
Multidimensional Scaling of
the 13 Sports
Axis II
F
HK
BK
BB
TF
H T
BX
PP
SW
Axis I
BW
G
SK
H9
bow83755_app H_001-012.qxd
H10
23/5/08
4:52 PM
Appendix H
TA B L E
Page H10
H.7
Average Ratings of the Food Types on Three Scales
Food
Spicy/Bland
Heavy/Light
High/Low Calories
Japanese (JPN)
Cantonese (CNT)
Szechuan (SCH)
French (FR)
Mexican (MEX)
Mandarin (MAN)
American (AMR)
Spanish (SPN)
Italian (ITL)
Greek (GRK)
2.8
2.6
6.6
3.5
6.4
3.4
2.3
4.7
4.6
5.3
3.2
5.3
3.6
4.5
4.3
4.1
5.8
5.4
6.0
4.7
3.4
5.4
3.0
5.1
4.3
4.2
5.7
4.9
6.2
6.0
Although the perception of sports in 1977 relate to sports in general (and not to just professional sports), and although these perceptions do not directly relate to the popularity of sports,
note that high action, team oriented sports (football and basketball) tended to be popular in
2000. Considering the Axis II factor (team versus individual), it might be that high baseball
player salaries, free agency, frequent player moves, and the inability of small market teams to
compete made baseball seem less team oriented to fans in 2000. Perhaps more revenue sharing
between small and large market teams would improve the situation. Considering the Axis I factor (degree of action), it might be that power tennis (partially due to new tennis racquet technologies) and the resulting shorter rallies made tennis seem less action oriented to fans in 2000.
Perhaps limiting the power of tennis racquets and thus allowing smaller, exciting players (like
Jimmy Connors and John McEnroe of the 1970s and 1980s) to be major competitors might help
increase the degree of action in tennis.
a
In Intermediate Statistical Methods and Applications, A Computer Package Approach (Prentice Hall,
1983), Mark L. Berenson, David M. Levine, and Mathew Goldstein consider a marketing research
study concerning the similarities and differences between the ten types of food shown in Table H.7.
Each type of food was given an integer rating of 1 to 7 on three scales: bland (1) versus spicy (7);
light (1) versus heavy (7); and low calories (1) versus high calories (7). Table H.7 gives the average
value for each of the food types on the three scales. Figures H.6 and H.7 present the results of a
cluster analysis and multidimensional scaling of the 10 food types.
(1) Discuss why the two axes in Figure H.7 may be interpreted as oriental versus western and
spicy versus bland.
(2) Using Table H.7 and Figures H.6 and H.7, discuss the similarities and differences between the
food types.
(3) Suppose that you are in charge of choosing restaurants to be included in a new riverfront
development that initially will include a limited number of restaurants. How might Table H.7
and Figures H.6 and H.7 help you to make your choice?
FIGURE
H.6
Japanese
Cantonese
Mandarin
French
American
Italian
Spanish
Greek
Mexican
Szechuan
A Cluster Analysis of the 10 Food Types
bow83755_app H_001-012.qxd
23/5/08
4:52 PM
Page H11

FIGURE
H.7
Multidimensional Scaling of the

10 Food Types
CNT
JPN
SCH
MAN
Axis I
ITL
FR
SPN GRK
MEX
AMR
Axis II
b Automakers use multidimensional scaling to measure the images of their cars. Customer surveys ask
owners of different car makes to rank their autos from 1 to 10 for such qualities as youthfulness,
luxury, and practicality. The responses are used to carry out multidimensional scaling, which
produces a perceptual map showing the images of the different cars. Figure H.8 is a perceptual map
showing car images in 1984. After viewing the map in Figure H.8, Chrysler concluded that
Plymouth, Dodge, and Chrysler needed to present a more youthful image and that Plymouth and
FIGURE
H.8
Multidimensional Scaling Showing Car Images in 1984
Perceptual MapBrand Images
Cadillac
Lincoln
Mercedes
Ford
Pontiac
Chevrolet
Datsun
Toyota
Dodge
Plymouth
Porsche
BMW
Chrysler
Buick
Oldsmobile
Conservative
Looking
Appeals to
Older People
Has a Touch of Class

a Car Id Be Proud to Own
Distinctive Looking
Has Spirited
Performance
Appeals to
Young People
Fun to Drive
Sporty Looking
VW
Very Practical
Provides Good Gas Mileage
Affordable
Source: Chrylser Corp.
Source: Marketing Research: Methodological Foundations (page 492), by Gilbert A. Churchill, Jr., The Dryden Press, Orlando,
1995.
Source: John Koten, Car Makers Use Image Map as Tool to Position Products, The Wall Street Journal (March 22, 1984),
p. 31. Reprinted by permission of The Wall Street Journal, Dow Jones & Company, Inc., 1984. All Rights Reserved
Worldwide.
H11
bow83755_app H_001-012.qxd
H12
23/5/08
4:52 PM
Appendix H
Page H12
Dodge needed to move up on the luxury scale. By the year 2000 Chrysler had introducted cars such
as the Dodge Neon, Plymouth Breeze, Dodge Intrepid, Chrysler Concorde, and Chrysler 300 M.
These cars are more youthful and/or luxurious and tremendously increased Chrysler sales. What does
the perceptual map say about the Buick and Oldsmobile divisions of General Motors? By 2000
General Motors had made Buick the family car division and had introduced new Oldsmobiles that
were more youthful and performance oriented. Do you think a perceptual map in 2002 would show
the same relationships between the Buick and Oldsmobile divisions?

Scaling

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Scaling

Uploaded by

Copyright:

Available Formats

bow83755_app H_001-012.

Factor Analysis, Cluster Analysis, and Multidimensional Scaling

APPENDIX H: Factor Analysis, Cluster Analysis,

H.1 Factor Analysis: An Application of Correlation

Form of application letter

Appendix H Factor Analysis, Cluster Analysis, and Multidimensional Scaling

A Matrix of Pairwise Correlations for the Applicant Data

SPSS Output of a Factor Analysis of the Applicant Data (7 Factors Used)

FACTOR MATRIX USING PRINCIPAL FACTOR, NO ITERATIONS

VARIMAX ROTATED FACTOR MATRIX

First principal component has variance 7.50395.

Factor Analysis, Cluster Analysis, and Multidimensional Scaling

V7: Makes colors bright

V2: Wont harm colors

V8: Removes grease stains

V3: Wont harm synthetics

V9: Good for greasy oil

V5: Strong, powerful

V11: Removes collar soil

V6: Gets dirt out

V12: Removes stubborn stains

Appendix H Factor Analysis, Cluster Analysis, and Multidimensional Scaling

SAS Output of a Factor Analysis of the Applicant Data (4 Factors Used)

4 FACTORS WILL BE RETAINED.

FORM OF APPLICATION LETTER

FORM OF APPLICATION LETTER

VARIANCE EXPLAINED BY EACH FACTOR

Factor Analysis, Cluster Analysis, and Multidimensional Scaling

SPSS Output of a Factor Analysis of the Detergent Data2

ROTATED FACTOR MATRIX:

Correlation Matrix for Detergent Data2

H.2 Cluster Analysis and Multidimensional Scaling

Appendix H Factor Analysis, Cluster Analysis, and Multidimensional Scaling

Factor Analysis of the Discount Store Data3

A Particular Undergraduates Ratings of Boxing and Basketball

Factor Analysis, Cluster Analysis, and Multidimensional Scaling

Average Rating of Each Sport on Each of the Six Scales

A Matrix Containing the Average Distances

Cluster 4: Golf, Bowling

Appendix H Factor Analysis, Cluster Analysis, and Multidimensional Scaling

A Tree Diagram Showing Clustering of the 13 Sports

Factor Analysis, Cluster Analysis, and Multidimensional Scaling

Average Ratings of the Food Types on Three Scales

A Cluster Analysis of the 10 Food Types

Appendix H Factor Analysis, Cluster Analysis, and Multidimensional Scaling

Multidimensional Scaling of the

Multidimensional Scaling Showing Car Images in 1984

Perceptual MapBrand Images

Has a Touch of Class

Source: Chrylser Corp.

Factor Analysis, Cluster Analysis, and Multidimensional Scaling

You might also like