Preface

This solution manual was prepared as an aid for instrctors who wil benefit by

having solutions available. In addition to providing detailed answers to most of the
problems in the book, this manual can help the instrctor determne which of the
problems are most appropriate for the class.
The vast majority of the problems have been solved with the help of available
the problems have been solved with
software (SAS, S~Plus, Minitab). A few of
computer
hand calculators. The reader should keep in mind that round-off errors can occurparcularly in those problems involving long chains of arthmetic calculations.

We would like to take this opportnity to acknowledge the contrbution of many
students, whose homework formd the basis for many of the solutions. In paricular, we
would like to thank Jorge Achcar, Sebastiao Amorim, W. K. Cheang, S. S. Cho, S. G.
Chow, Charles Fleming, Stu Janis, Richard Jones, Tim Kramer, Dennis Murphy, Rich
Raubertas, David Steinberg, T. J. Tien, Steve Verril, Paul Whitney and Mike Wincek.
Dianne Hall compiled most of the material needed to make this current solutions manual
consistent with the sixth edition of the book.
The solutions are numbered in the same manner as the exercises in the book.
Thus, for example, 9.6 refers to the 6th exercise of chapter 9.
We hope this manual is a useful aid for adopters of our Applied Multivariate
Statistical Analysis, 6th edition, text. The authors have taken a litte more active role in
the preparation of the current solutions manual. However, it is inevitable that an error or
two has slipped through so please bring remaining errors to our attention. Also,
comments and suggestions are always welcome.
Richard A. Johnson
Dean W. Wichern

Chapter 1
1.1

Xl =" 4.29

X2 = 15.29

51i = 4.20

522 = 3.56

S12 = 3.70

1.2 a)
Scatter Plot and Marginal Dot Plots

.
.

.

.

.

.

.

.

.
.
.

.

17.5

.

15.0

I'

.

.

.

.

.

12~5

.

.

.

)C

10.0

.

7.5

.
.

.
.
.

.
.

5.0
0

4

2

6

10

8

12

xl

b) SlZ is negative

c)
Xi =5.20 x2 = 12.48 sii = 3.09 S22 = 5.27

SI2 = -15.94 'i2 = -.98
Large Xl occurs with small Xz and vice versa.
d)
x = 12.48

(5.20 )

Sn --

-15.94)

-15.94
( 3.09

5.27

R =(
1 -.98)
-.98 1

.

40~ UJ R = .577c ) L (synetric) 2 .. hard to be definitive about nature of marginal distributions. 50 .'. .70 sii = 82. . The marginal distribution of Xi seems reasonably symmetrc.. ... However.(1 (synet:..2 1. .. ..3 SnJ6 : -~::J x = - -... .69 Large profits (X2) tend to be associated with large sales (Xi)... . . Since sample size is small.. . 150 xl 200 250 300 b) Xi = 155. ... ..60 x2 = 14. small profits with small sales. . .. ._. 1.. . .4 a) There isa positive correlation between Xl and Xi. . ... 3~OJ . .. .. . .'.26 'i2 = . 10 .PJot andMarginaldøøt:~llôt!.. .85 SI2 = 273. .'. 100 .03 S22 = 4. 25 20 . . . I' )C 15 . ... . marginal distribution of Xi appears to be skewed to the right.": SCëtter.

5 a) There is negative correlation between X2 and X3 and negative correlation between Xl and X3. 0 50 100 150 xl 200 250 . . Sêåttêr'Plotäl'(i'Marginal. . . . . .. . . .3 1. . . 400 . . 1200 M . '-' . . . . 300 . . . 1200 M . . 400 . x2 . . . . .DotPiØ_:i. . . 800 . .. . The marginal distribution of X2 seems reasonably symmetric. . . . . . . . . . . . .. .. . . . .'llÎi:lîtfjtì. The marginal distribution of Xi appears to be skewed to the right.I.alÎ. . 0 25 20 15 10 . .800 . The marginal distribution of X3 also appears to be skewed to the right. .Scatiêr. . 1600 )C . .~sxli. . 1600 )C . .Plötànd:Marginal.

17. n* ***** ***** ****** **** S a. 6.42 1 a) Hi stograms Xs Xi NUMBER OF HIDDLE OF HIDIILE OF INTERVAL co OBSERVATioNS ***** 5 INTERVAL 5. 18. 2 1 s 9. J.85) -.85 1. 14. . 6.90 -. 20. ** *** u** Uu. 16. 13.948. 11. 90. 1* OBSERVA T I'ONS 1.. s.OF 19 ******************* 9 ********* .26 (.42 .91 (155.36 82..60J 1 -. 4. 40. 110.45 461. u***** 7 10. 10. * 1 n * X3 2. 6. 14.36) -948.69 R = ( ~69 4. 6. ******** S ****** X2 NUHBER OF OBSERVATIONS 1 * HIDDLE OF INTERVAL 30.70 710. s. NUMBER OF OBSERVATIONS 7 ******* 9 ********* 1* 25 ************************* . 8. *********** 11 5 6 U* J 7.26 Sn = 273.3 U* HIDDLE OF * 1 Xl :s ***** 4.5 b) 273. 70. 12. :3 . NUMBER OF. . 15. NUMB£R. 4. 12. J. 22.S LS.. OF 08SERVATIONS 2 ** o o X7 HIDDLE OF INTERVAL 2. 16. 24. 7. 50. oJ .4 1.85 -32018.32018. **** 1 0 INTE"RVAL HIDIILE OF INTERVAL 4 0 19. J 2 3 10 12 a 100. 5. 60.45 -.* . 80. 21. un* 6 4 4 ********** ************ ******** 0 2. 4.03 x = 14. * 0 a. X4 NUltEiER OF OI4SERVATIOllS 7 ******* B ******** 10.6 . ******* 5 n*** 2 2 ** u 1* 2 1 ** * 20. 7. NUHBER OF S u*** INTERVAL * 1 I' I DOLE . 9. OBSERVA T IONS 1 J ***$********* 15 *************** a ******** 5 1 ui** * J **** *** 4 7. 26.

602 2.816 1.440 7.658 6. 093 3.058 2..5 1.. 714 4.138 .p1'Ot (vari ab 1 e space) ~ ~tem space. Xl 2 2 4 Scatter. 154 1 . 2 . 571 -2. 4 4 .052 1 .467 (syrtric) The pair x3' x4 exhibits a small to moderate positive correlation and so does the pair x3' xs' Most of the entries are small.452 .1 79 .172 11..5 b) 293. 191 -.6 2. 062 -.405 3.360 73 .369 3..260 1 .019 30.7 ill b) 3 x2 .580 n 1 0 .857 - x -2 .67 -1 . .486 S = -.7-91 .548 = 2. 1. 048 9.) 1 .095 .755 :609 .241 .354 30.

48 s 12 = 9.1.236 Using (1-20) d(P.-1. = /5 = 2.5).-3).Q)' /~H-1 )'+2(l)(-1-1 )(-1-0) '2t(-~0).5) The resulting ellipse is: X1 1. (3. we first obtain the coordinates of some points sati sfyi ng the equation: (-1. (2. 5 . 0 . (0. .5).0) is given by the expression t(xi-n2+ ~ (x1-1 )x2 + 2t x~ = 1.09 s 22 = 6.-2. To sketch the locus of points defined by this equation.9 a) sl1 = 20. .-1.6). 5 xi 10 .3).1.2. 19 X2 . (1.38S Using (1-20) the locus of points a c~nstant squared distance 1 from Q = (1. (0. -"5 .' =j~~ = 1.-6 1.8 Using (1-12) d(P.Q) = 1(-1-1 )2+(_1_0)2. (1. . .5). (2.6).

Therefore this is a distance for correlated variables if it is non-negative easily if we write for all values of xl' xz' But this follows 2. xl + 4xZ + x1x2 = (xl + r'2) + T x2 .2 = YZ. 4(x.?o.)(x2-y2) + (x2-YZ): = =.Q) ~O. .(x1-Yfx2+Y2):1 + 3(Xi-Yi):1.?0 so d(P.x2) = (0.-Yi)4 + Z(-l )(x1-Yl )(x2-YZ) + (x2-Y2):¿' = 14(Y1-xi):¿ + 2(-i)(yi-x.Q) = 14(X. we conclude that this is not a validdistan~e function.P) Next. and aZ2 = 4. The s€cond term is zero in this last ex.pr.)(yZ-x2) + (xz-Yz):¿' = d(Q.-yi)2.11 d(P. . b) In order for this expression to be a distance it has to be non-negative for 2. 1. 2. a12 = ~.1) we have xl-2xZ = -Z.2(xi-y.7 1.10 a) This equation is of the fonn (1-19) with aii = 1. for (xl . :¿ all values xl' xz' Since. 1 1 15 2.essi'on only if xl = Y1 and then the first is zero only if x.

. -1 c) The generalization to p-dimensions is given by d(Q.P) = max(lx. 1 -1 7 x.I.lxpl)' 1.lx21..8 1. ...1 1 . is X2 .O) is .13 Place the faci'ity at C-3..141) = 4 b) The locus of points whosesquar~d distance from (n.P) =max (1-31.12 a) If P = (-3.4) then d(Q.

' 230. 20:5.35 865. . . )(2 130..31 236.+ )(4 . s group.60 82.62 13. . 2:5:5. . Strong positive correlation.10 61 . b) Mul tipl e-scl eros. . . 280. 320.+ +______+_____+-------------+------~.32 3 as .53 = 1146. 94 221 '.+ . . . 1:5:5. .* I: I: 160..78 -20. . .38 (synetric) 337. 180. No obvious "unusual" observations. 93 90. 240... .91 Sn 61 ....14 a) 360.+ .48 286. . 1 3 -27 .+ 200. 42 . .+ .64 x = 12. 07 179.16 116.80 .72 -218.. . 65 812.9 1.

244 1 . H)6 .78 273.239 . 04 .28 2.548 1 R = (symmetric) .22 .56 1 95.132 . 37 . 67 3.200 1 1 R -.2u 1 03 .84 1.134 1 .892 .15 2.61 95.10 . (syietric) 1 .167 -.08 11 0.727 .114 1 .57 1. 99 147.13 sn = 1 01 .139 .62 5.438 .375 .133 = 1 ( synetrit: ) 1 Non multiple-sclerosis group.454 .28 1.173 1 .35 2.21 i = 1 .32 183 .127 .49 2.123 .896 .

. .58 1. ~ I I +. 1 t I i . . . .~~ . .. . 3. 1 . . + . . .21 2. . + .75 G. . . . I .. 1 .75 . . . .. 1 ... .. 2. . t.0 . . .'_ 1 X:i . . . . . . 1 . . 1 1 III 2.. .. . I . . + t 1 . . I 1 . l 1 . . + . 1. .. . 1 . .. + .P. 1 . . . . . J - . . .25 ". ~ e . 3.81 x = 2. 1 I ..llfl . ~ . i J . t 1 1 . . + . t I : . .7'5 3. 1 1 I -- ~ . . 3.. . . . . . . . . . I 1 -- . . . . cl .. 1 I . .2 . "'. 1 - E E .80 . 2 ... .- 2. . 1. .88 t. . . . .15 a) Scatterplot of x2 and x3. .1 3 . .25 Z "~A ACTIVITY X% b) 3.. . . . . . . . + .25 1. .. ...1) 3.. '.l - . . I 1 . . . it . 1 . 1I.25 . .. .11 1. .2 I t 1 . .. .. l . . ..54 1. . .. . . ... . . . J 1I 2 1 . . .z.. . .27 . .14 ~.. . 1 I 1 . 1. .". 1 t .S11 .--. +. 75f) .o . . . . I -. . 1 1 . .. .

535 .92 .362 .12 . (synetric) 1 .57 .704 -.61 1. 01 0 .02 .496 .455 .15 -.. .85 .11 . appetite and activity have a moderate positive correl a tion.12 4.34 .071 1 The largest correlation is between appetite and amount of food eaten.27 . 077 ' .551 1 . A1 so.85 .156 .15 Sn .39 .61 .11 .346 1 (syretric) .386 .11 .035 1 -. Both activity and appetite have moderate positive correlations with symptoms.02 -.09 1.537 .01 ..58 .21 = .187 1 R = .~6 .

8438 0.500 .0192822 0.67789 0.801 . and the 1500m and 3000m runs.02145tiO Sn - 0.74909 0.66826 0.0161219 0.669 .69146 0.500 1.152 .0161635 0.732 .0087559 0.674 .197 1.400 .854 . .060 .212 .368 1.065 .720.0202555 0.875 .021 .000 .0085522 0.00000 -0. .0177938 0.941.74369 0.0076395 0.0177938 0.66826 0.732 .806 .0161219 0.74218 0.621 .193 28.62555 1.00000 0.61192 0.55222 0.544 10. 0.0202555 0.000 .16 There are signficant positive correlations among al variable.867 .eation is 0.847 2.728 .809 .4420 0.80980 0.00000 0. R = 0. The lowest correlation is .55222 0.680 .820 .867 . for example.905 1. 7348 .08 153.72889 0.00000 0.0081886 0.082 .0087559 0.61882 0.082 .193 51.72889 1.0109612 0.027 .61192 1.74369 0.060 .674 .909 1.254 23.875 2.44020 0.02 4.178 .0076395 0.0077483 0.0081886 0.00000 0.7044 0.0077483 0.0771429 0.00000 x- 0.0667051 0. the 1 OOm and 200m dashes.801 .0192822 0.871 .677 .0099633 0.99 .4420 between Dominant humeru and Ulna.85181 1.806 1. 11.197 3.799 1.0085522 0.474 10.544 1.854 .199 .338 .000 .677 R = .12 .0111057 0.368 1.0124815 0.0214560 0. Paricularly large correlations occur between running events that are "similar".791 .85181 0.265 x = 2.021 .74909 0.680 .89365 1.152 6.152 .0099633 0. 7927 1.74218 0.720 .799 .871 .230 .0161635 0.338 .0170261 0.254 10.652 28. and the highest corr.0170261 0.508 .17 There are large positive correlations among all variables.000 .973 1.905 .230 4.0123332 0.782 .178 .8183 1.007 .000 3.212 .073 .065 .67789 0.13 1.973 .0123332 0.6938 0. 0.80980 1.0641052 0.027 .820 .474 10.809 .199 .941 1.000 .000 .400 .508 265.728 .791 .0641052 0.782 .669 .19 9.89365 bewteen Dominant hemero and Hemeru.89365 0.0101752 1.909 .62 So= 4.36 .ti2555 0.61882 0.69146 0.

675 .824 .147 .741 .816 .097 .105 .672 .092 .000 .102 .096 .66 .096 .074 .095 .082 .694 .177 .806 .875 .102 x = 6.115 .114 .100 .729 .972 1.093 .100 .000 .804 .81 .906 .081 8.938 .875 .854 1.118 .144 .776 .105 .093 7.938 1.797 .108 .082 .60 Sn = .096 .138 .729 .096 .71 .081 .000 .54 4.731 .852 .99 .086 .854 .62 .124 .694 .741 .816 .086 . 8.675 .075 .672 R = .097 .660 .797 .824 .081 .147 .14 1.081 . The correlation matrix for running events measured in meters per second is very similar to the correlation matrix for the running event times given in Exercise 1.167 1.804 1.065 .000 .776 .091 .866 .065 .972 .000 .731 .906 1.806 .906 1.906 . Notice the correlations decrease as the distances between pairs of running events increase (see the first column of the correlation matrx R).660 .852 .094 .118 5.144 .17.18 There are positive correlations among all variables.095 .108 .094 5.000 .866 .092 .000 .114 .075 .

00 '. Q I c: i- C .. .. z: .I: II ~ " o' " ... . c: .'" - -. ". . : o.......II 0' i:: .. : ... 00 " ...' ....: ... o' : '.. . -z .. ..... . c .. c: i- ...I: i = c: ... . ... -0 c: C CD c.¡c:: ...- UI ..... :: ..I: ~ ..... : ...' " " .: o c: en . .' " CI -. ..C co . . t- ~ CI - . z: .0 - - . :...QI .. .. ... : . ..¡c:. - .. -I: . c: ...19 (a) o _R A 0 IUS RADIUS LHUI..~ -.: C . .. I o' '0 . in ..15 1.- ... .... 0..... = C ~ . .. en " o' .C co co . .. .. ..o'' 0. z- C .. . . . '" o . c: . o en .~.. - ....ERUS tlUME~US ILULNA ULNA c- ..I: .0 - QI CI ".. . ..' . ... ". i :: .: . .. '" .... 00 00. QI t- " ... C ..

~..~.. .... . . . . . .8.0' .-.1fiii: '. .. .. . . 'L ~. -it . .... . .. .... ~ ..~. "-:f' I! .!.~~ . f .i -.. ..19 (b) ~. ' tl.. . ... . ... .. ... I..1. :i~ . . .- :.~: .. i:_. . t . . ... . ... . . -i-. "..1. _. . . \. . . .t:. :. .. A.. l.~. . l. . ..... \.to l . " ... . . · . . .. . . . " ". . . ... -: .. ll. .c .. . . 'l .: ...'\: . -Ii... \. .... . . .... . .t':". :.. :..~ . .. .~ "~ .16 1. · ~c. -l .. . .. i l 0 \. ~.. . ~ \.. t- ..-. ~..~..:..' ~... .I.. .' !t ~ 1" ~. ~l t... . ..~ ~. .' .. . P.¡ . .l .' \ . .." . . . . .. .. \. . . . . ..lý .~.. ~: ..... ~ .. .. .o... . ... .( . .. . . . . ....1... . .. . .:~ t. . .... . . . .. " . ~. . ...... . .

. but bent.l l . in the lower left hand part could be outliers.17 1.tion.."" i. \.. ... (b) . From the highlighted plot in (b) (actually non-bankrupt group not highlighted). . ... . which is apparently located in the bankrupt group. . '" ~'~t .. x1 . '. . .' A L_-l_ X .. . . .". . there is one outlier in the nonbankruptgroup. . . . . ~'T · '\ . \ l'ø.20 Xl (a) .. . .ö . . \ . .. .. .. . (a1 The plot looks like a cigar shape. \ X3 X1 .... .. . .. ~ x3 . . Some observations. L _ _ (.." r . . . besides the strung out pattern to the right. . . ~ .. (ll) The dotted line in the plot would be an orientation for the classificà. . ..

. .. . . . there is an orientation to classify into two groups.. ... .. . G~~ . . . .. The observation in the upper right is the outlier.. .. . . . . ..18 1. .e Outlier Q (a) There are two outliers in the upper right and lower right corners of the plot. .. . . ... .-. .. tfe' Ó... .. .~. ...l. ... . . As indiCated in the plot... . X1 .. ... . X3 ... . . . (b) Only the points in the gasoline group are highlighted.. .. ... ... .. .21 o (a) o (b) Outlier Outlier ~~ô 0'" .. . .. .. . . X1 .

..I . Xz . .. x./ . . . . Outliers ~ . . . .../. . .. .. .19 1. .../ .. . ~./ ... G Outlier ~ø ~~ ~ø. . . . . . . .. . .. . .. ... / \. ./ ~e~~\.et ot)~ . X1 Xz/ · ..~ . . .e . . . . . .. ~~e ... ./ . . fi .. )l" ... . .../ / .. .s. ... . .. .. . . ... .22 possible outliers are indicated. · ). . . . . .. . . fi .." . . Xz . t I ~~\e X1 . x.. il . .

. ci .u VI C s. -u:: cc: s. ~ciVI en :: . iu ci ~VI i-U:: iVI -~ . c( . ci VI en :: a.... V) oi = V) s. 20 . s.. c: fa s. ~ci :: iu VI II ai ci ~VI:: .G M N. 0 IØ U s. ci to :: iCo .Q -0 VI CI . VI CI VI ci u . s..- -u Cd ci Q.c. ~ i-aci .u :: "' e a..c u N oi c: -= s.. ..VI Q.c :: en i- Cd ci VI . ..Q c: ci ~ s.. --i. z: Cd CJ I) . '" +J VI ra ci . II C' Cd . U .c:s. ci V) Co = V) i..~a: i-n: +J l-0 .:: oi :: V) s. ci . ~VIa.. . ..a:u 4. s.I -a: ra :: VI s...I¡ Cd a c: s.. c: . ~ C" ~ :: iU ~ci ~ ~s.

24 20 10 13 4 C1 uster 2 3 9 14 19 18 C1 uster 3 22 .s 1 .21 Cl uster 1 1.

the same manner as those in Example 1.22 Clust~r 4 16 8 11 Cl uster 5 21 5 Cluster 6 17 12 2 C1 uster 7 We have cluster~d these faces in. plausible. for instance.ched from 7. however. other groupings are~qually . Note. utilities 9 and 18 l1ight be swit. .12. '5 Cluster 2 toC1 uster 3 and so forth.

. ..: l.. / ." .. .~.. '/ . . -..'-1 . 10 4 -.... -...emai ni ng stars cl usters.25 We illustrate one cluster of "stars..." I: .- (not ....- .. '. ...l.i ... f ~ 20 ".": ..'.1 ¡..0: . ...¡.. .... -. The shown) can be gr~uped in 3 or 4 additional r.~."-.......l ". 13 '-a..23 1.

487 0. d O.35 16628... ..368 0.. N '" .37 1. .208 0.37 1..282 0.000 O. .282 1.317 0.23 3.423 0.624 1.56 2. 860 0 .409 0. 35 90 1100 t30 5.605 -0. . . .85 -0. CD CD ..96 2. ~ .482 -0. ..09 98. .QOO ~0.44 81. .368 0. 3816 1742. . .116 0. 56 2.23 -0.55 SaleHt 3.~66 1555.801 0.260 0. .4342 50.74 2. Breed . .05 46. -:- .02 Breed YrHgt FtFrBody PrctFFB Frame BkFat 2.113 0. ..317 o .691 1.. .23 3. . .277 -0.28 -226. .691 0. cci .699 0. 75 10. .38 145. ~ .38 116. . .74 2.. ..44 -0.940 0. .82 43.81 2..344 0 . . . \. .487 0..390 0.49 -0.. a.523 0. .116 -0.615 0. .9474 70. .4 0.82 43.479 0.~15 0.368 -0.01 480 .2 0..423 1 . 525 0 .488 -0. 1263 0.78 15. . . . 27 -1.. .5224 995..0 ..14 272..000 -0. ..73 -429 .47 'SaleWt 46.27 1. .102 -0. .525 0. . .605 0. .I .17 4.: . . . .47 -0. .344 -0.198 0.3 0.32 25308. .94 128. . .801 -0.64 450.434 0..224 1. . . .05 -0.0 6.260 1.94 -0. .02 15.94 128. 368 0.277 0. '" d '" d .46 272. .472 -0.~'.: Frame :i .44 1. l. CD CD on SaleHt l: -:. . .97 145 . 624 0 . . . . .000 Sn SalePr -429.26 Bull data R (a) XBAR Breed SalePr YrHgt FtFrBody PrctFFB Frame BkFat SaleHt SaleWt 4. .434 -0.113 0.72 6592. BkFat . ~ .. . Breed .521 0. . 47 2.24 1..3158 0. .41 82.79 98.28 1.000 0.5 .. .0 8.00 0. ..000 -0.24 -0. . . . .92 206. 44 81..-. 72 6592. -.5££ 0.81 8481. . ..i'.02 383026.860 0.55 4. . .488 0.02 0. . .521 0.14 -0.05 3.. .. . .1. . ..0 7. .32 25308 . .409 0.2895 0.. .46 2.168 0. .41 82. 2 4 6 8 §! .000 0. '.472 0. . 38 -0. 523 0 . .000 0.49 51. . .l . .699 0.47 5813.38 9. .224 0.198 0.1967 1. . . t 0. . .. . . .208 £4.8816 6.-'. . 940 -0. 26 206 .555 0.79 116. 8 --¡.479 0. ~ .23 0.566 1. .09 -226..73 1. .05 5813.. .92 1. :.78 1.00 3.168 -0. FtFrB I . . . . ~ g 2 4 6 8 50 52 54 1i 58 60 .24 450. o CD . .390 0. . 17 -1. . I. 75 51.00 480..482 1.102 0. . . .

.391. . 500 . 1500 iI üi . This single point has reasonably large effect on correlation reducing the positive correlation by more than half when added to the national park data set. 0 0 1 .173 Scatterplot of Size Y5 viSitors 2500 . Correlation with this park removed is r = . . Gæct '5lio\£~ "' . .. 2000 . . (c) The correlation coefficient is a dimensionless measure of association. . . The correlation in (b) would not change if size were measured in square miles instead of acres.1000 . . .27 (a) Correlation r = . 2 3 4 5 6 7 8 9 Visitors (b) Great Smoky is unusual park.25 1.