Professional Documents
Culture Documents
Discriminant Analysis
Discriminant Analysis
This sounds like a hypothesis that could be tested with MANOVA, and it is, but it can also be tested with discriminant analysis First lets look at what MANOVA will tell us about this hypothesis
Effect Intercept
WCONCENT
Pillai's Trace Wilks' Lambda Hotelling's Trace Roy's Largest Root Pillai's Trace Wilks' Lambda Hotelling's Trace Roy's Largest Root
F Hypothesis df 467.961 b 4.000 467.961 b 4.000 b 467.961 4.000 467.961 b 4.000 7.857 8.000 11.793b 8.000 16.473 8.000 33.443c 4.000
Partial Eta Squared .980 .980 .980 .980 .440 .547 .634 .770
Noncent. Parameter 1871.844 1871.844 1871.844 1871.844 62.852 94.344 131.787 133.772
Observed a Power 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
a. Computed using alpha = .05 b. Exact statistic c. The statistic is an upper bound on F that yields a lower bound on the significance level. d. Design: Intercept+WCONCENT
Here we see that the hypothesis is confirmed: Countrys wealth concentration has a significant main effect on the set of four indicators
As you can note from the output, the univariate F tests for each of the four variables are all significant at p < .001. But what this output doesnt tell us is what sort of combination of these four variables the countries differ on, or if there is more than one combination on which they are significantly different
Number of Functions Extracted in Heres Wilks lambda again. DA Combining both discriminant
The discriminant analysis procedure extracts a maximum of m (number of discriminating variables) or k-1 underlying dimensions or canonical discriminant functions (whichever is smaller), where k is the number of groups or categories of the nominal level variable. For example, we have three categories of countrys wealth concentration, so two of these functions are extracted. Think of the idea of a total amount of variation in countrys wealth concentration that you could predict with one or more different combinations of the four variables (gini index, civil liberties score, etc) as 100%. The first new canonical variable (weighted combination of the four) accounts for 96.4 % of it, and the second canonical variable for the remaining 3.6 %. Combining these two improves the prediction
Wilks' Lambda Test of Function(s) 1 through 2 2 Wilks' Lambda .205 .890 Chi-square 64.215 4.726 df 8 3
functions allows you to predict all but .205 of the variation in level of wealth concentration
Eigenv alues Function 1 2 Eigenvalue % of Variance 3.344 a 96.4 .124 a 3.6 Cumulative % 96.4 100.0 Canonical Correlation .877 .332
Of the variance explained in wealth concentration, 96.4% was explained by the first function and 3.6% by the second one. Some variance of course remains unexplained.
Eigenv alues Function 1 2 Eigenvalue % of Variance 3.344 a 96.4 a .124 3.6 Cumulative % 96.4 100.0 Canonical Correlation .877 .332
Two other values that you see in the output are the eigenvalue and the canonical correlation. The eigenvalue is a value that can be interpreted as the variance of its respective discriminant function and the canonical correlation is the correlation between the new canonical variables formed by applying the weights from the discriminant function to the four predictors, and levels of wealth concentration
Unstandardized coefficients
The standardized and unstandarized canonical discriminant function coefficients are like the b and the weights in multiple regression. The ones on the right, with a constant, are like the beta weights and the intercept that you use with raw scores to classify new cases as to countrys wealth concentration. The ones on the left are the standardized coefficients, which means the variables are all measured on the same scale, and the weights can be compared to determine the relative importance of each of the variables to explaining group separation (differences in level of wealth concentration)
These coefficients can be used to classify new cases if the four discriminating variables are expressed in standard (z) scores
These coefficients or weights tell you how the four original variables combine to make a new one that maximally separates the countries based on their wealth concentration. You can interpret the standardized discriminant function coefficients as a measure of the relative importance of each of the original predictors. We will only interpret the first function since it explains so much more of the variance in countrys wealth concentration than the second one, and the second function was not significant. Function 1 could be labeled inequality since it is defined by the high positive loading of the gini index, and the high negative loading of political rights. The human development score and civil liberties score are comparatively unimportant in describing the separation among the categories of countrys wealth concentration
Canonical Discriminant Function Coefficients Function 1 human devel score: hi=more Political rights score Civil liberties score Gini index:0=perfect $ equality,100=perfect inequality (Constant) -1.240 -.366 .027 .126 -2.384 2 4.207 -.303 .535 .069 -7.167
This table shows the group centroids (vector of means) on the two new canonical variables formed by applying the discriminant function weights. Notice how well function 1 separates the low wealth concentration countries from the high wealth countries. You can think of the centroid for each group or level as that groups average discriminant score on that function (where for raw scores the discriminant score is -2.384 -1.240 human development score -.366 political rights score + .027 civil liberties + .126 gini index). New cases would be classified into groups depending on the group whose centroid their own vector of scores was closest to.
Unstandardized coefficients
Territorial Map from Discriminant Analysis This territorial map plots off the
High
Medium
location of cases based on their discriminant scores. Note for example that most of the low wealth concentration cases (the 1s) are concentrated on the negative end of function 1 (i.e., they are negative on inequality)) and the high wealth concentration cases (the 3s) are on the positive end (i.e., they are positive on inequality), consistent with the location of their group means (centroids) on the function (see arrows)
Functions at Group Centroids
Quadratic Classification
High
Low Wealth Concentration
Medium
One way of handling the problem of unequal covariances across groups (i.e., you flunked the Boxs M test) is to base the classification not on the combined covariance matrices but on the separate ones (this is an option in SPSS). Notice that you get a bit of a different result.
Original
Count
Predicted Group Membership Concentration of Wealth LowWealt ModerateWe HighWealt in Hands of Few hConcentr althConcentr hConcentr LowWealthConcentr 17 1 0 ModerateWealthConcentr 1 4 2 HighWealthConcentr 0 3 17 Ungrouped cases 4 15 9 LowWealthConcentr 94.4 5.6 .0 ModerateWealthConcentr 14.3 57.1 28.6 HighWealthConcentr .0 15.0 85.0 Ungrouped cases 14.3 53.6 32.1
Recall that the new canonical variables created by applying the discriminant function weights to the four original variables could be used to classify cases. Its best to have a holdout sample to use to test the new canonical variables as to how well they classify cases that werent part of the development or training sample, but we can go back and reclassify the existing cases to see how well we do at using the new canonical variables to classify cases back into the groups they belong to. According to the table above when the discriminant functions were used to predict what a countrys level of wealth concentration was from the four variables, 84.4% of the original grouped cases were correctly reclassifed back into their original categories (p(2), the hit rate). You can note that the largest proportion of errors were in reclassifying the middle category (moderate wealth concentration) while the classification was nearly perfect in reclassifying the low wealth concentration countries (only one error)
Classification Rules
Decision rules developed from discriminant analysis can be influenced by knowledge of or expectations about the relative size in the population of the levels of the grouping variable E.g., approximately 5% of the population of mortgagees will default in a given year, so the prior probabilities are 5% for one group and 95% for the non-default group In cases where these prior probabilities are not known they are often based on the sample sizes for the levels of the grouping variable if the sample is a random sample from the population Some decision rules treat the prior probabilities as equal across all levels and let the discriminating variables do all the classification work
Classification Rules
As mentioned earlier, sometimes a decision is made in advance to test a discriminant function by holding out a sample and then using the function obtained on the training sample to classify the new cases from the holdout sample An alternative approach is the leave-oneout method which is an option in SPSS under the Classify button
Each case is deleted in turn from the training sample and is classified by means of the classification rule established on the remaining observations
Wilks' Lambda Test of Function(s) 1 through 2 2 Wilks' Lambda .222 .944 Chi-square 62.440 2.372 df 4 1 Sig. .000 .124
The stepwise discriminant analysis tossed out two of the four variables for not measuring up, the two that seemed to have the lowest weights on the first function in the original DA. Note that these new canonical variables dont explain quite as much variance (lambda is a little bigger than the .205 that it was in the original analysis, and the classification correctness rate is lower (75.6% compared to 84.4%)). The original seems better as long as it is not your goal to find the most parsimonious solution using the fewest predictors
a Classification Results
Original
Count
Predicted Group Membership Concentration of Wealth LowWealt ModerateWe HighWealt in Hands of Few hConcentr althConcentr hConcentr LowWealthConcentr 17 1 0 ModerateWealthConcentr 2 3 2 HighWealthConcentr 0 6 14 Ungrouped cases 4 14 10 LowWealthConcentr 94.4 5.6 .0 ModerateWealthConcentr 28.6 42.9 28.6 HighWealthConcentr .0 30.0 70.0 Ungrouped cases 14.3 50.0 35.7
Functions at Group Centroids Concentration of Wealth in Hands of Few LowWealthConcentr ModerateWealthConcentr HighWealthConcentr Function 1 -2.023 -.022 1.828
Table 1
Table 2
Select Enter Independents together (not stepwise for now) Click on the Classify button and under Prior Probabilities set All Groups Equal and under Display select Summary table, and click Continue Click on the Statistics button and check means, univariate Anovas, Boxs M, and unstandardized function coefficients, and click Continue Click OK, and compare your output to the next several slides
Move the Countrys Wealth Concentration Variable into the Grouping window and set the range to a minimum of 1 and a maximum of 3 Move the Number of peaceful political demonstrations, Political rights, and Number of strikes variables into the Independents box
Eigenv alues Function 1 2 Eigenvalue % of Variance .635 a 98.4 .010 a 1.6 Cumulative % 98.4 100.0 Canonical Correlation .623 .102
Functions at Group Centroids Concentration of Wealth in Hands of Few LowWealthConcentr ModerateWealthConcentr HighWealthConcentr Function 1 1.052 -.384 -.658
Original
Count
Predicted Group Membership Concentration of Wealth LowWealt ModerateWe HighWealt in Hands of Few hConcentr althConcentr hConcentr LowWealthConcentr 21 1 0 ModerateWealthConcentr 5 1 8 HighWealthConcentr 7 4 16 Ungrouped cases 14 7 28 LowWealthConcentr 95.5 4.5 .0 ModerateWealthConcentr 35.7 7.1 57.1 HighWealthConcentr 25.9 14.8 59.3 Ungrouped cases 28.6 14.3 57.1