Professional Documents
Culture Documents
1) Overview
2) Basic Concept
3) Relation to Regression and ANOVA
4) Discriminant Analysis Model
5) Statistics Associated with Discriminant Analysis
6) Conducting Discriminant Analysis
i. Formulation
ii. Estimation
iii. Determination of Significance
iv. Interpretation
v. Validation
i. Formulation
ii. Estimation
iv. Interpretation
v. Validation
i. Estimation
v. An Illustrative Application
10) Summary
Copyright © 2010 Pearson Education, Inc. 18-4
Similarities and Differences between ANOVA,
Regression, and Discriminant Analysis
Table 18.1
Differences
Nature of the
dependent Metric Metric Categorical
variables
Nature of the
independent Categorical Metric Metric
variables
Fig. 18.1
X2 G1
1 1 2 2
G2
1 1 11 2
1 1 1 1 2
1
2 2
2 22
1 2
21
1 22
22
G1
G2 X1
Fig. 18.2
Formulate the Problem
1 1 50.2 5 8 3 43 M (2)
2 1 70.3 6 7 4 61 H (3)
3 1 62.9 7 5 6 52 H (3)
4 1 48.5 7 5 5 36 L (1)
5 1 52.7 6 6 4 55 H (3)
6 1 75.0 8 7 5 68 H (3)
7 1 46.2 5 3 3 62 M (2)
8 1 57.0 2 4 6 51 M (2)
9 1 64.1 7 5 4 57 H (3)
10 1 68.1 7 6 5 45 H (3)
11 1 73.4 6 7 5 44 H (3)
12 1 71.9 5 8 4 64 H (3)
13 1 56.2 1 8 6 54 M (2)
14 1 49.3 4 2 3 56 H (3)
15 1 62.0 5 6 2 58 H (3)
16 2 32.1 5 4 3 58 L (1)
17 2 36.2 4 3 2 55 L (1)
18 2 43.2 2 5 2 57 M (2)
19 2 50.4 5 2 4 37 M (2)
20 2 44.1 6 6 3 42 M (2)
21 2 38.3 6 6 2 45 L (1)
22 2 55.0 1 2 2 57 M (2)
23 2 46.1 3 5 3 51 L (1)
24 2 35.0 6 4 5 64 L (1)
25 2 37.3 2 7 4 54 L (1)
26 2 41.8 5 1 3 56 M (2)
27 2 57.0 8 3 2 36 M (2)
28 2 33.4 6 8 2 50 L (1)
29 2 37.5 3 2 3 48 L (1)
30 2 41.3 3 3 2 42 L (1)
1 1 50.8 4 7 3 45 M(2)
2 1 63.6 7 4 7 55 H (3)
3 1 54.0 6 7 4 58 M(2)
4 1 45.0 5 4 3 60 M(2)
5 1 68.0 6 6 6 46 H (3)
6 1 62.1 5 6 3 56 H (3)
7 2 35.0 4 3 4 54 L (1)
8 2 49.6 5 3 5 39 L (1)
9 2 39.4 6 5 3 44 H (3)
10 2 37.0 2 6 5 51 L (1)
11 2 54.5 7 3 3 37 M(2)
12 2 38.2 2 2 3 49 L (1)
INCOME 1.00000
TRAVEL 0.19745 1.00000
VACATION 0.09148 0.08434 1.00000
HSIZE 0.08887 -0.01681 0.07046 1.00000
AGE - 0.01431 -0.19709 0.01742 -0.04301 1.00000
INCOME 0.74301
TRAVEL 0.09611
VACATION 0.23329
HSIZE 0.46911
AGE 0.20922
Structure Matrix:
Pooled within-groups correlations between discriminating variables & canonical discriminant functions
(variables ordered by size of correlation within function)
FUNC 1
INCOME 0.82202
HSIZE 0.54096
VACATION 0.34607
TRAVEL 0.21337
AGE 0.16354 Cont.
Copyright © 2010 Pearson Education, Inc. 18-21
Results of Two-Group Discriminant Analysis
Table 18.4, cont.
Unstandardized Canonical Discriminant Function Coefficients
FUNC 1
INCOME 0.8476710E-01
TRAVEL 0.4964455E-01
VACATION 0.1202813
HSIZE 0.4273893
AGE 0.2454380E-01
(constant) -7.975476
Canonical discriminant functions evaluated at group means (group centroids)
Group FUNC 1
1 1.29118
2 -1.29118
Classification results for cases selected for use in analysis
Predicted Group Membership
Actual Group No. of Cases 1 2
Group 1 15 12 3
80.0% 20.0%
Group 2 15 0 15
0.0% 100.0%
Percent of grouped cases correctly classified: 90.00%
Cont.
Copyright © 2010 Pearson Education, Inc. 18-22
Results of Two-Group Discriminant Analysis
Table 18.4, cont.
Structure Matrix:
Table 18.5, cont.
FUNC 1 FUNC 2
INCOME 0.85556* -0.27833
HSIZE 0.19319* 0.07749
VACATION 0.21935 0.58829*
TRAVEL 0.14899 0.45362*
AGE 0.16576 0.34079*
Group 1 10 9 1 0
90.0% 10.0% 0.0%
Group 2 10 1 9 0
10.0% 90.0% 0.0%
Group 3 10 0 2 8
0.0% 20.0% 80.0%
Percent of grouped cases correctly classified: 86.67%
Classification results for cases not selected for use in the analysis
Predicted Group Membership
Actual Group No. of Cases 1 2 3
Group 1 4 3 1 0
75.0% 25.0% 0.0%
Group 2 4 0 3 1
0.0% 75.0% 25.0%
Group 3 4 1 0 3
25.0% 0.0% 75.0%
Percent of grouped cases correctly classified: 75.00%
Fig. 18.3
Across: Function 1
Down: Function 2
4.0
1 1
1 *1 3
23 3 *3 3
1 1 12 * 3 3
0.0 1 1 2 2
3
1 2 2
2
-4.0
* indicates a group
centroid
Fig. 18.4
13
13
13 Across: Function 1
8.0 13 Down: Function 2
13 * Indicates a
13
13 group centroid
4.0 113
112 3
112233
* 1 1 1 2 2 2 2 3 3*
112 * 223
0.0 121 2 233
1 12 2233
1 12 1 2 2 223
233
-4.0 1122
223
2
112
11122 233
1 121 2 2 2233
112 223
-8.0 11122 233
The probability of success may be modeled using the logit model as:
P
log = a +a X +a X +... +a X
1 - P
e 0 1 1 2 2 k k
P =
a X
n
log
1 - P
Or e i i
i= 0
exp( a X )
P =
i i
i =0
1 + exp(
k
a X i i )
i =0
Where:
P = Probability of success
Xi = Independent variable i
ai = parameter to be estimated.
Where,
nonmetric.
Table 18.7
Model Summary
Analyze>Classify>Discriminant …
6. Click OK.
Analyze>Classify>Discriminant Analysis