You are on page 1of 15

Discriminant analysis

1
Similarities and Differences between ANOVA,
Regression, and Discriminant Analysis

ANOVA REGRESSION DISCRIMINANT ANALYSIS

Similarities
Number of One One One
dependent
Variables

Number of
independent Multiple Multiple Multiple
variables

Differences
Nature of the
dependent Metric Metric Categorical
Variables

Nature of the
independent Categorical Metric Metric
variables
Discriminant Analysis

Discriminant analysis is a technique for analyzing


data when the criterion or dependent variable is
categorical and the predictor or independent variables
are interval in nature.
 When the criterion variable has two categories, the
technique is known as two-group discriminant
analysis.
 When three or more categories are involved, the
technique is referred to as multiple discriminant
analysis.
 Groups must be mutually exclusive and exhaustive
Discriminant Analysis

The objectives of discriminant analysis are as follows:


Development of discriminant functions, or linear combinations of
the predictor or independent variables, which will best discriminate
between the categories of the criterion or dependent variable (groups).
Examination of whether significant differences exist among the
groups, in terms of the predictor variables.
Determination of which predictor variables contribute to most of the
intergroup differences.
Classification of cases to one of the groups based on the values of the
predictor variables.
Evaluation of the accuracy of classification.
Conducting Discriminant Analysis

Formulate the Problem

Estimate the Discriminant Function Coefficients

Determine the Significance of the Discriminant Function

Interpret the Results

Assess Validity of Discriminant Analysis


Example on Buyer Non Buyer of wool
Respondents may belong to one of two groups
• those who are prospective buyers of wool
• those who are not
Discrimination between these groups through a set
of user characteristics: Durability, light weight,
low investment and rot resistance
Does a linear combination of these 4 characteristics
allow one to discriminate between buyers and non-
buyers of wool
Running discriminant analysis
(two groups)

Discriminant function
(Target variable: Buyers/Nonbuyers

z   0  1 x1   2 x2   3 x3   4 x4
Discriminant score Predictors
Eg: Durability

The  discriminant coefficients Light weight


need to be estimated Low investment
Rot resistance
Preliminary analysis

Descriptive statistics
Test for difference in group means
Correlation matrix
Fisher’s linear discriminant analysis

 The discrimant function is the starting point


 Key assumptions behind linear DA
(a) the predictors are normally distributed;
(b) Homogenity of variance
(c) No multicollinearity among independent
variables- correlation < 0.75
Discriminanant function

 Between group variance should be maximized and within


group variance to be minimized
 The unstandardized coefficients (b) are used to create the
discriminant function (equation).
 Eigenvalue is SSB/SSW
 The larger the eigenvalue, the more of the variance in the
dependent variable is explained by that function
 The canonical correlation is the measure of association
between the discriminant function and the dependent
variable.
The square of canonical correlation coefficient is the percentage
of variance explained in the dependent variable.
Key Terms

Look at Wilks Lambda values for the overall model


and for each independent variables.
 This is SSWithin/SSTotal

Wilks’ lambda indicates the significance of the


discriminant function and provides the proportion
of total variability not explained, i.e. it is the
converse of the squared canonical correlation
Smaller values of Wilks' lambda indicate greater
discriminatory ability of the function
Standardized discriminant fumction
coefficient
The standardized discriminant function coefficients
in the table serve the same purpose as beta weights
in multiple regression (partial coefficient) : they
indicate the relative importance of the independent
variables in predicting the dependent.
They allow you to compare variables measured on
different scales.
Coefficients with large absolute values correspond to
variables with greater discriminating ability.
Structural coefficients

 The structure matrix table shows the correlations of each


variable with each discriminant function.
 The correlations then serve like factor loadings in factor analysis
--that is, it helps in identifying the largest absolute correlations
associated with each discriminant function
 Also termed as discriminant loadings
 We do not interpret loadings in the structure matrix unless they
are 0.30 or higher.
 Many researchers use the structure matrix correlations because
they are considered more accurate than the Standardized
Canonical Discriminant Function Coefficients
in multiple discriminant analysis
A further way of interpreting discriminant analysis
results is to describe each group in terms of its
profile, using the group means of the predictor
variables. These group means are called centroids

Cases with scores near to a centroid are predicted as


belonging to that group.
Classification

Based on avaerage of group centroids each


respondent will be classified into one of the two
groups
Hit ratio = No. of correct predictions/ Total number
of cases

You might also like