Professional Documents
Culture Documents
a ly
a n
n t
a
in Presented by
r im Amritashish
si c Bagchi, Anshuman
D Mishra & Sukanta
Goswami
Definition
Discriminant analysis is a multivariate
statistical technique used for classifying a
set of observations into pre defined
groups.
OBJECTIVE
To understand group differences and to predict the
likelihood that a particular entity will belong to a
particular class or group based on independent
variables.
Purpose
1) The main purpose is to classify a subject
into one of the two groups on the basis of
some independent traits.
2) A second purpose of the discriminant
analysis is to study the relationship
between group membership and the
variables used to predict the group
membership.
Situations for its use
When the dependent variable is
dichotomous or multichotomous.
1. Sample size
group sizes of the dependent should not be
grossly different i.e. 80:20, here logistic
regression may be prefer.
should be at least five times the number of
independent variables.
2. Normal distribution
Each of the independent variable is normally
distributed.
3. Homogeneity of variances / covariances
All variables have linear and homoscedastic
relationships.
4. Outliers
Outliers should not be present in the data.
DA is highly sensitive to the inclusion of
outliers.
5. Non-multicollinearity
There should be any correlation among the
independent variables.
6. Mutually exclusive
The groups must be mutually exclusive, with
every subject or case belonging to only one
group.
7. Classification
Each of the allocations for the dependent
categories in the initial classification are
correctly classified.
8. Variability
No independent variables should have a zero
variability in either of the groups formed by
the dependent variable.
Terminology
1) Variables in the analysis
2) Discriminant function
A discriminant function is a latent variable which is
constructed as a linear combination of independent
variables, such that
Z= c+b1X1+ b2X2++bnXn
The discriminant function is also known as
canonical root. This discriminant function is used to
classify the subject/cases into one of the two
groups on the basis of the observed values of the
predictor variables
3) Classification matrix
In DA, it serves as a yardstick in measuring the
accuracy of a model in classifying an individual /case
into one of the two groups. It is also known as confusion
matrix, assignment matrix,or prediction matrix. It tells
us as to what percentage of the existing data points are
correctly classified by the model developed in DA.
4) Stepwise method of discriminant analysis
Discriminant function can be developed either by
entering all independent variables together or in
stepwise depending upon whether the study is
confirmatory or exploratory.
5) Power of discriminatory variables
After developing the model in the discriminant analysis
based on the selected independent variables, it is important
to know the relative importance of the variables so selected.
6) Boxs M Test
By using Boxs M Tests, we test a null hypothesis that the
covariance matrices do not differ between groups formed by
the dependent variable. If the Boxs M Test is insignificant, it
indicates that the assumptions required for DA holds true.
7) Eigen values
Eigen value is the index of overall fit.
8) WILKS lambda
It measures the efficiency of discriminant function
in the model.
Its value shows, how much percentage of
variability in dependent variable is not explained
by the independent variables.
9) Cannonial correlation
The canonical correlation is the multiple
correlation between the predictors and the
discriminant function. With only one function it
provides an index of overall model fit which is
interpreted as being the proportion of variance
explained (R2).
du re
r oce
l ed p
t a i
De
STEPS IN ANALYSIS :
STEP 1. STEP 2.
In step one the A discriminant
independent variables function model is
which have the developed by using
discriminating power are the coefficients of
being chosen. independent
variables
STEPS IN ANALYSIS Contd
STEP 3. STEP 4.
In step three Wilks In step four the
lambda is computed independent variables
for testing the which possess
significance of importance in
discriminant discriminating the
function. groups are being
found.
STEPS IN ANALYSIS Contd
STEP 5.
Height
Judgement
Patience
-4.390 0 4.390
a n k
T h
y o u