Professional Documents
Culture Documents
Prepared by-
Sumit Jain
Introduction-
Discriminant analysis or DA, is a technique for analysing marketing
research data when criterion or dependent variable is categorical and
the predictor or independent variables are interval in nature . In other
words, Discriminant analysis is a statistical method that
is used by researchers to help them understand the
relationship between a "dependent variable" and one
or more "independent variables." A dependent
variable is the variable that a researcher is trying to
explain or predict from the values of the independent
variables. Discriminant analysis is similar to regression
analysis and analysis of variance (ANOVA). The
principal difference between discriminant analysis and
the other two methods is with regard to the nature of
the dependent variable.
•
Contd..
•
Examples-
For example, an educational researcher may want to
investigate which variables discriminate between high school
graduates who decide (1) to go to college, (2) to attend a trade
or professional school, or (3) to seek no further training or
education. For that purpose the researcher could collect data
on numerous variables prior to students' graduation. After
graduation, most students will naturally fall into one of the
three categories. Discriminant Analysis could then be used to
determine which variable(s) are the best predictors of students'
subsequent educational choice.
Another example a medical researcher may record different
variables relating to patients' backgrounds in order to learn
which variables best predict whether a patient is likely to
recover completely (group 1), partially (group 2), or not at all
(group 3). A biologist could record different characteristics of
similar types (groups) of flowers, and then perform a
discriminant function analysis to determine the set of
characteristics that allows for the best discrimination between
the types.
•
Purpose-
The main purpose of a discriminant function analysis is to
predict group membership based on a linear combination of the
interval variables. The procedure begins with a set of
observations where both group membership and the values of the
interval variables are known. The end result of the procedure is a
model that allows prediction of group membership when only the
interval variables are known. A second purpose of discriminant
function analysis is an understanding of the data set, as a careful
examination of the prediction model that results from the
procedure can give insight into the relationship between group
membership and the variables used to predict group membership.
•
Objectives-
Ø To classify cases into groups using a discriminant prediction
equation.
Ø To test theory by observing whether cases are classified as
predicted.
Ø To investigate differences between or among groups.
Ø To determine the most parsimonious way to distinguish among
groups.
Ø To determine the percent of variance in the dependent variable
explained by the independents.
Ø To determine the percent of variance in the dependent variable
explained by the independents over and above the variance
accounted for by control variables, using sequential
discriminant analysis.
Ø
Ø To assess the relative importance of the independent
variables in classifying the dependent variable.
Ø To discard variables which are little related to group
distinctions.
Ø To infer the meaning of MDA dimensions which
distinguish groups, based on discriminant loadings.
Ø
Multiple discriminant analysis (MDA) is an extension of
discriminant analysis and a cousin of multiple analysis of
variance (MANOVA), sharing many of the same assumptions and
tests. MDA is used to classify a categorical dependent which has
more than two categories, using as predictors a number of
interval or dummy independent variables. MDA is sometimes
also called discriminant factor analysis or canonical discriminant
analysis.
•
Assumptions in Discriminant analysis-
8. Assumes linearity: The discriminant functions should be linear and related
to each other.
Application-
Career Counsellors
suppose we have two groups of high school
graduates: Those who choose to attend
college after graduation and those who do
not. We could have measured students'
stated intention to continue on to college
one year prior to graduation. If the means
for the two groups (those who actually went
to college and those who did not) are
different, then we can say that intention to
attend college as stated one year prior to
graduation allows us to discriminate
between those who are and are not college
bound (and this information may be used by
career counsellors to provide the
appropriate guidance to the respective
students).
Marketing-
In marketing, discriminant analysis
was once often used to determine
the factors which distinguish
different types of customers and/or
products on the basis of surveys or
other forms of collected data.
Logistic regression or other methods
are now more commonly used. The
use of discriminant analysis in
marketing can be described by the
following steps:
Formulate the problem and gather
data - Identify the salient attributes
consumers use to evaluate products in
this category - Use quantitative
marketing research techniques (such
as surveys) to collect data from a
sample of potential customers
concerning their ratings of all the
product attributes. The data collection
stage is usually done by marketing
research professionals. Survey
questions ask the respondent to rate a
product from one to five (or 1 to 7, or 1
to 10) on a range of attributes chosen
Anywhere from five to twenty
attributes are chosen. They could
include things like: ease of use,
weight, accuracy, durability,
colourfulness, price, or size. The
attributes chosen will vary depending
on the product being studied. The
same question is asked about all the
products in the study. The data for
multiple products is codified and
input into a statistical program such
as R, SPSS or SAS. (This step is the
Estimate the Discriminant Function
Coefficients and determine the statistical
significance and validity - Choose the
appropriate discriminant analysis method. The
direct method involves estimating the
discriminant function so that all the predictors
are assessed simultaneously. The stepwise
method enters the predictors sequentially. The
two-group method should be used when the
dependent variable has two categories or
states. The multiple discriminant method is
used when the dependent variable has three or
more categorical states. Use Wilks’s Lambdato
test for significance in SPSS or F stat in SAS.
The most common method used to test validity
is to split the sample into an estimation or
analysis sample, and a validation or holdout
The estimation sample is used in constructing the
discriminant function. The validation sample is used to
construct a classification matrix which contains the
number of correctly classified and incorrectly classified
cases. The percentage of correctly classified cases is
called the hit ratio.
Prediction of Elections:
In this case the variables can be various social and economic
factors,
coupled with party effort parameters. Some of these variables can
be as follows
(1)No. of new projects implemented by incumbent party
(4)SEC division of the Electorate (in form of ratios)
(5)Profession wise division of the Electorate
electorate in view.
•
Outcome of terrorist attacks with hostages:
•
Contd..
(1)Number of terrorists
(2)Strength of their support in the local population
A careful training with past cases can help the government take a decision
on whether to use force or negotiations to neutralize the terrorist threat.
MEDICINE AND DIAGNOSTICS
Insolvency prediction (Case study on Spanish Banks)
Unlike other financial problems, there are a
great number of agents facing business failure, so
research in this topic has been of growing interest
in the last decades. Insolvency, early detection of
financial distress, or conditions leading to
insolvency of insurance companies have been a
concern of parties such as insurance regulators,
investors, management, financial analysts, banks,
auditors, policy holders and consumers. This
concern has arised from the necessity of
protecting the general public
Contd..
In short, Discriminant Analysis is a
very useful tool (1) for detecting the
variables that allow the researcher to
discriminate between different
(naturally occurring) groups, and (2)
for classifying cases into different
groups with a better than chance
accuracy.
•
Reference
• www.wikipedia.com
• www.books.google.co.in
• www.resample.com
• www.statsoft.com
• www.faculty.chass.ncsu.edu
• www.eso.org