Professional Documents
Culture Documents
Discriminant
Analysis
It’s a linear combination of features /variables that ensures there is maximum separation between the
groups in the consideration
Within
group Within
Variance Between group
E E group Variance
x x Variance
p p
e e
r r
i i
e e
n n
c c
e e
Skills
Skills
• Logistic regression is more similar to LDA than ANOVA is, as they explain the
categorical variable by the values of continuous independent variables
• Logistic Regression is preferable in applications where it is not reasonable to
assume that the independent variables are not normally distributed, which is a
fundamental assumption of the LDA method.
• LDA is preferred for multiple outputs
Difference between PCA & LDA
PC1 E
x
Experience
p
e
r
i
e
n
PC2 c
e
Skills
Skills
1. PCA works on the concept of finding a direction where 1. LDA works on the concept of maximum the class
there is maximum variance separability
2. PCA is unsupervised learning 2. LDA is a supervised learning
3. Discriminant analysis is used when groups are
known a priori.
LDA as dimension reduction
µ2
d n
ee
e tw
Line 1 B
µ1
Correct
classification
Misc
la ssifi
cat io n
Line 2
1-Diamensional
2-Diamensional Correct
classification
• LDA uses both variables to create a new axis . By doing so , it maximises to separate the 2 classes
• Here µ1 and µ2 are the means of the between group and S1 and S2 are the deviation within the group
Maximize (µ1 - µ 2 )2
Multiclass classification
Y
• If there are 2 classes LDA will segregate the
d3 classes using 1 dimensional vector
d1
• If there are 3 classes LDA will segregate the
d2
classes using 2 dimensional vector
X
• If there are k classes LDA will segregate the
Maximize classes using k-1 dimensional vector
How does LDA predict and separates the classes
• In most cases, there are 2 ways in which LDA classes are predicted
1) Bayes rule
2) Distance
Application
a) Altman Z-Score
Bayes rule
Conditional Probability -is the likelihood of an outcome occurring, based on a previous outcome occurring
P(B)
Posterior Prior
Probability P(A|B) = P(B|A) * P(A) Probability
P(B)
Marginal
Probability
Equation on classes
Two classes, Y=1 and Y=0
P(X)
Example : What is the probability that a
person shall go out to play Sunny & Hot
Outlook Yes No P(Sunny|Yes ) P(Sunny |No) Temperature Yes No P(Hot|Yes) P(Hot |No)
Sunny 2 3 2/9 3/5 Hot 2 2 2/9 2/5
Overcast 4 0 4/9 0 Mild 4 2 4/9 2/5
Rainy 3 2 3/9 2/5 Cool 3 1 3/9 1/5
Total 9 5 1 1 Total 9 5 1 1
Where S is the Covariance matrix of the predictors (diagonals will have the variances and off-
diagonals will have the co-variances od every pair of predictors
T- indicates the vector should be transposed
Computing the statistical distance of a
customer from centre of acceptors class
Average credit card spends 2.7 New customer profile
Age 44
Income 100
Non-Acceptors Acceptors
Mean Average of CC spends 1.73 3.91 Centroid of acceptors
Mean of Age 45.37 45.07
Mean of Income 66.24 144.75
-1
995.5 14.21 7.77 0.0011 -0.0034 -0.0001
14.21 4.39 -0.06 = -0.0034 0.2388 0.0003
7.77 -0.06 134.07 -0.0001 0.0003 0.0075
−𝑇 −1 1 − 𝑇 −1
𝑐𝑠 ( 𝑥 , 𝑦 )= 𝑦 𝑠 𝑥− 𝑦 𝑠 𝑦
2
Altman Z -Score
• The Altman Z-score is the output of a credit-strength test that gauges a publicly-traded
manufacturing company's likelihood of bankruptcy.
• Z-Score = 1.2A + 1.4B + 3.3C + 0.6D + 1.0E
A = working capital / total assets, B = retained earnings / total assets, C = earnings before interest and tax / total assets,
• An Altman Z-score close to 1.8 suggests a company might be headed for bankruptcy, while a
score closer to 3 suggests a company is in solid financial positioning.