Professional Documents
Culture Documents
Information-Gain-Calculator
Information-Gain-Calculator
Confusion Matrix
Probability Distributions
P(X) p(a,b)
P(Y) p(c,d)
p(X,Y) p(e,f,g,h)
P(X)p(Y) p(ac,ad,bc,bd)
Conditional Probabilities
p(Test POS | "+") e/a 0.50
p(Test NEG | "+") f/a 0.50
p(Test POS | "-") g/b 0.25
p(Test NEG | "-") h/b 0.75
Test Classification Y
[optical scanner on assembly line]
"Positive" "Negative"
0.3 c 0.7 d
0.1 e 0.1 f
0.2 g 0.6 h
Name H(X,Y)
Probability of the Condition 1.5710
Probability of the Classification
Joint Distribution of X and Y 0.1
Product Distribution of X and Y 0.06
Mutual Information I(X:Y) = Rela
= e*log(e/ac)
Name 0.0323
True Positive Rate
False Negative Rate
False Positive Rate H(Y|X)
True Negative Rate
0.8490
Positive Predictive Value (PPV)
1- PPV
1- NPV H(X|Y)
Negative Predictive Value (NPV)
0.6897
I(X;Y) = H(Y)
= c*log(1/c) + d*log(1/d) 0.0323 0.8813
0.5211 0.3602
I(X;Y) = H(X)
0.0323 0.7219
e 0.10 f 0.20 g
ac 0.14 ad 0.24 bc
Information I(X:Y) = Relative Entropy of Joint and Product Distributions --- D(p(X,Y||p(X)p(Y))
0.0736965594 + f*log(f/ad) -0.04854268272 + g*log(g/bc) -0.05261
- H(X|Y)
0.6897
- H(Y|X)
0.8490
+ H(Y) - H(X,Y)
0.8813 1.5710
0.60 h
0.56 bd
- D(p(X,Y||p(X)p(Y))
+ h*log(h/bd) 0.059721
H(g/b, h/b)
H(f/d, h/d)
Copyright Daniel Egger/ Attribution 4.0 Inter
Venn diagram courtesy of Konrad Voelkel - Wikipedia: https://en.
ribution 4.0 International (CC BY 4.0)
l - Wikipedia: https://en.wikipedia.org/wiki/Information_diagram