Professional Documents
Culture Documents
"#$%&'(ı)*%+
,%-'./ 0 1'*2-'#"%3 0 4256"-
7%8"
!"#$%&'(!%)*
+,-./.&012(3.).
!*.)(%4(%,-./.&012(2.).(.)()*1(5%#4)
6/.017
For a data to be used for classification,
imbalance is where amount of classes in the
data are significantly different from each
other,
one variant class in terms of size can lead to
imbalance.
8#1()*1#1(1&"9'*(%&5"#,.)%"&(%&()*1
2.).7
8009#.0:(%4(&")(:"9#("&/:(5#%1&2
There are metrics you can take into account:
Cohen’s Kappa
ROC Curve
Confusion Matrix
Precision
Recall
F-Score
;14.,6/1
Some of the techniques to consider:
- Repopulate your minor classes.
- Undersample the minor classes.
<",1(8/'"#%)*,4(=*.)(+,,9&1()"
+,-./.&01
Support Vector Machine (a penalized one)
>1&./%?12(@"21/4
Penalization tweaks the model to consider
minor classes,
with an analogy it ensures equity between
classes.
- Penalized SVM
- Penalized Latent Dirichlet Analysis
+4(=*1(6#"-/1,(A&1BC/.44
C/.44%5%0.)%"&7
While there is positive and negative classes, it
is a binary classification.
As a special case of this, if we only know
information about only one class,
for example like we classify if its an apple or
not with only apple data,
we call this One-Class Classification problem.
C%@A2-"'4"%#-2-D E%6%'7@2"-@"
E%6%'F-D2-""#2-D
.9
.9
!"#$%&#"'%($#)*+%,ı-'*.
:;*;##;<'<"'<2))'#=-'>%56"#?'56#"6@A';=6';=#'%#*5'>%#6A"#BBB