Professional Documents
Culture Documents
3.2 Performance Measure Table 2 shows the performance (measured using AUC)
of AdaBoost (AdaB), SMOTEBoost (SmoteB) and RUS-
Traditional performance measures, such as overall ac- Boost (RusB) on all seven of the datasets used in our ex-
curacy, are inappropriate for evaluating models trained on periments as well as the results without boosting (None).
skewed data. Therefore, we employ Receiver Operating Also included is the average performance across all seven
Characteristic curves [7], or ROC curves, to evaluate the datasets. For each dataset, the best performing algorithm
models constructed in this study. ROC curves graph true is indicated by a bold value. Table 2 shows that for all
positive rates on the y-axis versus the false positive rates on datasets, the best performing technique is always either
the x-axis. The resulting curve illustrates the trade-off be- RUSBoost or SMOTEBoost. A t-test was performed to
tween detection rate and false alarm rate. Traditional perfor- compare the performances of RUSBoost and SMOTEBoost.
mance metrics consider only the default decision threshold. An underlined value in the RUSBoost column indicates
ROC curves illustrate the performance across all decision that RUSBoost performed significantly better than SMOTE-
thresholds. For a single numeric measure, the area under Boost at the α = 5% significance level. Similarly, an un-
the ROC curve (AU C) is widely used, providing a general derlined value in the SMOTEBoost column indicates that
idea of the predictive potential of the classifier. its performance was significantly better than RUSBoost.
Simple boosting (AdaBoost) is very effective at improv-
3.3 Design Summary ing classification performance when learning from imbal-
anced data. On average, using AdaBoost improves the per-
All experiments were performed using 10-fold cross val- formance of RIPPER from 0.6805 to 0.8598, an increase
idation. That is, the datasets were split into ten parti- of over 25%. Using the PC1 and SolarFlare datasets, the
improvement is the greatest (45% and 55%, respectively). times, and provides competitive performance.
Clearly, boosting has great potential for improving the per- Our results show that RUSBoost performs as well or bet-
formance of classifiers built on imbalanced data. While ter than SMOTEBoost on six of the seven training datasets
AdaBoost successfully improves classification performance used in this study. SMOTEBoost only significantly outper-
in all of our experiments, both RUSBoost and SMOTE- formed RUSBoost on one of the seven datasets. Using three
Boost perform better than AdaBoost on every dataset used of the datasets, RUSBoost performed significantly better
in this study. than SMOTEBoost. Overall, RUSBoost significantly out-
The overall performance of RUSBoost (the average per- performed SMOTEBoost. RUSBoost accomplishes this su-
formance across all seven datasets) is significantly better perior performance while boasting reduced computational
than that of SMOTEBoost. While SMOTEBoost improves complexity and training times when compared to SMOTE-
the average performance across all seven datasets from Boost. We highly recommend using RUSBoost as an al-
0.6805 (without boosting) to 0.8816, RUSBoost performs ternative to SMOTEBoost when constructing learners using
even better, achieving an AUC of 0.8991. As indicated by imbalanced datasets.
the underlined value, the performance of RUSBoost is sig- Future work will continue to investigate the performance
nificantly better than that of SMOTEBoost at the α = 5% of RUSBoost. We will evaluate RUSBoost using addi-
significance level. tional learners and performance metrics, as well as addi-
For three of the seven datasets, SMOTEBoost resulted in tional datasets. Further, we will compare the performance
a higher AUC than RUSBoost. However, only one of those of RUSBoost to other boosting algorithms designed to ad-
three datasets results in a statistically significant difference dress the class imbalance problem.
in performance. While SMOTEBoost results in a (slightly)
better performance for the CCCS12 and PC1 datasets, these References
differences are not significant. SatImage is the only dataset
where SMOTEBoost significantly outperforms RUSBoost [1] A. Asuncion and D. Newman. UCI machine learning repos-
(0.9539 vs. 0.9450). itory. http://www.ics.uci.edu/∼mlearn/MLRepository.html,
On the other hand, RUSBoost resulted in a higher AUC 2007. University of California, Irvine, School of Information
than SMOTEBoost for four of the seven datasets. In all and Computer Sciences.
but one of these datasets, the difference in performance was [2] N. V. Chawla, L. O. Hall, K. W. Bowyer, and W. P.
statistically significant. For the Ecoli dataset, RUSBoost Kegelmeyer. Smote: Synthetic minority oversampling tech-
performed slightly better than SMOTEBoost but the differ- nique. Journal of Artificial Intelligence Research, (16):321–
ence was not significant. For the Mammography, SolarFlare 357, 2002.
and SP3 datasets, RUSBoost significantly outperformed [3] N. V. Chawla, A. Lazarevic, L. O. Hall, and K. Bowyer.
SMOTEBoost. The largest difference in performance be- Smoteboost: Improving prediction of the minority class in
tween RUSBoost and SMOTEBoost occurred using the SP3 boosting. In In Proceedings of Principles of Knowledge Dis-
dataset where RUSBoost achieved an AUC of 0.7871 com- covery in Databases, 2003.
pared to SMOTEBoost’s 0.7096, an improvement of nearly [4] W. W. Cohen. Fast effective rule induction. In In Proc. 12th
11%. In summary, RUSBoost significantly outperformed International Conference on Machine Learning, pages 115–
SMOTEBoost on three datasets, while SMOTEBoost only 123. Morgan Kaufmann, 1995.
significantly outperformed RUSBoost on one.
[5] C. Drummond and R. C. Holte. C4.5, class imbalance, and
cost sensitivity: why under-sampling beats over-sampling. In
5 Conclusion Workshop on Learning from Imbalanced Data Sets II, Inter-
national Conference on Machine Learning, 2003.
In this work we present RUSBoost, a new technique for [6] Y. Freund and R. Schapire. Experiments with a new boost-
learning from skewed datasets. RUSBoost combines ran- ing algorithm. In Proceedings of the Thirteenth International
dom undersampling with boosting, resulting in improved Conference on Machine Learning, pages 148–156, 1996.
classification performance when training data is imbal- [7] F. Provost and T. Fawcett. Robust classification for imprecise
anced. RUSBoost is compared to the SMOTEBoost algo- environments. Machine Learning, 42:203–231, 2001.
rithm, proposed by Chawla et al. [3]. Instead of performing [8] J. Van Hulse, T. M. Khoshgoftaar, and A. Napolitano. Ex-
a complex oversampling algorithm (SMOTE), we utilize a perimental perspectives on learning from imbalanced data. In
simple undersampling technique (random undersampling) Proceedings of the 24th International Conference on Machine
that has been shown to result in excellent performance when Learning, pages 935–942, Corvallis, OR, USA, June 2007.
training models on imbalanced data. The result, RUSBoost, [9] G. M. Weiss. Mining with rarity: A unifying framework.
is a simple alternative to SMOTEBoost which is compu- SIGKDD Explorations, 6(1):7–19, 2004.
tationally less expensive, results in faster model training