Machine Learning
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Minority report in fraud detection: classification of skewed data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data mining for improved cardiac care
ACM SIGKDD Explorations Newsletter
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
An Evaluation of the Robustness of MTS for Imbalanced Data
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Improving SVM Classification on Imbalanced Data Sets in Distance Spaces
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
An empirical evaluation of bagging with different algorithms on imbalanced data
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
An efficient and simple under-sampling technique for imbalanced time series classification
Proceedings of the 21st ACM international conference on Information and knowledge management
A comparative study of sampling methods and algorithms for imbalanced time series classification
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Empirical study of bagging predictors on medical data
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Hi-index | 0.00 |
Research into learning from imbalanced data has increasingly captured the attention of both academia and industry, especially when the class distribution is highly skewed. This paper compares the Area Under the Receiver Operating Characteristic Curve (AUC ) performance of bagging in the context of learning from different imbalanced levels of class distribution. Despite the popularity of bagging in many real-world applications, some questions have not been clearly answered in the existing research, e.g., which bagging predictors may achieve the best performance for applications, and whether bagging is superior to single learners when the levels of class distribution change. We perform a comprehensive evaluation of the AUC performance of bagging predictors with 12 base learners at different imbalanced levels of class distribution by using a sampling technique on 14 imbalanced data-sets. Our experimental results indicate that Decision Table (DTable) and RepTree are the learning algorithms with the best bagging AUC performance. Most AUC performances of bagging predictors are statistically superior to single learners, except for Support Vector Machines (SVM) and Decision Stump (DStump).