Machine Learning
Robust Classification for Imprecise Environments
Machine Learning
Machine Learning
Minority report in fraud detection: classification of skewed data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Neural network ensemble strategies for financial decision applications
Computers and Operations Research
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
An Evaluation of Progressive Sampling for Imbalanced Data Sets
ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
An Evaluation of the Robustness of MTS for Imbalanced Data
IEEE Transactions on Knowledge and Data Engineering
On the Class Imbalance Problem
ICNC '08 Proceedings of the 2008 Fourth International Conference on Natural Computation - Volume 04
An Empirical Study of Combined Classifiers for Knowledge Discovery on Medical Data Bases
Advanced Web and NetworkTechnologies, and Applications
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
Ensemble with neural networks for bankruptcy prediction
Expert Systems with Applications: An International Journal
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Class confidence weighted kNN algorithms for imbalanced data sets
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
An empirical evaluation of bagging with different algorithms on imbalanced data
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Artificial Intelligence in Medicine
AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Hi-index | 0.00 |
This study investigates the performance of bagging in terms of learning from imbalanced medical data. It is important for data miners to achieve highly accurate prediction models, and this is especially true for imbalanced medical applications. In these situations, practitioners are more interested in the minority class than the majority class; however, it is hard for a traditional supervised learning algorithm to achieve a highly accurate prediction on the minority class, even though it might achieve better results according to the most commonly used evaluation metric, Accuracy. Bagging is a simple yet effective ensemble method which has been applied to many real-world applications. However, some questions have not been well answered, e.g., whether bagging outperforms single learners on medical data-sets; which learners are the best predictors for each medical data-set; and what is the best predictive performance achievable for each medical data-set when we apply sampling techniques. We perform an extensive empirical study on the performance of 12 learning algorithms on 8 medical data-sets based on four performance measures: True Positive Rate (TPR), True Negative Rate (TNR), Geometric Mean (G-mean) of the accuracy rate of the majority class and the minority class, and Accuracy as evaluation metrics. In addition, the statistical analyses performed instil confidence in the validity of the conclusions of this research.