Evaluating misclassifications in imbalanced data

Authors:
William Elazmeh;Nathalie Japkowicz;Stan Matwin
Affiliations:
School of Information Technology and Engineering, University of Ottawa, Canada;School of Information Technology and Engineering, University of Ottawa, Canada;School of Information Technology and Engineering, University of Ottawa, Canada
Venue:
ECML'06 Proceedings of the 17th European conference on Machine Learning
Year:
2006

Citing 7
Cited 4

Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Explicitly representing expected cost: an alternative to ROC representation

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
ROC confidence bands: an empirical evaluation

ICML '05 Proceedings of the 22nd international conference on Machine learning
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Learning to order things

Journal of Artificial Intelligence Research
AUC: a better measure than accuracy in comparing learning algorithms

AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Severe class imbalance: why better algorithms aren't the answer

ECML'05 Proceedings of the 16th European conference on Machine Learning

2008 Special Issue: Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance

Neural Networks
Index of Balanced Accuracy: A Performance Measure for Skewed Class Distributions

IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
Learning without default: a study of one-class classification and the low-default portfolio problem

AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
Comparing alternative classifiers for database marketing: The case of imbalanced datasets

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

Evaluating classifier performance with ROC curves is popular in the machine learning community. To date, the only method to assess confidence of ROC curves is to construct ROC bands. In the case of severe class imbalance with few instances of the minority class, ROC bands become unreliable. We propose a generic framework for classifier evaluation to identify a segment of an ROC curve in which misclassifications are balanced. Confidence is measured by Tango's 95%-confidence interval for the difference in misclassification in both classes. We test our method with severe class imbalance in a two-class problem. Our evaluation favors classifiers with low numbers of misclassifications in both classes. Our results show that the proposed evaluation method is more confident than ROC bands.