Tree induction vs. logistic regression: a learning-curve analysis
The Journal of Machine Learning Research
AUC: a statistically consistent and more discriminating measure than accuracy
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
ROC curves and video analysis optimization in intestinal capsule endoscopy
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Exploiting AUC for optimal linear combinations of dichotomizers
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Diagnosing scrapie in sheep: A classification experiment
Computers in Biology and Medicine
Classifier Loss Under Metric Uncertainty
ECML '07 Proceedings of the 18th European conference on Machine Learning
Proper Model Selection with Significance Test
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
An experimental comparison of performance measures for classification
Pattern Recognition Letters
Learning Curves for the Analysis of Multiple Instance Classifiers
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Constructing new and better evaluation measures for machine learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A novel ensemble algorithm for biomedical classification based on Ant Colony Optimization
Applied Soft Computing
ECML'06 Proceedings of the 17th European conference on Machine Learning
Subset ranking using regression
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Hi-index | 0.00 |
We present a statistical analysis of the AUC as an evaluation criterion for classification scoring models. First, we consider significance tests for the difference between AUC scores of two algorithms on the same test set. We derive exact moments under simplifying assumptions and use them to examine approximate practical methods from the literature. We then compare AUC to empirical misclassification error when the prediction goal is to minimize future error rate. We show that the AUC may be preferable to empirical error even in this case and discuss the tradeoff between approximation error and estimation error underlying this phenomenon.