Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
An Experimental and Theoretical Comparison of Model SelectionMethods
Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
Data mining in metric space: an empirical analysis of supervised learning performance criteria
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Classifier Loss Under Metric Uncertainty
ECML '07 Proceedings of the 18th European conference on Machine Learning
An Improved Model Selection Heuristic for AUC
ECML '07 Proceedings of the 18th European conference on Machine Learning
AUC: a statistically consistent and more discriminating measure than accuracy
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Evolving neural networks with maximum AUC for imbalanced data classification
HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part I
Hi-index | 0.00 |
Model selection is an important and ubiquitous task in machine learning. To select models with the best future classification performance measured by a goal metric, an evaluation metricis often used to select the best classification model among the competing ones. A common practice is to use the same goal and evaluation metric. However, in several recent studies, it is claimed that using an evaluation metric (such as AUC) other than the goal metric (such as accuracy) results in better selection of the correct models. In this paper, we point out a flaw in the experimental design of those studies, and propose an improved method to test the claim. Our extensive experiments show convincingly that only the goal metric itself can most reliably select the correct classification models.