Proper Model Selection with Significance Test

Authors:
Jin Huang;Charles X. Ling;Harry Zhang;Stan Matwin
Affiliations:
School of Information Tech. and Eng., University of Ottawa, Canada;Department of Computer Science, The University of Western Ontario, Canada;Faculty of Computer Science, University of New Brunswick, Canada;School of Information Tech. and Eng., University of Ottawa, Canada
Venue:
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Year:
2008

Citing 9
Cited 1

Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems

Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
An Experimental and Theoretical Comparison of Model SelectionMethods

Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

Machine Learning
Data mining in metric space: an empirical analysis of supervised learning performance criteria

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Model selection via the AUC

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Classifier Loss Under Metric Uncertainty

ECML '07 Proceedings of the 18th European conference on Machine Learning
An Improved Model Selection Heuristic for AUC

ECML '07 Proceedings of the 18th European conference on Machine Learning
AUC: a statistically consistent and more discriminating measure than accuracy

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Evolving neural networks with maximum AUC for imbalanced data classification

HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Model selection is an important and ubiquitous task in machine learning. To select models with the best future classification performance measured by a goal metric, an evaluation metricis often used to select the best classification model among the competing ones. A common practice is to use the same goal and evaluation metric. However, in several recent studies, it is claimed that using an evaluation metric (such as AUC) other than the goal metric (such as accuracy) results in better selection of the correct models. In this paper, we point out a flaw in the experimental design of those studies, and propose an improved method to test the claim. Our extensive experiments show convincingly that only the goal metric itself can most reliably select the correct classification models.