Learning Decision Trees Using the Area Under the ROC Curve
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Toward Bayesian Classifiers with Accurate Probabilities
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Journal of Artificial Intelligence Research
AUC: a better measure than accuracy in comparing learning algorithms
AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Comparing Naive Bayes, Decision Trees, and SVM with AUC and Accuracy
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning to predict train wheel failures
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Computers and Operations Research
ROC curves and video analysis optimization in intestinal capsule endoscopy
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Diagnosing scrapie in sheep: A classification experiment
Computers in Biology and Medicine
A critical analysis of variants of the AUC
Machine Learning
Survey of Improving Naive Bayes for Classification
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Proper Model Selection with Significance Test
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Naive Bayes for optimal ranking
Journal of Experimental & Theoretical Artificial Intelligence
Expert Systems with Applications: An International Journal
Detecting Abnormal Events via Hierarchical Dirichlet Processes
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
The ROC isometrics approach to construct reliable classifiers
Intelligent Data Analysis
Cost-Based Sampling of Individual Instances
Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
Score Fusion by Maximizing the Area under the ROC Curve
IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
Constructing new and better evaluation measures for machine learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Sparse Support Vector Machines with L_{p} Penalty for Biomarker Identification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Combining SVM classifiers using genetic fuzzy systems based on AUC for gene expression data analysis
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Learning locally weighted C4.4 for class probability estimation
DS'07 Proceedings of the 10th international conference on Discovery science
Training multiclass classifiers by maximizing the volume under the ROC surface
EUROCAST'07 Proceedings of the 11th international conference on Computer aided systems theory
On the choice of effectiveness measures for learning to rank
Information Retrieval
Performance metrics for activity recognition
ACM Transactions on Intelligent Systems and Technology (TIST)
Random one-dependence estimators
Pattern Recognition Letters
Data mining for credit card fraud: A comparative study
Decision Support Systems
Learning random forests for ranking
Frontiers of Computer Science in China
Boosting inspired process for improving AUC
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A comparison of evaluation metrics for document filtering
CLEF'11 Proceedings of the Second international conference on Multilingual and multimodal information access evaluation
International Journal of Bioinformatics Research and Applications
International Journal of Bioinformatics Research and Applications
Improving the ranking performance of decision trees
ECML'06 Proceedings of the 17th European conference on Machine Learning
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Preprocessing time series data for classification with application to CRM
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Severe class imbalance: why better algorithms aren't the answer
ECML'05 Proceedings of the 16th European conference on Machine Learning
Training classifiers for unbalanced distribution and cost-sensitive domains with ROC analysis
PKAW'06 Proceedings of the 9th Pacific Rim Knowledge Acquisition international conference on Advances in Knowledge Acquisition and Management
Search engine switching detection based on user personal preferences and behavior patterns
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Acquaintance or partner?: predicting partnership in online and location-based social networks
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
A tutorial on human activity recognition using body-worn inertial sensors
ACM Computing Surveys (CSUR)
Area under the distance threshold curve as an evaluation measure for probabilistic classifiers
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Computer Methods and Programs in Biomedicine
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Predictive accuracy has been used as the main and often only evaluation criterion for the predictive performance of classification learning algorithms. In recent years, the area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, has been proposed as an alternative single-number measure for evaluating learning algorithms. In this paper, we prove that AUC is a better measure than accuracy. More specifically, we present rigourous definitions on consistency and discriminancy in comparing two evaluation measures for learning algorithms. We then present empirical evaluations and a formal proof to establish that AUC is indeed statistically consistent and more discriminating than accuracy. Our result is quite significant since we formally prove that, for the first time, AUC is a better measure than accuracy in the evaluation of learning algorithms.