Comparing Naive Bayes, Decision Trees, and SVM with AUC and Accuracy

Authors:
Jin Huang;Jingjing Lu;Charles X. Ling
Affiliations:
-;-;-
Venue:
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Year:
2003

Citing 13
Cited 14

Decision estimation and classification: an introduction to pattern recognition and related topics

Decision estimation and classification: an introduction to pattern recognition and related topics
Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
C4.5: programs for machine learning

C4.5: programs for machine learning
Least Squares Support Vector Machine Classifiers

Neural Processing Letters
Support Vector Machines and the Bayes Rule in Classification

Data Mining and Knowledge Discovery
Discretization: An Enabling Technique

Data Mining and Knowledge Discovery
Learning Decision Trees Using the Area Under the ROC Curve

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Toward Bayesian Classifiers with Accurate Probabilities

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Tree Induction for Probability-Based Ranking

Machine Learning
AUC: a statistically consistent and more discriminating measure than accuracy

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Interruptible anytime algorithms for iterative improvement of decision trees

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Strategies for improving the modeling and interpretability of Bayesian networks

Data & Knowledge Engineering
An experimental comparison of performance measures for classification

Pattern Recognition Letters
Classification of Protein Interaction Sentences via Gaussian Processes

PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
Learning user purchase intent from user-centric data

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Semi-supervised self-training for sentence subjectivity classification

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Protein interaction detection in sentences via Gaussian Processes: a preliminary evaluation

International Journal of Data Mining and Bioinformatics
ACE-Cost: acquisition cost efficient classifier by hybrid decision tree with local SVM leaves

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Towards adjusting mobile devices to user's behaviour

MSM'10/MUSE'10 Proceedings of the 2010 international conference on Analysis of social media and ubiquitous data
Automatic annotation of protein functional class from sparse and imbalanced data sets

VDMB'06 Proceedings of the First international conference on Data Mining and Bioinformatics
Learning k-nearest neighbor naive bayes for ranking

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Training classifiers for unbalanced distribution and cost-sensitive domains with ROC analysis

PKAW'06 Proceedings of the 9th Pacific Rim Knowledge Acquisition international conference on Advances in Knowledge Acquisition and Management
Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
An empirical study of reducing multiclass classification methodologies

MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Predictive accuracy has often been used as the mainand often only evaluation criterion for the predictive performanceof classification or data mining algorithms. Inrecent years, the area under the ROC (Receiver OperatingCharacteristics) curve, or simply AUC, has been proposedas an alternative single-number measure for evaluating performanceof learning algorithms. In our previous work, weproved that AUC is, in general, a better measure (definedprecisely) than accuracy. Many popular data mining algorithmsshould then be re-evaluated in terms of AUC. Forexample, it is well accepted that Naive Bayes and decisiontrees are very similar in accuracy. How do they compare inAUC? Also, how does the recently developed SVM (SupportVector Machine) compare to traditional learning algorithmsin accuracy and AUC? We will answer these questions inthis paper. Our conclusions will provide important guide-linesin data mining applications on real-world datasets.