On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach

Authors:
Steven L. Salzberg
Affiliations:
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA. E-mail: salzberg@cs.jhu.edu
Venue:
Data Mining and Knowledge Discovery
Year:
1997

Citing 7
Cited 130

Symbolic and Neural Learning Algorithms: An Experimental Comparison

Machine Learning
Statistical significance in inductive learning

ECAI '92 Proceedings of the 10th European conference on Artificial intelligence
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Machine Learning
An Experimental Comparison of the Nearest-Neighbor and Nearest-Hyperrectangle Algorithms

Machine Learning
A quantitative study of experimental evaluations of neural network learning algorithms: current research practice

Neural Networks
Generalizing from Case studies: A Case Study

ML '92 Proceedings of the Ninth International Workshop on Machine Learning
Oblivious decision trees graphs and top down pruning

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Robust classification systems for imprecise environments

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Classifying proteins by family using the product of correlated p-values

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Classification and regression: money *can* grow on trees

KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Cognitive classification

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Multiple Comparisons in Induction Algorithms

Machine Learning
MultiBoosting: A Technique for Combining Boosting and Wagging

Machine Learning
Enlarging the Margins in Perceptron Decision Trees

Machine Learning
Robust Classification for Imprecise Environments

Machine Learning
The Effect of Instance-Space Partition on Significance

Machine Learning
Data mining with sparse grids using simplicial basis functions

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Performance comparisons of neural networks and machine learning techniques: a critical assessment of the methodology

New learning paradigms in soft computing
Extracting Context-Sensitive Models in Inductive Logic Programming

Machine Learning
Combining pairwise sequence similarity and support vector machines for remote protein homology detection

Proceedings of the sixth annual international conference on Computational biology
Data mining with sparse grids

Computing
Using Correspondence Analysis to Combine Classifiers

Machine Learning
Genetic Programming Experiments with Standard and Homologous Crossover Methods

Genetic Programming and Evolvable Machines
Refinement of Neuro-psychological Tests for Dementia Screening in a Cross Cultural Population Using Machine Learning

AIMDM '99 Proceedings of the Joint European Conference on Artificial Intelligence in Medicine and Medical Decision Making
A Hybrid Technique for Data Mining on Balance-Sheet Data

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Improving Supervised Learning by Feature Decomposition

FoIKS '02 Proceedings of the Second International Symposium on Foundations of Information and Knowledge Systems
Making Use of the Most Expressive Jumping Emerging Patterns for Classification

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Feature Selection for Meta-learning

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Function Decomposition in Machine Learning

Machine Learning and Its Applications, Advanced Lectures
Improved Dataset Characterisation for Meta-learning

DS '02 Proceedings of the 5th International Conference on Discovery Science
Limiting the Number of Trees in Random Forests

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Non-retrieval: Blocking Pornographic Images

CIVR '02 Proceedings of the International Conference on Image and Video Retrieval
Knowledge evaluation: statistical evaluations

Handbook of data mining and knowledge discovery
Logistic regression and artificial neural network classification models: a methodology review

Journal of Biomedical Informatics
Using background knowledge in case-based legal reasoning: a computational model and an intelligent learning environment

Artificial Intelligence - Special issue on AI and law
A function-decomposition method for development of hierarchical multi-attribute decision models

Decision Support Systems
Symbolization assisted SVM classifier for noisy data

Pattern Recognition Letters
Minority report in fraud detection: classification of skewed data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Towards parameter-free data mining

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating replicability of classifier learning experiments

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Fast String Kernels using Inexact Matching for Protein Sequences

The Journal of Machine Learning Research
Discovering biological motifs with genetic programming

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Minimum spanning trees in hierarchical multiclass support vector machines generation

IEA/AIE'2005 Proceedings of the 18th international conference on Innovations in Applied Artificial Intelligence
Using simulated data in support of research on regression analysis

WSC '04 Proceedings of the 36th conference on Winter simulation
Comparison of the nearest feature classifiers for face recognition

Machine Vision and Applications
Generating legal arguments and predictions from case texts

ICAIL '05 Proceedings of the 10th international conference on Artificial intelligence and law
The reusability of induced knowledge for the automatic semantic markup of taxonomic descriptions

Journal of the American Society for Information Science and Technology
Combining inductive and deductive tools for data analysis

AI Communications
Compression-based data mining of sequential data

Data Mining and Knowledge Discovery
Credit scoring with a data mining approach based on support vector machines

Expert Systems with Applications: An International Journal
Brief communication: Integrating subcellular location for improving machine learning models of remote homology detection in eukaryotic organisms

Computational Biology and Chemistry
Predicting carcinoid heart disease with the noisy-threshold classifier

Artificial Intelligence in Medicine
Prediction model building and feature selection with support vector machines in breast cancer diagnosis

Expert Systems with Applications: An International Journal
Decision-tree instance-space decomposition with grouped gain-ratio

Information Sciences: an International Journal
Data mining methodological weaknesses and suggested fixes

AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Classification with sparse grids using simplicial basis functions

Intelligent Data Analysis
Boosting interval based literals

Intelligent Data Analysis
Multicategory Classification Using An Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A simulated-annealing-based approach for simultaneous parameter optimization and feature selection of back-propagation networks

Expert Systems with Applications: An International Journal
Textual case-based reasoning for spam filtering: a comparison of feature-based and feature-free approaches

Artificial Intelligence Review
A sequential multi-category classifier using radial basis function networks

Neurocomputing
Learning symmetric causal independence models

Machine Learning
Evaluation of a graduate level data mining course with industry participants

AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Classifying carpets based on laser scanner data

Engineering Applications of Artificial Intelligence
Particle swarm optimization for parameter determination and feature selection of support vector machines

Expert Systems with Applications: An International Journal
A distributed PSO-SVM hybrid system with feature selection and parameter optimization

Applied Soft Computing
Parameter determination of support vector machine and feature selection using simulated annealing approach

Applied Soft Computing
Expectation Propagation for Rating Players in Sports Competitions

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Top-Down Hierarchical Ensembles of Classifiers for Predicting G-Protein-Coupled-Receptor Functions

BSB '08 Proceedings of the 3rd Brazilian symposium on Bioinformatics: Advances in Bioinformatics and Computational Biology
A Multi-measure Nearest Neighbor Algorithm for Time Series Classification

IBERAMIA '08 Proceedings of the 11th Ibero-American conference on AI: Advances in Artificial Intelligence
Querying and mining of time series data: experimental comparison of representations and distance measures

Proceedings of the VLDB Endowment
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting

Expert Systems with Applications: An International Journal
Combining Bagging, Boosting and Dagging for Classification Problems

KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
Handling class imbalance in customer churn prediction

Expert Systems with Applications: An International Journal
A decision support system for detecting products missing from the shelf based on heuristic rules

Decision Support Systems
Matching incomplete time series with dynamic time warping: an algorithm and an application to post-stroke rehabilitation

Artificial Intelligence in Medicine
An analysis of prosodic information for the recognition of dialogue acts in a multimodal corpus in Mexican Spanish

Computer Speech and Language
PSOLDA: A particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis

Applied Soft Computing
Handling sequential pattern decay: Developing a two-stage collaborative recommender system

Electronic Commerce Research and Applications
Applying enhanced data mining approaches in predicting bank performance: A case of Taiwanese commercial banks

Expert Systems with Applications: An International Journal
GA-based learning bias selection mechanism for real-time scheduling systems

Expert Systems with Applications: An International Journal
Time series shapelets: a new primitive for data mining

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Datasets with Classification Noise

ECSQARU '09 Proceedings of the 10th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Appearance based recognition methodology for recognising fingerspelling alphabets

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Brief communication: SVM-BALSA: Remote homology detection based on Bayesian sequence alignment

Computational Biology and Chemistry
ACO-based hybrid classification system with feature subset selection and model parameters optimization

Neurocomputing
Parasite detection and identification for automated thin blood film malaria diagnosis

Computer Vision and Image Understanding
Parameter determination and feature selection for back-propagation network by particle swarm optimization

Knowledge and Information Systems
Detection of articulation disorders using empirical mode decomposition and neural networks

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Automatically classifying case texts and predicting outcomes

Artificial Intelligence and Law
Bagging different instead of similar models for regression and classification problems

International Journal of Computer Applications in Technology
Learning of model parameters for fault diagnosis in wireless networks

Wireless Networks
The Naive Bayes Mystery: A classification detective story

Pattern Recognition Letters
A Survey of Accuracy Evaluation Metrics of Recommendation Tasks

The Journal of Machine Learning Research
Whodunnit - Searching for the most important feature types signalling emotion-related user states in speech

Computer Speech and Language
Performance enhancement of extreme learning machine for multi-category sparse data classification problems

Engineering Applications of Artificial Intelligence
Using support vector machine for improving protein-protein interaction prediction utilizing domain interactions

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Enhancing the classification accuracy by scatter-search-based ensemble approach

Applied Soft Computing
The use of hybrid manifold learning and support vector machines in the prediction of business failure

Knowledge-Based Systems
Nonparametric statistical analysis of machine learning algorithms for regression problems

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part I
Comparing performances of backpropagation and genetic algorithms in the data classification

Expert Systems with Applications: An International Journal
Sparse episode identification in environmental datasets: The case of air quality assessment

Expert Systems with Applications: An International Journal
Time series shapelets: a novel technique that allows accurate, interpretable and fast classification

Data Mining and Knowledge Discovery
On the impact of children's emotional speech on acoustic and language models

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech
Combining bagging, boosting, rotation forest and random subspace methods

Artificial Intelligence Review
Multi-objective genetic algorithm evaluation in feature selection

EMO'11 Proceedings of the 6th international conference on Evolutionary multi-criterion optimization
Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge

Speech Communication
An adaptive fuzzy k-nearest neighbor method based on parallel particle swarm optimization for bankruptcy prediction

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Exploiting place features in link prediction on location-based social networks

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining competitor relationships from online news: A network-based approach

Electronic Commerce Research and Applications
Parameter tuning, feature selection and weight assignment of features for case-based reasoning by artificial immune system

Applied Soft Computing
An incremental ensemble of classifiers

Artificial Intelligence Review
A Bayesian network classifier that combines a finite mixture model and a naïve bayes model

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Combining fuzzy cognitive maps with support vector machines for bladder tumor grading

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Protein cellular localization with multiclass support vector machines and decision trees

BSB'05 Proceedings of the 2005 Brazilian conference on Advances in Bioinformatics and Computational Biology
Reasoning with textual cases

ICCBR'05 Proceedings of the 6th international conference on Case-Based Reasoning Research and Development
Using classification to evaluate the output of confidence-based association rule mining

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Overcoming incomplete user models in recommendation systems via an ontology

WebKDD'05 Proceedings of the 7th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
New results on minimum error entropy decision trees

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Resampling methods for meta-model validation with recommendations for evolutionary computation

Evolutionary Computation
On employing fuzzy modeling algorithms for the valuation of residential premises

Information Sciences: an International Journal
A Three-Stage Expert System Based on Support Vector Machines for Thyroid Disease Diagnosis

Journal of Medical Systems
Sensitivity analysis with cross-validation for feature selection and manifold learning

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Credit risk assessment and decision making by a fusion approach

Knowledge-Based Systems
A Computer Aided Diagnosis System for Thyroid Disease Using Extreme Learning Machine

Journal of Medical Systems
GDTW-P-SVMs: Variable-length time series analysis using support vector machines

Neurocomputing
Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Experimental comparison of representation methods and distance measures for time series data

Data Mining and Knowledge Discovery
Better GP benchmarks: community survey results and proposals

Genetic Programming and Evolvable Machines
Local and global scaling reduce hubs in space

The Journal of Machine Learning Research
FRAN and RBF-PSO as two components of a hyper framework to recognize protein folds

Computers in Biology and Medicine
Principal Component Analysis of the start-up transient and Hidden Markov Modeling for broken rotor bar fault diagnosis in asynchronous machines

Expert Systems with Applications: An International Journal
Bootstrap analysis of multiple repetitions of experiments using an interval-valued multiple comparison procedure

Journal of Computer and System Sciences
Sketched symbol recognition using Latent-Dynamic Conditional Random Fields and distance-based clustering

Pattern Recognition
Towards UCI+: A mindful repository design

Information Sciences: an International Journal
A New Hybrid Case-Based Reasoning Approach for Medical Diagnosis Systems

Journal of Medical Systems
Classification accuracy is not enough

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.02

Visualization

Abstract

An important component of many data mining projects is finding agood classification algorithm, a process that requires very careful thoughtabout experimental design. If not done very carefully, comparative studiesof classification and other types of algorithms can easily result instatistically invalid conclusions. This is especially true when one is usingdata mining techniques to analyze very large databases, which inevitablycontain some statistically unlikely data. This paper describes severalphenomena that can, if ignored, invalidate an experimental comparison. Thesephenomena and the conclusions that follow apply not only to classification,but to computational experiments in almost any aspect of data mining. Thepaper also discusses why comparative analysis is more important inevaluating some types of algorithms than for others, and provides somesuggestions about how to avoid the pitfalls suffered by many experimentalstudies.