Ordering and Finding the Best of K2 Supervised Learning Algorithms

Authors:
Olcay Taner Yildiz;Ethem Alpaydin
Affiliations:
-;IEEE
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2006

Citing 9
Cited 5

Multiple comparison procedures

Multiple comparison procedures
A statistical technique for comparing the accuracies of several classifiers

Pattern Recognition Letters
C4.5: programs for machine learning

C4.5: programs for machine learning
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Combined 5 × 2 cv F test for comparing supervised classification learning algorithms

Neural Computation
Multiple Comparisons in Induction Algorithms

Machine Learning
Robust Classification for Imprecise Environments

Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Selecting Best Practices for Effort Estimation

IEEE Transactions on Software Engineering
Cost-conscious comparison of supervised learning algorithms over multiple data sets

Pattern Recognition
Introducing reordering algorithms to classic well-known ensembles to improve their performance

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Using Bagging and Cross-Validation to improve ensembles based on penalty terms

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.14

Visualization

Abstract

Given a data set and a number of supervised learning algorithms, we would like to find the algorithm with the smallest expected error. Existing pairwise tests allow a comparison of two algorithms only; range tests and ANOVA check whether multiple algorithms have the same expected error and cannot be used for finding the smallest. We propose a methodology, the MultiTest algorithm, whereby we order supervised learning algorithms taking into account 1) the result of pairwise statistical tests on expected error (what the data tells us), and 2) our prior preferences, e.g., due to complexity. We define the problem in graph-theoretic terms and propose an algorithm to find the "best” learning algorithm in terms of these two criteria, or in the more general case, order learning algorithms in terms of their "goodness.” Simulation results using five classification algorithms on 30 data sets indicate the utility of the method. Our proposed method can be generalized to regression and other loss functions by using a suitable pairwise test.