Wrapper feature selection for small sample size data driven by complete error estimates

Authors:
Martin Macaš;Lenka Lhotská;Eduard Bakstein;Daniel NováK;Jiří Wild;Tomáš Sieger;Pavel Vostatek;Robert Jech
Affiliations:
Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo namesti 13, 121 35 Prague, Czech Republic;Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo namesti 13, 121 35 Prague, Czech Republic;Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo namesti 13, 121 35 Prague, Czech Republic;Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo namesti 13, 121 35 Prague, Czech Republic;Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo namesti 13, 121 35 Prague, Czech Republic;Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo namesti 13, 121 35 Prague, Czech Republic;Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo namesti 13, 121 35 Prague, Czech Republic;Charles University, 1st Faculty of Medicine and General Teaching Hospital, Department of Neurology, Katerinska 30, 128 21 Prague, Czech Republic
Venue:
Computer Methods and Programs in Biomedicine
Year:
2012

Citing 14
Cited 2

Bootstrap Techniques for Error Estimation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Small Sample Error Rate Estimation for k-NN Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Selection: Evaluation, Application, and Small Sample Performance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Swarm intelligence

Swarm intelligence
Complete Cross-Validation for Nearest Neighbor Classifiers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Margin based feature selection - theory and algorithms

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
A Direct Method of Nonparametric Measurement Selection

IEEE Transactions on Computers
Cross-validation and bootstrapping are unreliable in small sample classification

Pattern Recognition Letters
Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap

Computational Statistics & Data Analysis
Estimating the accuracy of learned concepts

IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 2
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods

Computational Statistics & Data Analysis

Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis

Computer Methods and Programs in Biomedicine
A random forest classifier for lymph diseases

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper focuses on wrapper-based feature selection for a 1-nearest neighbor classifier. We consider in particular the case of a small sample size with a few hundred instances, which is common in biomedical applications. We propose a technique for calculating the complete bootstrap for a 1-nearest-neighbor classifier (i.e., averaging over all desired test/train partitions of the data). The complete bootstrap and the complete cross-validation error estimate with lower variance are applied as novel selection criteria and are compared with the standard bootstrap and cross-validation in combination with three optimization techniques - sequential forward selection (SFS), binary particle swarm optimization (BPSO) and simplified social impact theory based optimization (SSITO). The experimental comparison based on ten datasets draws the following conclusions: for all three search methods examined here, the complete criteria are a significantly better choice than standard 2-fold cross-validation, 10-fold cross-validation and bootstrap with 50 trials irrespective of the selected output number of iterations. All the complete criterion-based 1NN wrappers with SFS search performed better than the widely-used FILTER and SIMBA methods. We also demonstrate the benefits and properties of our approaches on an important and novel real-world application of automatic detection of the subthalamic nucleus.