Bootstrap Techniques for Error Estimation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Small Sample Error Rate Estimation for k-NN Classifiers
IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Selection: Evaluation, Application, and Small Sample Performance
IEEE Transactions on Pattern Analysis and Machine Intelligence
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Swarm intelligence
Complete Cross-Validation for Nearest Neighbor Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Margin based feature selection - theory and algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
A Direct Method of Nonparametric Measurement Selection
IEEE Transactions on Computers
Cross-validation and bootstrapping are unreliable in small sample classification
Pattern Recognition Letters
Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap
Computational Statistics & Data Analysis
Estimating the accuracy of learned concepts
IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 2
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Computational Statistics & Data Analysis
Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis
Computer Methods and Programs in Biomedicine
A random forest classifier for lymph diseases
Computer Methods and Programs in Biomedicine
Hi-index | 0.00 |
This paper focuses on wrapper-based feature selection for a 1-nearest neighbor classifier. We consider in particular the case of a small sample size with a few hundred instances, which is common in biomedical applications. We propose a technique for calculating the complete bootstrap for a 1-nearest-neighbor classifier (i.e., averaging over all desired test/train partitions of the data). The complete bootstrap and the complete cross-validation error estimate with lower variance are applied as novel selection criteria and are compared with the standard bootstrap and cross-validation in combination with three optimization techniques - sequential forward selection (SFS), binary particle swarm optimization (BPSO) and simplified social impact theory based optimization (SSITO). The experimental comparison based on ten datasets draws the following conclusions: for all three search methods examined here, the complete criteria are a significantly better choice than standard 2-fold cross-validation, 10-fold cross-validation and bootstrap with 50 trials irrespective of the selected output number of iterations. All the complete criterion-based 1NN wrappers with SFS search performed better than the widely-used FILTER and SIMBA methods. We also demonstrate the benefits and properties of our approaches on an important and novel real-world application of automatic detection of the subthalamic nucleus.