Less biased measurement of feature selection benefits

Authors:
Juha Reunanen
Affiliations:
ABB, Web Imaging Systems, Helsinki, Finland
Venue:
SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Year:
2005

Citing 9
Cited 0

Pattern recognition: statistical, structural and neural approaches

Pattern recognition: statistical, structural and neural approaches
C4.5: programs for machine learning

C4.5: programs for machine learning
Floating search methods in feature selection

Pattern Recognition Letters
Adaptive floating search methods in feature selection

Pattern Recognition Letters - Special issue on pattern recognition in practice VI
Multiple Comparisons in Induction Algorithms

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Overfitting in making comparisons between variable selection methods

The Journal of Machine Learning Research
A Direct Method of Nonparametric Measurement Selection

IEEE Transactions on Computers
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In feature selection, classification accuracy typically needs to be estimated in order to guide the search towards the useful subsets. It has earlier been shown [1] that such estimates should not be used directly to determine the optimal subset size, or the benefits due to choosing the optimal set. The reason is a phenomenon called overfitting, thanks to which these estimates tend to be biased. Previously, an outer loop of cross-validation has been suggested for fighting this problem. However, this paper points out that a straightforward implementation of such an approach still gives biased estimates for the increase in accuracy that could be obtained by selecting the best-performing subset. In addition, two methods are suggested that are able to circumvent this problem and give virtually unbiased results without adding almost any computational overhead.