Improving incremental wrapper-based feature subset selection by using re-ranking

Authors:
Pablo Bermejo;José A. Gámez;José M. Puerta
Affiliations:
Computing Systems Department, Universidad de Castilla-La Mancha, Spain;Computing Systems Department, Universidad de Castilla-La Mancha, Spain;Computing Systems Department, Universidad de Castilla-La Mancha, Spain
Venue:
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
Year:
2010

Citing 6
Cited 1

Feature Extraction, Construction and Selection: A Data Mining Perspective

Feature Extraction, Construction and Selection: A Data Mining Perspective
An introduction to variable and feature selection

The Journal of Machine Learning Research
Fast Binary Feature Selection with Conditional Mutual Information

The Journal of Machine Learning Research
Incremental wrapper-based gene selection from microarray data for cancer classification

Pattern Recognition
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Breeding value classification in manchego sheep: a study of attribute selection and construction

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II

Speeding up incremental wrapper feature subset selection with Naive Bayes classifier

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper deals with the problem of supervised wrapperbased feature subset selection in datasets with a very large number of attributes. In such datasets sophisticated search algorithms like beam search, branch and bound, best first, genetic algorithms, etc., become intractable in the wrapper approach due to the high number of wrapper evaluations to be carried out. Thus, recently we can find in the literature the use of hybrid selection algorithms: based on a filter ranking, they perform an incremental wrapper selection over that ranking. Though working fine, these methods still have their own problems: (1) depending on the complexity of the wrapper search method, the number of wrapper evaluations can still be too large; and (2) they rely in an univariate ranking that does not take into account interaction between the variables already included in the selected subset and the remaining ones. In this paper we propose to work incrementally in two levels (block-level and attribute-level) in order to use a filter re-ranking method based on conditional mutual information, and the results show that we drastically reduce the number of wrapper evaluations without degrading the quality of the obtained subset (in fact we get the same accuracy but reducing the number of selected attributes).