Improving incremental wrapper-based feature subset selection by using re-ranking

  • Authors:
  • Pablo Bermejo;José A. Gámez;José M. Puerta

  • Affiliations:
  • Computing Systems Department, Universidad de Castilla-La Mancha, Spain;Computing Systems Department, Universidad de Castilla-La Mancha, Spain;Computing Systems Department, Universidad de Castilla-La Mancha, Spain

  • Venue:
  • IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper deals with the problem of supervised wrapperbased feature subset selection in datasets with a very large number of attributes. In such datasets sophisticated search algorithms like beam search, branch and bound, best first, genetic algorithms, etc., become intractable in the wrapper approach due to the high number of wrapper evaluations to be carried out. Thus, recently we can find in the literature the use of hybrid selection algorithms: based on a filter ranking, they perform an incremental wrapper selection over that ranking. Though working fine, these methods still have their own problems: (1) depending on the complexity of the wrapper search method, the number of wrapper evaluations can still be too large; and (2) they rely in an univariate ranking that does not take into account interaction between the variables already included in the selected subset and the remaining ones. In this paper we propose to work incrementally in two levels (block-level and attribute-level) in order to use a filter re-ranking method based on conditional mutual information, and the results show that we drastically reduce the number of wrapper evaluations without degrading the quality of the obtained subset (in fact we get the same accuracy but reducing the number of selected attributes).