Speeding up the wrapper feature subset selection in regression by mutual information relevance and redundancy analysis

  • Authors:
  • Gert Van Dijck;Marc M. Van Hulle

  • Affiliations:
  • Computational Neuroscience Research Group, Laboratorium voor Neuro-en Psychofysiologie, K.U. Leuven, Leuven, Belgium;Computational Neuroscience Research Group, Laboratorium voor Neuro-en Psychofysiologie, K.U. Leuven, Leuven, Belgium

  • Venue:
  • ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A hybrid filter/wrapper feature subset selection algorithm for regression is proposed. First, features are filtered by means of a relevance and redundancy filter using mutual information between regression and target variables. We introduce permutation tests to find statistically significant relevant and redundant features. Second, a wrapper searches for good candidate feature subsets by taking the regression model into account. The advantage of a hybrid approach is threefold. First, the filter provides interesting features independently from the regression model and, hence, allows for an easier interpretation. Secondly, because the filter part is computationally less expensive, the global algorithm will faster provide good candidate subsets compared to a stand-alone wrapper approach. Finally, the wrapper takes the bias of the regression model into account, because the regression model guides the search for optimal features. Results are shown for the ‘Boston housing’ and ‘orange juice’ benchmarks based on the multilayer perceptron regression model.