Embedding monte carlo search of features in tree-based ensemble methods

  • Authors:
  • Francis Maes;Pierre Geurts;Louis Wehenkel

  • Affiliations:
  • Dept. of Electrical Engineering and Computer Science Institut Montefiore, University of Liège, Liège, Belgium;Dept. of Electrical Engineering and Computer Science Institut Montefiore, University of Liège, Liège, Belgium;Dept. of Electrical Engineering and Computer Science Institut Montefiore, University of Liège, Liège, Belgium

  • Venue:
  • ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Feature generation is the problem of automatically constructing good features for a given target learning problem. While most feature generation algorithms belong either to the filter or to the wrapper approach, this paper focuses on embedded feature generation. We propose a general scheme to embed feature generation in a wide range of tree-based learning algorithms, including single decision trees, random forests and tree boosting. It is based on the formalization of feature construction as a sequential decision making problem addressed by a tractable Monte Carlo search algorithm coupled with node splitting. This leads to fast algorithms that are applicable to large-scale problems. We empirically analyze the performances of these tree-based learners combined or not with the feature generation capability on several standard datasets.