On oblique random forests

  • Authors:
  • Bjoern H. Menze;B. Michael Kelm;Daniel N. Splitthoff;Ullrich Koethe;Fred A. Hamprecht

  • Affiliations:
  • Interdisciplinary Center for Scientific Computing, University of Heidelberg, Germany and Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge;Interdisciplinary Center for Scientific Computing, University of Heidelberg, Germany;Interdisciplinary Center for Scientific Computing, University of Heidelberg, Germany;Interdisciplinary Center for Scientific Computing, University of Heidelberg, Germany;Interdisciplinary Center for Scientific Computing, University of Heidelberg, Germany

  • Venue:
  • ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In his original paper on random forests, Breiman proposed two different decision tree ensembles: one generated from "orthogonal" trees with thresholds on individual features in every split, and one from "oblique" trees separating the feature space by randomly oriented hyperplanes. In spite of a rising interest in the random forest framework, however, ensembles built from orthogonal trees (RF) have gained most, if not all, attention so far. In the present work we propose to employ "oblique" random forests (oRF) built from multivariate trees which explicitly learn optimal split directions at internal nodes using linear discriminative models, rather than using random coefficients as the original oRF. This oRF outperforms RF, as well as other classifiers, on nearly all data sets but those with discrete factorial features. Learned node models perform distinctively better than random splits. An oRF feature importance score shows to be preferable over standard RF feature importance scores such as Gini or permutation importance. The topology of the oRF decision space appears to be smoother and better adapted to the data, resulting in improved generalization performance. Overall, the oRF propose here may be preferred over standard RF on most learning tasks involving numerical and spectral data.