Combining bagging and random subspaces to create better ensembles

  • Authors:
  • Panče Panov;Sašo Džeroski

  • Affiliations:
  • Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia;Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia

  • Venue:
  • IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Random forests are one of the best performing methods for constructing ensembles. They derive their strength from two aspects: using random subsamples of the training data (as in bagging) and randomizing the algorithm for learning base-level classifiers (decision trees). The base-level algorithm randomly selects a subset of the features at each step of tree construction and chooses the best among these. We propose to use a combination of concepts used in bagging and random subspaces to achieve a similar effect. The latter randomly select a subset of the features at the start and use a deterministic version of the base-level algorithm (and is thus somewhat similar to the randomized version of the algorithm). The results of our experiments show that the proposed approach has a comparable performance to that of random forests, with the added advantage of being applicable to any base-level algorithm without the need to randomize the latter.