Reduced bootstrap aggregating of learning algorithms

  • Authors:
  • Rafael Pino-Mejías;María-Dolores Jiménez-Gamero;María-Dolores Cubiles-de-la-Vega;Antonio Pascual-Acosta

  • Affiliations:
  • Andalousian Prospective Center, Avda. Reina Mercedes, s/n, 41012 Seville, Spain and Department of Statistics, University of Seville, Avda. Reina Mercedes, s/n, 41012 Seville, Spain;Department of Statistics, University of Seville, Avda. Reina Mercedes, s/n, 41012 Seville, Spain;Department of Statistics, University of Seville, Avda. Reina Mercedes, s/n, 41012 Seville, Spain;Andalousian Prospective Center, Avda. Reina Mercedes, s/n, 41012 Seville, Spain and Department of Statistics, University of Seville, Avda. Reina Mercedes, s/n, 41012 Seville, Spain

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2008

Quantified Score

Hi-index 0.10

Visualization

Abstract

Bagging is based on the combination of models fitted to bootstrap samples of a training data set. There is considerable evidence that such ensemble method can significantly reduce the variance of the prediction model. However, several techniques have been proposed to achieve a variance reduction in the proper bootstrap resampling process, as in our reduced bootstrap methodology, where the generated bootstrap samples are forced to have a number of distinct original observations between two appropriate values k"1 and k"2. In this paper, we first describe the reduced bootstrap and consider possible values for k"1 and k"2. Secondly, we propose to employ the reduced bootstrap for bagging unstable learning algorithms as decision trees and neural networks. An empirical comparison over classification and regression problems shows a tendency to reduce the variance of the test error. We have also performed a theoretical analysis for learning algorithms that can be approximated by a quadratic expansion, obtaining expressions for the relative gain in efficiency for the reduced bootstrap aggregating procedure.