An ensemble technique for stable learners with performance bounds

  • Authors:
  • Ian Davidson

  • Affiliations:
  • Department of Computer Science, SUNY-Albany, Albany, NY

  • Venue:
  • AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ensemble techniques such as bagging and DECORATE exploit the "instability" of learners, such as decision trees, to create a diverse set of models. However, creating a diverse set of models for stable learners such as naïve Bayes is difficult as they are relatively insensitive to training data changes. Furthermore, many popular ensemble techniques do not have a rigorous underlying theory and often provide no insight into how many models to build. We formally define stable learner as having a second order derivative of the posterior density function and propose an ensemble technique specifically for stable learners. Our ensemble technique, bootstrap model averaging, creates a number of bootstrap samples from the training data, builds a model from each and then sums the joint instance and class probability over all models built. We show that for stable learners our ensemble technique for infinite bootstrap samples approximates posterior model averaging (aka the optimal Bayes classifier (OBC)). For finite bootstrap samples we estimate the increase over the aBC error using Chebychev bounds. We empirically illustrate our approach's usefulness for several stable learners and verify our bound's correctness.