On the Boosting Pruning Problem

  • Authors:
  • Christino Tamon;Jie Xiang

  • Affiliations:
  • -;-

  • Venue:
  • ECML '00 Proceedings of the 11th European Conference on Machine Learning
  • Year:
  • 2000

Quantified Score

Hi-index 0.01

Visualization

Abstract

Boosting is a powerful method for improving the predictive accuracy of classifiers. The ADABOOST algorithm of Freund and Schapire has been successfully applied to many domains [2, 10, 12] and the combination of ADABOOST with the C4.5 decision tree algorithm has been called the best off-the-shelf learning algorithm in practice. Unfortunately, in some applications, the number of decision trees required by ADABOOST to achieve a reasonable accuracy is enormously large and hence is very space consuming. This problem was first studied by Margineantu and Dietterich [7], where they proposed an empirical method called Kappa pruning to prune the boosting ensemble of decision trees. The Kappa method did this without sacrificing too much accuracy. In this work-in-progress we propose a potential improvement to the Kappa pruning method and also study the boosting pruning problem from a theoretical perspective. We point out that the boosting pruning problem is intractable even to approximate. Finally, we suggest a margin-based theoretical heuristic for this problem.