Selection of decision stumps in bagging ensembles

Authors:
Gonzalo Martínez-Muñoz;Daniel Hernández-Lobato;Alberto Suárez
Affiliations:
Computer Science Department, Universidad Autónoma de Madrid, Madrid, Spain;Computer Science Department, Universidad Autónoma de Madrid, Madrid, Spain;Computer Science Department, Universidad Autónoma de Madrid, Madrid, Spain
Venue:
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Year:
2007

Citing 9
Cited 2

Bagging predictors

Machine Learning
Ensembling neural networks: many could be better than all

Artificial Intelligence
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Pruning Adaptive Boosting

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Introduction to Evolutionary Computing

Introduction to Evolutionary Computing
Pruning in ordered bagging ensembles

ICML '06 Proceedings of the 23rd international conference on Machine learning
Using boosting to prune bagging ensembles

Pattern Recognition Letters
Ensemble Pruning Via Semi-definite Programming

The Journal of Machine Learning Research
Selective ensemble of decision trees

RSFDGrC'03 Proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing

Comparing two genetic overproduce-and-choose strategies for fuzzy rule-based multiclassification systems generated by bagging and mutual information-based feature selection

International Journal of Hybrid Intelligent Systems - Hybrid Fuzzy Models
Margin distribution based bagging pruning

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article presents a comprehensive study of different ensemble pruning techniques applied to a bagging ensemble composed of decision stumps. Six different ensemble pruning methods are tested. Four of these are greedy strategies based on first reordering the elements of the ensemble according to some rule that takes into account the complementarity of the predictors with respect to the classification task. Subensembles of increasing size are then constructed by incorporating the ordered classifiers one by one. A halting criterion stops the aggregation process before the complete original ensemble is recovered. The other two approaches are selection techniques that attempt to identify optimal subensembles using either genetic algorithms or semidefinite programming. Experiments performed on 24 benchmark classification tasks show that the selection of a small subset (≅ 10-15%) of the original pool of stumps generated with bagging can significantly increase the accuracy and reduce the complexity of the ensemble.