How large should ensembles of classifiers be?

Authors:
Daniel HernáNdez-Lobato;Gonzalo MartíNez-MuñOz;Alberto SuáRez
Affiliations:
Computer Science Department, Escuela Politécnica Superior, Universidad Autónoma de Madrid, C/ Francisco Tomás y Valiente, 11, Madrid 28049 Spain;Computer Science Department, Escuela Politécnica Superior, Universidad Autónoma de Madrid, C/ Francisco Tomás y Valiente, 11, Madrid 28049 Spain;Computer Science Department, Escuela Politécnica Superior, Universidad Autónoma de Madrid, C/ Francisco Tomás y Valiente, 11, Madrid 28049 Spain
Venue:
Pattern Recognition
Year:
2013

Citing 22
Cited 1

Bagging predictors

Machine Learning
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Randomizing Outputs to Increase Prediction Accuracy

Machine Learning
Random Forests

Machine Learning
Neural Network Ensembles

IEEE Transactions on Pattern Analysis and Machine Intelligence
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Limiting the Number of Trees in Random Forests

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Theoretical Bounds of Majority Voting Performance for a Binary Classification Problem

IEEE Transactions on Pattern Analysis and Machine Intelligence
Extremely randomized trees

Machine Learning
Rotation Forest: A New Classifier Ensemble Method

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Comparison of Decision Tree Ensemble Creation Techniques

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
A Theoretical Analysis of Bagging as a Linear Combination of Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Instance-Based Pruning in Ensembles of Independent Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Influence of Hyperparameters on Random Forest Accuracy

MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Out-of-bag estimation of the optimal sample size in bagging

Pattern Recognition
Monte Carlo theory as an explanation of bagging and boosting

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Switching class labels to generate classification ensembles

Pattern Recognition
Out of bootstrap estimation of generalization error curves in bagging ensembles

IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Inference on the prediction of ensembles of infinite size

Pattern Recognition
Application of majority voting to pattern recognition: an analysis of its behavior and performance

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

A novel framework to design fuzzy rule-based ensembles using diversity induction and evolutionary algorithms-based classifier selection and fusion

IWANN'13 Proceedings of the 12th international conference on Artificial Neural Networks: advances in computational intelligence - Volume Part I

Quantified Score

Hi-index	0.01

Visualization

Abstract

We propose to determine the size of a parallel ensemble by estimating the minimum number of classifiers that are required to obtain stable aggregate predictions. Assuming that majority voting is used, a statistical description of the convergence of the ensemble prediction to its asymptotic (infinite size) limit is given. The analysis of the voting process shows that for most test instances the ensemble prediction stabilizes after only a few classifiers are polled. By contrast, a small but non-negligible fraction of these instances require large numbers of classifier queries to reach stable predictions. Specifically, the fraction of instances whose stable predictions require more than T classifiers for T@?1 has a universal form and is proportional to T^-^1^/^2. The ensemble size is determined as the minimum number of classifiers that are needed to estimate the infinite ensemble prediction at an average confidence level @a, close to one. This approach differs from previous proposals, which are based on determining the size for which the prediction error (not the predictions themselves) stabilizes. In particular, it does not require estimates of the generalization performance of the ensemble, which can be unreliable. It has general validity because it is based solely on the statistical description of the convergence of majority voting to its asymptotic limit. Extensive experiments using representative parallel ensembles (bagging and random forest) illustrate the application of the proposed framework in a wide range of classification problems. These experiments show that the optimal ensemble size is very sensitive to the particular classification problem considered.